Test data management for automated tests

New Thinking 2 February 2015 Alex Irvine

In any organisation, finding test data can be difficult. Finding data for an automated test is often even harder because of the dependency of the test on the data and the strong focus on reuse and repeatability for automated tests, says Alex Irvine.

If the data or the rules change, the test will simply fail – or in some cases not run at all – and may not give you any indication why. Because of the difficulty of the task, time is wasted identifying and acquiring appropriate data. When data is hard to find, the data used may end up being low quality. Low-quality test data results in invalid or inaccurate testing and low confidence in the findings. We’ve used more time (money!) and we can no longer be confident in the results of our testing.

We cannot avoid the need to identify and acquire test data – and we cannot accept that the status quo is the best we can do. Recently, I implemented a data management strategy based on a simple principle: Change the way data is specified for automated tests and use specially crafted tools to dynamically acquire the data at runtime.

Common data specification practice

A common practice when writing automated tests is to script against a specific client. This approach works well enough for it to be normal behaviour – and automation teams all over the world insist on having their own environment so they can maintain total control over their data.

If our application and environment were representable by the game ‘Guess Who™’, this is equivalent to analysing the needs of the test and determining that an ideal candidate is ‘Maria’, with her earrings, green beret, long brown hair and gender. In a typical automated test, all of those attributes will be verified.

Scripting against a single specific data set (Maria) is an imperfect and risky technique.

  • If Maria is no longer present in the database, the test will not be able to run
  • If the business rules change, Maria may no longer be appropriate
  • What if Maria’s services are modified and she no longer has earrings?
  • If the business rules contain age conditions, what happens when Maria gets older?

Unless there are clear comments in the code or in the test description, it will be very difficult to understand what it was about Maria that made her ideal for the test just by reading through it. If new data is required, it will be hard to find a replacement for Maria. If you are very lucky and have an environment you have complete control over, you may be able to keep it static and employ ‘clean-up tests’ to reset or roll back the data.

In a complex environment where data comes from multiple sources and transactions cannot always be rolled back, tests need to be less tightly coupled to their data.

A way out

The profound change that needs to occur is the specification of test data by attribute instead of name. The purpose of the test above may be to simply verify that users can be unsubscribed from the ‘HATS’ service. All the test really needed was a user with a hat, so Claire and Bernard are just as valid as Maria.

The test should be written in such a way that the critical data is specified by attribute and condition, and not by name. In effect: ‘This test requires a user with a hat’ instead of ‘This test requires user: Maria’. For this method to work, mechanisms need to be built to interpret and correctly acquire an appropriate candidate.

The test will also need to be built to handle the anticipated degree of variation that this approach brings. The simplest way to ensure this is to check only the things that matter to your test. Verifying that the users name is Maria – and/or that her hat is a beret and is green – will not make the test better. It will only make it more fragile.


The approach outlined above where tests are designed in such a way that they dynamically seek out appropriate data at runtime worked really well on my project. The burden of discovering test data was built in to the automated tests, saving many hours of manual effort per test cycle. The tests became more robust and the test results were more meaningful.