Insights / Article

Why test automation keeps breaking

When test automation keeps breaking, the first reaction is often to blame the tool.

Maybe we picked the wrong framework. Maybe the automation engineer did not write the tests properly. Maybe UI automation is just too flaky. These explanations can be true in some cases, but they often miss the deeper issue.

In my experience, test automation usually breaks because we create tests that try to prove too much in one flow. We couple them to unstable implementation details. We run them against environments that are not consistent enough to produce reliable results. Then, when the tests fail repeatedly, the team starts to lose trust in automation altogether.

This is where automation can become more of a burden than a confidence mechanism. Instead of helping teams move faster, it creates noise. Instead of giving clear feedback, it creates investigative work. Instead of showing whether the product is ready, it becomes another thing that needs to be maintained.

The problem is not that test automation is useless. The problem is that reliable automation requires design. It needs a clear scope, stable points of interaction, and an environment that supports repeatable execution.

In most cases, it comes down to three recurring problems, and the first one is the most common mistake I see teams make.

1. The test is trying to prove too much

When a test tries to validate an entire business process through one UI journey, it may give the illusion of strong coverage, but when it fails, it becomes a poor diagnostic tool.

I was once the sole test engineer introducing test automation at a client site. The team was excited. It was their first time using automation, and they heard all the usual promises from test automation: faster feedback, better coverage, and more confidence in delivery.

One of the first flows they wanted to automate was customer sign-up. On the surface, it sounded simple. A customer visits the website, fills in their details, submits the form, opens their email, clicks the verification link, and is then marked as verified in the CRM. The team also wanted the automated test to open the CRM and confirm that customer details are captured correctly. The challenge was that they wanted to test the entire flow through the UI.

There are two problems with this. First, every step runs through a browser, webmail, and CRM, making the test slow and the feedback loop longer. Second, when it fails, the error message tells you almost nothing about where in the journey things went wrong.

The test covered a lot without giving clear feedback. A better approach is to split the flow based on what you are trying to test. If the purpose is to test customer sign-up, you may not need to open the webmail and click the verification link every time. That can be tested separately. If the purpose is to confirm the CRM has received the correct customer information, checking through the API might be faster than relying on UI.

End-to-end through UI is not bad. A small number of full journey tests can provide feedback and confidence that the whole process works. But when every important verification is pushed through to UI, automation becomes slow, fragile, and difficult to diagnose.

2. The test is coupled to the wrong thing

A tightly coupled test is not automatically a bad test. Every test has to latch onto something, a button, a form, a confirmation message. The problem is not the coupling itself but coupling to unstable details.

This is especially common in UI automation. For example, a test might be written to click on the yellow “Sign up” button located at the top right corner of the page. That sounds precise but also fragile. The button could change colour. The wording could change from “Sign Up” to “Register”, and the button could move to a different part of the screen. None of these changes necessarily means the customer journey is broken, but the automated test may still fail. It is failing because the instructions were attached to details that were likely to change.

A better approach is to couple the test to something more stable. In a custom-built application, that may mean working with developers to add a unique test ID to important elements, such as the sign-up button. The button may move, change colour, but the automation still knows what to click.

In vendor platforms such as D365 or Salesforce, where adding a test ID may not be possible, the same principle applies: avoid relying on fragile visual details where possible and use the most reliable identifiers, such as accessible names or field labels.

The aim is to couple the test to something that represents the user intent, not the temporary layout of the application.

This is where test automation becomes more than a scripting exercise. Reliable automation often requires collaboration between testers, developers, business analysts, and platform specialists about how the system can be tested reliably. If the application is not built or configured with testability in mind, the automation will always be fighting the product instead of supporting it.

3. The test assumes a stable environment

Automated UI tests are often labelled as flaky, but the test itself is rarely the real problem. Usually, the test is simply running against conditions that are not stable enough to produce a reliable result.

Unlike a human tester, automation does not naturally pause, observe, and adjust. It executes instructions quickly and consistently. If the application has not finished loading, if a background process has not completed, or if a downstream system has not responded yet, the test may move to the next step too early and fail.

The most common cause is timing. If a test submits a sign-up form, a downstream system might need a few seconds to update the CRM, and because the test checks the result instantly, it fails, even though the product is working correctly. A hard wait, pausing for a fixed number of seconds, may reduce the failure rate, but it also slows the entire test suite. A better approach is dynamic wait, where the test waits until it sees meaningful evidence that the system is ready, like a confirmation message or an updated API response, before moving on.

Timing is only half the battle. Tests also rely heavily on the infrastructure they run on. If a test environment is slow, shared by multiple teams, missing production-like configurations or relying on inconsistent test data, the exact same test can pass in one run and fail in the next.

This is why “works on my machine” is such a common frustration. The code and the test are identical, but the local machine, CI pipeline and third-party systems behave differently. One approach to help with the environment is containerisation or environment as code. The idea is that every machine runs the exact same environment, removing inconsistencies.

A failing automated test does not automatically mean the automation is badly written. More often than not, the environment is the problem, not the test.

Reliable test automation is not just about writing scripts. It is about design decisions, being clear about what each test proves, building applications with testability in mind, and treating test environments as part of the delivery system.

When those decisions are made well, automation stops being noise and starts being a signal. It tells the team what failed and where it failed, and that clarity is what makes it valuable.

Share

Related articles

  • Stop finding bugs after UAT: A better approach to User Acceptance Testing

  • Evaluating QA consultancies for enterprise digital transformation: Resourcing vs partnership

  • The CEO’s blueprint: De-risking digital delivery at scale