6 common test automation mistakes and how to avoid them

After nearly 20 years of experience in software testing, I've seen a great deal of test tooling, and more than a few failures.

Short-term success is common. It's not that hard to write a little code to exercise a system. The problem comes 6 to 12 months later, when a test run takes hours, the test environment and results get flaky, and the programmers change the expected behavior of the system, transforming the nature of the tool from test automation to change detection.

Success in test automation is less about getting it right and more about avoiding mistakes. With that in mind, here are a few of the recurring ("deadly") mistakes I've seen over the years. You know you've made a mistake when you...

Testing in the Agile Era: Top Tools and Processes

1. Drive testing entirely through the user interface

If you do a Google search for "test automation," the first dozen examples are likely to be about driving the entire system through the user interface. That means opening up a browser or mobile simulator and connecting to a back end over the Internet. But that's slow.

Incredibly slow.

This approach works fine for the first weeks, when running checks only takes five minutes. Over time, though, five minutes turn into an hour, then two, then three. Before you know it, testing locks up the tester's computer or test environment all afternoon. So you start kicking off automated test runs at 5 am or 5 pm and get the results the next day. Unfortunately, if something goes wrong early on, all the results will be corrupted. That slows to a crawl the feedback loop from development to test, creating wait states in the work.

While programmers are waiting for feedback, they start the next thing, which leads to multitasking. Eventually, someone re-skins the user interface, and, unless there is some sort of business logic layer in the tool, all checks will fail and you will be left with no easy way to revise the system. In an attempt to just get done, teams revert to human exploration, the automation becomes even more out of date, and, eventually, it will be thrown away.

Worst case, your testers spend all day maintaining the automation false failures, adjusting the test code to match the current system, and rerunning them. This might have some marginal value, but it is incredibly expensive, and valuable only when the programmers are making changes that routinely cause real failure. But that's a problem you need to fix, not cover up with the Band-Aid of testing tools.

During my three years at Socialtext, I helped maintain a test tooling system through a user interface that was advanced for its time. O'Reilly took it as a case study in the book Beautiful Testing. The team at Socialtext uses the same framework today, although it now has several tests running at one time on Amazon's Electric Compute Cloud. Although we had a great deal of GUI-driving tests, we also had developer-facing (unit) and web services (integration) tests, a visual slideshow that testers could watch for every browser, and a strategy to explore by hand for each release. This combination of methods to reduce risk meant we found problems early.

Socialtext's success with heavy testing through the user interface made us the exception, not the rule. We got there with our mixed approach, and also by avoiding some of the other killer mistakes below.

2. Ignore the build/test/deploy pipeline

Recently, a customer brought my organization in to do an analysis and make a recommendation on test tooling. When we asked about the team's build process and how they deployed new builds, they were surprised. That wasn't on the menu, they said; the ask was to automate the testing process.

We don't think of it that way.

Let's say we had simply created the automated checks for them. When we left our two-day consulting assignment, we would have waved our magic wands and the customer would be able to run a script to get results in, say, ten minutes. Magic!

But if the company had one shared test environment where changes needed to be negotiated through change control, that might not actually save any time. We'd have a big, fat bottleneck in front of testing. As Tanya Kravtsov pointed out recently in her presentation at TestBash New York, automating the thing that is not the bottleneck creates the illusion of speed but does not actually improve speed.

There's a lot more to testing than test execution and reporting. Environment setup, test design, strategy, test data, setup — all of these are part of testing. To not take them into account when looking into test tooling leaves you automating only a very small part of the process.

Environment issues aside, automated checks that need to be run by hand create a drain on the team. Most teams we work with tend to want to just get started by running automated checks by hand. I suggest a different approach: Start with one check that runs end-to-end, through the continuous integration server, running on every build. Add additional scripts to that slowly, carefully, and with intention. Instead of trying to automate 100%, recognize that tooling creates drag and maintenance cost. Strive instead to automate the most powerful examples.

3. Set up test data through the user interface

During a recent consulting assignment, a tester told me he spent 90 percent of his time setting up test conditions. The application allowed colleges and other large organizations to configure their workflow for payment processing. One school might set up self-service kiosks, while another might have a cash window where the teller could only authorize up to a certain dollar amount. Still others might require a manager to cancel or approve a transaction over a certain dollar amount. Some schools took certain credit cards, while others accepted cash only. To reproduce any of these conditions, the tester had to log in, create a workflow manually, and establish a set of users with the right permissions before finally doing the testing. When we talked about automation approaches, our initial conversation was about tools to drive the user interface. For example, a batch script like this:

parking-user create -email email@address.com -firstname john -lastname tester -account_id 10 -permissionsset read/write

The command-line idea was fine, but driving through the GUI would, again, be slow, and we wanted fast feedback.

In this case, you could check the screens to see if they still created a user with the right setup, but once that's done, there's no need to recheck that create use works over and over. Instead, consider creating actual command-line parameters to speed up testing. In the example at the client, a simple command-line tool could have flipped the ratio from one hour a day of testing and seven hours of setup to seven hours of testing and one hour of setup.

Utilities like these can deliver value outside of testing. Often, operations and support can see the immediate value and will advocate for them. They are the kinds of things a programmer might make over a lunch hour.

A second common type of test data is the export-to-zip/import-from-zip combination. Teams that do this create a common sample test data set, with known expected results to search, and known users. The deploy pipeline creates a sample environment with a clean database, then imports the zip file. Some of my customers who have a multitenant system, where many users share the same database, think this option isn't a realistic simulation. In that case I suggest finding a way to export, delete, and re-import by account.

4. Keep tests separate and distinct from development

Another problem with test tooling, one that's more subtle, especially in user interface testing, is that it doesn't happen until the entire system is deployed. To create an automated test, someone must code, or at least record, all the actions. Along the way, things won't work, and there will be initial bugs that get reported back to the programmers. Eventually, you get a clean test run, days after the story is first coded. But once the test runs, it only has value in the event of some regression, where something that worked yesterday doesn't work today.

There's plenty of failure in that combination. First of all, the feedback loop from development to test is delayed. It is likely that the code doesn't have the hooks and affordances you need to test it. Element IDs might not be predictable, or might be tied to the database, for example. With one recent customer, we couldn't delete orders, and the system added a new order as a row at the bottom. Once we had 20 test runs, the new orders appeared on page two! That created a layer of back and forth where the code didn't do what it needed to do on the first pass. John Seddon, the British occupational psychologist, calls this "failure demand," which creates extra work (demand) on a system that only exists because the system failed the first time around.

Instead of creating the "tests" at the end, I suggest starting with examples at the beginning that can be run by a human or a software system. Get the programmer, tester, and product owner in a room to talk about what they need to be successful, to create examples, to define what the automation strategy will be, and to create a shared understanding to reduce failure demand. My preference is to do this at the story level — what some might call a minimum marketable feature — which requires a half-day to a week of work. George Dinwiddie, an agile coach in Maryland, popularized the term "the three amigos" for this style of work, referring to the programmer, tester, and analyst in these roles. Another term for the concept is acceptance test-driven development.

My best experiences with test tools have been when the tooling was part of the requirements. Either the programmer created the tooling to demonstrate that the code works ("watch this") or the tester was so integrated into the development process that the automated examples popped out when the code was complete.

5. Copy/paste your test code

Eventually, someone has to write the code. Even if the record/playback tool claims to be codeless, sooner or later your software will produce dates that need to be compared to today's date and formatted, and you'll need to drop down into some kind of code editor. The person writing the code is probably not a professional programmer, but even were that so, it is tempting to focus more on getting the code done than on doing it well.

Here's a simple example.

Say every logical example, every "test case," is isolated. You can run them independently or as a list. Each example starts with a login.

You could cut and paste something like this:

driver->goto_page(test_url);
driver->wait_for_element('userid');
driver->type_ok(username_var);
driver->wait_for_element('password');
driver->type_ok(password_var);
driver->wait_for_element('submit');
driver->click_ok('submit');
driver->wait_for_page_to_load();
driver->wait_for_text('hello ' + firstname_var);


Or you could create a single function:

def login(username, password, expected_test)

The example is trivial; of course you'll create a login function that you can reuse. But when we get to the nitty-gritty of the application — creating new data, editing rows and profiles, searching, and so on — it is tempting to just get the code to work. As you add new features, you copy/paste to make a new automated example. Over a period of years, you end up with a lot of copied/pasted code.

That's when disaster strikes.

At some point, someone may want to change the way the code works. Some operation you call a hundred times suddenly requires that the users fill out a captcha or click a button before they can proceed, and all of the automation breaks. Fixing it requires a great deal of searching and replacing, and that could take days, while the programmers continue to move further and further ahead of you. Once this happens a few times, the test process becomes messy and expensive, and fails to deliver much value.

To avoid this, create functions for logical operations. The page objects pattern does this in a structured, object-oriented way.

6. Think of test tooling as a single, large computer program

Writing code to drive the application is straightforward at first, but eventually that code grows complex and hard to debug. Imagine debugging a test failure that might or might not point to a failure in a program. Without an explicit, clear separation, the code starts to look and feel like a big ball of mud.

It does not have to be that way. Imagine if, instead, an executable example looked more like this, and you viewed it in a tool like a spreadsheet:

A Human-Readable Automated Check Example

This table-based example doesn't include if statements or for loops, and the %% sign indicates a variable that can be passed in or assigned. In the past, I have created accounts and users with a standard name, followed by a time stamp, to ensure that the users were unique for each test run. Individual functions, like search_for, followed by what to search and what to expect in the results, consist of code. Those might have if statements or loops in them, but what we expose to the customer is a straight flow.

How you expose it doesn't matter; the interface could be a spreadsheet you export to .csv, a web page, or a wiki-based tool like Fitnesse.

These are the kinds of examples that a tester, analyst, or even technically inclined customer can read and edit. The less technical staff might not write new functions, but they could come up with this level of code and work with a toolsmith or programmer who creates the code beneath the function call.

By splitting the work into the isolated, human-readable examples and the implementation code, we make a system that benefits, and can be maintained by, more groups. The article "Serious Acceptance Checking" has a more detailed example of this strategy.

Tomorrow's test tool strategy

There is no right way to do test tooling, but there are certainly some wrong ones. "Quick wins" may be cheap and easy, but not when they result in a system that is expensive, or even impossible, to maintain. Worse, it could create extra delayed feedback loops between programmers and testers.

Instead of jumping into driving the GUI, take a step back. Look at the layers of the application. See what processes are actually repetitive and how much time you'd save by automating them.


Should you automate?
Source: XKCD

Review your playbook for test automation and pick something that could give you a big win for small effort. Engage the programmers to create a solution for the entire team. Mind the speed of feedback, focus on things that add speed but not drag, avoid the worst of the mistakes, and you'll probably be OK.

Those are ideas from my own practice. What are yours? Add your comments below.

Topics: Dev & TestDevOps
Article Tags