Micro Focus is now part of OpenText. Learn more >

You are here

You are here

DevTest done right: How automation builds strong development, testing teams

Peter Wayner Freelance writer

Everyone's heard the stories of the shops where the developers and testers never click. The companies are like a middle school dance—on one side of the room are the developers and on the other side, staring back, is the testing crew. In theory, each side knows the value of the other. The developers can't ship working code without the careful review of the QA team, and the QA team wouldn't have anything to do without the programmers. But when they don't click, productivity suffers. If they speak at all, it's with short sentences. Little is said and even less is accomplished.

It doesn't need to be this way. Some teams are breaking down barriers to get the sides working together, and they're using software to make the change happen. They're building up the software tools that make it simpler for the developers and the testers to leave their silos and move forward together, working in sync.

One click between testing, deployment

The tools that are uniting the developers and testers are designed to bolster cooperation by simplifying communication, while bringing more automation and structure to the process. The old mechanism where developers would sign off on a version, hand it over to the testers, and wait for a reply is gone. Whenever new code is checked in to a content repository, the testing begins immediately. The tools analyze the code, compile it, and start up a collection of tests. Both the developers and QA get the results and the cycle continues. Some call this world "DevTest," a reference to the way that the tools are extensions of the world of DevOps. The DevTest world is becoming tightly integrated with the DevOps world—it's now just one click between testing and deploying the software.

Assaf Greenberg, a team manager in HP's Performance Center of Excellence, uses a collection of software packages to automate the daily debugging and testing cycle by running nightly stress tests on new products. The tools collect the latest version of the code, build it, and begin to push it through a sequence of tests automatically. It's like a collection of elves that appear magically each night. "Every night we are running a regression test," Greenberg explains. "In our case, it's about five hours. At the end of the test, we are comparing the results to the previous run."

The new results show up in the email sent to the developers and the testers. If the application performed well under the stress tests, everyone is happy. If something went wrong, the details are in the logs. "In the morning, the team can go and check the specific transaction that created the regression." he says. "Everything is automated using Jenkins and Performance Center."

[ Also see: Virtual testing: How to build an army of testers in the cloud ]

In this case, the entire process is orchestrated by Jenkins, a popular open source package that begins analysis of the code after it's committed. The tool first builds the code and then starts the various tests. The major stress testing is done with HP's Performance Center, a package triggered with a plugin that links it with Jenkins. Greenberg says his lab also runs a number of other tests to look for deeper problems. One is an endurance test that runs for 72 hours and simulates 1,000 users constantly asking for information. The test tracks KPIs like the response time for various database queries.

"When we are doing the automated testing, we are looking for the response time of the server," he explains. "We are using the API of the product. We are looking for the user experience. How long does it take for the user to take the response back to the client—including the rendering?"

This approach is growing more common as companies strategically integrate development and testing with a common platform. The developers and the testers work together on the same code base using a unified collection of tools that constantly checks and tests the code. As soon as the developers release some code by checking it into the repository, the tools and the testers take it apart, first to make sure it actually works, and second to ensure that it meets their standards.

Automated tools

The DevTest methodology is layered on top of the common source code repositories, such as Git, Subversion, Mercurial, or Perforce. Once developers started using the structure of a good central repository, they realized they could centralize more than just storage and history. They could create a central force that could unite all stages of the development pipeline.

Many like Greenberg rely on Jenkins, or its close cousin Hudson, to watch the repositories and start up the automated testing when the code is modified. These open source tools, often referred to as continuous integration, or CI, are also constellations of their own, because they're really just loose frameworks that knit together a big ecosystem of plugins.

Build it

The first job of the CI hubs is to build the code from scratch. While this sounds like a trivial test for code quality, it's essential that it be done as often as possible on the entire repository. In the past, teams often assumed that if each programmer's code worked as promised, then the entire project would work when it was all assembled. But it's too easy for developers to drift apart as they use slightly different versions of each other's code. Even small changes can crash the build.

When teams started realizing that versions of the code could easily slip into incompatibility, they started using continuous integration to watch for problems. Simply using an automated build tool to repeat this can save weeks at the crucial time when teams are trying to fit together all the parts just before the deadline.

After the code is built, the testing and analysis can begin. There are hundreds of different test programs that integrate with Jenkins or Hudson, so many that companies are starting to ask staff members to specialize in installing and configuring them. Some offer stress testing on deployed tools, while others dig deeper inside the code itself to look for flaws.

The popular open source Java tools JUnit and JMeter are two good examples of the strategy of testing the performance of the code by pushing it with sample data. Even the most basic test data can ensure that the code base is responding correctly to input. Rapid and repeated tests can reveal the kind of flaws that only appear when the code is asked to run at scale.

XebiaLabs builds a set of tools to automate the pipeline and simplify the job of managing Jenkins and the various plugins. When teams start adding many of them, the build-test-deploy cycle can grow very complex. "Many teams try to use tools like Jenkins to manage the execution of tests through their pipeline, which helps to a point, but they still have the problem of reporting the results and organization," says Randy Mazin, a DevOps consultant for XebiaLabs who helps companies adopt their product. "When it comes to the reporting, they have to comb through flat text files of result data and aren't sure what some of it means. Does that sound safe?"

One of their products, XL TestView, wraps around Jenkins to make it simpler for the entire team to track how the build and testing are progressing. It collects the data directly from Jenkins and other tools and delivers the current results on a dashboard.

Code analysis

The task of building the code and running it with an automated test like JUnit or JMeter will catch many basic bugs that are revealed when the code fails or runs slowly, but these tests are just the beginning. The automated tools can also scan through the code looking for common errors in logic, as well as more subtle problems, like security holes. The code doesn't even need to be compiled for some of the analysis, because the tools start with the raw source code. The best tools use sophisticated analysis of the code that looks at the logic directly without running it. It's able to infer potential problems without waiting for the code to fail. This is important because some bugs only occur when the load is especially high, and others only appear when malicious users are deliberately trying to break into the system.

Coverity's Code Advisor and Parasoft's Static Analysis are two tools that are able to spot a long list of potential coding flaws. Many of these mistakes are hard to identify in advance because the code is technically correct enough to compile, so the problem can't be found in a normal build. A flaw in the logic could lead to a failure later, perhaps even at the worst possible time when the load is peaking. Some of these errors would sail through traditional unit testing with basic data sets and only lead to problems down the road.

The list of flaws that might be spotted include programmer omissions, such as the failure to check for null pointers, buffer overflows, integer overflows, and uninitialized objects. These tools can also identify complex potential flaws, including race conditions and deadlocks that occur when programmers don't think through all the issues in multi-threaded code.

Some of the tools try to be proactive by pushing a coding philosophy that will lead to better code. The open source project Cucumber, for instance, lets the team spell out the expectations for the code with a business-readable, domain-specific language—a simplified but rigorous language meant to be accessible to both programmers and business development team members. Cucumber turns these specifications into both documentation and a test program.

Business rules and coding policy

While the developers are understandably focused on bugs and potential bugs, the automated systems can also perform often thankless chores, such as forcing coders to follow company policies. Some teams, for instance, have specific stylistic rules, like requiring certain comments or avoiding some complex idioms. One team I worked with banned defining more than one variable in each line of code because the boss thought it made the code cleaner. CI plugins are ideal taskmasters for the thankless job of keeping the programmers focused on following the style guide.

Some industries have policies that go beyond simple rules of style and aim to avoid legal swamps or harm to customers. Aerospace, automobile, and medical companies, for example, must comply with sets of coding rules defined by their industry and designed to keep fliers, drivers, and patients safe from injury or death. The software analysis may not catch all the problems with the code, but it can flag violations of these policies as soon as they appear in the code base.

Security remains a challenge

Ensuring that a software package keeps data private and prevents unauthorized access is a continual challenge for developers. The automated tools can also offer a layer of defense by searching the code for known weaknesses.

Identifying these flaws is best done using a technique called syntactic analysis, if only because the flaws are often impossible to detect with regular use cases and testing. The code usually runs smoothly and passes stress tests because the security holes don't reveal themselves by crashing.

The automated DevTest tools can use static analysis of the code to identify dangerous structures that could allow errors like SQL injection, cross-site scripting, and cross-site credential forgeries. More sophisticated mistakes, such as failing to test for the end of a buffer, can be also be snagged with pattern-matching rules. While the mechanisms can't flag all permutations, they identify enough potential flaws to make them invaluable.

A different approach is to test the software by sending it a barrage of test data that's been slightly tweaked. The process is often called "dynamic fuzz testing," a reference to the idea that the test data is generated by adding a slight amount of randomness or "fuzz" to the process. In practice, the mechanisms also use wildly different data in the hope of generating the crashes or failures that will reveal weaknesses or security holes. If this is done enough times, the random changes may align in just the right way to trigger a flaw.

"Dynamic fuzz testing complements source analysis and adds another layer of defense against unknown vulnerabilities," says Mikko Varpiola, one of the founders of Codenomicon, a company that started to commercialize the process. In April 2015, the company was purchased by Synopsys, which also owns Coverity.

Varpiola says the fuzz testing is an ideal complement to the static analysis. "It can be deployed without access to source code," he says. "Although when combined with source code and appropriate developer tooling, located defects can be quickly and effectively remedied."

Catching complexity is key

Some of the automated tools do more than look for obvious errors. They also track the statistical structure of the underlying code and collect measurements, such as the number of variables per function or the depth of the function tree. These data points don't reveal anything concrete in and of themselves, but they can be good indications of burgeoning complexity, and complexity is often a precursor to failure.

If the code is complex, the development team may not understand the problem well or it may not be planning sufficiently. There are many reasons why the code might be growing unwieldy, and many of them may be dangerous. It often makes sense for a manager to track these values and take a second look at code with growing complexity.

Wayne Ariola, the chief strategy officer at Parasoft, says his company's tools are building statistical profiles of the code base to help team leaders recognize when code is starting to become more fragile. "Flushing out defects early is a no-brainer," he says. "We need to understand how we're injecting complexity. We need to analyze this to understand the systemic impact."

What happens if the analysis of a critical component shows that the new execution path is more complex? "It could automatically trigger a code review. It could highlight to QA that this particular part should be given much more rigorous scrutiny. It could alter the deployment path for the component," Ariola explains.

The last step of the testing process is deploying the code. This is still a nail-biting process for many organizations because unexpected differences in the test and production environments can lead to rapid and embarrassing crashes. The DevTest pipeline is solving that by making deployment part of the chain.

XebiaLabs, for instance, makes two products, XL Release and XL Deploy, that extend the DevTest process so the code can move smoothly out of testing and into common use. They're designed to automate the release and integrate the process with the development pipeline so that manual errors are minimized. The automation provides a structured and repeatable process that eliminates many of the mistakes that humans can introduce.

The limits of automation

The list of automated solutions is so impressive that it can lead some to wonder whether testing requires humans at all. While some lean startups leverage the automated tools to avoid bringing on a pure testing team, they usually find that the tools are far from perfect. The tools can catch many important flaws, but they can't spot them all. It's well-known in computer science that code analysis of any kind can't spot 100 percent of the errors (see Rice's theorem). The list of mistakes that are flagged by the tools is a good place to begin, but it's never complete.

Smart organizations see these tools as a way of amplifying the abilities of the human testers. The testers understand the code and craft the collection of use tests and performance tests for the application. They're able to understand the limitations of the automated solutions and create new tests to cover gaps.

This shift is also starting to change the job description of the testing team. Instead of pushing buttons and clicking links all day, they organize the automated tests and customize new ones. Sometimes they work part of the day as developers and then spend the other half curating the test suite. While there's always a role for the tester who plays the part of the end user, some of the more sophisticated tests require the tester to understand the underlying architecture and code. They must think like a programmer and then imagine how the code might fail.

Lisa Wells, VP of product marketing at XebiaLabs, says the new generation of tools can help bring the team together and end the split between developers and testers. Too many companies hire an army of QA testers but don't give them the power to change much. Testing shouldn't be the job, she says, of "a separate group where quality is your job but you can't do anything about it."

"The whole team must commit to quality," she says. And the solution to that is to use a centralized DevTest environment to keep everyone involved.

Keep learning

Read more articles about: App Dev & TestingTesting