You are here

New roles, responsibilities redefining performance engineering in the enterprise

public://pictures/Todd-DeCapua-CEO-DMC.png
Todd DeCapua, Technology leader, speaker & author, CSC

What is performance engineering? It depends on who you ask.

A new survey found different answers depending on the respondent's role and company size. But the consistent theme was that performance engineering is changing and expanding as its importance grows.

[ Get up to speed on quality-driven development with TechBeacon's new guide. Plus: Download the World Quality Report 2019-20 for lessons from leading organizations. ]

Why performance engineering matters

On June 2, 2015, United Airlines grounded all their planes and offered little explanation. Some people online and in the media speculated that it was due to a terrorist attack, but the answer turned out to be something simpler.

Computers stopped delivering accurate information to the dispatchers, and the only safe solution was to stop planes from taking off. While the delay lasted only a few hours, so many people were affected that some estimates put the failure's cost at over $4 million.

This isn't the only problem United has had. In February 2014, the airline's check-in computers failed at three major hubs, and in January 2014, it canceled about 1,500 flights because of another malfunction. United's problems are actually quite common. Southwest Airline's computers failed when they rolled out a big web sale in June 2015, and for several days passengers had trouble buying tickets.

Stories like these are on the rise in every industry, and the damage to a business's reputation and bottom line can be substantial. When United grounded its planes, the direct costs were easy to calculate by tallying the costs of taking care of the passengers and rebooking them.

But that doesn't capture the indirect costs for the passengers, their business associates or families, or the downstream impact those delayed flights had throughout the system. When Southwest's site failed, how many customers decided to book a flight with another airline?

While we know something about the costs to the airlines themselves, the customers' troubles are impossible to measure. We simply understand the frustration and hassle involved when a plane doesn't arrive on time.

As more businesses consider failures like these, they're recognizing that they need to do something. They're restructuring their teams and redefining jobs such that that some members are focused on ensuring that the essential computer infrastructure and applications deliver good, stable performance at all times. They're embracing practices in performance engineering and treating them as critical, adopting an organizational culture supporting this transformation, and rewarding individuals for their contributions.

Keep in mind that "performance engineering" doesn't refer only to a specific job, such as a performance engineer. More generally, it refers to the set of skills and practices that are gradually being understood across organizations, focused on achieving higher levels of performance in technology, in the business, and for end users.

[ Get up to speed with TechBeacon's Guide to Software Test Automation. Plus: Get the Buyer’s Guide for Selecting Software Test Automation Tools ]

Performance engineers have many responsibilities

Today, there are enough teams devoted to performance engineering that we can begin to understand just how far businesses are elevating these practices. Hewlett Packard Enterprise commissioned YouGov, an independent research firm, to conduct a survey of 400 IT professionals with a variety of titles: 50 percent were performance engineers or performance testers, 25 percent were application development managers, and 25 percent were IT operations managers. Conducted in April 2015, the survey only included companies with at least 500 employees; some firms that participated have more than 10,000 employees.

Survey questions were designed to reveal how companies are defining what's most important within the performance engineering domain. This provides a glimpse into how businesses have adopted these practices, as well as the value of these practices from technology, business, and end-user perspectives.

While the answers are still evolving as companies build their teams and adopt a performance engineering culture, they show a fair amount of agreement on the core practices. At the same time, the answers reveal differences in what tasks to include under the performance engineering umbrella.

For example, 70 percent of respondents agreed that "designing post-deployment performance testing" is a critical task. Sixty-seven percent agreed that "performance tuning" is also important. In other words, performance engineering must include the monitoring of deployed software to ensure that it's working smoothly and consistently. All the top five percentages involved some kind of monitoring of website performance.

Performance engineer responsibilities

Certain post-deployment roles were infrequently cited. "Working with a disaster recovery team," for example, was least likely to appear on the list of responsibilities. All the top 11 responsibilities were cited by at least 53 percent of respondents, indicating strong agreement about what the job should entail. The complete list can be found in the figure above.

Views of performance engineering differ widely

Many of those surveyed supported adopting performance engineering practices throughout the product lifecycle. For example, 58 percent of respondents included "perform design inspections" and "perform code inspections" in the list of performance engineer roles. A slightly smaller percentage (55 percent) believe that performance engineering should be practiced at the very beginning of a project, and "develop design guidelines" was noted as a responsibility.

Answers to a number of the questions indicated that performance engineering should start well before deployment. Slightly more than half of the respondents in each category embraced the idea that performance engineering should be part of the planning and building of applications.

Not all managers of application development and IT operations ranked the roles and responsibilities of performance engineering in the same order. Respondents who listed themselves as application development and IT operations managers selected "post-deployment testing" for the top of the list. A full 79 percent chose it as an essential part of the job. Second on the list was "monitoring system performance post-deployment."

Performance engineer responsibilities

But those who identified themselves as "performance engineers" or "performance testers" had a slightly different view. Only 50 percent listed "monitoring system performance post-deployment" as a chief responsibility of the job. They ranked roles from the planning stages well above monitoring. Performing design inspection and code inspection were chosen before monitoring the running application. Performance engineers and performance testers saw themselves more as a partners to the entire process, not just monitors watching throughput after deployment.

Despite this differences, they still ranked "performance testing and tuning" at the top of the list. Their answers don't reveal a deep disagreement as much as a recognition that the job begins well before deployment. Performance engineers and performance testers understand that they can have a bigger role if they're part of the project from the start.

The parity across the top 11 roles and responsibilities and across all three stakeholder groups (performance engineers/performance testers, application development managers, and IT operations managers) is significant. This further supports a shared vision for how performance engineering is needed prior to deployment.

Performance engineering spans the development process

The survey responses show how companies are creating a process for performance engineering that embeds engineers and practices at every stage of development—from initial sketching of the service or product, to production deployment, and into the next iteration.

The aggregated answers below show a general agreement, in the following sequence:

  1. Performance engineering needs to start from the beginning of the development cycle. It helps create models that define successful deployment by measuring the response times for services.
  2. From the start, the performance engineering effort must be coordinated with designers and offer inspections to flag any potential performance issues.
  3. When code is being designed and built, performance engineering practices should move beyond mere inspections. They should include a suite of tests that focus on performance, starting with a single user, then move to performance under load, and then other resiliency and capacity scenarios as needed.
  4. During deployment, continued monitoring ensures that performance meets (or exceeds) expectations; this is a validation of everything observed in pre-production and can be used as continuous feedback for the next iteration.

Adding performance engineering to the start of the process ensures that potential mistakes or failures can be prevented early before substantial resources are wasted on development. It's far worse to deploy a system to production, only to suffer losses from an incident or outage as performance issues are revealed.

Performance engineering involves many roles

As teams continue to evolve, some groups are defining sub-roles within performance engineering, and these jobs are reflected in the survey answers. While many rank "post-deployment testing" at the top of the list, a significant portion of the responses also include other sub-roles:

  1. Design inspector: Examines the structure and architecture of the software with an eye for performance. Identifies bottlenecks when possible. Ensures there's adequate planning for redesign and rescaling as loads shift.
  2. Code inspector: Ensures that code meets the design and includes the proper options for changing the configuration as demands shift with growth. Ensures that code standards are met.
  3. Software deployment reviewer: When major software installations are spread over multiple racks, the software deployment reviewer ensures that the code is deployed carefully and uniformly so all users experience the correct behavior.
  4. Disaster planner: Anticipates the possibility that some or all of the server or network infrastructure will be unworkable due to either human error or natural disasters. Ensures that architectural and design review includes a section devoted to recovering from a disaster.
  5. Capacity planner: The capacity planner should know the size of the business and also be able to plan for changes brought by expansion. Will the software have adequate disk space to store the files of all users? Will the servers be able to juggle all users at peak moments?
  6. Business analyst: Understands the nature of the business and ensures that the software supports it. Are all the requirements of the business being met? Are all regulations being implemented and enforced? Are the needs of marketing and fulfillment implemented?

Not every company embraced all of these roles, but all roles were listed by at least 53 percent of respondents. For example, 58 percent said that "performing design inspections" was a core function of the performance engineer role. Fifty-five percent said that "performance modeling" was also a core function that should be done during design, not after deployment.

The answers also showed a difference of opinion between managers and those who identified as performance engineers or testers. When the roles were ranked according to the percentages indicated, the application development managers said that "monitoring system performance post-deployment" was second on the list.

Those in the performance engineering or performance tester roles ranked it eighth, well below many jobs that begin before deployment. They gave a higher rank to work done during the design phase, such as code inspections and design inspections. In other words, people in these roles felt it was important for them to be involved from the beginning. The full list shown in the figure above in the complete report shows just how differently these two groups view the job.

Performance engineering can't ignore security

Many teams differ on how to handle security. On one hand, it's seen as a different role, one that can be separate from the application development and IT operations process. Of course, we might ask how security affects the end-to-end performance of systems for the end user, how users are reviewing and optimizing security through the lifecycle, and whether this is something the "security team" actually worries about, or if the organization just needs to live with the results.

But the teams also recognized that they can't ignore security. Seventy-eight percent of the IT Ops managers and 76 percent of the application development managers said that security was an area that teams must understand in order to do their jobs well. This was the highest value given to any of the areas listed in this question. Security out-polled other areas, including agile development, big data analysis, and mobile products, which emphasizes its relative importance.

Performance engineering importance

Growing significance of performance engineering roles

Some might be tempted to see the differences in numbers as a fault line or an indication that there's great disagreement, but the different numbers are really an indication that performance-engineering-related tasks are changing and expanding.

There's a clear consensus that performance engineering is essential for modern organizations. They understand that it's more than babysitting the work of the programmers after deployment in production. Performance engineering includes substantial planning and modeling before development even begins. From another perspective, performance engineering provides businesses with a crucial, high-level view of the system that helps programmers, who are often caught up in the details of data structures or output formats.

In all cases, performance engineering is taking a bigger seat at the application delivery table as businesses recognize that poor performance can be costly—even deadly—to a company, and end users don't stick around long after a disaster. When a computer failure can ground an airline or shut down a warehouse, businesses need to understand: the only way to manage a large system is by devoting a team and adopting a culture to ensure that those kinds of failures don't occur.

[ Learn how to apply DevOps principles to succeed with your SAP modernization in TechBeacon's new guide. Plus: Get the SAP HANA migration white paper. ]