Why cloud apps don't perform—and what you can do about it

Performance is an issue for any application deployed to the cloud, but it's especially so for those you have migrated via "lift and shift." Along the way, your developers may have missed a few things that needed to be addressed. Here's how to loop back and take care of those issues.

To optimize the performance of your applications in a cloud-based system, you need to take a multidimensional approach. Think about the design and deployment of the applications themselves, the size and configuration of the cloud resource within which they're running, and your operations and monitoring procedures and tools. 

Many applications that run in the cloud today don't leverage cloud-native features, and they lack the good monitoring and management you need to gain visibility into performance. Other issues that plague app performance in the cloud include poor architecture and bad cloud-machine sizing, just to name a few.

But here are the biggest issues—the top three causes of poor application performance in the cloud—and what you can do about them.

Ovum Decision Matrix for Multicloud and Hybrid Cloud Management 2018-19

Your app wasn't built for the cloud

The most common reason for poor performance is that your application was not purpose-built for a cloud-computing platform. Cloud performance is a problem due to the simple fact that performance was not engineered into your application.

By analogy, let's say your car does not accelerate well. You take it into the shop hoping that the mechanic can figure out what's up. It turns out that your engine is fine; the car is just built to accelerate slowly. 

You have a few choices. You could get a new car, but that's a huge investment. You could spend less by dropping a new engine into the car and upgrading the transmission. Or you could just live with it and give yourself plenty of time to accelerate into traffic. 

The lesson for cloud-computing applications is pretty much the same. These are your options if an application does not perform up to expectations (slow user interface response time, slow data refresh, etc.):

  • You could rewrite the application to leverage cloud-native features, which should provide better performance. 
  • You could perform small tweaks to, for example, provide a better approach to storage I/O. 
  • You could just live with suboptimal performance until such time as the organization is ready to fix the problem.

How to fix the problem

The problem with application performance issues on the cloud is that most people are quick to blame the cloud itself. Sometimes that's the cause. But adding more resources or increasing the power of the resources may not have much, if any, effect. 

For answers, analyze the way the application was designed and developed. A fix probably means some redesign, rewrite, and redeployment work that will add risk and cost. 

To diagnose the problem, isolate the application and profile the performance. Then test the application using automated testing tools that simulate load. Gather metrics from the test simulation to determine where the problem exists and the likely ways you can fix it. 

If that's not an option for your team, application monitoring works as well. Application monitoring tools gather performance data over time. In fact, this could be a better option, considering that it can provide a more realistic view of how the application is used by real users doing real work. 

Your app has been sandbagged by poor database performance

Sometimes the application itself may perform well but the database does not. Again, you determine the cause by testing the application, focusing this time on database response times. 

One way to do this is via a white-box approach, meaning that you examine the response times of all subsystems of the application—including the database. You can simulate application-level access to the database that makes database requests and then look at the response time coming back. 

Or you can do black-box testing, where you examine the overall behavior of the application without looking at the individual subsystems, such as the database. 

But a better approach might be to do operational monitoring over time, which lets you gather database performance data in much the same way you gather performance data for applications. But database monitoring is a bit more consistent because databases are consistent, while applications are not. 

You can find tools and approaches to monitor databases such as MongoDB, AWS DynamoDB, Oracle, and so on, with connectors that are prebuilt to connect to the correct points of monitoring for your database. 

How to fix the problem

The fixes for performance issues with databases are much as you might expect. They include changing "tune-ables"—parameters that change the configuration of cache size, bucket size, I/O optimization, etc. 

If none of those work, increasing the resources that the database needs, such as CPU and storage, is a good option for performance engineering. As a last resort, you might need to examine how the application deals with the database. 

In some instances, you'll need a new database. Your current database might be purpose-built for other uses, and the way you use the specific database might not be the right fit. 

For instance, you don't want to use a transactional database for something that's analytics-oriented. Enterprises often look to a single database to support many types of applications. That's possible in some instances, but not always optimal. 

Your cloud services aren't up to the task

People often blame the cloud for poor performance when other issues are at fault. Culprits might include the application, the database, or the cloud-computing service itself. But again, the typical root issue is user error caused by incorrect use of the cloud-computing service. 

Just as you can pick the wrong database service in the cloud, you can use cloud services incorrectly. For example, you might use the wrong type of storage systems, such as file or block storage, when object storage will provide the best performance for your cloud-based applications. 

Other issues arise around applications that are CPU-limited. The problem is that you didn't pick the right platform, CPU, and memory configuration for the application. 

Common mistakes include choosing a high-end CPU for a machine instance in the cloud but not correctly sizing the memory. When your application becomes memory-limited, it begins to swap to disk, and performance goes off a cliff. 

Diagnosing the problem is straightforward, considering that the cloud service usually puts up warnings that you're limited by a configuration error. A better approach would be to, again, use cloud operational monitoring with the native monitoring tools offered by your cloud provider, or use a third-party monitoring tool that offers connectivity into the cloud's monitoring API.  

Analyze data you've gathered over time to spot trends that indicate problems that could lead to performance issues. Items to monitor include the network, I/O, database performance, CPU utilization, memory utilization, and so on.

How to fix the problem

Keep in mind that some issues are harder to spot than others. For instance, applications may perform well on Mondays but terribly on Fridays. While it might seem as if your cloud instances are haunted, if you run monitoring tools and gather performance data over time, you'll discover the solutions to such hard-to-diagnose problems.

The case of the haunted application that performed differently on Monday versus Friday actually happened. After gathering data over a two-week period and analyzing it, the organization concluded that end-of-week processing by a separate application was saturating the company network on Fridays. So it placed that application on a separate network segment.

But the issue also could have been a database cache filling up at the end of the week, a load increase on Fridays for some reason, or security issues (such as attempted attacks) that drove down application performance because the security system was in defensive mode. Monitoring tools can help pinpoint the diagnosis.

Be proactive with your monitoring tools

Of course, the cloud-based application performance issues you encounter could be different from those listed above. I named just the three issues you're most likely to see. Regardless, a good offense is to be proactive with monitoring and management tools to spot and correct issues before they become serious. 

Best of all, follow sound architectural practices on an ongoing basis. This includes correctly designing and building your applications, the databases they leverage, and all points in between. And don't forget about correctly configuring the cloud platforms and making sure to add performance testing to the DevOps processes. 

In other words: An ounce of prevention is worth a pound of cure. 

Hybrid Cloud: New Challenge For Monitoring Solutions