Ramp up your app testing with mobile analytics

Deploy an app without testing? You might as well just delete all your code and give up.

According to a survey by Dimensional Research, 61 percent of users expect mobile apps to start within four seconds, while 49 percent want responses to inputs within two seconds. If an app crashes, freezes, or has errors, 53 percent of users will uninstall it.

Users have high expectations, and they won't hesitate to delete your app if they're disappointed. So, of course you test. But many developers are missing one of the most powerful tools to enhance their testing: mobile analytics.

According to Julian Harty—co-author of the ebook The Mobile Analytics Playbook: A practical guide to better testing and the consultant once responsible for testing Google's mobile apps in Europe—the use of mobile analytics in testing is "novel." Up until now, analytic data was seen as a tool for business purposes and for checking metrics such as revenue, in-app sales, or app popularity. It's not surprising that testers and developers can find a use for app analytics; what's surprising is that such data hasn't already been employed.

Gartner Magic Quadrant for Software Test Automation

Guessing what users will do

"Let's assume that you are creating a mobile app," Harty says. "It's very hard to imagine how people would use your software when you're developing it or testing it." Whether the company is small or a global giant, even with a team of developers you can only imagine app use within the limitations of your own background, expectations, education, cultural orientation, and assumptions.

The number of variations of context for an app is vast. People use different devices across a variety of carrier networks, bring different personalities and specific needs to the app, and create a massive number of device, operating system version, and app collection interactions. Prior to having thousands, or millions, of people uncovering all these variations, it's impossible to fully identify the multidimensional array of factors that affect user perception.

Harty mentions working with a company that had a mobile app and customers in more than 40 countries. Some tester came across some data that showed a high concentration of users in Paris. "We weren't testing the software in French," Harty says. "We had no clue, really, whether [customers in France] were getting a good experience or not."

Limitations of app store feedback

There are many things you can never know for testing and design when technical personnel don't have access to the types of analytic data that could help their efforts. Although monitoring user feedback on app stores is useful and important, the data is insufficient because only three types of users typically write anything to you, Harty says: "The people who are pissed with you, the people who are loquacious, and the people who love you to bits." He thinks that feedback generally comes from less than 5 percent of customers and often from a subset of 1 percent or even much less.

But take information from mobile analytics, and suddenly you gain a different picture. Even with blacked-out periods—such as temporarily losing a wireless connection, or apps that first ask permission to share data—developers might expect a user sampling of 90 percent or more, Harty estimates, based on his broad experience in app testing.

Although Harty and co-author Antoine Aymer identify nine different ways that app quality can be measured in The Mobile Analytics Playbook, Harty says three of them are easier to diagnose through mobile analytics: usability, performance, and reliability. Consider each of these in turn.

Usability

Usability focuses on how people become proficient with an app. One aspect is learnability, or the speed at which someone learns how to use the software. The second aspect, correctness, focuses on whether someone can use the app and undertake the right actions and obtain the right answers.

Analytics can provide clues to both parts of usability. For learnability, developers can track speed over time in making use of features to pinpoint when a person has mastered the software. Determining correctness is more difficult, because developers must be able to know when a person has not only used the software with ease, but attained the correct result. Trying to infer correctness is a subtle issue, perhaps revealed by observing how many times a person undertakes the same action in a row, or seeing whether they correctly perform actions in some known context, such as a tutorial or demo function.

Performance

Perceived performance is one of the factors that, when gone awry, will most quickly drive users away. "People generally look at time taken, not something [more obscure] like memory usage," Harty says. Tracking how quickly particular combinations of screen actions take, or perhaps the time lag in obtaining data from a cloud-based service, can give insight into performance.

Reliability

There are two main ways to measure reliability. One is the probability of failure on demand. That factor, often called PFoD or PFD, "measures whether a system works correctly when asked to do so," Harty says. "It is measured as a ratio between the number of times a request was done and the number of times it was completed without error." The other way is mean time between failures, or MTBF, which refers to the average time between incidents of any system or API crashing. Reliability specifically ignores normal exception handling, such as reminding a user of a required input before moving to the next step of an action.

"Mobile apps have to survive in more hostile conditions and a greater variety of conditions than other applications," Harty says. "Developers who don't consider these [reliability] aspects don't write good-quality apps and probably don't write the apps to collect analytics data."

Getting to a baseline

Harty offers some examples of how usability, performance, and reliability analytic data can aid testing. The more details on the types of devices people use, the other apps on the device, the carrier of choice, and usage patterns, the more intelligently testing can occur. If 90 percent of customers use one of five devices, perhaps testing can be narrowed, speeding time to delivery of new versions

"Maybe on an e-commerce app the modal [or most commonly found] number of items [in the shopping basket] is three," Harty says. Obviously you need to test with three items in the basket. But maybe 15 percent of users have fifteen items, so you would want to test the cart with that number, as well as something a little larger than the most number of items people put into their shopping cart, "just to see if anything breaks or not."

Once there is a baseline of such information, ongoing analytics can be used like an alarm clock. "If it changes, wake me up again," Harty says. "If not, I'll assume things are fine."

Analytics can provide a powerful boost to testing and software quality. To learn more about integrating analytics and testing, click below to download The Mobile Analytics Playbook.

Image credit: Tiger Noah

Gartner Magic Quadrant for Software Test Automation
Topics: Dev & Test