The state of predictive analytics in IT Ops: What to expect in 2019

public://pictures/Christopher-Null-CEO-Null-Media.png
Christopher Null, Freelance writer, Null Media

Of all the AI applications at the disposal of IT Ops, perhaps none has affected daily operations as much as predictive analytics. Its ability to make predictions based on analysis of historical data and behaviors has allowed enterprises to proactively tackle problems with a speed that wasn’t even imaginable just a decade ago. And predictive analytics is only going to get more sophisticated in the next couple of years.

Here are the predictive analytics capabilities currently in play in IT operations management, how to leverage them to achieve the greatest accuracy, and how the field will mature in 2019.

The State of Analytics in IT Operations

The current state of predictive analytics in IT ops

Enterprises are currently using predictive analytics to get ahead of a wide range of IT operational issues. Sriram Parthasarathy, senior director of product architecture and predictive analytics at Logi Analytics, said these uses fall into three areas:

  • Predicting application or network downtime so proper mitigating actions can be taken for mission-critical applications.
  • Monitoring application health, such as identifying a burst in traffic or sudden degradation in the response time for different service APIs.
  • Predicting what resources will be needed—or available— on any given day. For example, the technology can be used to predict how many calls an IT support center will get and the number of absent staff to anticipate and handle critical calls to maintain SLAs,” Parthasarathy said.

These capabilities have been made possible by a pair of trends that have only recently developed, said Sanjeev Agrawal, president of LeanTaaS, a data aggregation and tool consolidation company.

Over the past decade, tools have been introduced that gather logs, system events, and network events and provide a holistic view of what might be going on when you have an incident such as an outage or security breach, Agrawal said. If there’s a worm spreading on your network, for example, you can use these tools to search for the particular vulnerability associated with the worm across thousands of machines, and identify which machines have been vulnerable or which machines may have been compromised, he said.

"That capability didn’t exist before. With all this historical data aggregated, now vendors are looking to predictive analytics to opportunistically predict an outage before it happens.”
Sanjeev Agrawal

Agrawal added that while these tools are already quite powerful, the this is still very much an emerging technology that's evolving.

[ Webinar: What’s New in Network Operations Management (Dec. 11) ]

The perils of false positives

Today, predictive analytics is hardly perfect.  The biggest pitfall is the occurrence of false positives, said Gary Brandt, an evangelist at Micro Focus. That's likely to continue to be an issue in 2019.

“The problem has to do with the contextual aspects of the data. You don’t just need data. You need the right data, with the right context.”
Gary Brandt

The idea behind machine learning is to reduce errors in an automated way, Brandt said. The algorithms change little aspects of things and see which results have the fewest errors. Once you find a model that works, you apply it to your real-world production set. 

"The biggest challenge in dealing with false positives is having data that’s representative and that has enough context—things from the environment that play a part.”
–Gary Brandt

Part of the challenge of getting lots of quality data is the fact that most IT shops run multiple operations management tools, each of which creates its own silo of data, said Mathi Venkatachalam, vice president at ManageEngine. Most IT teams use separate tools for network monitoring, configuration management, and network orchestration. “If the data generated from these tools can be correlated and analyzed, the accuracy of predicted network outages will be much higher.”

It's not just which datasets you choose to use, though. How you interpret those datasets also matters. Without enough data or enough of the right data, it’s all too easy to introduce bias, setting the stage for untrustworthy outcomes, said Splunk chief technology advocate Andi Mann.

One bias would be that “every week looks the same,” he said. “We all know that a Monday looks different from the rest of the week. If you’re a retailer, the Super Bowl and Black Friday—these things look very different from the rest of the year. So the first bias or false positive is not having enough training data.”

"If you only show a machine-learning solution a small amount of data, it will assume that’s all the data it has to work with, and then you can get false positives out of that.”
Andi Mann

Garbage in/garbage out

Addressing the garbage-in/garbage-out problem will be critical as predictive analytics capabilities continue to mature. But Parthasarathy believes that part of the problem may be relieved by additional automation tools in the near future.

“Data cleansing and prep is a significant time investment of a predictive analytics project. I think you’ll see a lot of smart automated data prep and cleansing tools that have a built-in context to prepare the data based on the predictive question being asked.”
—Sriram Parthasarathy

Agrawal also sees more automation coming in the form of responses to predicted problems and events. For example,  these systems will automatically detect the presence of a worm or other threat on a network and quarantine all vulnerable machines to contain that threat until you can complete a manual review. "Combining predictive capabilities with safe, automated responses to minimize damage is a key trend we’re likely to see in the coming years,” he said.

Mann sees predictive analytics moving beyond just performance and availability, and toward more business-minded prognostication.

“What we're increasingly starting to see now is operations using all sorts of business metrics around customer sign-ups, time on site, even revenue data. We're starting to see IT operations being able to really see what the customer experience is like and being able to use that data to predict what that customer experience will be like. This is a real game-changer.”
—Andi Mann

The industry is still in the early stages of being able to predict negative customer impact—people are far more unpredictable than computer systems—but despite that added complexity, the challenge remains much the same, Mann said.

“Whether it’s solving IT problems or business problems, the more data you have, the more successful you will be.”
—Andi Mann