You are here

Predictive analytics in hybrid IT: The future of ops

public://pictures/davidl.jpg
David Linthicum, Chief Cloud Strategy Officer, Deloitte Consulting

The predictive analytics systems of today and tomorrow will change the way we do operations. We will know how system modifications will affect IT operations, security, and governance risks. We'll also learn how to automate forthcoming complexity in ways that are cheaper and less risky, and we will have the ability to proactively plan for three years into the future.

The growth of complexity in both on-premises and public cloud platforms, or in hybrid IT, is obvious to everyone at this point. Ops-related predictive analytics means the ability to leverage AI and big data in new, more efficient ways to deal with their increasing complexity. 

So, are you in? Most people in IT operations management, including cloud and traditional, see the value of systems that can literally predict the future. Apply that magic to IT Ops, and you have the ability to solve problems before they become known problems, perhaps problems that are never known to humans.

However, the costs of leveraging predictive analytics with ops are going up. This is true even with the use of the public cloud and its ability to leverage newer tooling and data sources. You'll need upgraded skill sets, expensive tooling replacements and upgrades, and, initially, more people at the helm. 

On the plus side, if done correctly, this should lead to fewer ops and CloudOps people, much less cost when considering interruption avoidance and the effect on the business, and the ability to scale enterprise IT in the cloud, or not, at any pace needed by the business. 

In other words, this has the existing and future potential of a huge ROI. Here's what you need to know about the state of ops-related predictive analytics.

[ Digital transformation can be a costly failure without proper controls. Find out how IT4IT value streams can help in this Webinar. ]

Trends in predictive analytics

All things are not "puppies and bunnies." Those who implement predictive analytics with ops and CloudOps have cited issues that include dirty or redundant data as their current data source, the unavailability of predictive analytics skills to support the development and operations of predictive analytics itself, and, worst of all, a lack of data integration within the enterprise, or between the enterprise and cloud. 

So ops-related predictive analytics is actually becoming an ops problem itself. 

While there are signs that the skills shortage is getting better, in 2019 IT Ops continues to struggle with integrations that can access many different databases, applications, and platforms using the same software systems. There are solutions to these problems, but they're traditionally slow and costly, and often they're the core reasons why ops-related predictive analytics are not used. 

On the IT Ops and cloud operations side of things, we saw a few trends last year that we're seeing this year as well. These include:

The increased use of data heterogeneity

While some predicted that ops-focused analytics would leverage single sources due to consolidation within the cloud, many enterprises have chosen to leave most of the data where it is and migrate the data to the cloud using database analogs. 

For example, an enterprise might move from MySQL on premises to a MySQL lookalike in the cloud, such as AWS's Aurora database. But unless you move to new databases, you won't be able to leverage new database models, such as object-based.

There are a few core answers to this potential problem.  

First, employ data integration—making many data sources appear as one virtual or logical source, no matter if they are logs or proper databases. But this option adds cost and complexity. 

Second, just bite the bullet and move to whatever databases best support predictive analytics, and consolidate as much as you can. Keep in mind that you'll need to work within existing database frameworks in most enterprises, and moving to purpose-built databases is typically too expensive when considering ops.

[ Also see: How predictive analytics will disrupt software development ]

IoT data in the mix

You would have thought that IoT data and predictive analytics wouldn't mix, but they do, and do so well. 

Examples include the ability to predict maintenance events on core IT systems that are IoT-enabled. Take, for instance, the ability to fly an airliner better and safer, since the core systems leverage the IoT to monitor such external characteristics as server temp, room temp, the presence of moisture, and more. 

However, the models to deal with traditional operations data go right out the window. The new models are a mix of analytical and operational data that are mashed together into one of 10,000 emerging use cases for the integration of the IoT using predictive analytics, with core system operations for hybrid IT. 

The trouble comes when considering IoT Ops data storage and analytics as unique use cases. Traditional analytics databases for use in IoT ops-related predictive analytics, such as columnar databases such as AWS's RedShift, are typically too costly for just ops-related predictive analytics.

Embedding predictive analytics within ops or non-ops applications

Predictive analytics for traditional ops or CloudOps once invoked images of IT staffers sitting in front of two to four screens, all jam-packed with information and graphics. These days, more and more predictive analytics are done directly from an application, even non-ops-related applications. 

Ops-related predictive analytics services are embedded inside of the applications, typically as an API call directly to the ops-related predictive analytics engine, and thus to the back-end data and log files as well. 

By the way, this leads to a lot of innovative opportunities, since the services are embedded inside of ops or non-ops applications that we can build net-new or by modifying existing apps. 

Innovative opportunities include: 

  • The ability to embed trending ops issues within business applications to calculate the future costs and risks of system issues, and their effect on the business. You can even do creative things such as look at the likelihood of future outages or security breaches, and then gauge their effect on supply chains or other core business systems that are affected by operational issues that an ops-related predictive analytics system can reveal. 
  • The ability to embed ops-related predictive analytics within other ops tools. While most ops tools have "sort of" proactive trending, such as looking at log files to spot an increase in I/O errors within cloud or on-premises storage systems, those tools are not good at making predictions. Thus, you need to improve on the ability to leverage ops-related predictive analytics systems from other ops tools using APIs that you build or that are provided by the ops-related predictive analytics tool of choice. This provides several advantages, such as predictive corrections made directly from the tools. 

[ Looking to bring innovation into your enterprise? Learn from others' Enterprise Service Management (ESM) implementations—and get recommendations for deployment. ]

What about next year and the year after?

What’s still coming this year, and in the years to come? We'll see a few mostly positive changes, but also a few negative ones. It's best to understand these changes before they become issues within your enterprise. 

With the addition of cloud and other modern systems, most enterprises moved from around 5,000 endpoints in 2016 to around 7,000 in 2018. Endpoints are anything that needs to be maintained, such as applications, databases, platforms, and software. You'll leverage even more endpoints in the future, thus the rise of endpoint complexity for IT Ops and CloudOps. The cloud makes it easier to make messy architectures quicker, since the model allows you to provision and deploy systems quickly. Creating 20 to 40 endpoints that are put into production takes less than five minutes.

As you can see from the figure above, increased endpoints lead to a tipping point where the number of systems, or endpoints, reaches a level where we can no longer effectively manage them. To make matters worse, we're dealing with limited budgets, skills, and numbers of people. 

[ Also see: How to bridge the IT Ops-CloudOps divide ]

What's an ops team to do? 

You need to get much more creative to manage the future. In many cases, that means going all-in with ops-related predictive analytics, which have proved to be effective in the last few years, and getting to work on the self-correcting mechanism. 

The path to this Nirvana is not well-defined. While there are operations tools that offer predictive analytics as part of their core features and functions, as of now the use of predictive analytics by operations tools is nonexistent. It's also unlikely that more traditional on-premises ops tools will ever adopt ops-related predictive analytics. 

The path will be difficult and complex. You need to look at this as a strategy, one that will take some initial funding and other resources. Once everything is defined, it's time to execute, test, and develop net-new ops systems. The end goal is to automate the list of required resources and take risk out of the equation. 

[ Ready to manage your hybrid IT future? Download Crafting and Managing Hybrid Multicloud IT Architecture to get up to speed on unified infrastructure management. ]