You are here

You are here

Special Report: Buyer's Guide to AIOps Management Tools

public://pictures/davidl.jpg
David Linthicum Chief Cloud Strategy Officer, Deloitte Consulting
 

 

Table of Contents

 

Overview

AIOps combines traditional IT operations tooling with machine learning and analytics to help mitigate the complexities of managing emerging multicloud and other types of heterogeneous systems.

The value of AIOps lies in its ability to reduce costs, risks, and operational complexity while increasing reliability and security. However, those charged with selecting an AIOps tool can quickly become overwhelmed by the sheer number of features, functions, and approaches these tools offer. Moreover, tool providers take many different approaches to leverage new technologies, including advanced predictive analytics, chatbots, deep learning, and machine learning.

This Special Report provides IT operations managers with a framework for understanding the state of this complex operations technology as well as future trends. It also provides guidance on how to evaluate and compare the various types of available tools, so you can narrow down the list to exactly what your organization needs and select the correct tool or tools required.

Think of this report as a decision support tool that includes a list of key criteria and questions. You can use it to understand both your business and technology requirements, create the right selection criteria, and pick the right AIOps tool or tools for your needs.

There are nine basic features in most AIOps tools avaiable today (see the figure below). Most of these layers leverage machine learning as a common feature and integration from one feature to the next. For example, the data/event correlation feature leverages operational data as training data, so the ability to effectively correlate the data improves over time with the system’s continuous use of machine learning functions.

Figure 1: While all AIOps providers’ features and functions may vary, these are the core features you should evaluate.

Here’s what you’ll find in the rest of this Special Report:

  • AIOps management: What it is, why you need it. AIOps has been both an evolution and a revolution. The discipline can solve many problems, including handling complexity and using automation to adapt directly to the needs of the business.
  • The state of AIOps management tools and techniques.Here’s what the tools offer, and where they fall short. The AIOps landscape that exists today includes tools grouped as commercial, open source, and other ways to break them down by category
  • Top 7 AIOps management technology trends. Where is AIOps going? How do the changing requirements of enterprise IT affect the evolution of AIOps? How and when will shortcomings be addressed?
  • Building an AIOps toolbox: How to select the right tools for the job. Here are the top considerations when selecting a tool set, along with next actions you should take.
  • How to create an RFP for AIOps tools. The key requirements and steps you need to keep in mind.

Key takeaways:

  • AIOps tools are most valuable to IT operations with high levels of complexity and heterogeneity. A higher number of system patterns in the operational domain equates to a higher need for AIOps.
  • While all AIOps tools have some sort of automation, the capabilities of this automation vary greatly, from simple APIs to complete process and programming engines. Some tools have “canned” processes for self-healing—for example, restarting a network device automatically. With other tools you’ll need to create, test, and receive audit trail reports of what was conducted.
  • Connectivity to systems in an operational domain also varies. Keep in mind that this is connectivity you don’t want to create yourself. The fastest path to deployment includes an out-of-the-box tool that tests connections to common systems (including Amazon Web Services S3, SAP, Kubernetes, etc.).
  • Observability is an emerging concept that most of the AIOps tools on the market support. It provides IT with a better understanding of the systems under management. It helps interpret what the data and behaviors actually mean, and how the systems relate to each other.
  • To select the correct tools, you need to understand the systems to be managed, required security, details of your SLAs, required ops team skills, and how to relate all of this information to a business case.
  • Most AIOps tools have a quick ROI—usually less than six months. When building a business case, focus on the “as is” state and then the “to be” state with the AIOps tool. The value lies in the inefficiencies you remove between the two.
  • Most AIOps tools move from simple infrastructure monitoring to infrastructure and application monitoring. If there is no need to monitor applications in your operational domain, you should not go down this path with your AIOps provider. It just adds cost and risk.
  • Consider AIOps tools that can monitor specialized systems. The list includes purpose-built databases, containers, serverless systems, and other nontraditional infrastructure that should be monitored and managed in a specific way.
  • Improving root cause analysis is a key value of AIOps that many people overlook during the tool selection process. This is the feature that gets to the primary cause of a failure, such as a storage system crashing that manifests in database errors.

 

AIOps management: What it is, why you need it

AIOps represents both an evolution and a revolution in IT operations management. It’s an evolution in that traditional management and monitoring tools were AI-enabled, and thus they became AIOps by simply adding features. These tools are typically sold by traditional enterprise infrastructure management companies that have been around for years, many of which are name brands.

The revolution is in the number of startups that have entered the AIOps marketplace. Many became the companies that determined the direction of AIOps as an emerging tool category. While these are relatively new vendors, most less than 10 years old, they have grown along with the popularity of AIOps.

The use of artificial intelligence (AI) and/or machine learning (ML) added the ‘AI’ to ‘Ops.’ While the use of AI by each of the AIOps players differs substantially, there are a few core AI services that most tools share. These include:

Adaptive rules

AIOps tools use rules to set limits. However, there are times when you want those rules to be broken, or proactively adaptive. For instance, you might have a rule set within your AIOps tool to automatically shut down a server if the internal temperature exceeds 200 degrees. If it’s the end-of-year processing and shutting down the server will result in the loss of the process and the data, the AI engine understands the need for a work-around rule in this instance that will benefit the business.

Learning from system data

Some AIOps tools come with pre-baked knowledge that can be leveraged from the start. An example is an understanding of 1,000 different patterns of network failures and those that can be fixed using the automation component of the tool. One instance is cycling a network device to attempt a self-heal. The trend is to provide some pre-built knowledge with the AIOps tools that can also adapt to new data as a stimulus.

Pre-baked knowledge can be convenient, but most of what AIOps tools learn over time is from the data that flows from the systems under management, with additional information from how issues are corrected, tuned, or otherwise optimized.

AIOps tools take on a learning role with the ability to get better over time, much as humans do. Since the AIOps tool can consider all data and all responses to that data, the AIOps engine quickly learns to make more accurate conclusions in near real-time than can most humans on operational teams.

Matching patterns

AIOps tools can understand repeating patterns of data coming from the systems under management and match those event patterns to solution patterns. For instance, the ability to understand that a series of I/O errors coming from a storage system in the data center matches up with the pattern that the primary storage system will fail soon, and the tool knows that the applications should switch to a backup storage system as soon as possible.

There are thousands of event patterns to understand, as well as their ultimate meaning and the solution patterns that have the best chance of solving the problem. The ability to understand and track both problems and matching solutions through AI pattern matching allows the AIOps tool to be responsive and get better over time as it works through more problems and solution patterns.

Heuristics may be leveraged as well to determine likely outcomes from the discovered patterns. Moreover, heuristics can do useful tasks such as predict outcomes, negative or positive, months out from the actual problem impacting the business.

Other common features of AIOps tools are also found in some non-AI-enabled tools. These include:

The ability to connect to remote systems under management through applications that gather the required data from the systems for the AIOps engine. Data consumption allows raw data to be classified and correlated, and erroneous information removed.

The ability to self-heal operational issues using traditional approaches such as system resetting or running through automated remote troubleshooting procedures. Typically, self-healing is accomplished by using an automation engine (see Figure 2) to create processes that fix issues without human or with human review.

Figure 2: Technology stack of a generic AIOps tool.

The ability to provide the operations staff with productive views of the monitoring data, either raw or calculated. This should be configurable to the personal preferences of the AIOps tool user, with many different views supported.

Integration with a CloudOps/Ops playbook, which means that you’ve predetermined which tasks need to occur, and in what sequence, to solve most issues and properly maintain the systems. The AIOps tools need to have this programmed in using the automation component.

What problems does AIOps address?

AIOps solves many issues. Some of the value drivers for AIOps tools include:

  • Dealing with operations using automated tools, and the ability for that automation to adapt directly to the needs of the business, gathering knowledge over time.
  • Dealing with complexity that comes from the use of heterogeneous systems that exist in multi-cloud deployments, as well as in traditional on-premises systems.
  • Dealing with SecOps. The AIOps tool can provide support for security subsystems by working together to guard against breach attempts that may manifest as excessive I/O, or as other anomalies.
  • Dealing with Ops governance. AIOps tools should augment the governance systems in place, working together to use and enforce policies.
  • Dealing with performance. You constantly monitor performance, not only to please the end users who want to leverage a speedy system, but to make sure that all resources live up to service level agreements.

Keep in mind that you need to understand the problems before you can understand the value a specific tool can bring.

Key takeaways:

  • The discipline of AIOps has come about through the marriage of artificial intelligence (AI) and IT operations.
  • Both new and established companies offer tools in this space.
  • Common features include adaptive rules, the ability to learn from systems under management about what may go wrong and how to fix any problems, and providing multiple views for different types of operations users.
  • AIOps can solve different types of problems, from dealing with the complexity of managing heterogenous and multi-cloud infrastructures to helping with governance, performance, and security.

 

The state of AIOps management tools and techniques

According to a report from Mordor Intelligence, the 2019 AIOps market was valued at (US) $1.64 billion and is expected to reach $6.88 billion by 2025. This is a 27% CAGR over the forecast period.

The growth will come partially from the growing amount of complexity in enterprises. Complexity is a byproduct of the increased use of cloud computing along with the inability to give up existing legacy systems. This naturally leads to more systems under management. Because IT can rarely increase operational budgets, IT needs better force multiplier tools to leverage their systems.

So, what is the current state of the AIOps tools? What do they offer? Where do they fall short? Here’s the AIOps landscape that exists today, with a breakdown of the various types of tools. These include commercial, open source, and other ways to break them down by category.

The objective is to understand them better which, in turn, will point you toward the best tool for your solution.

The good news

The AIOps tool market is built upon well-developed management and monitoring technology that’s decades old. Most of the providers are not starting from scratch, no matter if they are building net-new tools using existing best practices developed by the industry, or if they’re building upon their own existing technology.

Other positive issues to consider in the AIOps market landscape:

Tools are typically well integrated with other ops tools. These include security operations (SecOps), governance operations (GovOps), and other such systems involved with running your infrastructure and applications. Because these tools can work and play well together, they do not require a great deal of integration by the operations team to leverage AIOps.

They are cost effective. The use of three different deployment types is pushing prices down to on-demand levels, unlike enterprise licensing models of the past. However, there are all kinds of pricing models.

On-demand models offer tools that can run on-premises or in an IaaS cloud, such as Amazon Web Services or Microsoft Azure. SaaS leverages AIOps tools with proprietary pre-built software services, which are special SaaS versions of the AIOps tool that can run only on a SaaS platform. Other unique deployment and pricing models can be deployed as well.

The bad news

The bad news is typical of any emerging and over-hyped space. There is confusion about just what AIOps is, and what value it brings to the industry. Reports such as this one can clear up much of the confusion. However, with billions of marketing dollars now spent in this space, over-hype and over-promising will make the facts even muddier for enterprise IT. This is likely to get worse before it gets better.

Other downsides to AIOps tools that the AIOps providers certainly won’t tell you about:

AIOps tools are complex unto themselves. Most AIOps tools require simple training for those in operations to correctly leverage them. Or you’ll need to hire a consultant to properly set up the tools. This will likely improve in the future, but implementors report now that most AIOps tools are difficult to configure and leverage, which is likely to be the case for the next few years.

The deployment model could be a single point of failure for the AIOps tool. In cases where the AIOps tool runs on an IaaS cloud or is delivered via SaaS, if for some reason those infrastructures have an outage, the removal of your AIOps tool could be catastrophic. However, so far, the uptime for these platforms is good.

Tool deployment models

Current AIOps tools can be categorized into deployment models, or how they are positioned in the marketplace (depicted in Figure 3 below):

Figure 3: Existing roadmap of the market for AIOps tools.

Commercial

These AIOps tools are proprietary, and the IP and software is wholly owned by the provider. The providers are solely responsible to maintain it. If that provider’s business goes away, either through bankruptcy or acquisition, you may have to remove the AIOps tool from service.

Open source

For all practical purposes, the software code is in the public domain. While open source models can vary in structure, they typically rely upon a few companies to maintain the code, as well as a network of developers. While the software and code are typically free, you may have to pay for deployment services, or the development of missing system connectors. The most common model is to provide the base software, and charge for the most commonly required add-ons. However, many open source tools in the monitoring area don’t support high scalability and aren’t feature rich.

Special purpose

SaaS models use a purpose-built SaaS version of the software that runs remotely and is leveraged over the open Internet. An example would be Salesforce.com. This model can also be open source but is typically commercial and you’re charged by usage. The provider maintains the SaaS service.

On-demand: The on-premises version of the software runs in an IaaS cloud. This is a bit different than SaaS since you’re charged with installing, configuring, and maintaining the AIOps tool that runs on the IaaS cloud server or servers, just as you would if it were installed on premises.

On premises: These AIOps tools run on traditional platforms that you physically own.

Special requirements: Means that you leverage something that’s atypical, such as AIOps tools that run on edge servers or on other atypical platforms.

Key takeaways

  • The AIOps tool market is built on well-developed management and monitoring technology that’s decades old.
  • Like any new market, however, there’s much confusion and hype over what AIOps is and how it can help enterprises.
  • AIOps tools are complex to set up correctly; your staff will need training or you’ll need to engage a consultant to make sure you’re off to the proper start.
  • Although SaaS-based AIOps tools have a good performance track record thus far, be aware that if these tools go down for any reason, the results could be catastrophic.
  • AIOps tools come in many flavors: commercial, open source, SaaS, on-premises, and others.

 

Top 7 AIOps management technology trends

Where is AIOps going? How do the changing requirements of enterprise IT affect the evolution of AIOps? How and when will shortcomings be addressed? Here are the current technology trends, as well as look ahead at the discipline’s growth.

Figure 4 represents the general and likely evolution of AIOps tools over the next several years. While some of these trends have already started, some have not. However, the increasing needs of AIOps users will ensure that these trends will be reflected in future tool releases.

Figure 4: The evolution of AIOps will evolve around emerging enterprise requirements

1. Specialization of tools

This is the evolution of certain tools that focus on specialized purposes, such as IoT, security, compliance, or performance. The idea is that if the tools are purpose-built for a certain solution pattern, the tool provider will do a much better job of solving that problem.

In the future, general-purpose AIOps tools may not be as popular as they are today. Most enterprises will instead leverage a suite of tools that solve different problems. Some of the advantages of this approach include the tool’s ability to:

Focus on specific complex problems that allow the tool to become best of breed for that category.

Track management costs to a more fine-grained level, considering that each tool has a specific purpose, cost, and cost model (e.g., on demand).

Focus the AI models on the problem. This would include a knowledge engine’s ability to become smarter around specific problems.

2. Generalization of tools

In other instances, AIOps tool providers will focus on generalization. The rationale? If one tool solves many problems, one-stop-shopping enterprises will find that tool more desirable. While the limitations of this approach are the opposite of specialization, there will be a need for tools that “do it all.” However, the case can be made that the trends will likely move more toward specialization.

3. Other ops systems integration

An AIOps tool’s ability to integrate with other management tools is now or will soon be table stakes for vendors in the AIOps marketplace. For instance, the ability for a general-purpose AIOps tool to work and play well with security operations (SecOps) tools to perhaps determine when certain system behaviors may be an indication of a denial of service attack (such as the increased saturation of the processor).

Other integration opportunities include the ability to integrate with performance operations, or “PerfOps,” to continuously tune the systems using data gathered from the AIOps tools. Today, this type of integration is hard to find. Where it is offered, those services are rarely built into the AIOps tools to take advantage of the integration.

4. Centralized knowledge model sharing

An AIOps tool benefits from the data and knowledge it collects from your organization over time, as well as from the data and knowledge shared by other enterprises. This results in a few core advantages, including the ability to:

Increase the value of AIOps because you won’t have to wait months for the AIOps tool to get smart by processing locally generated data. (https://gigaom.com/report/key-criteria-for-aiops/)

Find emerging patterns related to attempted breaches that may occur worldwide, and thus raise the alarm early since the data/knowledge is centrally gathered and analyzed.

Find issues with specific hardware and/or software, as well as cloud services. Issues may occur at many enterprises where a specific brand and model of storage system is prematurely failing. The shared knowledge engine will spot that trend by analyzing data for many enterprises and report it to the users of that hardware and/or software as well as the manufacturer or cloud provider who can preempt outages with a proactive fix to that equipment or service.

5. Increased proactive automation

While AIOps tools have some automation, today it’s typically reactive. You must wait for problems to occur, and then take automated corrective action. Being proactive means you can identify issues with systems by using known data patterns that may indicate activity that typically leads to a failure and correct the issues before it becomes a failure.

These functions are complex for the enterprise IT operations team to set up. Vendors need to develop more pre-built proactive functions that can work with the initial installation or when leveraging an AIOps tool on-demand. An AIOps tool’s pre-built knowledge will become a core selling point and provide the most value for the dollars spent on AIOps.

6. Reporting and support for audit

Logging is a core feature of AIOps tools because they gather data on an ongoing basis, and react to that data, as well as train an AI knowledge base. However, support for compliance or financial audits is typically not provided, or is an afterthought at best. Moving forward, AIOps tools need built-in processes to analyze gathered data that will support audits. These options should be prebuilt into the functionality of the tools.

7. Cost management

Some AIOps tools already include cost management capabilities, but only with rudimentary functionality (such as usage). Most AIOps tools will soon adopt some or all of the following functionality:

Predictive maintenance costs of physical hardware and software. This will be combined with the AI engine’s features to determine tasks that must be accomplished to avoid failures, and money spent or saved.

The ability to monitor cloud-based usage down to the user, application, database, etc., and determine the best path to cost optimization. Today there are cost governance tools that do similar things, but they are not integrated with AIOps tools that have access to cost-pattern data that should be managed.

Key takeaways

  • AIOps tools will likely become more specialized for areas including the Internet of Things, security, compliance, and performance.
  • There will always be a need for general-purpose tools for shops that don’t want or can’t afford best-of-breed solutions.
  • Another key trend includes increasingly integration with other operations tools, including performance management.
  • Tools will increasingly become proactive, to fix problems before an outage or other issue occurs.
  • AIOps tools will provide ever-better cost management features, including predictive maintenance costs of hardware and software, and the ability to more finely monitor cloud usage.

 

Building an AIOps toolbox: How to select the right tools for the job

How do you select the right AIOps tool or tools for your needs? It’s a matter of logic. First, determine the correct requirements criteria for the set of systems you need to manage. Then assign a rank for each tool by criteria. Consider other non-obvious business issues, such as the business viability of the tool vendor. Then, pick the tool or tools.

This sounds simple, but it’s the most time-consuming part of the evaluation. You need to understand the actual individual features and capabilities of each tool. It’s helpful to go into an evaluation test process with the shortest list of tools that need to be considered.

Considerations

These are general considerations for the AIOps tool selection process, but they are just suggestions. You may find that this list is exactly right, or you may need to add a few more considerations. You may even need to remove some considerations.

Figure 5 depicts most of the considerations to rank against your own requirements.

Figure 5: Selecting an AIOps tool means understanding how to consider requirements.

Cost models

This is how the tool is monetizing itself. There are two basic types of cost models. Open source tools are free, but there are add-on costs in almost every installation. Costs include connectors, as well as some professional services you’ll need to install and configure the software. Commercial tools have a traditional software license, or they have on-demand pricing either cloud-delivered or as SaaS.

What you need to look for here is the cost per day, with any model or any charging structure. This allows you to compare apples to apples and make the cost of the tools much less complex to understand.

Deployment models

The tool provider leverages this model to deliver the software. While there are additional ways that AIOps providers deliver software, the basic choices are: On-premises, SaaS cloud hosted, and IaaS cloud hosted.

Each model has generalized advantages and disadvantages:

On premises: Offers complete control of the hardware and software. Most consider this a positive. However, the negative is the cost of owning and managing your own hardware and software.

SaaS: An advantage of the SaaS model is its ability to leverage the software using most devices and browsers. A disadvantage is that it’s more difficult to modify the configurations, or return to a previous release, considering that the SaaS software is a common service that’s static.

IaaS cloud hosted: This model functions much like hardware and software that you run on premises, but it runs on a remote cloud service. You have all of the advantages of on-premises control, typically at a reduced cost. However, the loss of direct control is viewed as a negative by some IT organizations, as is the risk that performance may be reduced when moving from low latency, on-premises systems.

Platform connections

This encompasses the operating systems and hardware where the AIOps tool runs; Linux on an Intel core processor with a certain sized memory configuration, browser interface, etc. Of course, there are many iterations of this example. Your AIOps tool provider will offer guidance for the platform requirements, no matter if it’s cloud hosted or on-premises. Note: With SaaS deployed, there is typically no choice as to the platform you run on.

Host platforms

The host platform could be that of an IaaS cloud provider, an on-premises system, or even a managed services provider. However, in any event you are the provider. It’s the job of the host platform provider to deliver the proper platform configuration in support of the tool. They should provide some rudimentary management functions as well, such as the ability to provide cost data to calculate burn in real time.

Closed or open source

The important note here is not the cost of the tool, including the fact that open source tools are typically free. It’s really about the legalities at play via the open source license. The license may include limitations that put the tool out-of-bounds for your requirements, such as limiting the number of systems under management to 1,000 when you have 2,030 that need to be managed. Commercial licenses are typically scalable with a relative cost to the systems under management.

Use of AI

This goes to how well, or not so well, the AIOps tool leverages a native AI system. What are the system’s capabilities to be trained by the incoming data from the systems under management? Can the system perform tasks such as correlation, noise reduction, predictive analytics, and automated actions?

Use of data

AIOps tools are really databases with processes and AI capabilities that ride on top. One consideration is how the data is leveraged, and how it’s physically stored. Most AIOps tool providers leverage object-based databases, sometimes third-party databases. Some leverage proprietary databases of their own creation.

Security features

What security features are built into the AIOps tools, such as support for identity management, encryption, and security auditing?

Governance features

Similar to security, this means support for service and resource governance. This typically involves integration with external governance systems, but some governance may be native to the AIOps tool.

Configurability

How do you manage volatility in a configurable domain? The use of configuration means that you can set thresholds, communication speeds, connector attributes, security models, etc. The more you can configure, the better.

APIs

APIs provide services-level and microservices-level access to the features of the AIOps tool. The more features that are accessible via the APIs, the better you can manage the features of the AIOps tool(s) directly from applications. This means that applications can perform management functions. In essence, they manage themselves.

Customization

This typically relates to customization of the dashboards that each operational team leverages to observe data and graphics related to the views that are most productive for them. While the AIOps tools come with useful pre-built dashboards, most who leverage AIOps tools are more productive with customized views into the data and operational behaviors.

Your next move

It’s helpful to create a table to score each AIOps tool; see an example in Table 1. You’ll need to provide a ranking for each consideration, based on its importance to your enterprise. It’s best to use 1-100 as the range.

Then provide a score based upon your evaluation using a 1-10 range. Then multiply the ranking times the score. Also, note the best attributes of each tool, or what would set the tool apart from the rest of the pack.

Table 1: Features and requirements table; how to make your tool section decisions.

Key takeaways

  • Determine the correct requirements criteria for the set of systems you need to manage. Then assign a rank for each tool by criteria. Consider other non-obvious business issues, such as the business viability of the tool vendor. Then, pick the tool or tools.
  • Be prepared; this is the most time-consuming part of the evaluation.
  • It’s helpful to go into an evaluation test process with the shortest list of tools that need to be considered.
  • Understand the cost model, deployment model, whether the tool is closed or open source, and how exactly it uses AI.
  • Other key considerations include security and governance features, and how the tool allows you to manage its APIs.

 

How to create an RFP for AIOps tools

Most enterprises will create a formal RFP (request for proposal) in the selection process. RFPs should be easy-to-understand documents that allow the tool providers to sell the attributes of the technology in a consistent way. In turn, this allows enterprises to easily compare each tool and how well they measure up to the requirements.

Examples of the types of questions that may be in the RFP include:

  • What cost model do you employ?
  • What is the per-day cost of managing X number of systems?
  • What deployment model do you leverage?
  • How do you support security?
  • How do you support governance?
  • What platforms do you support, and which is optimal for your tool?
  • Is your AIOps tool based on open source technology, and if so, what is the licensing model? Also, what other add-ons and technologies are needed for deployment, and how much more cost is typically required?
  • How does your tool employ AI during operations?
  • How does your tool employ data during operations?
  • How does your tool support configuration?
  • How does your tool support APIs?
  • How does your tool support customizations?

Of course, feel free to customize the list above, leaving some questions out, or adding a few more that are specific to your requirements.

Understand your own requirements

The best way to understand your requirements is to agree upon them as a team and write them down so they are documented and formal. Referring to Table 1, you’ll have the ability to define each requirement, such as “Use of AI,” and then the RFP response for that question, how it’s ranked by the RFP evaluation team, and ultimately, how you scored each contender. You fill out a table for each provider.

Table 2: Sample RFP response evaluation table.

Building an RFP

The typical structure to build an AIOps tools RFP would be:

  • Introduction
  • Purpose of the RFP
  • Description of the problem
  • Desired solutions
  • Business objectives
  • Criteria
  • Key questions
  • Estimated costs
  • Risks
  • Evaluation process
  • Submission guidelines
  • Conclusion

Making a selection

Most would describe this process as just selecting the tool with the highest score. But the consensus of the team should also carry a degree of weight in the final decision, in conjunction with the highest score.

Preconceived biases may play a factor when evaluating the tools. It’s even a good idea to have two separate teams evaluate each tool and provide a ranking and score for each. This may remove some of the bias because two or more evaluators are looking at the same criteria.

Keep in mind that the selection you make may not be the cheapest solution on its face. The AIOps tool provider may charge more for a better tool that increases efficiencies in other areas. If you put more weight on the cost of the tool vs. the efficiencies of using one tool over another, you may find that the missing features or quality problems cost much more than any money saved in the initial purchase.

Selection testing and implementations

Testing the top three or four AIOps tools as part of the RFP is a good idea. If you evaluate 10 AIOps tools, and after the selection process pick three or four top contenders, then run a test and acceptance process for each.

This means you must define the testing criteria, which means defining actual problems, and then evaluating each tool’s ability to solve those problems.

One sample testing problem would be:

“Demonstrate the recovery of a cloud-based storage system that is going into failure. This should include the criteria for detecting the failure, the automation of the recovery process, reporting the failure, logging the failure, and using the data to train the AI systems, etc.”

The provider should demonstrate an approach and the physical operations of the tool to solve the problem. The problems you use for testing should be directly related to the problems you want to solve in your enterprise using the AIOps tool(s).

While building the right RFP for your AIOps tool purchase seems like complex task, these are the essential steps to get right:

  • Understand your requirements in detail
  • Understand how that translates into tool requirements
  • Understand the market and the players
  • Understand the test and acceptance process

Missing any of the above tasks will just add risk to your AIOps selection process.

Read more articles about: Enterprise ITHybrid IT