You are here

You are here

15 AIOps resources for IT pros

public://pictures/ericka_c_0.jpg
Ericka Chickowski Freelance writer
 

As IT operations professionals increasingly turn to artificial intelligence (AI) and machine learning to streamline how they maintain system reliability, the field of AIOps is starting to heat up as a skill specialization. AIOps helps to automate the analysis of data streaming from IT monitoring tools and optimize the workflows in IT service management (ITSM) areas such as inquiry management and self-service capabilities.

For operations, ITSM, site reliability engineering (SRE), and AIOps engineering teams to get the most out of new AIOps capabilities, they are going to need to bolster their skills and processes to layer in best practices around new specializations such as data science and process design.

While AIOps teams may need fewer Tier 1 incident responders on hand, they will need more data science gurus to curate data and train algorithms. They'll need to team up with experienced troubleshooters and responders to help design workflows and runbooks, as well as to intervene with complex root-cause analysis. Additionally, teams will need thoughtful planners who can get the most out of AIOps to design resilient architectures that feature automated preventative maintenance and self-healing capabilities.

To keep up, operations professionals should start now in developing the skills for an AIOps future. Because it is such a nascent field, there are no AIOps-specific professional organizations or major certifications yet. However, with a little bit of creative digging into resources geared at related subspecialties such as SRE, network automation, and application performance management, it's possible to start looking for the contacts and learning necessary to understand the AIOps tools and capabilities available today, and build up AIOps skills and practices.

Here are 15 great AIOps resources that IT Ops pros should know about.

AIOps-specific events

SKILup Day: AIOps & MLOps

Date: October 15, 2020

Location: Virtual

Cost: Free

Hosted by the DevOps Institute, this free, one-day virtual summit offers beginners in AIOps a chance to bone up on the brief history of the niche and hear about use cases and benefits of AIOps practices. The virtual event sports an online networking lounge to share ideas with fellow travelers, plus a resource library and relevant videos to come up to speed on AIOps.

AIOps 2020: International Workshop on Artificial Intelligence for IT Operations

Date: December 14-17, 2020

Location: Dubai, United Arab Emirates

Research paper submission deadline: August 16, 2020

Cost: Collacated with ICSOC 2020, which cost €150 to €850 in 2019

Planned as a workshop event collocated with the International Conference on Service Oriented Computing (ISOC 2020), AIOps 2020 is an academic- and researcher-oriented event focused on the cutting-edge advances made in AIOps around areas such as self-healing, early anomaly, fault and failure (AFF) detection, and root-cause analysis techniques. 

AIOps Exchange

Date: The website notes that the event was postponed due to the pandemic, but in mid-August, it still said that the hope was to convene in early summer.

Location: TBA

Cost: TBA

2019 was the inaugural year for AIOps Exchange, which is a one-day forum event that features best practices advice from a handful of enterprise IT Ops professionals and heavily emphasizes the roundtable format for an intimate, interactive format for exchanging ideas on the newest techniques in refining SRE strategies using AIOPs, supporting DevSecOps, and developing the right culture within teams to effectively use AIOps technology. It is one of the few vendor-neutral AIOps-specific events currently out there, so keep checking for a solid date.

AIOps Expo

Date: February 9-12, 2021

Location: Miami Beach, Florida

Cost: $599 to $3,599

AIOps Expo focuses on a range of areas relating to how AI and machine learning can be used for IT operations, including application performance, network performance, and security. The previous event featured an agenda that ran through in-depth discussions around topics such as AIOps maturity models, how to build teams for AIOps, and blending DevOps and AIOps. 

Related Events

Interop Digital

Date: October 5-8, 2020

Location: Virtual

Cost: $499+

Interop has gone to a completely virtual format for the fall, and there will be plenty of crossover into AIOps territory across the four days of its programming. Sessions and training are particularly strong in AIOps leadership content such as overcoming cultural hurdles to leverage AIOps and introductory sessions on how AI and automation will serve as the backbone of IT operations in the future.

ONUG Fall 2020

Date: October 14-15 2020

Locations: New York and online

Cost: Free to $199

Started as the Open Networking User Group, ONUG has evolved into an enterprise leadership group focused on building out the digital enterprise through evolved practices in enterprise cloud, DevSecOps, and automation. The agenda for its fall conference fits right into the AIOps wheelhouse, with content planned around automating cloud governance, cloud-native DevOps, automating multi-cloud observability, as well as specific AIOps topics.

SRECon20 Americas

Cost: Free 

Date: December 7-9, 2020

Location: Virtual

A conference focused squarely on the site reliability engineers who are tasked with practicing AIOps, SRECon20 Americas hasn't yet released its agenda but chances are high there will be lots of good research on the use of AI in IT. Last year the USENIX event featured many talks around automating management of cloud infrastructure, designing resilient data pipelines, using open AIOps tooling to improve observability, and many other topics of interest to AIOps professionals.

Training and Courses

SRE Foundation

Cost: $1,595

Date: The next class begins September 22, 2020.

Location: Online

At many organizations today, site reliability engineers and AIOps engineers are one and the same, and it only follows that a certification in SRE fundamentals will offer the foundation upon which ops professionals can thrive with AIOps. With accreditation run by the DevOps Institute, the SRE Foundation class by the ITSM Academy is a four-day course that dives deep into the principles of SRE, including teaching and labs around achieving service-level objectives (SLOs), monitoring for service-level indicators (SLI), and the tools and automation techniques for maintaining system reliability.

AIOps Essentials

Cost: Free with Linux Academy seven-day trial

Date: On demand

Location: Online

Designed for IT Ops professionals charged with care and feeding of Kubernetes clusters, this class focuses squarely on bringing AIOps practices to cloud-native environments using the open-source Prometheus event monitoring and alerting platform. The goal is to get pros comfortable enough with the tooling to integrate Prometheus rules with Kubernetes APIServer to start scaling nodes and effectively managing a hybrid cloud through machine learning.

AI for Everyone

Cost: $49 for 180 days of certificate eligibility

Date: Ongoing

Location: Online

In order to start leveraging AI for IT, it would help to truly understand the basics on AI and how it's being applied today. This is a relatively short (six hours) nontechnical course designed to provide executives and managers with a broad overview of AI use cases, applications, and techniques, as well as pointers on how to get started building AI projects and teams.

Python for Data Science and AI

Cost: Free

Date: Ongoing

Location: Online

IT Ops pros seeking to build out a foundation of learning for automation, data science, and AI would be well served to start brushing up on Python. This free beginner Coursera class will give students the fundamentals in Python basics, data structures, and programming fundamentals over the course of 22 hours of learning. It can be taken on its own or used as a part of a broader IBM Data Science Professional Certificate.

Books, research, and reports

Cognitive Computing Recipes

Cost: $19.24+

While it isn't specifically an AIOps book, Cognitive Computing Recipes offers a solid survey of the use of deep learning and machine learning for developers and IT pros and features an entire chapter on AIOps as a part of its catalog of real-world use cases. The chapter provides practical information on improving on key reliability metrics such as mean time to detect and mean time to repair through the use of AI.

Practical Network Automation

Cost: $19.79+

A primer in the fundamentals of network automation, infrastructure as code, and analytics-driven IT ops, this tome provides practical advice in leveraging Python, Ansible, and other tooling to improve network performance, support DevOps practices, and apply continuous integration/continuous delivery principles. The book includes a chapter on the key pillars of AIOps, including information on collecting and managing data, and analyzing it using machine-learning principles.

ACM Digital Library

Cost: $5/article+

The Association for Computing Machinery (ACM) has published some valuable research on AIOps and data-driven performance management in the last year. Among the highlights are broad looks at the latest research innovations in AIOps, as well as in-depth research into topics such as predicting node failures in ultra-large-scale cloud computing platforms, tools for automated log parsing, and the latest techniques in time-series anomaly detection.

AIOps Exchange Report

Cost: Free

Last year's inaugural AIOps Exchange event put together a survey and this report, which offers some interesting statistics on the drivers for AIOps adoption, the barriers to getting the most out of AI for IT and the value delivered by AIOps. It's a quick read but provides some talking points for professionals and executives taking the first steps to exploring AIOps

Keep learning