How to reinvent monitoring—and the relevance of IT Ops
In an age in which every company is a tech company, performance and availability are more business-essential than ever. Faced with goals of 99.99% uptime and strict SLA compliance, enterprise organizations are using new approaches to infrastructure and application management. These include using containers, microservices, and relying more on continuous integration and continuous delivery to keep those services on the cutting edge.
The result? The monitoring strategy you need to keep the entire IT stack up and running is now quite complicated. But it's never been more critical to business success, so how do you keep up? You'll need a new tool strategy.
From all in one to best of breed
Today’s approach to IT monitoring must be reinvented because the way companies do business is changing. Twenty years ago the monitoring landscape looked vastly different. Organizations move at a slower different pace: applications were updated just a few times per year, and IT infrastructure was primarily hosted on large, on-premise machines. It was completely acceptable for legacy vendors to sell a single, catch-all monitoring system—and administrators could adopt a low-touch, “set and forget” approach to systems management.
Today, both infrastructure and applications have changed radically. Apps that used to be updated once per quarter may be redeployed several times a day. Meanwhile, IT infrastructure is gradually migrating to the cloud, and organizations are using containers and microservices to ensure availability and uptime for services that are scaling and moving at a faster pace.
To accommodate this shift, organizations are moving toward a best-of-breed approach for monitoring, because the old “one tool to rule them all” approach no longer works. Instead, many enterprises are selecting the best tool for each part of their stack, with different choices for systems monitoring, application monitoring, error tracking, and web and user monitoring.
As a result the monitoring industry has become highly fragmented. Hundreds of new tools have come to market to help organizations gain insight into an increasingly diverse set of complex systems. The monitoring tool industry has been gradually reinventing itself because, as the underlying software and infrastructure of organizations evolve and modernize, monitoring too must come along for the ride.
The relevance of enterprise IT
As these trends continue, enterprise IT is becoming more of a strategic asset than ever. In the past, IT's primary mandate was to maintain uptime while remaining as cost-efficient as possible. While this mandate still stands, IT's scope and responsibility has expanded to encompass the enablement of agility in three areas: R&D, go-to-market, and marketing and sales.
Enterprises now realize that, no matter what business they're in, they are effectively tech companies. Technology enables everything from production to point of sale to customer feedback, and internal human resources and billing systems. If your systems aren’t firing 100% at all times, you’re giving the competition an opportunity to overtake you. In short, if IT applications and infrastructure are not dynamic and scalable, the business cannot be dynamic and scalable.
IT culture: What it is and why it matters
Enterprises are changing the way IT culture is defined, and the way IT does business. The goal of IT service management (ITSM) traditionally has been to align IT processes and roles to facilitate the goals of the business. That, in turn, set the tone for the organization’s IT culture.
But in recent years, ITSM has started to give way to DevOps, an approach that focuses on tearing down the walls separating development from operations in order to support agility, speed and shared responsibility.
While some organizations have swung to the other end of the spectrum and adopted DevOps with open arms, many still operate in the middle ground between traditional ITSM and DevOps. Either way, the notion of pushing code once per quarter in highly structured and scheduled releases is a thing of the past.
To remain competitive, companies must embrace the shift to some version of an IT culture that is nimble, dynamic and that supports the many moving parts of increasingly complex IT environments.
The new approach to monitoring
The very purpose of monitoring is to set thresholds that allow IT professionals to know exactly when a problem occurs or is likely to occur. This approach frees them from having to stare at a dashboard all day long.
However, this relatively low-touch approach gets more complicated as the systems those tools must monitor become increasingly complex and diverse. For example, one of our clients relied on just three monitoring tools five years ago. Now they have more than ten. As companies add more tools, the number of alerts that they must field can grow by orders of magnitude. It’s simply impossible for any human, or team of humans, to effectively manage that.
Organizations must adjust their monitoring strategies to address this issue. If they do nothing, they will not only cripple their ability to identify, triage and remediate issues, but they run the risk of violating SLAs, suffering downtime, and losing the trust of customers.
Implementing a smart correlation strategy to transform high volumes of alerts into related incidents allows organizations to turn what is effectively noise into intelligent, actionable insights. This sort of approach not only enables companies to scale and remain agile, but it also ensures that IT teams won’t be bothered ten times per night for a false alarm. The result is happier employees, happier customers, and a stronger business.
The evolution of IT and its tools and processes
Everything is accelerating. Release cycles are getting shorter, data centers are getting more complex, and infrastructure-as-code enables systems to evolve faster than ever. This shift towards agility is great for the business, but it introduces a new set of challenges for IT monitoring.
We’ve already seen an explosion in the number of tools available to tackle these problems, and there’s no one-size-fits-all solution anymore. Organizations must rely on an increasingly complex and diverse stack to effectively manage and monitor IT environments.
But this shift is just the beginning. In order to keep up with this pace of disruption, every component of IT—tools, people and processes—must be involved. And this imperative will only become stronger over the next ten years, as our technologies, and our expectations of those technologies, continue to evolve.
How IT Ops can remain relevant
The key to remaining relevant in IT operations is to embrace lifelong learning. If you pick up any technical training manual, you’ll often discover that 10 to 20 versions or more have been released since its original publication. Things move so fast, in fact, that manual writers run the risk of having their material become outdated before they can publish it.
Because the world of IT is changing so rapid, so must the software and tools that support it. IT professionals simply can’t afford to be complacent. While this places a great deal of demand on them, it also makes them an extremely important and strategic asset to their organizations. You could say this is the golden age of IT. But you can only maintain your value if you keep up.
Image source: Flickr