4 ways to align IT monitoring with your business

Companies often decide it’s time to get serious about IT operations monitoring after a disastrous event takes place that exacts a heavy toll on the business. An e-commerce service outage leads to painful revenue loss, for example. Once your company has been the victim of such an event, both business and IT leaders will be determined never to be caught off guard again.

You will most likely want to do a better job at IT operations monitoring to avoid a similar catastrophe, but you also have an opportunity to enhance overall application, infrastructure, and network performance. That’s understandable, but there has to be a logic to implementing a more rigorous IT operations monitoring environment.

You can’t start with monitoring every server, router, switch, and business-critical application. Rather, begin with the applications and business processes that most need improvement—especially among those that deliver premium value to the organization—and the IT systems and operations that support them.

Determining what those are will drive the development of a new framework for honest discussions between IT and business teams that can inform all subsequent IT Ops monitoring efforts.

I take clients through a four-step agenda that helps business and IT personnel work determine what processes warrant immediate IT operations monitoring attention and how to implement best practices to ensure effective management approaches.  

The Journey to Hybrid Cloud: A Design and Transformation Guide

Change mindset, gain agility

When you break out of the mindset of “monitor everything”—instead focusing on explicitly defined, critical areas where IT and the business must be in sync—you bring agility to your IT Ops monitoring efforts.

To that end, our four-step agenda follows the path I'll explain below.

1. Start with known issues in business application processes 

We ask our clients: What's the last thing you want to hear on a Friday afternoon when you're about to head out for the weekend? Everybody typically has some process issue in mind that could use the help that comes with a 360-degree view of performance and availability offered by application performance monitoring.

Maybe a concern is a challenge with the ERP system’s customer service module. Downtime issues regularly disrupt the customer queue, interfering with agents’ ability to categorize and manage service calls and contributing to longer client waits and resolution times. You can see how, in the age of the customer, it may not be to the business’ benefit to disappoint on the service front.  

Then we marry that known problem to the supporting server and database infrastructure so that the organization has a focus on a defined environment for core monitoring, and to provide proof of concept, too. 

2. Add in the end-user perspective

The next step is to discover issues that the business may not even be aware of but that may be having a negative impact in potentially critical areas. Such issues may warrant ongoing monitoring. To this end, we leverage technologies that simulate the actions users take and the results they can expect to get—for instance, in tests, we might find that it takes close to 20 seconds to confirm a ship date for a transaction. Absent any business context, that may seem an inordinate amount of time. But it's a baseline metric that we might want to improve upon.

Collaborating with the business and IT on what potentially should be a normal response—say, seven seconds—we can work together to first ensure that systems are all in a healthy state. We want to be clear that there aren’t inherent problems here that are causing the slow performance. Once it’s determined that systems are error-free, the next step is to estimate whatever hardware and/or software upgrades will be required to speed up the request—and to determine the costs to attain that speed are worth it given the nature of the business process.

3. Dive into more data

With problematic applications and processes identified and very simple end-user transactions implemented, we can start to ask further questions. For example, is there a reason—such as a customer’s service-level agreement (SLA)—that a seven-second transaction response time should be a requirement rather than a nice-to-have?

While the business may still want to upgrade its infrastructure to try to live up to that metric, without SLA penalties in the wings, there may not be a justification for consistently monitoring IT operations to assure these transaction times. On the other hand, if SLAs are in place on that point, it’s important to know if you’re failing in this dimension so that you can correct that ASAP.    

4. Start the monitoring on a wide scale

As we dig into these performance issues for our clients using IT operations monitoring tools, we often discover that things weren’t set up properly to ensure effective monitoring of an application, process, or infrastructure in the first place. Particularly in large, dysfunctional environments, we often have to fix these setups before we can even begin real monitoring.

We explain to our clients that the right move is to start monitoring a few core assets, such as operating systems and virtualized servers, so that when we drill down into why an application process is slow or broken, we have baseline information from which to pivot.  

Put sensible monitoring in place

Don't wait for a disaster to begin to start monitoring your systems along the lines I've described here. Begin with the applications and business processes that most need improvement, and add to your monitoring capabilities over time. 

With sensible monitoring in place—agile monitoring, not overdone—you’ll set yourself up for the next stage of your IT Ops monitoring journey: automation. But that’s a topic for a future discussion.

The Journey to Hybrid Cloud: A Design and Transformation Guide
Topics: IT Ops