You are here

You are here

Monitoring for network change: 4 use cases

Michael Procopio Technology Evangelist, Micro Focus

Too often, the cause of network downtime is an incorrect change. To ensure their network is operating efficiently, teams must monitor for change as well as for performance and availability.

Typically, a network operations center (NOC) team monitors the network. When there's a problem, the team tries to find the cause. But whether it finds the cause or not, the next step is to pass that problem over to network engineering.

If the NOC finds the problem, network engineers can immediately start on repairs. But if the NOC is unable to locate the source of the issue, network engineers have to spend more time searching for the cause, which can take hours or even days.

Generally, NOC teams don't monitor the network for changes that might affect performance, such as configuration updates and software updates. That's typically done by the network engineers.

However, NOC teams that check for these types of changes can more quickly and accurately detect performance issues. To do that, you need a tool that not only monitors performance and availability, but also is aware of configuration changes. Here are four types of changes you should monitor for.

1. A change in the running state of the device

A change in the running state of any device can occur when an individual logs into the network device and makes a change. Most network devices have a configuration file that holds the device's parameters for when its booted. Once it's running, any changes to the running state affect the running device but not the configuration file.

If a change is made incorrectly, a performance problem occurs and the NOC operator then tries to diagnose the problem. But since the operator doesn't always know what the change is, it can take a while to find it. 

A networking monitoring and configuration tool can speed troubleshooting by showing the NOC operator the changes that are made to the device. The operator can then send that information to the network engineers, who won't have to go hunting around to find the problem. There's also a chance the tool will allow the NOC operator to fix the issue without involving the network engineers.

2. Unsaved changes

An unsaved change is one that's made to fix a performance problem but isn't saved to the boot configuration file by the person making the change. The NOC team needs to know about changes that have been made but not saved, because later, when the router reboots, the performance problem will return.

Having this information allows the NOC team to notify the network engineers that someone made a change but didn't save it.

A networking performance monitoring and configuration tool can detect any unsaved changes. It also detects reboots and will show the change between the "fixed" configuration that wasn't saved and the "unfixed" one from the configuration file.

3. A change in the boot configuration of the device

Most IT departments have what are called change windows, which typically occur in the middle of the night, when changes that are made affect the fewest number of people.

Let's say a new regulation has gone into effect and a company needs to make a modification to this network device to adhere to the regulation. Since this isn't a critical change, the network engineer makes the change to the boot configuration during the normal change window. However, not every change that somebody makes is going to work the way it's expected to work.

When the device gets rebooted, likely in the middle of the night, the change the network engineer made could have a negative side effect that may show up either immediately or when a lot of people are using the network.

In many cases, the NOC operator sees that there's a big performance problem and needs to figure out what happened. The networking performance monitoring and configuration tool can allow the NOC operator to look back and see the changes and performance. Some tools can look as far back as 30 days.

After determining the change that could be the source of the performance problem, the operator can send a ticket to network engineering saying, "We've got a performance problem. And by the way, there was a change made at 2:00 a.m. Maybe that's helpful to you, or maybe that's the problem."

4. A software update on the device

This is a software change rather than a configuration change. Vendors such as Apple, Microsoft, and Cisco often send out updates to their software that don't always work as intended and may have negative side effects.

Software updates are also typically scheduled during a change window. But as in the above example of the change in the boot configuration of the device, the negative impact of the change may be delayed.

However, if network operators are using a network performance monitoring and configuration tool, they can see the potential problem and send that information off to the network engineering team.

Share the knowledge

The bottom line is that making change information available to NOC operators makes them more valuable to their organizations. That's because rather than just telling network engineering that there's a performance problem, they can now let network engineering know the probable cause. And that means they can more quickly find and fix the problem so that everybody's back up and running sooner.

Proactively monitoring the network for change is much more effective than just reacting to performance issues and then having to spend time searching for the cause of the problem.

Downtime is very expensive; in 2019 a 14-hour network outage is estimated to have cost Facebook around $90 million. Given these costs, implementing a networking performance monitoring and configuration tool can not only make people more productive; it can also save a business huge amounts of money.

Keep learning

Read more articles about: Enterprise ITIT Ops