How to improve your observability systems

Jayne Groll CEO, DevOps Institute

It takes less time and effort for your developers to fix problems and optimize a system when it's highly observable. So what's the best way to improve your observability systems? Here, several DevOps Institute ambassadors share their insights ahead of the upcoming Observability SKILup Day.

But before we get there, you need to have a clear understanding of what "observability systems" really means. DevOps research and assessment (DORA) perhaps defines it best, as a "tooling or a technical solution that allows teams to actively debug their system. Observability is based on exploring properties and patterns not defined in advance."

Josh Atwell, senior technology advocate at Splunk, explains the emergence of observability as the reaction to IT's complexity. All of the natural complexity that new system architectures and new application stacks have creates a groundswell around emerging practices to ease the associated pain of this complexity.

For example, the adoption of cloud and new staff services—combined with app containerization and new services and APIs—contributes greatly to the complexity that IT professionals encounter today.

Observability, when viewed as a framework within these tool chains, is designed to pull data from the environment and to give teams the ability to observe what's happening and that previously went unseen. This helps improve the baseline knowledge for developers and engineering teams, and this data must be analyzed regularly to improve decision making. But, like all IT systems, observability itself must be subject to continuous improvement through measurement and feedback loops.

Here are the best practices DevOps Institute experts say will improve your observability systems.

Start with your culture

As with many other DevOps initiatives and frameworks, culture is a key aspect to successful observability. Developers must adopt an open mindset, allowing team members a window into their process while operations pros continue to report systems operations.

These opportunities for collaboration often play out through new dashboards and capabilities, said Mark Peters, technical lead at analytics vendor Novetta.

"[Dashboards offer a] chance to drill down from monitored events to exact processes, and then scale the other way, with observability, to show how multiple monitored processes interact."
Mark Peters

This open mindset is similar to the principles of lean thinking. As Neelan Choksi, president and COO of value-stream management platform vendor Tasktop, said, "It's important to learn to see where we are stuck and wasting valuable time and resources." Developer teams need to understand the situation as a whole.

"Rise above the fray to truly see the end-to-end flow of business value."
Neelan Choksi

To achieve this cultural shift within your organization, identify the flow of value today and strive for continuous improvement. Tiffany Jachja, engineering manager at Vox Media, said continuous improvement for observability means never settling.

"[If everyone] had settled for JSON-based configurations, we would not have had YAML. If we had settled for YAML, we would not have UI-based configurations."
Tiffany Jachja

Mix in automation and AIOps

When it comes to improving observability systems, automation and AIOps show promising potential, said Helen Beal, chief ambassador at the DevOps Institute.

“Observability produces huge amounts of data—far more than a human can hope to analyze for insights.”
Helen Beal

Because of this, automation and AIOps can move faster than a human can, as Ryan Sheldrake, field CTO at security platform provider Lacework, said.

"Real-time mapping on a continuous basis is the only hope of observing such a volatile, ever-moving, complex entity that the SRE team(s) need to wrap their arms around."
Ryan Sheldrake

However, when using automation, AI, and other technology to help with observability systems, it is important for DevOps teams to remember a few basic rules. Observability is achievable only when application developers provide the needed alerts and logs while developing applications, the monitoring tools that read and display this data, and the automation tools that can take action on the insights provided by the data and monitoring tools, said Sushant Mehta, senior manager of application development at software and services provider Diyar United.

The more integrated these key aspects are, the more efficient and effective observability systems will be. And when you apply automation and AIOps, the possibilities for improvement are endless.

Integrate observability into your development lifecycle

Another way to improve your observability systems is by integrating observability into each stage of the software development lifecycle. One way to do this, said Parveen Arora, co-founder and director at consultancy VVnT SeQuor, is by including observability in your CI/CD pipeline. 

While operations teams are typically responsible for observability and application health, developers know their code better than anyone, said Supratip Banerjee, solutions architect at investment management firm Principal Global Services.

"[They] know how the codes will translate in production, and implementing specific observability goals in the CI/CD pipeline in stages will improve the total result."
Supratip Banerjee

Failure to properly introduce observability into the development process is why OpenTelemetry with the OT Protocol was introduced in the first place, said José Adan Ortiz, solutions engineer at Akamai Technologies. OpenTelemetry provides a native way to integrate different observability tools and providers into one high-quality telemetry across the stack.

"One of the best ways to improve and scale observability is to grow in Open Telemetry adoption."
José Adan Ortiz

Principal Global's Banerjee added: "Integrating and simplifying observability solutions may save IT operations teams a significant amount of time and resources." 

Play the game of trial and error

As with many other tools and frameworks, improvement will only come through trial and error, said Maciek Jarosz, DevOps and process expert. Many factors are at play when it comes to an observability system, including politics, economy, and environment, he said.

"Scaled systems are complex beyond imagination at times."
Maciek Jarosz

Additionally, because every system is unique to the team or organization using it, there is no single silver bullet to improve observability systems for all. However, one approach to better observability systems is to make everything observable, said Anshul Lalit, head of technology and transformation at software and services provider Kongsberg Digital.

"A good observability solution will give visibility into the runtime behavior of a system to allow for better decision making, debugging, and better performance." 
Anshul Lalit

Remember your objectives

In the end, it takes less time and effort for developers to fix problems and optimize a system that is highly observable. With a baseline knowledge of the data being analyzed within a system, developers and engineering teams can make better decisions and deliver business value to customers more efficiently.

That's why you should continually strive to make observability systems more comprehensive and efficient. Whether it's through automation and AIOps or by integrating observability into your CI/CD pipeline, there are several ways you can improve observability now and in the future.

Join the humans of DevOps at the Observability SKILup Day, a free virtual event sponsored by the DevOps Institute, on September 23, 2021.

Read more articles about: App Dev & TestingDevOps