DevOps reorganization at six months: 5 things we learned
CSG International embarked on a major DevOps reorganization this past March, when we began the process of bringing together separate development and operations organizations into one joint organization. The journey has been both challenging and rewarding, and I'm looking forward to sharing our experiences at DevOps Enterprise Summit 2016 this week.
As we’ve moved towards teams that build and run their own software, we’ve exposed many problems and begun architecting solutions. Exposure to the operations world has brought a whole new perspective to our development teams.
Some things went as expected. For example, we knew we’d find many opportunities for automation, and we understood that we’d be getting more involved in production troubleshooting. We also saw the need to expand development best practices, such as peer reviews and continuous integration, to our operations partners.
But while we thought we were prepared for this new operations world and its unique set of challenges, several things still caught us by surprise. These included the need to redefine architecture, the desire to slow changes down, details of the change process, and our experience with off-hours support.
Our redefined view of architecture
We’ve been surprised to learn that our view on architecture needs to be expanded. We traditionally thought of architecture as a development-only domain, while operations got treated as something that should be handled tactically.
However, we’ve discovered that good design and architecture principles don’t stop at the edge of code. The entire system needs an architect to oversee such things as standards, design principles, interfaces, and surrounding automation. This requires re-thinking the architect role and their place in the enterprise.
Go slow before you speed up
As developers, we want to push our changes to production sooner, getting new capabilities into the hands of our users as fast as possible. Operations has historically been opposed to this, wanting to slow change down. These opposing forces have been a source of frustration at times, with both sides failing to understand the others’ viewpoint.
However, as development has started to share responsibility for implementing customer-facing changes, and dealing with any fallout of those changes, we have begun to understand that there is a small level of inherent risk that comes with each change. This risk can cause fear of change, and this realization have given us a surprising new perspective on the desire to slow things down.
Our background in numerous DevOps techniques, such as automated tests, continuous integration, configuration as code, and continuous delivery, has enabled us to understand that it is possible to move quickly and safely. But until those concepts are in place, development now has much more empathy with operations' desire to reduce the frequency of change.
We have a new respect for change processes
Yet another surprise is the extensive nature of the change process. Development has survived for years, blissfully unaware of the intricacies of our change process. We write code, perform testing, and find out when it’s going to go to production. However, it was operations that handled the details of the paperwork to get it there.
That's no longer the case. In the operations world, the change process is interwoven into the daily business of accomplishing work. The process can be cumbersome and time-consuming, but is an absolutely essential part of getting work done. We must understand it well, and comply with it.
As developers are exposed to this process, we want to streamline and improve the process. However, being able to demonstrate a track record of safely making change is a precursor to being able to change the process.
Achieving that safe track record is a complicated activity requiring multiple DevOps techniques. While we are making strides to get there, we must live within the existing system in the meantime. This emphasizes the need to improve quality to reduce process overhead.
24 x 7 support means Dev, Ops are always on
24X7 support is another area where it has been eye-opening to gain firsthand experience. Most developers are aware that we ask our operations teams to provide off-hours support. However, living this experience has shown us how pervasive support can be to life outside of work. For instance, I now sleep with my phone next to my bed at night, with the ringer at maximum volume, so I’ll be sure to hear it when those middle-of-the-night calls come—and they do come.
Quick responses are required no matter what I am doing, whether I'm on a road trip with family or boating with friends. When a DevOps team has an issue, they're responsible for making sure that it gets addressed, regardless of the time or day of the week.
Operations personnel are hard-working and dedicated, providing this support as-needed. The need to be “always on” is tough, though, and requires a different, even more passionate mindset.
We have a new respect for Ops
Overall, our development organization agrees that operations is harder than expected. It’s one thing to read about operations in a book, and quite another to gain direct experience executing.
Dealing with production issues and supporting multiple operations teams presents its own set of challenges. There are new processes and new risks. There are frequent disruptions, and unplanned work.
DevOps provides us with methodologies, and a framework to deal with these challenges, and improve the situation. We have begun applying several techniques, with success. But our DevOps journey continues to be full of learning and surprises. Each new challenge presents us with an opportunity to take in new knowledge and improve the state of things.
Want to hear more? See Erica's presentation, "When Ops Swallows Dev," at DevOps Enterprise Summit 2016 San Francisco.