How Jaguar Land Rover hit the accelerator with DevOps

Our DevOps journey at Jaguar Land Rover started around our infotainment system. There were a handful of us in Portland, Oregon, and in the United Kingdom—software engineers—who knew that our throughput wasn't quite meeting our potential. We had read about higher-performing teams and about this DevOps thing, and we knew we could change the business for the better.

Two of us from Portland spent six months convincing my boss to allow us to go to our first DevOps Enterprise Summit in San Francisco in 2016. It was there that we were first introduced to the inspiration we needed.

We were given books such as The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win, by Gene Kim, Kevin Behr, and George Spafford, and The Goal: A Process of Ongoing Improvement, by Eliyahu Goldratt, which talks about the theory of constraints. As we identified the similarities of the books to our own projects, we were motivated by the thought that it was possible to transform our organization.

When we got back to Portland after the conference, we were excited and ready to transform our entire company. We had all these ideas about how we were going to change the business and the way we developed software. Here's what we did.

How to Build a DevOps Toolchain That Scales

Transitioning from DevOps as an idea to practice

The problem was that no one else had gone to the same conference, and consequently no one else was as energized as we were about making this change. We soon realized that to get everyone on board, we had to prove ourselves—at least on a small scale.

So we built a server. We started with Linux, Git, GitLab, and GitLab CI (continuous integration), and we ran open-source software. We began with three projects, and then we encountered our first issues: Only one person in each project knew how the build worked and how automated testing worked.

We slowly brought each of them into the idea of continuous integration and testing. However, since the larger builds consumed so much of our only server's resources, we would lose the revision control system because our build slaves were hogging all the resources.

[ See more: DevOps Enterprise Summit 2018 Las Vegas ]

Running into our first issues

So we did what any self-respecting software team does when it's in a pickle—we bought more hardware. With three new servers, we started to add more projects. But when we added more volume to our three-runner set, the demand increased, and in came complaints from users who had to wait for their jobs to finish, particularly when everybody was running builds at peak business hours.

One user was so frustrated that the ops team hadn't completed a three-week-old request to add a build dependency on the build servers that he wanted to go back to running builds on his own machine.

To alleviate this concern, we moved to ephemeral Docker containers to run all of our builds. And with those, we defined every piece of build infrastructure as code. We used packer recipes to build the Docker containers, empowering application developers to change the underlying infrastructures that built their applications. Essentially, we handed over the keys to establish self-service development.

We began to see further adoption and added new projects, but we still ran into capacity issues and problems maintaining the bare-metal hardware because of power outages, network failures, CPU fans overheating, switch outages, etc.

The journey continued across the pond

At that point, we knew we had to move to the cloud. Now, rather than running ephemeral Docker containers on our own hardware, we had ephemeral Amazon EC2 containers. So we still kept the Docker container/ephemeral piece, but we eliminated the lack of capacity and could scale to incoming demand.

When we moved to the cloud, we had about 50 or 60 developers using the newly set-up tool set, so we added new projects, especially larger projects. Then I was invited to move to England, where the company's headquarters is located, to continue the transformation there.

Around this time, we continued an ongoing effort to adopt infotainment as our largest project yet. We looked at some of the key indicators in the infotainment developer environment, and we knew we had a lot of room for improvement. The main problem was that our feedback loops were four to six weeks. Can you imagine writing code today and six weeks from now being told whether it works or is broken?

Infotainment also had a significantly higher number of contributors—up to 1,000. And we noticed that commit activity didn't come steadily, but rather in bursts. We discovered that our highest commit days were Thursdays, which became the days we were always short in the number of available approvers.

We also had a tremendous amount of complexity in the build process of our Linux distribution. And there were only three people in the world who knew how our build system worked end to end.

Solving the problems

Ultimately, we cut the time it took to get feedback on a new feature from four or six weeks to 30 minutes by automating the build process and queuing Linux recipe changes during the large bursts.

We also refactored and incorporated a standard where everyone, not just three people, knew how the build system worked. The build system was now intended to be extremely simple — so simple that even JavaScript developers who didn't know Linux or Linux building knew how to add their applications to the build system in just 30 minutes.

By simplifying things, we were able to increase throughput.

Since then, we've delivered an infotainment system for nine different vehicles; each vehicle uses the same infotainment system and Linux distribution. With this platform, we were able to incorporate the company's first continuous deployment with software over the air, and we're now able to go from build to delivery in an hour, rather than six weeks.

Lessons learned

I’ve come to realize that dev and ops aren't always so sexy. But transforming or causing a change within an enterprise takes hard work. And there are some qualities that I've begun to respect because of how much hard work is involved. Those qualities include inspiration, persistence, and an attitude of continuous improvement.

Without these qualities, creating change in a traditional organization, such as Jaguar Land Rover, is very difficult. And I no longer underestimate the effort behind being the change agent.

Here are some other lessons learned over the last two years that I want to share.

  • At first I didn't understand the difference between a true strategy and a set of objectives. But, ultimately, a true strategy is what can make you more competitive in the marketplace. It's not just a vision statement.
  • If you're doing principle-based software delivery, it means listening to uncomfortable and conflicting opinions.
  • If you think at your core that something is right, fight for it.
  • Democracy isn't always the best approach to making technical decisions. In some cases, taking a strong stand is the healthiest decision, even if it's not the most favored by the group.
  • Articulating the "why" can be challenging without having the backup of similar, like-minded individuals.
  • Lead with focus, positivity, and transparency. The idea of a blameless culture and psychological safety can always ensure that you're thinking about what could go wrong.
  • The pursuit of a blameless culture and the psychological safety that implies can create an environment where no one is afraid to point out mistakes. 

The bottom line: Invest in your process as you do with product, and you will achieve higher quality, faster.

This is just a brief summary of our multi-year journey. To learn more about how we accelerated software development and what we're working on now, drop in on our session, "DevOps and Jaguar Land Rover," at DevOps Enterprise Summit Las Vegas, which runs October 22–24, 2018. TechBeacon readers can get $300 off registration using code DEVOPS300.

Topics: DevOps