How to put your CI/CD data to work

A continuous integration/continuous delivery (CI/CD) environment, especially one at scale, creates a large amount of information, including builds, versions, machines, storage, tests, promotions, dependencies, and so on. You need to collect and classify this information for traceability and audit purposes. But you can also use this large amount of data can to drive your pipelines.

At one large client, a large financial services firm, we gathered information and meta-information around the raw facts brought by our CI/CD chains, and we have used the additional information to provide more feedback to all users, going way beyond developers and testers. By using a unified point of collection for all our data, we have been able to extract intelligence about the firm's systems.

Forrester Digital Business 2018: Benchmark Your Digital Journey

It's pouring information all day long

During a presentation at All Day DevOps last year, I talked about how we went from zero jobs to more than 10,000 in our Jenkins-based environment. Today we are approaching 15,000. Those jobs, which run 24 hours a day, six days a week, do such things as:

  • Create tags in source control

  • Upload artifacts in our Artifactory instances

  • Trigger other jobs

  • Trigger other projects

  • Deploy applications in different environments

  • Run some tests and collect the results

  • Scan the code for anomalies and security flaws

  • Measure performance

This generates a lot of information from many different sources in terms of tooling and applications.

Seeing is believing

Early in the process of setting up the CD chain, we decided to store this information for traceability and audit purposes. While we managed some measurements using specific tools for code quality, we have been using the open-source Ontrack application for everything else.

Using concepts such as builds, validation stamps, and promotion levels, we have been able to collect and visualize the quality of application versions in one glance. For example, for a given version, were all automated tests successful? How did this evolve in the last days?

By linking the builds to their source control management information (we use both Git and Subversion) and their associated tickets, we can answer such questions as which version the ticket was solved in and the level of quality of the tests and code.

Sources for even more information

We follow the same principle for all sources of information in our CD ecosystem: We link the tools to Ontrack and automatically collect some meta-information to attach to the projects, branches, and builds. We get build information from Artifactory to get dependencies between the different applications, virtual machine information for QA deployments, etc.

This structured and centralized information allows us to use Ontrack as a single point of knowledge about the CD state. We have been using this information to generate KPI reports and TV wallboards for development teams, with dedicated and live information. It’s not only about the state of their pipelines (build and test failures) but also the health of their pipelines, the obsolescence of their dependencies, their security flaws, and so on.

While most of the information is gathered automatically, we also allow project managers, team leaders, and QA staff to add manual information to the system. This adds up to the existing amount and provides even better feedback to the teams.

Automated feedback

Finally, we can use this large amount of information to drive new events in the CD chain. For example, an application A being promoted to a given level will trigger a backward compatibility test for application B. This provides a high level of interaction in the CD chain, but without the complexity of a classical, job/pipeline-based implementation.

Integration

The Ontrack application natively speaks to and receives information from many different tools, including Git/Subversion, JIRA, Jenkins, Artifactory, and InfluxDB. Its extension system has allowed us to extend the communication channels to VMware, Foreman, HP Fortify, and SonarQube.

In turn, our own tools can communicate with Ontrack and send information or collect it, using a Jenkins plugin, a Groovy-based DSL, a REST HTTP API or GraphQL schema.

This makes the integration of the CD information with all the other systems run very smoothly.

Want to know more? Come to my presentation during the All Day DevOps 2017 online conference, where I’ll talk in detail about Ontrack and how we use it to gather intelligence about our continuous delivery ecosystem. Registration is free for this 24-hour event, and you can also watch my talk --or any of 100 others, on YouTube after the event concludes.

Forrester Digital Business 2018: Benchmark Your Digital Journey
Topics: DevOps