Big data analytics isn't just for marketing. Here's how to use dark data from your own development processes to dramatically reduce cycle times and accelerate application delivery.

Use big data analytics to blow away application delivery deadlines

People associate big data analytics with market segmentation and ad targeting, but you can also use it to turbocharge your own software development process.

With traditional development models, development cycles and corresponding test cycles were quite long. Gradually, manual testing generated quantities of data that could be managed with relational databases. However, as agile methods continued to infiltrate every aspect of software development, cycle times shortened, with everything tending towards the continuous: integration, testing, delivery, and so on.

These continuous processes generate data that, according to Gartner, meets the definition of big data. Most of that is "dark data" that never gets analyzed or put to use. But it does hold a potential goldmine of information that you can use to help drive decision making at many levels. For example, an analysis of the data might help you strike the balance between using costly and exhaustive testing to lower risk with better code coverage versus using a more focused and cost-effective testing process for agile application delivery.

Gathering your data

There are several sources of data you can collect as an application evolves from design to production. Developers start off with a set of requirements that are translated into a design. Both requirements and the design can change at different stages of the application life cycle, so the requirements, design, and changes to them all constitute sources of data that you should capture.

Next, the design gets coded. How many developers participated in the coding? How long did it take? You can use this data to correlate a design with the effort required to code it. What parts of the code were touched? What output do you have in your software configuration management system and build logs? That's more data for your analysis.

Quality assurance (QA) engineers test and enter defects, take screen shots, and record videos. You can use that data to correlate coding effort with the number of defects. QA staff then provide their feedback for another iteration of coding. Each time the cycle repeats, the development team generates new data. Finally, the mature application is released into production, where users interact with it.

This is the point where businesses apply traditional big data analytics to analyze user behavior, but you can use all the data created by designers, developers, and QA teams before the application is ever released to drive your own decision making. The data is there for the taking. Too often, however, that data is lost to the dark reaches of a database without any analysis.

Using big data to your advantage

Here's how my own organization used dark data to find the right balance between exhaustive versus agile testing to optimize application performance. When we transitioned to continuous delivery, we started getting changes delivered to production daily. To achieve this velocity, we had to optimize development cycles and make them much shorter. But how could we do that without compromising quality?

We decided to analyze performance data across multiple builds to see how it correlated with different areas of the code. We could then identify areas of code that needed more attention. We moved testing of those areas from an exhaustive regression testing process at night to ongoing tests run on each code commit. In this way, we were able to identify and address performance bottlenecks sooner in the development process and attain our goal of shortening the overall development cycle.

Big data analysis is a disruptive opportunity you can use to rethink how you work across the application life cycle, and it can be applied to every stage of the software delivery pipeline. Everyone involved in the process generates data you can use. Whether it's developers fixing bugs or QA engineers reporting them, you can use this data to help you make better business decisions. In the future, smart systems may even work autonomously, learning from historical decisions and suggesting optimizations based on historical data. One could imagine a continuous integration server that could correlate tests to the code, only running tests relevant to code that has changed.

You can change everything by applying big data analysis to the wealth of dark data in the development life cycle. The way development teams have worked for the past 20 years fundamentally changes when you combine the power of machine learning to the goldmine of data at your fingertips.

Whether you look at build logs, analyze source code, evaluate defect histories, or assess vulnerability patterns, you can open a new chapter in application delivery fueled by big data analytics. Potential developers will check in code, and without any manual intervention the system will rapidly execute only the relevant tests. Code changes will be automatically flagged as high risk based on history and the relevant business impact of the changes, in the process triggering additional regression tests for execution. Step into the future, where machine learning and big data analytics help you build and deliver software faster and better than you ever could before.

Topics: App Dev