You are here

You are here

Key requirements for a successful DataOps initiative

public://pictures/davidweldon.jpeg
David Weldon Freelance Writer and Research Analyst,
 

Most large- and medium-size organizations are engaged in some level of digital transformation. It is vital for those organizations to be able to work with data at scale and respond to real-world events as they happen.

That can be greatly aided by DataOps practices, which have emerged in the past couple of years as a way for organizations to improve data quality and customer experiences.

DataOps is similar to DevOps, but instead of focusing on IT operations and software development teams, it is geared toward data analysts, data developers, and data scientists.

For most organizations, DataOps involves a cultural change to focus more on collaboration and service delivery using lean practices. It also focuses on data operations, intelligent systems, and advanced analytics.

Many firms haven't yet embraced DataOps, are unclear what DataOps really is, don't understand how it relates to their data management practices, and can't predict what benefits they can expect from it. 

Here's why you should learn more about it—and get started.

[ Learn how to optimize your Enterprise Service Management in TechBeacon's Guide. Plus: Find out in this Webinar if ITSM and ESM are one and the same. ]

DataOps means better data models

DataOps began just a few years ago as a set of best practices, but it has now matured to become for leading organizations a new approach to data analytics. 

Joshua Shackelford, senior manager of IS operations at wine producer E. & J. Gallo Winery, said DataOps allows analysts, programmers, scientists, engineers, and others can track versions of their data, models, or code. They are able to version, test, package, deploy, compare, and monitor data.

"When we talk about DataOps we are referring to the pipeline that data models and insights should flow through in order to deliver reliable models for enterprise consumption."
Joshua Shackelford

DataOps is often conducted by different teams, which may tackle different stages of the process. Individuals involved should understand data modeling techniques, teams should learn from each other, and everyone should document how they arrived at successes.

"DataOps practices should be iterative," Shackelford said. "Each team will go through a maturity model." For instance, early cases might not have rigorous monitoring or live A/B testing.

"Any team that plans on taking a model to production should define their DataOps pipeline practices."
—Joshua Shackelford

The correct way to lead a DataOps effort

When an organization decides the time is right to invest in DataOps, deciding who will lead the effort will depend on the organizational structure.

Typically, a chief data officer (CDO) would own this process, but it is likely that an organization that already has a CDO probably already has a DataOps practice at varying levels of maturity, Shackelford said.

Some companies might start DataOps as a community of practice that spans multiple departments, while others would centralize the effort within their data analytics or data science team. "At Gallo, we are working toward federated data access, so each team will be at a different maturity level," he said.

Convincing managers that they should put an employee on a DataOps team should not be difficult, he said.

 "DataOps should be a grass-roots initiative. Team members should be able to highlight the value in streamlining their data practices."
—Joshua Shackelford

If an employee tells his or her manager that they want to automate the deployment of data or set up tests against their models, he said it would be hard for any manager to say no.

The employee should start small, and iterate on the value. Shackelford's advice: Don't pitch all new tools, because it will take much longer to see a return.

As noted, a strong understanding of data modeling techniques is important for DataOps success.

"Anyone that wants to push their DataOps practices forward needs to have an understanding of what it means to have a data model in production."
—Joshua Shackelford

In other words, people need to know what it takes to build a model or how to move data. Next, they need to merge versioning and testing knowledge into their build practices.

[ Get up to speed on IT Operations Monitoring with TechBeacon's Guide. Plus: Download the Roadmap to High-Performing IT Ops Report ]

The right time to launch a DataOps initiative

Most organizations adopting DataOps practices seem to do so in one of two ways. For newer firms, that often means doing it early in the creation of the organization so that it becomes part of the corporate culture, said Daniel Skidmore, senior director of DataOps at Overstock.com.

For more mature teams, DataOps can make sense when the urgency of rapid deployments and agility outweighs the cost of changing the existing monolithic infrastructure.

In terms of managing a DataOps initiative, Skidmore said there should be one central figure, ideally with authority over both development and operations resources. The farther-reaching the changes to processes will be, the higher up in the organization the individual should be.

For those wanting to join the effort, the best way to do so is to educate management on the specific challenges that the organization is encountering and how DataOps can help resolve them.

"For example, if deployment time is an issue, explain how pipelines can streamline and improve the situation."
Daniel Skidmore

Top skills that go into DataOps success

Skidmore agreed with Shackelford that strong knowledge of data modeling techniques is important for DataOps success. Other critical skills and knowledge include data integrity, including data-testing best practices, along with the ability to navigate data management systems and other kinds of software.

Other important core competencies, Skidmore said, include knowledge of logical and physical data models, data-loading best practices, orchestration processes, and containers.

As for technical skills, Skidmore said specific software experience can include Docker, Jenkins for pipelines, and Git for version control. This is on top of the typical data management software such as databases and streaming software such as Kafka.

Shackelford offered a similar list, noting that most DataOps practices revolve around Python skills. But understanding R, Lambda, Git, Jenkins, Anaconda, and other containers would be "a big boost to standardizing and automating the pipeline," he said. "Other tools will vary depending on which integration and middleware tools were selected by the company.”

Executive buy-in is vital 

Since DataOps requires a cultural change to be successful, top-level executive support is key, both Shackelford and Skidmore said. Obtaining buy-in at the rank-and-file technical level is usually the easy part.

"Historically, I have found that with the technical people, if you give them enough time and resources, they are excited to try and experiment with new things. This will be even easier in new, smaller organizations."
—Daniel Skidmore

Shackelford said that executive support is also important because success may not come quickly. "This is a newer practice. Applying automation, testing, and monitoring to data will take time to learn." The concept of failing quickly needs to be supported at the highest levels, he said.

Also keep in mind, Shackelford said, that it's okay to switch languages, tools, or practices as you go.

"The person needs to experiment and report out on their learning. The key to success in a data world is who can learn the fastest."

[ Learn how to roll out Robotic Process Automation with TechBeacon's Guide. Plus: Find out how RPA can help you in this Webinar. ]