Abstract
Over the past decade, we have witnessed an unprecedented technological disruption in application and data functions. They are now enabling enterprise-wide digital transformation initiatives to empower customers and colleagues. This evolution, coupled with the avalanche of data, creates a wonderful opportunity to establish data-driven enterprises, while simultaneously poses challenges around managing, governing, monitoring, and improving the value of data. Enterprises need to re-imagine how they deliver data and analytics programs, as current processes often fall short of meeting the rapidly evolving demands of business. A DataOps framework acts as an antidote to address data-related challenges, enabling a shift towards agile and reliable delivery of data and analytics programmes. This paper discusses the significance of DataOps and leveraging it to alleviate problems faced in data analytics programs. It also provides a framework to implement DataOps successfully in enterprises.
Challenges abound in the data value chain
A typical data value chain comprises the following stages: Acquire, Process, Publish, Consume, and Act. Typical challenges in this value chain include (Ref1):
Figure 1 illustrates several key statistics on enterprise data value chains (Ref2).
Figure 1: Enterprise data value chain statistics
Six steps to leveraging DataOps to mitigate challenges
DataOps (see Figure 2) brings together the best software engineering and data engineering tools and methodologies, coupled with cultural changes and monitoring controls, to create trust in data and accelerate analytics delivery. It puts analytics at the heart of an enterprise. DataOps acts as a bridge between data providers and consumers by facilitating bidirectional communication flow to improve the quality of the data value chain.
DataOps combines Agile, DevOps, and statistical process controls to provide delivery efficiencies and increase the value of the data value chain.
Figure 2: DataOps architecture
Let’s delve into what it takes to implement a successful enterprise-wide DataOps strategy (Ref3). The six key steps are as follows:
#1 Establish the DataOps function with a culture of deeper collaboration
It’s critical to establish the DataOps function with senior stakeholders of the enterprise with representation from both business and IT. Define the operating model, establish KPIs across the data value chain pertinent to the DataOps function and the enterprise as a whole, track through DataOps implementation, and continuously refine the KPIs to further increase the data value.
It is important to establish an enterprise-focused strategy. Key stakeholders, such as the Chief Information Officer (CIO), Chief Technology Officer (CTO), Chief Data Officer (CDO), Chief Digital Officer, Chief Analytics Officer (CAO), Chief Data Architect, Chief Data Scientist and Head of Business Functions and Finance representatives, must be included.
#2 Leverage/set up Enterprise-level Agile and DevOps capabilities
Most modern enterprises have either built or are in the process of building Agile and DevOps capabilities. Data & Analytics teams should, therefore, join forces and leverage the enterprise’s Agile and DevOps capabilities to:
#3 Automate the provisioning of data, analytics, and AI infrastructure
One critical principle of DataOps is the ability to scale IT infrastructure in an agile manner to meet the rapidly evolving business requirements. Many commercial and open-source tools are available to automate infrastructure. Regardless of the hosting environments (cloud/on-premise/hybrid), enterprises should rely on infrastructure as code to set up, configure, and scale Data & Analytics platform services. Version control the code similar to the application code or analytics code. Ensure automation in security and compliance requirements as well.
Examples of data infrastructure automation include:
#4 Establish multi-layered data architecture to support a variety of analytical needs
Modern-day data platforms are complex with varied needs, so it’s important to design your data platform in alignment with business priorities to support myriad data processing and consumption needs. One of the proven design patterns is to set up multi-layered architecture (raw, enriched, reporting, analytics, sandbox, etc.), with each layer serving a different purpose, and increase the value over time.
It is also important to establish the owners across different layers. Register data assets across various data layers to support your enterprise data discovery initiatives. Set up data quality controls across various layers to create data assurance and trust. Set up appropriate data access controls so that data providers and consumers can safely share and access data and insights. Containerize and scale these services across various analytical engagements as reusable services.
#5 Build data value chain orchestration pipelines
Orchestration plays a pivotal role in stitching together the data flows from one layer to another to bring “ideas to operationalization.” Leverage containerization capabilities to ensure that the sub-components of these orchestration pipelines are scalable and reusable across the enterprise.
Key pipelines supported by DataOps are:
#6 Define and implement a holistic monitoring and alerting framework
Build a comprehensive monitoring and alerting framework to continuously measure how each stage of your data value chain responds to the changes. Socialize these KPIs with the DataOps function to take the right course of action and build reusable artifacts where possible.
Benefits of DataOps
DataOps is the future of data management
Given the rapid and constant changes in data, enterprises need a comprehensive solution to bring together every part of a business into one pipeline. That’s what DataOps enables. It drives companies to use data more efficiently, leveraging the right tools, technologies, and skill-sets.
With better end-to-end data pipeline visibility, automated orchestration, higher quality, and faster cycle times, DataOps enables data analytics groups to better communicate and coordinate their activities. Clearly, DataOps is the antidote that organizations always wanted to address data value chain challenges, and it will become a critical discipline for those who want to thrive in the new age data landscape.
References
Ravi Varanasi
Partner, Data Analytics and AI, Wipro
Ravi Varanasi has more than two decades of experience in data, analytics, cloud, architecture, innovation, and thought leadership. Ravi has an impressive mix of working for some major banks and consultancies, at various capacities across business functions like pensions, investments, IT operations, global standards, commercial banking, wealth management, payments, and anti-money laundering.
Dilip Maringanti
Partner, Data Analytics and AI, Wipro
Dilip Maringanti has worked with global financial institutions and retailers in setting up their data strategies and leading many data transformation engagements. He specializes in providing strategy, advisory, and architecture services in multi-cloud, data, and AI spaces.