May 13, 2021
Companies need to take time to get governance frameworks, unit economics and integration capabilities right, or they risk paying the price in the long run, according to members of Protocol's Braintrust.
Good afternoon! In this week's Braintrust, we asked the experts about the best ways to craft a data strategy by having them identify the corners that can't be cut when that strategy is being set up. Specifically, we wanted to highlight those areas where allocating more resources and time up front could pay dividends down the line. Questions or comments? Send us a note at email@example.com
CEO at Informatica
As we enter the second wave of digital transformation where we will see rapid acceleration towards a cloud-first data strategy, data governance will remain the one corner that you absolutely cannot cut.
With the proliferation of data across multiple systems, multiple clouds, hybrid infrastructures and increased access to data across the enterprise, comes the greater responsibility to protect and safeguard your data with a strong governance and compliance framework in place. The most important thing companies must invest in as they design their data strategy is establishing a data governance framework that does the following:
- Defines and documents standards and norms, accountability and ownership of data so only the right people have access to the right data.
- Optimizes for trust, privacy and protection of data and in instances of highly-regulated industries like health care and financial, sensitive data.
- Adheres to global and regional compliance regulations like GDPR and CCPA.
Data governance is the holy grail of enterprise data strategy and also the foundation to building and maintaining customer trust and loyalty in a digital-first world.
Chief Strategy Officer at DataStax
The essential purpose of an enterprise data strategy must be to keep the company competitive for as long as it hopes to be in business. Google's breakthrough ability to remain competitive by generating multiple novel $1 billion-plus businesses per year is based in part on the "resource economy." This is an internal standard that lets new data products scale up and down with demand, while preserving the unit economics appropriate to the value of those products.
Over a 10-year period, you can assume that the number of data requests from new classes of applications will increase by two orders of magnitude (100x) or more. These patterns will be taken for granted by users and demanded by developers, as machine learning becomes the basis of most software.
How much will serving a data request cost, roughly, per response? What will that cost need to be in a year? Ten years? What is your enterprise's CAGR of original and replicated data? How can you trace these costs to the data products and the apps that consume them?
You don't have to get these answers exactly right. The key is to ask the questions so that the organization learns to think in curves. It is the shock of seeing the future costs that will galvanize the entire company to collaborate on a data strategy.
Seth Dobrin, Ph.D.
Global Chief AI Officer at IBM
Businesses are facing many hurdles when it comes to data — how do you bring it all together, how do you make sense of it and how do you make it usable for AI? In fact, a recent survey from IBM shows the proliferation of data across the enterprise has resulted in over two-thirds of global IT professionals drawing from more than 20 different data sources to inform their AI. As businesses develop data and AI strategies, it's essential that they have the tools and processes in place to securely and compliantly connect the right data, to the right users, at the right time, anywhere it's needed. There are three corners not to cut when it comes to designing an enterprise data strategy:
- Tackle data integration: A data fabric can help break down silos and automate the integration of data to ensure it is viewable and usable by all. The data can be processed, managed and stored as it moves within the data fabric so business users have a single point of access to find, understand, shape and utilize data throughout the organization.
- Bring AI to your data: For businesses today, the ability to train and run AI on data anywhere is essential. New capabilities like federated learning will help businesses apply machine learning techniques to situations where data cannot or should not be moved due to reasons such as data privacy, secrecy, regulatory compliance or simply the size of data involved.
- Data must be protected and secure: Companies must develop their data and AI strategies with security built in from the start, then remain vigilant about ensuring their systems are protected during production and implementation.
CEO at Talend
Sound business decisions demand a robust enterprise data strategy. At Talend, we believe data quality, and in a broader sense data health, is vital to a successful enterprise data strategy.
Data management focuses on the mechanics of moving and storing more data rather than maintaining the health of data. In trying to manage data, companies' inability to ensure data is clean, complete and compliant is resulting in digital landfills of corporate information. This needs to change. Data health is the foundation of a successful data strategy because it recognizes the need to create and maintain standards for the well-being of corporate information.
Data health is Talend's vision for a holistic system of preventative measures, effective treatments and a supportive culture necessary to ensure that the data fueling decisions and direction is sound. It will allow companies to answer basic questions about their data that remain challenging for many to address — where it resides, who has access to it, whether it's accurate and how much it's worth. Ultimately, data health will help organizations understand and communicate — in a quantifiable way — the reliability, risk and return of their most highly critical business asset.
Lena Mass-Cresnik, Ph.D.
Chief Data Officer at Moelis & Company
The ultimate data strategy goal is to turn data into value, and while there are five- and even seven-step plans for how to do that, one element is core to effective execution: A data strategy must be aligned with business strategy. The initial focus needs to be on target business outcomes, not defining which technologies to use from the outset. What are you solving for? Then you can design and implement technical solutions with a comprehensive understanding of the key business problems across a broad perspective.
Because AI/ML offers numerous opportunities, setting priorities, determining a starting point and identifying use cases requires a deep understanding of the methodologies, opportunities and strategic vision, as well as strong relationships with business and product counterparts. While CDOs and CIOs can quickly experiment with AI/ML algorithms, they must concurrently plan how to scale and productionalize models. The entire life cycle needs to be explicitly discussed and enforced in the data strategy and incorporated into the organization structure with partnership among business, data science and technology teams. This can simultaneously lead to enterprise-level efficiency and faster innovation.
While a holistic data strategy will have implications for data and technology, people and organizational issues are also crucial. Establishing a data platform, infrastructure, talent and culture requires time, and those results become assets along with the data. Working with data assets is critically different from building products, programs or interfaces. Careful management, strategic investment and persistent effort are required. CIOs and CDOs can position themselves for effective execution over the long-term by aligning data strategy with business strategy.
CIO at XPO Logistics
Data analytics are increasingly driving decision-making in business. A comprehensive enterprise data strategy — from collection to collaboration — can pay dividends through increased efficiency and speed. There are three key elements to designing a data strategy that should never be compromised:
- People: Technology is about the people, not the solution. Build a team of engineers and data scientists with a common understanding of the strategic goals, who can apply cutting-edge thinking to commercial practices. Data is dynamic and it's crucial to provide opportunities for the team to engage with the latest data science tools and advancements. Career development helps retain and grow talent within your organization.
- Data collection: Organizations amass huge quantities of data every day but not all data may be relevant for analysis. Insist on having clear guidance on data sources, data completeness, data freshness, data size and data storage. Each one of these attributes can have a significant impact on business value and usability, system performance, overall cost, system architecture and information availability. For example, our company's digital freight marketplace, XPO ConnectTM, uses proprietary algorithms to turn massive amounts of data into relevant information in real-time, so that our customers can buy, sell and manage transportation efficiently.
- Feedback loops: Strong feedback loops support a culture of continuous improvement and the ability to measure results. Feedback loops can be incorporated within an organization through many ways including social media, staff surveys and daily meetings. These all create opportunities to collect feedback and generate ideas from the team.
In the era of big data, data should be viewed as a strategic asset to unlock value.
Co-founder and CEO at Cockroach Labs
See who's who in the Protocol Braintrust and browse every previous edition by category here (Updated May 13, 2021).
These days, an "enterprise data strategy" should simply be called a cloud data strategy. Cloud is the present and the future, and our decisions around how we manage and use data in this environment will determine competitive advantage and how efficiently we run our businesses over the next decade.
For transactional data workloads, cloud databases currently deliver opex/capex gains and may speed to market; however, they often fail to deliver on the resilience and scale that the cloud offers because they are not "distributed" in nature.
As many shift to a cloud-native approach for applications, they may also "lift and shift'' a legacy database or use a "move and improve" modified database in the cloud, but this limits their success. These options are the weakest links in this strategy. A cloud-native, distributed database will future-proof your strategy and your investment in cloud. This new breed of database can deliver bulletproof resilience (because even the cloud will fail) and provide the natural scale to take advantage of the "infinite" cloud resources.
Kevin McAllister ( @k__mcallister) is a Research Editor at Protocol, leading the development of Braintrust. Prior to joining the team, he was a rankings data reporter at The Wall Street Journal, where he oversaw structured data projects for the Journal's strategy team.
More from Braintrust