Data meshes have been the talk of coffee tables and conference rooms alike in terms of technology and philosophy.
So, what is a data mesh?
Sadly most material on the subject is thick with buzz words for VP and C-level execs – completely unparsable to the engineers. Before we get into the depth of data mesh, it is important to know that the motivation behind data mesh is sound but also actively intuitive and practical.
What is a data mesh?
It is an analytical data architecture and an operating model where the data gets treated as the product and is owned by multiple business domain teams – which intimately know and consume data.
In the modern digital world, data is ubiquitous and the byproduct of every digital action we undertake. Every system, process, sensor – everything generates data. At the same time, the latest technology makes it easy to collect and store data for businesses to leverage by making informed, better decisions.
Additionally, creating a more custom and tailored experience for the customer base.
That being said – many organizations still struggle to empower and enable their employees to make timely and informed decisions. The centralized data architectures and platforms lack offering insights with the flexibility and speed that the scaling businesses require.
This is where data mesh comes in as the solution – helping you not to mesh your business operations (pun intended).
The data mesh simply applies modern software engineering principles and the repository of learning from making robust internet-scale solutions. Solutions are set to unleash the true power of the potential of enterprise data.
Make Teams Self-Servicing
If the goal of an organization using data mesh is to empower teams by putting them in charge of their success and destiny, then self-service is key.
Rather than letting the team act as intermediaries resulting in a loss of context and urgency – data producers create and serve data directly to customers. From there, the customers can customize data to suit their needs. Additionally, data producers will no longer have to maintain or build any type of specialized infrastructure. As such, infrastructures offer no value, and consumers also won’t have to depend on others for their needs.
The challenge, however, is to get all producers and consumers of data effective when presented with varying use cases, stages of technical sophistication, different stacks, tools, and different time constraints.
Four Principles of Data Mesh
The four principles of data mesh include;
Domain Ownership
This involves decreasing the hops between data consumers and data sources.
Data as a Product
Integrating design thinking for data by encapsulating relevant code, infrastructure, and policies within a cohesive product.
Self-Serving Data Framework
Decreasing and removing friction and dumping the technical complexities from the interaction of data producers and consumers.
Computational Governance
This principle involves the automation of governing policies without having a centralized authority.
What are other ways that can make teams successful?
Single-Way Production and Consumption of Data
Follow a standardized manner of making data available for consumption; this data must be discoverable and usable by consumers, and this is critical.
Dead Simple Adoption
Teams must not be left alone to manage and deploy infrastructure for serving or consuming data. Such infrastructure should be offered as a service – similar to application developers having systems like Kubernetes. Such systems must be able to handle the gray parts of production application deployments – a serverless system, if you may.
Normalize SQL
For leveling up with the customers/consumers, you must speak in terms they are familiar with. Almost everyone is familiar with SQL, and it is all that is needed anyways. Do not send your team on a wild goose hunt to learn new languages or paradigms.
SQL is the data’s lingua franca.
Your goal should be to bridge the specific domains and nurture accessibility for everyone.
Open and Accessible
Systems built to bridge users’ communities have to be accessible and open.
It is quite obvious – but if people find it hard to connect a system to a data mesh – they won’t do it, and you will lose value. This is one more reason to like real-time messaging systems.
Real-Time Messaging Systems
Relatively speaking, offering robust connectivity for real-time and batch systems is easy. Instead of getting into a fight with teams to re-platform the applications on a centralized infrastructure, give them the independence to choose what works for them. This also offers them a way to get the required data in their ecosystems.
Final Word
In this post, we have discussed data mesh, its challenges, and how it helps solve the modern-day digital world data problems of enterprises and data estates. We touched upon a few topics that will need deeper thinking to enable the correct implementation. We have purposefully restrained ourselves with the ‘what’ parts of the data mesh, as we intend to pursue the ‘how’ in our upcoming article.
Stay tuned with Memphis; we hope this post is useful.