One area of interest that seems to be gathering some attention is the idea that data mesh and knowledge graphs (or graphs in general) can serve as complementary architecture tooling within the distributed data stack. This idea is attractive because it offers representation at two layers:
the discovery of domain data as it exists within its natural habitat (apps, services, databases, …) and the consumption protocols for cross-domain access and sharing;
the discovery of contextual organizational knowledge and learning from a holistic interconnected body that is considered a first-class citizen in its own right.
Note the distinction between data and knowledge. This to me is a powerful combination and a real enabler towards realizing a more adaptable and resilient enterprise.
The big 10,000ft architecture question I’d like to pose is: when designing for this combination, and considering both the language of producers and consumers across the organization, which approach would you choose/favor in terms of precedence:-
Mesh-first. Here, the data and domains are what they are and this informs the architecture in real terms, and how it should be consumed. The knowledge graph merely tracks the existing shape and language of the mesh.
Graph-first. Here, we consider there may be a more integrated organization language (ontology) that is better suited to enterprise knowledge, learning, and expression as a whole. The graph is both consumer and producer within the mesh and the mesh may evolve to match the shape and language of the graph.
A notion of a Hybrid approach, at least in my understanding, appears to be gaining ground in enterprise practices. Having one or more KGs as an overlay to the data mesh, such that the KG layer is definitive about metadata, the people involved, downstream use case dependencies, etc. In other words, these are two distinct abstraction layers and practices.
That idea may be closer to the “Mesh-first” category above, in the sense that the data mesh provides a foundation for what gets described in the graph?
FWIW, we tried to summarize from about a dozen large-scale practices in the Metadata Day event, which has videos, slides, etc., online https://metadataday2020.splashthat.com/
There’s also the new Data Mesh Learning community just launched this week: Data-Mesh-Learning (where I was glad to run into Phil too!)
To espouse a knowledge graph first approach would suggest that it is the first thing to come into being, most enterprises have a rich ecosystem which the knowledge graph enters into.
Hi all
I think the data mesh, with it’s necessary technical meta data to access and process data, and the knowledge graph should be able to evolve independently from each other, but in a coordinated fashion. One can influence the other.
Example: A) A data profiling running in the data mesh determines new distinct values in data source, with no mapping to business entities in the knowledge graph (KG). This can trigger a process on the KG team to close the gap in the conceptual model.
B) An AI team codifies findings about business processes they learned from using data, They do this by adding content to corresponding nodes and edges in the KG and figure out, that the data they used is not mapped to these business entities. This triggers a process to identy the data they used in the technical meta data repository of the dat mesh and correlate it to the business entities in the KG.
Both have value in itself, but togther they are more valuable.
Best
Markus