What is semantic data modeling?

Semantic data modeling can be defined as the development of descriptions and representations of data in such a way that the latter’s meaning is explicit, accurate, and commonly understood by both humans and computer systems. This definition encompasses a wide range of data artifacts, including metadata schemas, controlled vocabularies, taxonomies, ontologies, knowledge graphs, entity-relationship (E-R) models, property graphs, and other conceptual models for data representation.

3 Likes

That’s a great summary. One of the most overlooked aspects of semantic modeling is that it applies to dynamic systems as well as statically-modeled concepts. Temporal graphs fits neatly into both (transient and point-in-time) and so do communications protocols. In fact, the OSI 7-layer model and accompanying OSI/X-CCITT standards provides one of the best examples of how semantics facilitate communications across complex distributed environments and with clear separations covering: transmission, negotiation, encoding, session, and application level concerns. We take this pretty much for granted but our present-day interconnectedness would not exist without these differing semantic interacting layers at work. This model also illustrates how to break a complex model down into separately-manageable and discrete sub-models that work together as a whole.

1 Like

In practical terms, this for me consists of two parts:

  • creating examples, application profiles, RDF shapes to describe the relevant semantic data
  • ontology research, reuse, extension, engineering

Before these steps come competency questions, dataset research, data schema harmonization.

Just to repeat something @VladimirAlexiev wrote for emphasis: creating examples, creating examples, creating examples. A search on Google Scholar for “ontology” finds hundreds of projects that were all about the creation of ontologies with no actual data associated with them.

I have found that an agile approach of starting with a small model, proving its value with data that conforms to that model, and then iterating up from there results in data models that are so much more useful than some giant top-down thing.