Taxonomy/Ontology KGC presentations Q&A

hhedden · May 20, 2022, 1:19pm

At a moderator’s request, I am reposting the Q&A that I had posted in the KGC Slack channel topic-taxonomy, which compiles the questions and my answers that were submitted via Airmeet for the two presentations that I gave at KGC.

KGC tutorial presented May 3 “Foundation for a Knowledge Graph: Taxonomy Design Best Practices"

What are the downsides/problems of having multiple skos:broader?
Excessive multiple broader concepts (polyhierarchy) compromises the overall hierarchy “tree” structure, which can be a helpful guide when displayed to end users. If there are consistently multiple ways to classify concepts, then perhaps a faceted taxonomy design would be better. Sometime polyhierarchy is created in violation of hierarchical relationships best practices, in order “promote” visibility of subcategories, such as in e-commerce implementations.
How is “instance of” modeled in SKOS?
There is no specific designation for “instance of” in SKOS. It is simply a kind of broader (hierarchical relationship) between two concepts. ANSI/NISO Z39.19 does indicate the specific types of broader term/narrower term (BT/NT) as: BTI/NTI (broader term-instance and narrower term-instance).
Aren’t comparative adjectives a kind of hierarchy e.g. big, bigger, biggest?
In a certain sense, this can be considered a hierarchy but it is not a taxonomic or thesaurus hierarchy. There are other hierarchies (org chart roles, military ranks, Maslow’s hierarchy of needs, Bloom’s taxonomy for education), which are also not taxonomy/thesaurus/ontology hierarchy of broader/narrower or class/subclass. I wrote a blog post on this topic:
The Accidental Taxonomist: Hierarchies in Taxonomies, Thesauri, Ontologies, and Beyond
Standards for ‘wording of concepts’ (pg# 62): good guidelines, however, are these based on industry standards or generally used in your practice?
These are all based on the ANSI/NISO Z.39.19 standard, expect for my recommendation of which form of capitalization to use.
Here I repeat slide #62:

Unambiguous; understood even out of context of the hierarchy.
Example: Nursing Certification, rather than Certification as narrower to Nurses
Consistent capitalization: initial capitalization is recommended.
Example: Corporate finance, rather than corporate finance or Corporate Finance
Single words or multi-word phrases; Nouns or noun phrases
Example: Employment; Part-time employment
Countable nouns are usually plural
Example: Occupational accidents (countable); Occupational health (not countable)
Adjectives alone may exist within term lists of characteristics/properties (metadata or facets), but not within hierarchical taxonomies or thesauri. For example, colors, sizes.
Parenthetical qualifiers may be used for disambiguation, not modification.
Example: Walnut (wood)
Avoid term inversions (e.g. noun, adjective) because labels are searchable
Example: Racial discrimination, not Discrimination, racial

Can we use terms of proprietary cots (Commercial-off-the-shelf) application database in creating SKOS for defining pvt ontology?
This question is not quite clear. Proprietary commercial database management systems or taxonomy management systems, do not usually come with taxonomy terms. Some taxonomy management software vendors (including PoolParty) partner with a taxonomy vendor to provide prebuilt SKOS taxonomies for editing and enhancing. Prebuilt ontologies are a separate matter, and more of these are available from various sources.
What about SHACL?
SHACL (Shapes Constraint Language, a language) is for validating RDF graphs against a set of conditions. In particular, SHACL is used to check constraints, which is something the open world reasoning cannot do. It is important to perform validation, whether by SHACL or some validation method built into software for editing SKOS taxonomies and OWL ontologies. PoolParty has a validator built in, which is a more convenient feature to do conformance checks for data imported into PoolParty. So, currently PoolParty does not use SHACL, except for providing two SHACL data processing units (DPUs) for the ETL pipeline.
Can the Ontology be a realistic reflection of a Business/Org, whereas a Taxonomy a scheme to facilitate data analytics/mgmt?
To a certain extent, yes, perhaps. The taxonomy itself contains the specific concepts to tag content. However, the ontology also facilitates data analytics and management.
Are facets labels?
It depends on how you define “labels,” Facets are not taxonomy concepts used in tagging content. Facets are grouping of concepts of a certain type or aspect. Facets have labels, although taxonomy concepts also have labels. Using SKOS, it is common practice to based facets on concept schemes, but that is not required.
Is this an example of an ontology ? Person (male, female), Relationship (sibling, uncle, parent, child, son, brother, sister, spouse ,husband, wife, neighbor, employee)
Not quite. The ontology needs triples of person-relationship-person.
These relationships should be constructed as reciprocal/inverse pairs, each going in opposite directions. If you want to have gender-specific relations, then you need to create subclasses of Persons for Male person and Female person.
How do you define the relevance / performance of a term while evaluation of taxonomies?
This pertains to the tagging of the taxonomy concept to content. Relevance/performance is within context of a set of content.
How much fine-grained should a taxonomy/classification be? Is there a preferred standard or general practice in industry?
It depends on the use case. However, a guideline to keep in mind is how many content items should be retrieved with a query on the taxonomy. A good guideline would be not much more than would display in a single screen view (such as 10 document search results, but screenviews vary based on the UI). Then the question is what is involved in a single query: a single taxonomy concept or a combination of taxonomy concepts as is done with facets. Thus, if the taxonomy is constructed as faceted, then concepts may be less granular than a single hierarchical taxonomy or thesaurus.
What models/standards would be the best to build the taxonomy of skills’ name?
Models and standards are general and not domain specific.
What is your opinion about “alternative label” vs. “semantic relation between things, e.g., with owl:sameAs”
The SKOS notion of preferred versus alternative (not preferred), does not exist in owl:sameAs, whereby the linked individual or class names have equal standing.
Also, because the preferred label is “used for” the alternative label, the alternative does not have to be equivalent but could be conceptually narrower.
Also owl:sameAs is often used to link equivalent individuals or classes in different ontologies. SKOS uses skos:exactMatch for this purpose

KGC presentation May 5 “Taxonomy-Driven Ontology Design” (http://www.hedden-information.com/wp-content/uploads/2022/05/KGC-Taxonomy-Driven-Ontology-Design.pdf) on Wednesday May 5.

How did you link/map a SKOS concept, which shall be an individual, to an OWL class?
Regardless of whether it’s a SKOS concept or an OWL instance, the link to an OWL class is the same; it’s called type, and this designation is from RDF (thus, not specific to SKOS or OWL). You don’t need to know this is called “type,” when using a tool such as PoolParty, where I showed in screenshots how the link is done through UI menus. However, if you want to view the triples in PoolParty, this link predicate is indicated with this URI for “type”: https://www.w3.org/1999/02/22-rdf-syntax-ns#type
How would you properly link multiple ontologies in an effort to reuse Vocabularies? would owl:sameAs be a good candidate?
Linking is not necessarily done for the purpose of vocabulary reuse. If you reuse a vocabulary (whether a taxonomy or ontology), you simply take the URIs as they are from their source. If you want to extend an existing vocabulary (whether a taxonomy or ontology) by reusing another, then you would not have identical entities already.
owl:sameAs should only be used between entities (classes or individuals) that are exactly the same but just have different names/labels. To link SKOS taxonomies, you use the mapping relations (exactMatch, broadMatch, etc.); to link entities between different ontologies, use the semantic relations (datatype properties) that exist for either ontology.
In your opinion, what are the most critical requirements to create a great ontology and taxonomy?
Despite a lot of interest in vocabulary and ontology reuse, I would still emphasize focusing on serving the specific use case(s) and user needs.
A lot more could be written here. So, I suggest consulting the Knowledge Graph Cookbooks, available as a free PDF download at: https://www.poolparty.biz/the-knowledge-graph-cookbook
How do you distinguish between say Organization as a thing versus data about Organization as a kind of data? Especially where meaning arises from something other than data e.g. legal constructs?
“Organization as a thing” would be a SKOS (taxonomy) concept Organization. “Organization as a kind of data” would be an ontology/custom schema class (defined in an ontology) with properties like CEO, number of employees, etc.
In PoolParty, the two can then be connected by creating a hierarchy of concepts (perhaps a subtree of the Organization concept) representing individual organizations; assigning the class Organization to them; and then adding the custom properties defined by the schema.
Can you link the use of a label to the context in which that label is used for that concept? How do you define ‘context’?
This sounds like a case for SKOS-XL (SKOS Extension for Labels SKOS eXtension for Labels (SKOS-XL) Namespace Document - HTML Variant, 18 August 2009 Recommendation Edition), which I did not have time to mention. With SKOS-XL, labels become resources and you can attach relations/properties to them, such as indicating a specific context, as you define it.
Heather Hedden