How do you explain the Semantic Web to people?

My mother isn't at Semantic Overflow and does not know about the Semantic Web - well, she does not even own a computer. But even many people with whom I have studied computer science have still no clear idea about the Semantic Web. I'm working in this field for years now on a daily base, but every time when I am being asked what I am doing, I get the embarrassing feeling that I'm not very good in explaining it to them.

So, how do you explain the Semantic Web to people?

You may distinguish between different kinds of people, from the "mother" type to the "knows-everything-about-the-Web-except-for-the-semantics-stuff" people.

Suppose you want to buy a digital camera that weighs less than 6 ounces, costs less than $500, and can take closeups from 2" away. Every digital camera maker in the world has a database with the data that would tell you which models to consider (weight, MSRP, minimum focal distance), but to actually find out which, yourself, you'd have to figure out who all those makers are, go to each web site, look up each camera's specs page and find and check those three bits of information. You'd probably make a spreadsheet as you went, or write them all down on a piece of paper or something. Or you'd hope somebody else already did this and put up a page about it that you can find in Google, because it would take forever and be annoying as hell.

This is a pathetic state of things. If there's one thing computers are great at, it's repetitive tasks involving data. It ought to be even easier for some computer somewhere to answer this very precise, well-defined question than it is for Google to read all those pages and try to find out that would help you. But it isn't, because all those manufacturer databases are on different computers, in different formats, using different terminology.

The Semantic Web is an idea and a project to try to make it possible for questions like this to be answered on the internet. It's a way of representing and exchanging data such that different clumps of that data, like each camera manufacturer's product-spec database, can be put together, reconciled, sorted, filtered, analyzed, extracted, etc., not by people slowly poking through readable web-pages one at a time, but by computers whizzing through thousand and millions of individual bits of data at a time on our behalf.

You go into a library looking for books to research a difficult question... you think you'll need several books to find the answer... you want an answer quickly so that you can get back to watching Dr. Phil at home.

At the door, a sprightly woman greets you and tells you that she has read a great many of the books in the library. She admits that her English isn't great, but she has a sage glint in her eye...

...so, in simple English, you ask her your difficult question. She understands, and [in broken English] answers directly, giving you the library codes for the several books she has read the information from in case you need to verify.

With her answer in hand—and having thumbed through one or two of those books to verify the answer—you return home to watch the remainder of Dr. Phil.

The End.


Thereafter, we're researching:

  • how to teach the sprightly woman with the superb memory to better understand the books she reads and the questions she answers;
  • how people should write [esp. the boring, purely informational] books in a way the sprightly woman can better read them;
  • how sprightly people in different libraries [who often speak different languages] can communicate between themselves to answer queries

The way I've described it to people before is by explaining just two key concepts:

  1. That the concept of a "web address" can be extended to identify not just documents but other things. Even if we can't use the address to retrieve those things, it can still serve as an identifier.
  2. We can describe things using triples. Triples are like the subject-verb-object structure they'll be fairly familiar with from most spoken languages.

I find that most people grok the concepts above with less than five minutes explanation. A piece of paper to draw a rudimentary boxes-and-lines graph containing resources and relationships they're likely to be personally familiar with will also help.

Syntax and query languages are implementation details that most non-technical types will not be interested in, so there's probably no need to explain.

People tend to be more interested in what opportunities the technology enables. Things like: it could allow your computer to understand that this document you're reading, and that document over there are about the same topic. It could provide a common data format that your diary and your digital photo library device can use; so that when you're flicking through previous dates in your diary, you can instantly find photos from those dates. It allows computers to understand and deal with the connectedness of real life.

The Semantic Web revolves around the notion that we can represent a lot of useful knowledge as simple statements that a computer can understand. Each statement contains just three things: a subject, a verb (or adjective) and an object. For example: "cats, are, mammals". Another example might be "a mammal, has, teeth" and a third might be "lions, are bigger than, mice".

These statements can be combined to create new knowledge. For example, we know that mammals have teeth, we know that cats are mammals and thus we know that cats have teeth. That's obvious to you and me but to a computer that's new knowledge that it has to learn and the semantic web gives it a way to figure out new knowledge like this using 'reasoning'.

Because the Semantic Web also includes the notion of a universal language that all computers will agree to speak they know that the words they are using have the same meaning (unlike the English language where words can have lots of different meanings). The Semantic Web also defines how computers can exchange knowledge so now multiple computers can work together to answer questions that a single computer could not answer previously.

In a nutshell, the Semantic Web will allow computers to answer questions that could not be answered before using normal web searches. For example, there is no way to search the web today for "a vegetarian restaurant in New York near a museum that's running an exhibit on Japan" but with the Semantic Web that will become possible.

Someday soon Semantic Web technology will allow computers to complete school examinations and even play and win a game of Jeopardy against humans!

  1. Fedarated databases
  2. Homogenisation, contextualisation and reasoning

= An application of collective intelligence

semanticweb.com has done a series of blog posts very relative to your question. The original post was a call for video submissions, limited to 90 seconds, to pitch the semantic web to an audience of your choosing (technically savvy or not). They have had several responses... some of which I list here:

And my favorite: Semantic Web Elevator Pitch for… Government and Citizenry?