BBC World Cup 2010 portal
What
The BBC web portal for the 2010 World Cup football.
How?
Architecture
-
BigOWLIM triple store, approached through a REST service offering SPARQL query processing;
- Dynamic aggregation and publishing, page-rendering using Zend;
-
IBM LanguageWare Language and ontological linguistic platform for concept extraction.
Performance
The portal has been serving millions of page requests a day throughout the World Cup on the basis of continually changing OWL reasoned semantic RDF data. The platform currently serves an average of a million SPARQL queries a day.
Why?
Practical advantages
The developers stress flexibility and inference as reasons why they used semantic web technology over more traditional technology.
The flexibility played both on the data layer, where it "facilitates agile modeling" and allowed for increased query complexity compared to relational schema databases, as on the presentation layer.
With regards to the presentation layer, the developers stressed that they "are not publishing pages, but publishing content as assets which are then organised by the metadata dynamically into pages, but could be re-organised into any format we want much more easily".
A second technical advantage mentioned is inference. Due to the reasoning facilities of the triple store, inferred statements are automatically derived from the explicitly applied journalist metadata concepts. This made both the journalist tagging and the triple store powered SPARQL queries simpler and indeed quicker than a traditional SQL approach.
Dynamic aggregations based on inferred statements in turn increase the quality and breadth of content across the site.
Future of newsmedia
The BBC (and other media actors) are not solely using semantic web technology for direct, technical advantages. There is a general feeling that the traditional media has somewhat missed the boat when the web took of, as many publishing companies simply re-published their existing content in a static format on the net, failing to take advantage of the hyperlinked, interconnected and two-way nature of the medium.
The calculated guess is that just as the hyperlink revolutionized digital content distribution, the semantic hyperlink and URI promises will an even greater impact. And focused projects such as the World Cup 2010 will allow organizations such as the BBC to be at the forefront of such a change.
So?
Regarding the semantic web stack of technologies, the Wold Cup Portal 2010 is arguably the first large scale, mass media site to be using concept extraction, RDF and a Triple store to deliver content. It demonstrates that that the this kind of technology is ready to deliver large scale, mainstream products.
The BBC intends to continue using this approach for presenting content. They expect that "this technological approach will play a key role in the creation, navigation and management of over 12,000 athletes and index pages for the London 2012 Olympics".
Read more/sources
PLOS Publishing
The quite sizeable PLOS open access publishing platform uses the semantic web publishing framework Ambra. I am not familiar enough with the project to digg up the "why" part, but given the size of the project it will not be to play with experimental tech...