This morning’s first session was dedicated to Using the Web of Data, or, as Alan Dix put it: “In the end, it’s not about data – it’s about use!” Alan and Richard Cyganiak were the keynoters for this session.
To start with, Alan pointed to the two sides of achieving the web of data: Firstly generating the web of data (a billion triples, as mighty as this may sound, is actually tiny, says Alan) and then, secondly, accessing the web of data.
With regard to generating the Web of Data, Alan distinguished between top down and bottom up approaches, counting to the former the creation of the web of data from legacy sources (i.e. where you take existing data and semantically lift them, e.g. from structured data) or web scraping such as DBpedia‘s extraction of data from Wikipedia.
N.B.: This notion of ‘top-down’ does not imply a hierarchical relationship, but rather means that there is already a plan for what is going to be put on the web of data (e.g. ‘all semi-structured information on Wikipedia’ or ‘dataset XY from project Z’). The bottom-up idea here implies that data is added as the result of an action, or interaction, as the user/s go, e.g. relationships are created as the user expands his or her social network. For instance on Amazon, user interaction is used to generate semantics: People do not tell Amazon what they like, they simply buy it.
Having relationships of course does not imply yet that these relationships are part of the Semantic Web. Or, as Alan put it, “why should I be RDFizing my online presence if none of my friends are?”
Please take a look at the PDF of the Alan’s slides (2,4 MB) – what I cannot reproduce here is a chart he developed, which was very useful for describing current scenarios on the web and which posed a twofold question:
Does a website/platform have the web of data implemented? YES/NO
Is the web of data on ta website/platform apparent to the user? YES/NO
The possible combinations (YES/YES, YES/NO, NO/YES, NO/NO) provide a good heuristic tool for describing what is currently available, with and without the Semantic Web. Take, for instance, the shiny interface of Talis’ Project Cenote: Cenote’s vision is to “make library data visible in many contexts, inside and outside of the library, making the data much more accessible and visible to a wider audience – benefiting current and potential users of library services wherever they are.” On Cenote, the user doesn’t see that it’s got the Web of Dat in it – it is actually implemented, but not in a way that is apparent to the user.
On the other end of the spectrum, you have a platform like Facebook: Alan referred to Facebook as “the user’s own web of data”, i.e. web of relationships: The user is aware of these relationships (they actually shape his interaction and communication with the site), and the (numerous!) apps on Facebook continually add relationships, but, regrettably, insulated from one another and not using RDF (and don’t you try to take data out of Facebook!).
Two examples of public data that Alan cited and that grow as people/institutions add data do them are Freebase (the “open database of the world’s information” – see previous posts on this blog about Freebase) and Swivel. Swivel allows people, institutions, anyone to upload and explore data, also featuring official data sources such as (links go to their Swivel pages): New York Federal Reserve Bank, UNESCO Institute for Statistics, DukeResearch or EUROSTAT. According to Alan, there is already more data on Swivel now than in the whole Linked Data cloud.
Alan also mentioned the Social Graph API – o yesterday evening Luca Hammer (one of the web 2.0 people who had joined the Open Hacking Session) introduced me to the WordPress Plugin “Meet your commenters” – Meet you commenters uses Social Graph to find social relations on the web, and adds these data to the commenter profiles it creates in WordPress.
Image via WikipediaOn a different note: I took sometime today to explore Alan’s homepage and found the cute Christmas Cracker’s application which was first developed in 1999 and which is now also available on Facebook. As trivial as it may sound at first – sending virtual Christmas Crackers (with more than 5000 possible combinations!) is a good showcase for developing Human Interaction Scenarios, and a number of papers have been written about the application. Here is the casestudy which Alan recommends to begin with: Designing experience – virtual Christmas Crackers.
The abstract and a list of links to all websites and demos Alan discussed can be found here. Full reference: A. Dix and R. Cyganiak (2008). Using the Web of Data. Keynote at WOD-PD 2008 | Web of Data Practitioners Days, Vienna, Austria – Oct 22-23, 2008. http://www.hcibook.com/alan/papers/WOD-PD-2008/
Even if you have not met Richard Cyganiak in person, you have certainly come across one of his creations: The Linked Data Cloud. Richard is a research assistant at DERI Galway. In his demo, he gave us the opportunity to gain hands on experience, introducing a tool he dubbed Snorql, which is basically an easier to use version of a SPARQL-endpoint, as it already has the required prefixes ‘pre-installed’:
Using the Snorql interface, we could explore the dataset we had created collaboratively during Keith Alexander and Yves Raimond’s session. Writing SPARQL queries manually can be a challenge, but is next to impossible if you (like me) don’t know the syntax. But today we could just copy and paste all the queries from a website Richard had put up prior to his session – thanks a lot for the excellent preparation and demonstration!
Richard also showed a couple of RDF browsers in action, e.g. the Tabulator Plugin (“a Firefox extension which allows Firefox to handle data as well as documents”), or the Marbles Linked Data browser which is running right on beckr.org/marbles; enter, for instance
http://api.talis.com/stores/wod-pd-sandbox/items/People/JanaHerwig (learn more about Marbles here).
Thank you, Alan and Richard – the combination of talk and demo was indeed a perfect intro towards using the Web of Data.