Learn more
- Oct 22, 2008
Semantic Desktop, Lifting and Human Language Technology [WOD-PD]
The next session at WOD-PD was given by Leo Sauermann (German Research Center for Artificial Intelligence DFKI, Germany), and Brian Davis (DERI Galway, Ireland). Leo introduced the idea of the Semantic Desktop, and more specifically, the Nepomuk Social Semantic Desktop. There’s good article about Nepomuk on Linux.com, written by Bruce Byfield on August 26, 2008, from which I quote the following, enlightening passages:
Ansgar Bernardi, deputy head of the Knowledge Management Department at Deutsches Forschungszentrum für Künstliche Intelligenz (DFKI, or the German Research Center for Artificial Intelligence) and Nepomuk’s coordinator, explains, “The basic problem that we all face nowadays is how to handle vast amounts of information at a sensible rate.” […] “The point is, you have a vast amount of information on your desktop, hidden in files, hidden in emails, hidden in the names and structures of your folders. Nepomuk gives a standard way to handle such information.”
At a high level of generalization, Nepomuk has three main aspects, according to Bernardi. First, there is a standard framework for annotating pieces of information so that connections can be made between them. Second, there are ontologies, the sets of “documented shared understanding” or common concepts that can be defined for particular types of information, such as bio-science or computer desktop use. Finally, there are the tools for making or using the annotations and ontologies, what Bernardi calls the “workspaces that connect to other workspaces and help you in your day to day activities of collecting information, structuring it, making sense of it, and creating new information and communicating it.”
Leo has provided the relevant download links for those who “want to get their hands dirty” with Nepomuk (as he put it) on his blog. Leo Sauermann and Ansgar Bernardi also contributed an article about the Semantic Desktop to the recently published Social Semantic Web volume – a preview of the article is available here (in German – I’m sorry!).
Brian Davis‘ part of the talk focused on Lifting and Human Language Technology (HLT) for the Semantic Desktop – Semantic Lifting means to capture semantics and translate them into ontologies. Human language technology (HLT), in its broadest sense, can be described as computational methods for processing and manipulating language (for instance text analysis).
One of the goals of the Semantic Desktop is speech act detection for email – speech act here as defined by John Searle. At its most basic definition, a speech act is simply an utterance, but is also often understood more specifically as an illocutionary act (which is a term introduced by John L. Austin in How to do things with words), or a ‘performative utterance’, meaning that by saying something, one actually does something. For instance, the sentence “Please have the document ready for Workshop 1.” contains an instruction: It informs the reader about the requirements for a particular event, and asks him or her to meet these requirements.
Brian also introduced Roundtrip Ontology Authoring (ROA), which is a process that allows non-expert users to author or amend an ontology by using simple, easy to learn, controlled natural language. The process is a combination of Controlled Language for Information Extraction (CLIE) and Text Generation which is developed on top of GATE. ROA is documented on the the Nepomuk website; for further information about CLIE, read this article by Valentin Tablan, Tamara Polajnar, Hamish Cunningham and Kalina Bontcheva: User-friendly ontology authoring using a controlled language (PDF, 64 KB).