Learn more
- Sep 26, 2008
Why Faviki is able to suggest tags in 13 languages
Just got in touch with Vuk MiliÄić from Faviki recently – Faviki has been selected as a featured project on Google code, and in that context, Vuk describes the process of how Faviki retrieves its suggestions in a little more detail. It’s really interesting! It also sheds more light on the way that DBpedia is used in Faviki: Not immediately for the retrieval of tags, but for the translation of tags – long live the smartness of linked data!
- Faviki fetches a web page and extracts a core text (without HTML and non-relevant content).
- Then it tries to figure out if a content is in English. If it isn’t, it is sent to Google language API, which detects the original language automatically, translates it into English and returns the translation.
- The content is then sent to and analyzed by Zemanta API, which then finds relevant links. Faviki uses links from English Wikipedia – titles are used as semantic tags.
- If users language is not English, we must translate them. Using DBpedia datasets “Links to Wikipedia Article†, we can find names of Wikipedia’s titles in one of 13 languages. These datasets actually contain the connections between English Wikipedia articles and articles from Wikipedia in other languages.
- Finally, suggested tags are offered to a user.
Read the whole blog post on Vuk’s Faviki blog