Why Faviki is able to suggest tags in 13 languages
Just got in touch with Vuk MiliÄiÄ‡ from Faviki recently – Faviki has been selected as a featured project on Google code, and in that context, Vuk describes the process of how Faviki retrieves its suggestions in a little more detail. It’s really interesting! It also sheds more light on the way that DBpedia is used in Faviki: Not immediately for the retrieval of tags, but for the translation of tags – long live the smartness of linked data!
- Faviki fetches a web page and extracts a core text (without HTML and non-relevant content).
- Then it tries to figure out if a content is in English. If it isnâ€™t, it is sent to Google language API, which detects the original language automatically, translates it into English and returns the translation.
- The content is then sent to and analyzed by Zemanta API, which then finds relevant links. Faviki uses links from English Wikipedia – titles are used as semantic tags.
- If users language is not English, we must translate them. Using DBpedia datasets â€œLinks to Wikipedia Articleâ€ , we can find names of Wikipediaâ€™s titles in one of 13 languages. These datasets actually contain the connections between English Wikipedia articles and articles from Wikipedia in other languages.
- Finally, suggested tags are offered to a user.
Read the whole blog post on Vuk’s Faviki blog