Semantic Web Company
Menu
Open
Close
Menu
  • Home01
  • Solutions02
    • backSolutions
    • Search & Analytics02
    • Recommender Systems02
    • Digital Transformation02
  • Products03
    • backProducts
    • PoolParty Semantic Suite03
    • PoolParty PowerTagging03
  • Company04
    • backCompany
    • About us04
    • Leadership Team04
    • Partners04
  • Research05
    • backResearch
    • Home05
    • Topics05
    • Projects05
    • Publications05
    • Events05
  • Careers06
  • Learn more07
    • backLearn more
    • PoolParty Academy07
    • SEMANTiCS Conference07
    • Company News07
  • Legal08
    • backLegal
    • Imprint08
    • Privacy08
    • Terms of use08
  • Contact us09

Learn more

  • Feb 17, 2011

Transforming spreadsheets into SKOS with Google Refine

  • Tools & Software, Uncategorized, Vocabularies & Languages

Looking for high quality enterprise vocabularies we recently turned our attention to the Global Industry Classification Standard (GICS), which is an industry taxonomy designed to categorize any private company. It was developed by Morgan Stanley Capital International and Standard & Poor’s and is mainly used by the global financial community to aid in the investment research process.

It is available for download as .xls spreadsheet files in several languages. Of course it would be much better to have this valuable taxonomy in a standard and machine-readable format. The Simple Knowledge Organization System SKOS is a perfect fit for a taxonomy like GICS. But how to turn a spreadsheet into SKOS with minimal manual effort?

I chose to try Google Refine for this task, as recently a promising RDF extension had been released by DERI‘s Fadi Maali and Richard Cyganiak.

Google Refine is “a power tool for working with messy data, cleaning it up, transforming it from one format into another, extending it with web services, and linking it to databases”. Previously it was known as Freebase Gridworks which is now further developed by Google since its acquisition of Metaweb.

Refine
Google Refine UI

Refine is a very useful tool to filter and consequently transform rows, colums and cells according to customizable patterns.

After applying all necessary transformations to the spreadsheet one can edit the “RDF Skeleton”, where the columns can be mapped to literals, RDF properties and RDF classes (which can be imported from their namespaces).

RDF Sekeleton
Editing the RDF Sekeleton

Once you got your valid SKOS model ready you can export it in RDF/XML or Turtle format. Then you may want to load it into an ontology editor like Protégé or a thesaurus management tool like PoolParty in order to build upon it or connect it to other knowledge models. With PoolParty the GICS taxonomy can also be utilized to tag and categorize documents, provide semantic search and facetted navigation and it can be published as Linked Data without further effort.

GICS in PoolParty screenshot
GICS loaded in PoolParty

Working with Refine and its RDF extension was easy and fun. It’s even possible to isolate and save the transformation steps done with Refine, so one can re-apply them on similar structured spreadsheets. This came in very handy as GICS is published in nine languages and as many separate, identically structured spreadsheets.

PrevPrevious post
Next postNext
ALL POSTS

Twitter

@semwebcompany

RT @PoolParty_Team: Want to know how to build knowledge graphs for enterprise applications? Industry leaders... Read More

May 23 2023, 9:16 am
@semwebcompany

RT @PoolParty_Team: Join us at Responsible Business Europe 2023 🌱 The @PoolParty_Team is passionate about #Environmental... Read More

May 22 2023, 12:53 pm
@semwebcompany

RT @PoolParty_Team: ESG meets Knowledge Graphs 🤝 Join this webinar for an intro to the #ESG... Read More

May 22 2023, 11:39 am
More
  • Twitter
  • Linkedin
  • Youtube
  • Xing
Scroll Top

2023 © Semantic Web Company