SEMAGROW: Stream Processing For Semantic Data

Schematic representation of data as larger and smaller bubbles

SEMAGROW: Stream Processing
For Semantic Data

The agricultural industry shares huge amounts of data and embraces
Linked Data technologies to a large extend. This research project aims to improve semantic data processing by enhanced Linked Data standards and the automatic alignment of different data models across semantic information infrastructures.

Visit the Project Website

“A large number of benchmarks for SPARQL have been devised over the last years. However, only a few tackle federated query processing."
— Axel-Cyrille Ngonga Ngomo (University of Leipzig)

What Are the Main Challenges in Querying Linked Data?

Linked Data is operated across distributed graph databases. When a SPARQL endpoint is provided, the data can be directly reused in other applications. Linked Data is based on the principle that data is provided originally only once and is then linked. Therefore federated queries can be very time-consuming as they have to follow multiple link patterns. Another obstacle to make the Linked Data environment fully operable are different data models used for Linked Data applications. This leads to fractions and often prevents successful data queries and reuse.

Which Solutions Are Considered to Improve Linked Data Processing?

Algorithms will be developed that indicate the fasted way for Linked Data processing. They also show up upfront when barriers in successful data querying occur. This general quality improvements will require conceptual changes of the Unique Resource Identifiers as well as the additional metadata. Thus the Linked Data environment's performance can be improved. Another important quality improvement will be an automatic detection of differences in data models and the linking of congruent concepts.

Further Developments at Semantic Web Company

The PoolParty semantic middleware is a key player in the Linked Data technology field. The software enables companies to process Linked Data and is obviously also limited when different data models are applied across various sources. The tools and improvements of SEMAGROW to align data models automatically will be directly incorporated into PoolParty.

Project title


Project website


3 years

Methods Applied

  • Data Modelling
  • Linked Data
  • Machine Learning

Project sponsors

LOD2: The Linked Data Technology Stack for Enterprises
SemaGrow is partially funded by the Seventh Framework Programme of the European Commision (FP7-ICT-2011.4.4a Intelligent Information Management) under Grant Agreement No. 318497.

Semantic Data Integration

Scalability and performance are key considerations in data integration projects.

Learn More