On metrics for measuring fragmentation of federation over SPARQL endpoints

Nur Aini Rakhmawati, Marcel Karnstedt, Michael Hausenblas, Stefan Decker

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Processing a federated query in Linked Data is challenging because it needs to consider the number of sources, the source locations as well as heterogeneous system such as hardware, software and data structure and distribution. In this work, we investigate the relationship between the data distribution and the communication cost in a federated SPARQL query framework. We introduce the spreading factor as a dataset metric for computing the distribution of classes and properties throughout a set of data sources. To observe the relationship between the spreading factor and the communication cost, we generate 9 datasets by using several data fragmentation and allocation strategies. Our experimental results showed that the spreading factor is correlated with the communication cost between a federated engine and the SPARQL endpoints . In terms of partitioning strategies, partitioning triples based on the properties and classes can minimize the communication cost. However, such partitioning can also reduce the performance of SPARQL endpoint within the federation framework.

Original languageEnglish
Title of host publicationWEBIST 2014 - Proceedings of the 10th International Conference on Web Information Systems and Technologies
PublisherSciTePress
Pages119-126
Number of pages8
ISBN (Print)9789897580239
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event10th International Conference on Web Information Systems and Technologies, WEBIST 2014 - Barcelona, Spain
Duration: 3 Apr 20145 Apr 2014

Publication series

NameWEBIST 2014 - Proceedings of the 10th International Conference on Web Information Systems and Technologies
Volume1

Conference

Conference10th International Conference on Web Information Systems and Technologies, WEBIST 2014
Country/TerritorySpain
CityBarcelona
Period3/04/145/04/14

Keywords

  • Data distribution
  • Federated SPARQL query
  • Linked data
  • SPARQL endpoint

Fingerprint

Dive into the research topics of 'On metrics for measuring fragmentation of federation over SPARQL endpoints'. Together they form a unique fingerprint.

Cite this