On the impact of data distribution in federated SPARQL queries

Nur Aini Rakhmawati*, Michael Hausenblas

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Citations (Scopus)

Abstract

With the growing number of publicly availableSPARQL endpoints, federated queries become more and moreattractive and feasible. Compared to queries against a singleendpoint, queries that range over a number of endpoints posenew challenges, ranging from the type and number of datasetsinvolved to the data distribution across the datasets. Existingresearch focuses on the data distribution in a central storeand is mainly concerned with adopting well-known, traditionaldatabase techniques. In this work we investigate the impact of thedata distribution in the context of federated SPARQL queries.We perform a number of experiments with four federationframeworks (Sesame Alibaba, Splendid, FedX, and Darq) againstan RDF dataset, Dailymed, that we partition by graph and class.Our preliminary results confirm the intuition that the moredatasets involved in query processing, the worse performanceof federation query is and that the data distribution significantlyinfluences the performance.

Original languageEnglish
Title of host publicationProceedings - IEEE 6th International Conference on Semantic Computing, ICSC 2012
Pages255-260
Number of pages6
DOIs
Publication statusPublished - 2012
Externally publishedYes
Event6th IEEE International Conference on Semantic Computing, ICSC 2012 - Palermo, Italy
Duration: 19 Sept 201221 Sept 2012

Publication series

NameProceedings - IEEE 6th International Conference on Semantic Computing, ICSC 2012

Conference

Conference6th IEEE International Conference on Semantic Computing, ICSC 2012
Country/TerritoryItaly
CityPalermo
Period19/09/1221/09/12

Fingerprint

Dive into the research topics of 'On the impact of data distribution in federated SPARQL queries'. Together they form a unique fingerprint.

Cite this