TY - GEN
T1 - On the impact of data distribution in federated SPARQL queries
AU - Rakhmawati, Nur Aini
AU - Hausenblas, Michael
PY - 2012
Y1 - 2012
N2 - With the growing number of publicly availableSPARQL endpoints, federated queries become more and moreattractive and feasible. Compared to queries against a singleendpoint, queries that range over a number of endpoints posenew challenges, ranging from the type and number of datasetsinvolved to the data distribution across the datasets. Existingresearch focuses on the data distribution in a central storeand is mainly concerned with adopting well-known, traditionaldatabase techniques. In this work we investigate the impact of thedata distribution in the context of federated SPARQL queries.We perform a number of experiments with four federationframeworks (Sesame Alibaba, Splendid, FedX, and Darq) againstan RDF dataset, Dailymed, that we partition by graph and class.Our preliminary results confirm the intuition that the moredatasets involved in query processing, the worse performanceof federation query is and that the data distribution significantlyinfluences the performance.
AB - With the growing number of publicly availableSPARQL endpoints, federated queries become more and moreattractive and feasible. Compared to queries against a singleendpoint, queries that range over a number of endpoints posenew challenges, ranging from the type and number of datasetsinvolved to the data distribution across the datasets. Existingresearch focuses on the data distribution in a central storeand is mainly concerned with adopting well-known, traditionaldatabase techniques. In this work we investigate the impact of thedata distribution in the context of federated SPARQL queries.We perform a number of experiments with four federationframeworks (Sesame Alibaba, Splendid, FedX, and Darq) againstan RDF dataset, Dailymed, that we partition by graph and class.Our preliminary results confirm the intuition that the moredatasets involved in query processing, the worse performanceof federation query is and that the data distribution significantlyinfluences the performance.
UR - http://www.scopus.com/inward/record.url?scp=84870716155&partnerID=8YFLogxK
U2 - 10.1109/ICSC.2012.72
DO - 10.1109/ICSC.2012.72
M3 - Conference contribution
AN - SCOPUS:84870716155
SN - 9780769548593
T3 - Proceedings - IEEE 6th International Conference on Semantic Computing, ICSC 2012
SP - 255
EP - 260
BT - Proceedings - IEEE 6th International Conference on Semantic Computing, ICSC 2012
T2 - 6th IEEE International Conference on Semantic Computing, ICSC 2012
Y2 - 19 September 2012 through 21 September 2012
ER -