Tutorial ESWC 2018: From heterogeneous data to RDF graphs and back

Tutorial held during the 15th ESWC 2018

3rd or 4th June 2018

Olivier Corby, Catherine Faron Zucker, Maxime Lefrançois, Antoine Zimmermann


Contents:

Abstract

With the rise of the Web and Web of Data, the RDF data model may be used as a lingua franca to reach semantic interoperability and integration and querying of data having heterogeneous formats. This tutorial introduces two domain-independent SPARQL extensions that together aim at lowering the overhead for knowledge engineers to embrace the Semantic Web models and technologies. Application examples will be given in the field of open data.

SPARQL-Generate is an extension of SPARQL for querying not only RDF datasets but also documents in arbitrary formats. It offers a simple template-based option to generate RDF Graphs from documents in heterogeneous formats. SPARQL Template Transformation Language (STTL) is an extension of SPARQL which enables Semantic Web developers to support the many cases where they need to transform RDF data. It enables them to write specific yet compact RDF transformers toward other languages and formats, including RDF itself. Combining SPARQL-Generate and STTL enables users to develop a new variety of applications where RDF is used as a pivot language in Web applications requiring heterogeneous data transformation processes.

Topic and Relevance

It is commonly understood by developers that the adoption of the Semantic Web models and technologies are enablers for semantic interoperability on the Web and the Web of Things, but that their adoption is bound to that of RDF data formats. The topic of this tutorial is SPARQL-Generate and STTL, that were developed by the organizers, recently presented in some of the main Semantic Web conferences (ISWC, ESWC), and both contribute to making the choice of a data format and that of a data model orthogonal. We claim that together they contribute in lowering the costs of reaching semantic interoperability on the Web.

This is the first tutorial that combines the two SPARQL extensions. A tutorial has already been given on STTL at the fourth French days on software engineering JDEV in 2017 (http://devlog.cnrs.fr/jdev2017). SPARQL-Generate has been used during two Hackatons in 2017 that each involved a dozen of Web developers and Data Scientists in the context of project OpenSensingCity (http://opensensingcity.emse.fr/tuba/ and http://opensensingcity.emse.fr/peniche/) .

Duration and Sessions

The tutorial will be organized on a half day, and will alternate between presentation of the theory and short exercises that make the learners proficient in using the two SPARQL extensions. Applications will be applied to the management and browsing of existing RDF and non-RDF open data, including real-time sensor data.

Audience

This tutorial targets practitioners that use or are responsible of open data publication and management. It intends to provide the attendees with some insight on how the presented SPARQL extensions make open data become Linked Open Data more easily. Familiarity with RDF and SPARQL will be assumed.

Tutorial Material

We ask attendees to come with their own laptop with Java 1.9 and an IDE installed.

Introductory slides

Introductory slides are available as a pdf

Part 1: SPARQL-Generate

The data we will be using during this tutorial is available at the following location https://ci.mines-stetienne.fr/aqi/.

Recommended ways to use SPARQL-Generate for this tutorial are the online playground, or our brand new standalone app.

If you are using the standalone app, you must download and unzip the tutorial workspaces.

Part 2: STTL

STTL slides are available as well as the STTL exercise.

STTL slides are available

STTL and Linked Data Script technical documentation are also available.

To test the STTL language you need to download the Corese server and download and extract STTL archive. Then the commands to run the server are shown below:

# with Linux:
java -jar corese-server-4.0.2.jar -pp /home/path/eswc/profile.ttl -lh -debug

# with Windows:
java "-Dfile.encoding=UTF-8" -jar corese-server-4.0.2.jar -pp eswc/profile.ttl -lh -debug

Then access the service using your web browser at http://localhost:8080, click on Demo (top right) then select Demo

References

Corby Olivier, Faron-Zucker Catherine and Gandon Fabien. "LDScript: a Linked Data Script Language." International Semantic Web Conference. Springer, Cham, 2017.

Corby Olivier and Faron-Zucker Catherine. "STTL: a SPARQL-based transformation language for RDF." International Conference on Web Information Systems and Technologies. 2015.

Lefrançois, Maxime; Zimmermann, Antoine; and Bakerally, Noorani. "A SPARQL extension for generating RDF from heterogeneous formats". Extended Semantic Web Conference, ESWC, 2017.

Lefrançois, Maxime; Zimmermann, Antoine; and Bakerally, Noorani. "Flexible RDF generation from RDF and heterogeneous data sources with SPARQL-Generate". International Conference on Knowledge Engineering and Knowledge Management, 2016

Agenda

Agenda Topic Presenter
15'IntroductionAll
1h15'SPARQL-GenerateMaxime Lefrançois, Antoine Zimmermann
1h15'STTLOlivier Corby, Catherine Faron Zucker
15'Conclusion - questions - discussionsAll

Organizers

Olivier Corby and Catherine Faron Zucker are both members of the WIMMICS joint team between Inria Sophia Antipolis Méditerranée and the I3S laboratory, within Université Côte d’Azur. They are co-authors of a book and two MOOCS on the Semantic Web models and technologies. They designed the STTL language (http://ns.inria.fr/sparql-template).

Olivier Corby is a senior researcher at Inria Sophia Antipolis. His research interests are Programming the Semantic Web of Linked Data and Knowledge Representation and Reasoning. He is the designer of the Corese Semantic Web Factory that implements STTL. He teaches the Semantic Web at University of Nice and gave tutorials in several conference.

Catherine Faron Zucker is associate professor at Université Nice Sophia Antipolis since 2002. Her main research interests are Semantic Web, Ontologies, Knowledge Representation and Reasoning. She is responsible of a master track on Web Science and Technologies, where she teaches Semantic Web models and technologies on a regular basis, and she gave several introductory presentations and tutorials on these topics.

Maxime Lefrançois and Antoine Zimmermann are both associate professor at École des Mines de Saint-Étienne, and members of Laboratoire Hubert Curien. They co-designed the SPARQL-Generate language.

Maxime Lefrançois focuses on lowering the costs of adoption of the Semantic Web formalisms and technologies. He is the initiator and main developer of the SPARQL-Generate reference implementation, main developer of the SEAS versioned and modular ontologies, and recently co-edited the Semantic Sensor Network recent joint OGC/W3C recommendation.

Antoine Zimmermann’s work is related to representing, reasoning, and querying data and knowledge on the Semantic Web, with a particular interest in multicontextual, heterogeneous information modelling and exploitation. He currently coordinates national project OpenSensingCity and has participated in several research projects at the national level in France and at a European level. He has taught about Semantic Web technologies for 6 years to Master level students.