About Linked Stage Graph

Linked Stage Graph is a Knowledge Graph developed during the Coding da Vinci Süd 2019 hackathon. The graph is being created using a dataset by the National Archive of Baden-Wuerttemberg. It contains black and white photographs and metadata about the Stuttgart State Theatre from the 1890s to the 1940s.
The nearly 7.000 photographs give vivid insights into on-stage events like theater plays, operas and ballet performances as well as off-stage moments and theater buildings. However, the images and the data set as they are currently organized are hard to use and explore for anyone who is unfamiliar with an achive’s logic to structure information.
This project proposes means to explore and understand the data by humans and machines using linked data and interesting visualizations.

Goals

Create a Knowledge Graph

Create a linked data knowledge graph (KG) out of the photographs and metadata to enable means of exploration for (linked) data, web and information enthusiasts via a SPARQL endpoint. Users are then invited to query the data, create their own applications and visualizations out of them or connect the data to other data sources.

Read more

Connect with Others

Extract named entities from the (often unstructured) textual mentions like persons or performances and connect as many entities as possible to existing KGs, such as Wikidata. This way, we are able to extend the information given in the original graph with new knowledge.

Read more

Explore

Use the data from the KG to create a simple visualization and bring the photographs to life to enable means of exploration for culture, theater, photography and history enthusiasts who want to browse through the timeline of the Stuttgart State Theater.

Read more

Creating the Knowledge Graph

What is a Knowledge Graph?

A knowledge graph is a "graph of data with the intend to compose knowledge".

Graph of data refers to a data set viewed as a set of entities represented as nodes, with their relations represented as edges.
Composing knowledge refers to a continual process of extracting and representing knowledge that enhances the interpretability of the resulting knowledge graph.

A knowledge graph is therefore a method to create and organize knowledge in a continuing process in a way that it can be interpreted by humans and machines alike.

Source: "Knowledge Graphs: New Directions for Knowledge Representation on the Semantic Web". Bonatti et al.

Workflow

workflow

From XML (EAD-DDB) to RDF

The metadata was provided using the XML EAD standard which is used for encoding descriptive information regarding archival records. In order to create a knowledge graph, the data has to be transformed into the Resource Description Framework (RDF).
Many XML to RDF converters already exist, but due to the unique structure of the provided metadata, none of them worked out of the box. In the end, we used the XML2RDF converter by rhizomik and we used and adapted an EADRDF XSLT Stylesheet. Both outputs were imported into OpenLink Virtuoso. We connected both outputs using owl:sameAs and the archival unit ids. This enabled us to merge both outputs by semantic reasoning.

Named Entity Extraction and Linking (Connecting with Others)

The provided metadata contains interesting information about the performances and photographs in form of semi-structured or unstructured text. For example, the resource http://slod.fiz-karlsruhe.de/labw-2-2599390 has a title (dcterms:title) and an abstract (dcterms:description). These semi-structured textual information as shown in the first table below can be interpreted by humans, but not by machines. Therefore they cannot be queried or visualized in a meaningful and useful way. We started to tackle this issue in two steps:

Before Linking

dcterms:title Was ihr wollt (William Shakespeare)
dcterms:description Schauspiel
Art und Datum der Aufführung: Neuinszenierung, 11.03.1923
Inszenierung: Curt Elwenspoek
Bühnenbild: Felix Cziossek
Kostüme: Ernst Pils

After Linking

dcterms:title Was ihr wollt (William Shakespeare)
dcterms:description Schauspiel
Art und Datum der Aufführung: Neuinszenierung, 11.03.1923
Inszenierung: Curt Elwenspoek
Bühnenbild: Felix Cziossek
Kostüme: Ernst Pils
schema:isBasedOn <http://www.wikidata.org/entity/Q221211>
<http://d-nb.info/gnd/4316770-6>
dbo:setDesigner <http://www.wikidata.org/wiki/Q55638867>
schema:contributor <http://www.wikidata.org/entity/Q692>

In the second table above, several mappings were created. For instance, Felix Cziossek was mapped to the respective Item in Wikidata using the property dbo:setDesigner and the play "Was ihr wollt" was mapped to the respective creative work in Wikidata and GND using the property schema:isBasedOn.

What does this mean?

In this example, we have now created new knowledge in the form of human and machine interpretable facts:

These new and structured information can now be queried using SPARQL. Another advantage of linking these entities from our knowledge graph to other knowledge graphs is that we can make use of all data linked to these resources (e.g. in Twelfth Night) as you can see in the Lodview Linked Open Data widget.

Show Resource in Lodview

Exploration

We have created several means of exploration. For non-technical users who simply want to enjoy the photographs along with their descriptions and relevant persons, we have created the Linked Stage Graph Viewer and we have utilized the Vikus Viewer. For technically advanced users, we provide an endpoint to be queried using SPARQL.

Preprocessing: AI-Based Image Coloring

What breathes more life into photographs than a little bit of color? Using a tool based on artificial intelligence, we automatically colorized each photo in the data set with interesting outcomes. While the results aren’t close to perfection, we believe that the color adds a new vibrant dimension to these historical photos.

User Interfaces

Linked Stage Graph Viewer

The Linked Stage Graph Viewer is an exploration interface created by us. It enables to explore the images from the data set in (sort of) an instagram feed like fashion. We have cropped the photographs automatically to focus on the most interesting sections in them.
The photographs are arranged in a timeline from 1912 to 1943 which can be explored by scrolling up and down. Swiping left and right reveals other performances which have taken place in the same year. By clicking on a title, you are directed to the Lodview interface which shows you all metadata we have for each of the performances.

Vikus Viewer

The Vikus Viewer was created by Christopher Pietsch in the context of the Urban Complexity Lab at FH Potsdam. We found that the viewer works great with the data and photographs from our knowledge graph. The timeline allows the user to dynamically explore the images and metadata. Users can filter the content, zoom into it and focus on individual images which also reveals some of the metadata we have gathered.

SPARQL Endpoint

You can query the Linked Stage Graph using our SPARQL-Endpoint.

Go to Endpoint

Use the following prefixes with your query:


	PREFIX dcterms: <http://purl.org/dc/terms/>
	PREFIX gnd: <http://d-nb.info/gnd/>
	PREFIX schema: <http://schema.org/>
	PREFIX slod: <http://slod.fiz-karlsruhe.de/>
	PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
			

Example Query 1: Query Linked Stage Graph

Select all resources and their labels, for which there is a contributor (schema:contributor) listed in the data set. Optionally, also show the genre of each resource (schema:genre).
(Query Demo)


    SELECT DISTINCT ?resource ?label ?contributor ?name ?genre
    WHERE {
        ?resource schema:contributor ?contributor .
        ?resource rdfs:label ?label .
        ?contributor rdfs:label ?name .
        OPTIONAL {?resource schema:genre ?genre }
        }

Example Query 2: Federated Query (via Wikidata)

Select all resources, each resource label and their respective representation in Wikidata (schema:isBasedOn). Additionally, query Wikidata for these resources and select the publication year for each resource.
(Query Demo)


    SELECT distinct ?resource ?resourcelabel ?publicationdate
	WHERE {
		?resource schema:isBasedOn ?wikiresource .
		?resource rdfs:label ?resourcelabel .
	
		SERVICE <https://query.wikidata.org/sparql> {
			?wikiresource <http://www.wikidata.org/prop/direct/P577> ?publicationdate .
				}
		}

Disclaimer: When the Wikidata server is busy, this federated query may cause a timeout. If this occurs, please try again a few minutes later. Thank you!

Example Query 3: Federated Query (via DBpedia)

Select the English language abstracts of each linked resource (e.g. persons and plays).
(Query Demo)


		
	SELECT DISTINCT ?resource ?resourcelabel ?dbp ?abstract
	WHERE {
		?resource schema:isBasedOn ?wikiresource .
		?resource rdfs:label ?resourcelabel .

		SERVICE <http://dbpedia.org/sparql> {
			?dbp ?p ?wikiresource .
			FILTER regex(?dbp, 'http://dbpedia.org/', 'i')
			?dbp <http://dbpedia.org/ontology/abstract> ?abstract .
			FILTER (lang(?abstract) = 'en') .
		}	

Disclaimer: When the DBpedia server is busy, this federated query may cause a timeout. If this occurs, please try again a few minutes later. Thank you!

Team

Tabea Tietz

Tabea Tietz

Junior Researcher at FIZ Karlsruhe and Karlsruhe Institute of Technology (AIFB).

Supervisor:
Prof. Harald Sack

Kanran Zhou

Student of electrical engineering at Karlsruhe Institute of Technology and student co-worker at FIZ Karlsruhe.

Supervisor:
Prof. Harald Sack

Jörg Waitelonis

Jörg Waitelonis

Linked Data enthusiast,
yovisto GmbH

Paul Felgentreff Web Dev

Paul Felgentreff

Addicted to progress and nature, design & Green Marketing ‐ not green washing but actual good marketing ✌