Tutorial 1: Introducing Graph Data

Next: Introducing RDF

The semantic web can seem unfamiliar and daunting territory at first. If you're eager to understand what the semantic web is and how it works, you must first understand how it stores data. We start from the ground up by outlining the graph database - the data storage model used by the semantic web.

After this tutorial, you should be able to:

  • Describe in basic terms what the semantic web is.
  • Experience the paradigm-shift of storing information as a graph database, rather than a hierarchical or relational database.
  • Understand that the semantic web of data is defined using Resource Description Framework (RDF).
  • Understand the basic principles of RDF statements and how they can define data graphs.

Estimated time: 5 minutes

graph data concept

If you come from a traditional IT background and are used to the idea of storing data either in a hierarchy (for example XML) or in a relational database (for example MySQL, MS SQL), you may not yet have come across Resource Description Framework, or RDF.

RDF is a common acronym within the semantic web community because it forms one of the basic building blocks for forming the web of semantic data. What it defines is a type of database which you may not be immediately familar with: something called a graph database.

Although it might not be familiar to you, it is the type of database that builds the semantic web, globally. We will learn why in these tutorials.

During this lesson, you will learn what a graph database is, how RDF defines one, and visualise graph data so you can get a feel of what it looks like. We will begin by comparing hierarchical, relational, and graph databases to see how they are different.

1.1 Introducing The Graph Database

Relational database vs graph database

For most types of data storage, there is the concept of some elements of data (whether they be for example data nodes or data tables) having more precedence, or importance, over other elements.

For example, take an XML document. An XML document typically contains nodes of information each with a parent node. At the root of the document is the highest level node, which has no parent.

Take a look at the illustration above. In a data graph, there is no concept of roots (or a hierarchy). A graph consists of resources related to other resources, with no single resource having any particular intrinsic importance over another.

An Example Of A Data Graph

It's easiest first to look at a series of statements about how things relate to each other and to visualize these as a graph before looking at how these relationships might be expressed in RDF. Look at the following statements describing the relationship between a dog (called Bengie) and a cat (called Bonnie):

Bengie is a dog.
Bonnie is a cat.
Bengie and Bonnie are friends.

Using these three simple statements, let's turn this into a data graph:

The relationships implied by this graph are fairly intuitive but to be thorough let's review them. We can can see that our two things - identified by "Thing 1" and "Thing 2" - have the properties name, animalType and friendsWith.

From this, we can see that "Thing 1"'s name is Bengie, and "Thing 2"'s name is Bonnie. "Thing 1" is a dog, and "Thing 2" is a cat. And finally, both are friends with each other (implied by the friendsWith property pointing in both directions).

Important Point The arrows in the above diagram are properties, sometimes in RDF terminology called predicates. Remember for now that the terms property and predicate are interchangable, and that it is the arrows that describe the properties in the graph.

Before formally introducing simple RDF, let's give a quick example to give you a flavor of what it looks like.

Semantic Web Primer e-Book

NEW! Semantic Web Primer e-Book (First Edition)

Includes all our primer tutorials. Plus two exclusive new tutorials on RDF syntaxes, and NoSQL databases found only in the e-Book.

1.2 A Starting Example Of RDF

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:region="http://www.country-regions.fake/">

	<rdf:Description rdf:about="http://en.wikipedia.org/wiki/Oxford">
		<dc:title>Oxford</dc:title>
		<dc:coverage>Oxfordshire</dc:coverage>
		<dc:publisher>Wikipedia</dc:publisher>
		<region:population>10000</region:population>
		<region:principaltown rdf:resource="http://www.country-regions.fake/oxford"/>
	</rdf:Description>

</rdf:RDF>

Don't worry about the details for now, we will review these later. For now, know that this is RDF/XML - the XML form of RDF. There are other ways of recording RDF, but we will look only at RDF/XML in these starting tutorials.

1.3 The RDF Statement (Triple)

The RDF/XML above (between the <rdf:Description> tags inclusive) is called an RDF statement, or sometimes called an RDF triple. Of the two, triple is the most helpful term as it describes the breaking of the statement into its three constituent parts: the subject, predicate, and object of the statement.

It is easiest first to illustrate these terms in the form of a simple graph. Look at the following graph of data describing the color of a T-shirt:

Example graph showing color property of a T-shirt

In terms of the simple graph above, the:

  • Subject is the T-shirt
  • Predicate (property) is the color
  • Object is white

Important Point RDF, whilst the foundation of defining data structures for the semantic web, does not in itself describe the semantics, or meaning, behind the data. This will come later when we introduce RDFS (RDF Schema) and OWL (Web Ontology Language). Don't worry about these for now. First, we need to learn how RDF structures data and relationships and how that differs from the more familiar ways of storing data - we need to paradigm shift from the relational or hierarchical means of modelling data to a graph model.

Let's look at this in terms of a simple RDF/XML statement:

<?xml version="1.0" encoding="UTF-8"?>

<rdf:RDF
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:feature="http://www.linkeddatatools.com/clothing-features#">

	<rdf:Description rdf:about="http://www.linkeddatatools.com/clothes#t-shirt">

		<feature:color rdf:resource="http://www.linkeddatatools.com/colors#white"/>

	</rdf:Description>

</rdf:RDF>

We will formally break this statement down in the next lesson, but before moving on, see if you can get a feel of how the RDF above describes the subject, predicate, and object.

You have completed this lesson. You should now understand the following:

  • What a data graph is.
  • That the semantic web is a giant, global data graph defined in RDF (Resource Description Framework).
  • The all-important shift in thinking from storing data in relational, or hierarchical models to a storing in graph models.
  • The subject, predicate and object in terms of basic data graphs and RDF statements.
  • A basic familiarity with the layout of an RDF document.

You should now be able to start the following tutorial:

 

Community

Register to download software from our site and interact with other users as you learn semantic web.