Nepomuk Appendix A - RDF for Dummies in a Nutshell

In my previous posts I used some terms that probably need explaining. The following descriptions should not be used as basis for any exam and may very well scare some academic semantic web professionals, but they get me through the day. And I think they are sufficient to understand most of what is going on with Nepomuk data in KDE.

RDF - The Resource Description Framework describes a way of storing data. While "classical" databases are based on tables RDF data consists on triples and only triples. Each triple, called statement consists of

subject - predicate - object

The subject is a resource, the predicate is a relation, and the object is either another resource or a literal value. A literal can be a string or an integer or a double or any other type defined by XML Schema (actually it is even possible to define custom literal types). Since RDF was born as a web technology all resources and relations are identified by their unique URI. (Meaning they have a namespace often ending in a # and a name. Typically abbreviation such as foo:bar are used.) Thus, a dataset in RDF is basically a graph where resources are the nodes, predicates the links, and literals act as leaves.

RDF defines one important default property: rdf:type which allows to assign a type to a resource.

RDFS - The RDF Schema defines a set of resources and properties extending RDF. This extension basically allows to define ontologies. RDFS defines the two important classes rdfs:Resource and rdfs:Class which introduces the distinction between instances and types, as well as properties to define type hierarchies: rdfs:subClassOf and rdfs:subPropertyOf, and rdfs:domain and rdfs:range to specify details when defining properties.

This allows to create new classes and properties much like in object oriented programming. For example:

@PREFIX foo: <http://foo.bar/types#>

foo:Human rdf:type rdfs:Class .
foo:Woman rdf:type rdfs:Class .
foo:Woman rdfs:subClassOf foo:Human .

foo:isMotherOf rdf:type rdf:Property .
foo:isMotherOf rdfs:domain foo:Woman .
foo:isMotherOf rdfs:range foo:Human .

foo:Mary rdf:type foo:Woman .
foo:Mary foo:isMotherOf foo:Carl .

A simple example of how to define an ontology in RDFS (using the Turtle language). The last two important predicates in RDFS are rdfs:label and rdfs:comment which define human readable names and comments for any resource (the labels are used for matching fields and grouping results in my previous blog on search).

NRL - The Nepomuk Representation Language was developed in Nepomuk to further extend on RDFS. I will not go into detail and explain everything about NRL but keep to what is important with respect to KDE at the moment.

Most importantly NRL changes triples to quadruples where the fourth "parameter" is another resource defining the graph in which the statement is stored (may be empty which means to store in the "default graph"). This graph (or context as it is called in Soprano) is just another resource which groups a set of statements and allows to "attach" information to this set. NRL defines a set of graph types of which two are important here: nrl:InstanceBase and nrl:Ontology. The first one defines graphs that contain instances and the second one, well you guessed it, defines graphs that contain types and predicates.

To make this clearer let's extend our example with NRL stuff:

@PREFIX foo: <http://foo.bar/types#>

foo:graph1 rdf:type nrl:Ontology .
foo:graph2 rdf:type nrl:InstanceBase .

foo:Human rdf:type rdfs:Class foo:graph1.
foo:Woman rdf:type rdfs:Class foo:graph1.
foo:Woman rdfs:subClassOf foo:Human foo:graph1 .

foo:isMotherOf rdf:type rdf:Property foo:graph1 .
foo:isMotherOf rdfs:domain foo:Woman foo:graph1 .
foo:isMotherOf rdfs:range foo:Human foo:graph1 .

foo:Mary rdf:type foo:Woman foo:graph2 .
foo:Mary foo:isMotherOf foo:Carl foo:graph2 .

But making a distinction between ontology and instance resources is not all we gain from contexts.

NAO - The Nepomuk Annotation Ontology already defines resource types and properties you already encountered in KDE: nao:Tag or nao:rating. But it also defines nao:created which is a property that assigns an xls:dateTime literal to a resource, in our case a graph. This way we store information about when a piece of information was inserted into the Nepomuk repository.

foo:graph1 nao:created "2008-02-12T14:43.022Z"^^<http://www.w3.org/2001/XMLSchema#dateTime> .

SPARQL - The Query Language for RDF is what we use to query the RDF repository. Its syntax has been designed close to SQL but since it is quite young it is by far not as powerful yet.

Anyway, this is how a simple query that retrieves the mother of Carl looks like:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foo: <http://foo.bar/types#>

select ?r where { ?r foo:isMotherOf foo:Carl . }

Or if we take NRL into account:

prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foo: <http://foo.bar/types#>
prefix nrl: <http://semanticdesktop.org/ontologies/2007/08/15/nrl#>

select ?r where { graph ?g { ?r foo:isMotherOf foo:Carl . } . ?g rdf:type nrl:InstanceBase . }

I think this is enough for today. I hope this blog entry helps in understanding the inner workings of Nepomuk better. Let me just give one more hint: Soprano (the RDF storage solution we use in KDE) comes with static QUrl objects for most of the common resource URIs. You find them in the Soprano::Vocabulary namespace.


in 24 hours?


For a quick fun intro to Semantic Web vision in general, see: http://www.youtube.com/watch?v=OGg8A2zfWKg

For a slightly longer intro to RDF see: http://research.talis.com/2005/rdf-intro/

For a quite in-depth RDF tutorial see: http://www.w3.org/TR/REC-rdf-syntax/

By gromgull at Tue, 02/12/2008 - 14:10

by Tim Bernerse Lee and fellows is


By leobard at Wed, 04/16/2008 - 09:58