Fountain depicting the Fox and the Stork. Image credit: Jordiferrer

Aesopica, Part 3: Named Graphs

This article is the third part of a series, examining the use of the Clojure language for representing Linked Data, with examples from Aesop’s stories. In part one the basic elements of “The Fox and the Stork” story were formalised as Linked Data in Clojure, while in part two we investigated how various literal values can be described. In this article we examine how information about facts themselves, such as meta-information, can be described with Linked Data. As always, the functionality detailed in these articles can be found in the Aesopica library for using Clojure to write Linked Data.

In Linked Data, facts are represented as triples of subjects, predicates and objects. For example, when representing the story of the “The Fox and the Stork” one fact that we want to represent is “The Fox gives an invitation.” In this fact “The Fox” is the subject, the “gives an” is the predicate and “an invitation” is the object. Of course, as we mentioned in our previous articles, one of the strengths of Linked Data is that the elements are more precisely defined than just their natural language representations in a sentence. A Uniform Resource Identifier (URI) is used to more formally identify these elements. This would make the previous fact to be written as follows, using the Turtle notation of RDF:

When making using a base prefix for this could be shortened as:

@base <> .
<fox> <gives-invitation> <invitation1>.

Using our Clojure based notation, that was introduced in the previous articles, we could write this same fact as follows:

   {nil ""}
   #{[:fox :gives-invitation :invitation1]

The above-mentioned fact is just one out of many needed to represent the full story of “The Fox and the Stork”. In most cases a multitude of facts is required to represent the required knowledge. A set of facts, each consisting of subjects, predicates and objects, form a knowledge graph which provides us with a very general, but precise, way to represent knowledge.

However there are scenarios when we want to represent knowledge about the facts themselves. One way Linked Data/RDF facilitates this is the use of the “named graphs”. Named graphs allows us to associate an identifier (a URI) with a fact, or a set of facts. This essentially gives a name to a graph in the knowledge base, hence the notion of “named graphs”. Such an identifier can then be used as a way to add information about the facts with which it is associated.

The NQUADS syntax for RDF illustrates one way such named graphs can be represented. In this representation all the elements of the fact: the subject, predicate, object and optionally a graph-name, are written out fully, separated by spaces and concluding with a dot (.) .

To take a single fact as an example, here follows a NQUADS format representation that details that “for the first invitation the Stork has been invited”, and this fact is part of the “first dinner” named graph:

<> <> <> <> .

For this fact there are four elements to be represented, hence we can refer to these elements together as a quad, versus the notion of a triple for facts just consisting of a subject, predicate and object. As mentioned previously, URIs are used to precisely identify each element: <> is the subject, <> is the predicate and <> is the object respectively. In addition the graph is identified by the URI <>.

The big benefit of using such identifiers as names for the graphs is that they themselves can be part of facts. For example if we want to express that the facts contained inside the “first dinner” graph occur before the facts of the “second dinner” graph, we can use the fact:

<> <> <> .

Note that this fact itself is not part of any named graph. In a knowledge base of facts this would make it a part of the “default graph”. A default graph is a graph without any particular name. This makes the mixing of “regular” facts, where each fact consists of a triple, and facts in explicit named graphs, where each fact is a quad, possible in a single knowledge base.

An example of a sightly expanded version using of “The Fox and the Stork”” story using named graphs in the NQUADS format can be therefor be as follows:

<> <> <> .
<> <> <> .
<> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .
<> <> <> <> .

Now that we introduced the concept of “named graphs” we now want introduce a way to represent them in the Clojure representation of Linked Data. Similarly on how in NQUADS the triples are extended to quads to indicate the name of the graph, we extend our previously introduced Clojure syntax to be able to use quads for facts, as opposed to just triples. Similarly to NQUADS the, optional, fourth element of each fact represents the named graph identifier. Any regular triple based fact is part of the default graph in the knowledge base, similarly to the NQUAD representation.

The resulting Clojure representation of above-mentioned Linked Data story can be written as follows:

(def fox-and-stork-named-graph-edn
   {nil ""
    :rdf ""
    :time ""}
   #{[:fox :rdf/type :animal]
     [:stork :rdf/type :animal]
     [:fox :gives-invitation :invitation1 :dinner1]
     [:invitation1 :has-invited :stork :dinner1]
     [:invitation1 :has-food :soup :dinner1]
     [:invitation1 :serves-using :shallow-plate :dinner1]
     [:stork :gives-invitation :invitation2 :dinner2]
     [:invitation2 :has-invited :fox :dinner2]
     [:invitation2 :has-food :crumbled-food :dinner2]
     [:invitation2 :serves-using :narrow-mouthed-jug :dinner2]
     [:invitation1 :serves-using :narrow-mouthed-jug :dinner2]
     [:dinner1 :time/before :dinner2]}})

The main difference between the Clojure representation and NQUADS is that the Clojure representation uses prefixes and NQUADS uses full URIs written out each time. This is a deliberate design choice in syntax from both perspectives. In NQUADS this allows the format to represent each fact on a single line, without the need for a lookup based on context for the full URI of elements. In the Clojure representation the prefixes allow for a much more compact fact representation that makes for easier reading and writing by human users.

There are a number of other formats for writing Linked Data, some of which support named graphs. TriG for example is an extension of the Turtle format used in previous articles in this series. JSON-LD is also a very commonly used format for Linked Data that also supports named graphs. With the introduction of the Clojure way of writing Linked Data in this series, it makes sense to enable translating Linked Data into these formats for compatibility and reaching a wider audience. The facts on how to achieve this will be detailed in another article.

Newres Al Haider
Postdoctoral Researcher and Software Engineer