olmstad
olmstad

Reputation: 696

How to store tree with ordered children in RDF? How to traverse such structure in SPARQL?

How do I store tree with ordered children in RDF?

Input:

1. Title 1
   Some text  1.
2. Title 2
2.1. Title 2.1
     Some text under title 2.1.
2.2. Title 2.2
     Some text under title 2.2.

Titles can be arbitrary and not necessarily contain numbering.

How to get back all elements still ordered in one query?

Desired output:

|-----------+----------------------------+
| Title     | Content                    |
|-----------+----------------------------+
| Title 1   | Some text under title 1.   |
| Title 2   |                            |
| Title 2.1 | Some text under title 2.1. |
| Title 2.2 | Some text under title 2.2. |
|-----------+----------------------------+

EDIT: "Calculate length of path between nodes?" doesn't answer my question. It discusses unordered nodes. My question is specifically about ordered collection (list of lists) and getting back elements in original order.

Upvotes: 2

Views: 669

Answers (2)

Stanislav Kralin
Stanislav Kralin

Reputation: 11479

Option 1

You could serialize RDF into flattened JSON-LD and write simple recursive function in e. g. Javascript.

var nquads = `
<http://ex.com/titleCollection> <http://ex.com/subtitles> _:b1 .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:b2 .
_:b1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b3 .
_:b2 <http://www.w3.org/2000/01/rdf-schema#label> "Title 1" .
_:b2 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.com/Title> .
_:b2 <http://www.w3.org/2000/01/rdf-schema#comment> "some text under title 1" .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:b4 .
_:b3 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:b4 <http://ex.com/subtitles> _:b5 .
_:b4 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.com/Title> .
_:b4 <http://www.w3.org/2000/01/rdf-schema#comment> "some text under title 2" .
_:b4 <http://www.w3.org/2000/01/rdf-schema#label> "Title 2" .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:b6 .
_:b5 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> _:b7 .
_:b6 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.com/Title> .
_:b6 <http://www.w3.org/2000/01/rdf-schema#comment> "some text under title 2.1" .
_:b6 <http://www.w3.org/2000/01/rdf-schema#label> "Title 2.1" .
_:b7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#first> _:b8 .
_:b7 <http://www.w3.org/1999/02/22-rdf-syntax-ns#rest> <http://www.w3.org/1999/02/22-rdf-syntax-ns#nil> .
_:b8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ex.com/Title> .
_:b8 <http://www.w3.org/2000/01/rdf-schema#comment> "some text under title 2.2" .
_:b8 <http://www.w3.org/2000/01/rdf-schema#label> "Title 2.2" .
`;

jsonld.fromRDF(nquads, {format: 'application/nquads'}, function (err, doc) { 
   print(doc, "http://ex.com/titleCollection") 
});

function print(doc, id) {
   var what = get(doc, id)
   var label = what['http://www.w3.org/2000/01/rdf-schema#label']
   var comment = what['http://www.w3.org/2000/01/rdf-schema#comment']
   var subtitles = what['http://ex.com/subtitles']
   if (label) console.log(label[0]['@value'])
   if (comment) console.log(comment[0]['@value'])
   if (subtitles) {
      for (var i of subtitles[0]['@list']) print(doc, i['@id'])
   }
}

function get(doc, id) {return doc.find((element) => (element['@id'] == id))}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jsonld/0.4.12/jsonld.min.js"></script>

Original Turtle was:

@prefix ex: <http://ex.com/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:titleCollection ex:subtitles
    (
        [
        a ex:Title ; rdfs:label "Title 1" ;
        rdfs:comment "some text under title 1" 
        ]
        [
        a ex:Title ; rdfs:label "Title 2" ;
        rdfs:comment "some text under title 2" ;
        ex:subtitles
            (
                [
                a ex:Title ; rdfs:label "Title 2.1" ;
                rdfs:comment "some text under title 2.1" 
                ]
                [
                a ex:Title ; rdfs:label "Title 2.2" ;
                rdfs:comment "some text under title 2.2" 
                ]
            )
        ]
    ) .

Option 2

Another option is to rely on storage order, hoping that items are stored in order of appearance.

Turtle syntax for blank node property lists and collections forces correct "order of appearance".

In GraphDB, you could say after importing the above Turtle:

PREFIX ex: <http://ex.com/> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX ent: <http://www.ontotext.com/owlim/entity#>

SELECT ?label ?comment {
    ?s a ex:Title ; rdfs:label ?label ; rdfs:comment ?comment
} ORDER BY ent:id(?s)

Option 3

Another option is to use inferencing.

  1. First, let's invent our own format for ordered trees, e. g. the following one:

    :title0 a :Node; rdfs:label "Book";
            :down title1.
    :title1 a :Node; rdfs:label "Title 1";
            :down title11;
            :right title2.
    :title2 a :Node; rdfs:label "Title 2";
            :down title21;
            :right title3.
    :title3 a :Node; rdfs:label "Title 3";
            :down title31.
    
  2. Second, let's restore initial tree ordering (and transitively close it). In SWRL:

    right(?a, ?b) ^ right(?b, ?c) -> right(?a, ?c)
    down(?a, ?b) ^ right(?b, ?c) -> down(?a, ?c)
    down(?a, ?b) ^ down(?b, ?c) -> down(?a, ?c)
    

    You could use OWL axioms instead or assert some of inferred statements explicitly.

  3. Third, let's formulate rules that define ordering that corresponds to the depth-first traversing order:

    right(?a, ?b) -> after(?a, ?b)
    down(?a, ?b) -> after(?a, ?b)
    down(?a, ?c) ^ right(?a, ?b) ^ down(?b, ?d) -> after(?c, ?d)
    down(?a, ?c) ^ right(?a, ?b) -> after(?c, ?b)
    right(?a, ?b) ^ down(?b, ?c) -> after(?a, ?c)
    

    Not sure that this set of rules is minimal or elegant...

  4. Now, your SPARQL query should be:

    SELECT ?s (SAMPLE(?label) AS ?title) (COUNT(?o) AS ?count) {
        ?s a :Node ; rdfs:label ?label .
        OPTIONAL { ?s :after ?o }
    } GROUP BY ?s ORDER BY DESC(?count)
    

Upvotes: 2

Jeen Broekstra
Jeen Broekstra

Reputation: 22052

You could model your example data as follows:

ex:title1 a ex:Title ;
          rdfs:label "Title 1";
          rdfs:comment "some text under title 1".

ex:title2 a ex:Title ;
          rdfs:label "Title 2";
          rdfs:comment "some text under title 2".


ex:title21 a ex:Title ;
          rdfs:label "Title 2.1";
          rdfs:comment "some text under title 2.1".

ex:title22 a ex:Title ;
          rdfs:label "Title 2.2";
          rdfs:comment "some text under title 2.2".
ex:title2 ex:subtitles (ex:title21 ex:title22).
ex:titleCollection ex:subtitles (ex:title1 ex:title2) .

Then a query for all things in order could do a very basic lexical ordering by title:

select ?title ?content 
where {  
    [] ex:subtitles/rdf:rest*/rdf:first [ 
                      rdfs:label ?title ;
                      rdfs:comment ?content ] .
} 
order by ?title

result:

Evaluating SPARQL query...
+-------------------------------------+-------------------------------------+
| title                               | content                             |
+-------------------------------------+-------------------------------------+
| "Title 1"                           | "some text under title 1"           |
| "Title 2"                           | "some text under title 2"           |
| "Title 2.1"                         | "some text under title 2.1"         |
| "Title 2.2"                         | "some text under title 2.2"         |
+-------------------------------------+-------------------------------------+
4 result(s) (4 ms)

If you don't want to rely on the actual title property to provide correct ordering, you could of course introduce an explicit ordering property with hierarchical numbering, and use the value of that in your order by clause.

Upvotes: 2

Related Questions