Mir4culix
Mir4culix

Reputation: 81

Is there a "DISTINCT ON" equivalent in SPARQL?

My data is basically an event log in RDF. I have cases and events, the latter belong to the former. Events have timestamps and an actor who triggered them.

For each case I now need the latest event, when it happened, and who triggered it.

This is roughly my current query:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ex: <http://example.org/>
  
SELECT ?case ?event ?timestamp ?actor
WHERE {
  ?case rdf:type ex:Case ;
        ex:hasEvent ?event .
  ?event ex:timestamp ?timestamp ;
         ex:hasActor ?actor .
  } 
ORDER BY ASC(?case) DESC(?timestamp)

Which yields something like this:

| case     | event     | timestamp                                 | actor      |
=================================================================================
| ex:case1 | ex:event1 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Alice   |
| ex:case1 | ex:event2 | "2020-01-01T01:00:00Z"^^xsd:dateTimeStamp | ex:Bob     |
| ex:case2 | ex:event3 | "2020-01-01T03:00:00Z"^^xsd:dateTimeStamp | ex:Charlie |
| ex:case2 | ex:event4 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Dan     |

However I would like to only get the first and third row, as they correspond to the latest events for this case. Like this:

| case     | event     | timestamp                                 | actor      |
=================================================================================
| ex:case1 | ex:event1 | "2020-01-01T02:00:00Z"^^xsd:dateTimeStamp | ex:Alice   |
| ex:case2 | ex:event3 | "2020-01-01T03:00:00Z"^^xsd:dateTimeStamp | ex:Charlie |

In order to achieve this I tried to use SELECT ?case ?event (MAX(?timestamp) AS ?latest) ?actor combined with GROUP BY ?case however SPARQL complains I need to group by ?event and ?actor as well which is not what I want of course.

I am aware that PostgreSQL has DISTINCT ON which would solve my problem, but I need to do it in SPARQL. Is there a nice way to achieve this?

Upvotes: 1

Views: 87

Answers (1)

Mir4culix
Mir4culix

Reputation: 81

Self answer based on @UninformedUser's comment:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX ex: <http://example.org/> 

SELECT ?case ?event (?latest as ?timestamp) ?actor WHERE { 
   ?case ex:hasEvent ?event . 
   ?event ex:timestamp ?latest ; 
          ex:hasActor?actor . 
   { SELECT ?case (MAX(?timestamp) AS ?latest) {
       ?case rdf:type ex:case ; 
             ex:hasEvent ?event . 
       ?event ex:timestamp ?timestamp } 
   group by ?case } 
} 

Upvotes: 2

Related Questions