rotzaug
rotzaug

Reputation: 65

How to find the next biggest value of a resource with a SPARQL query

I have trouble solving the following problem: "Construct a graph with the apoapsis of each planet together with a reference to the planet that comes next with regard to its distance from the sun."

Here is dump of the graph:

:Saturn  
  skos:exactMatch dbr:Saturn;
  rdf:type dbo:Planet;
  v:orbits :Sun;
  v:apoapsis [rdf:value 9.0412; v:uom unit:AU] ;
  v:orbitalPeriod [rdf:value 29.45; v:uom unit:YR ];
  v:radius [rdf:value 60268; v:uom unit:KM] ;
  v:temperature 
    [rdf:value -139; 
      v:uom unit:Deg_C ];  
.

So in the graph is some data about the solarsystem. All planets (dbo:Planet) have the v:apoapsis property and a value telling the distance to the sun. I already figured out how to find all bigger values but i just want the next biggest one. The result looks like this:

:Mars   v:apoapsis    1.666 ;
        v:nextPlanet  :Saturn , :Jupiter , :Uranus .

:Mercury  v:apoapsis  0.467 ;
        v:nextPlanet  :Saturn , :Jupiter , :Uranus , :Mars , :Earth , :Venus .

:Earth  v:apoapsis    1.017 ;
        v:nextPlanet  :Saturn , :Jupiter , :Uranus , :Mars .

:Venus  v:apoapsis    0.728 ;
        v:nextPlanet  :Saturn , :Jupiter , :Uranus , :Mars , :Earth .

:Jupiter  v:apoapsis  5.4588 ;
        v:nextPlanet  :Saturn , :Uranus .

:Saturn  v:apoapsis   9.0412 ;
        v:nextPlanet  :Uranus .

The expected result should look like this:

:Mars v:apoapsis 1.666 ;
 v:nextPlanet :Jupiter .
:Mercury v:apoapsis 0.467 ;
 v:nextPlanet :Venus .
:Uranus v:apoapsis 20.11 ;
 v:nextPlanet :Neptune .

I'm kind of new to SPARQL and suck with the idea of iterating over elements for this kind of task. A full solution is not necessary I just want to know how to solve this problem and I'm happy about some ideas. Thank you.

My most prominent query looks like this:

CONSTRUCT{?planet v:apoapsis ?AUdist;
    v:nextPlanet ?nextPlanet .}
WHERE {
  ?planet a dbo:Planet.
  ?planet v:apoapsis ?dist.
          ?dist v:uom unit:AU;
                rdf:value ?AUdist .

  FILTER(?AUdist > ?AUdist2)
  {
    SELECT ?nextPlanet ?AUdist2
        WHERE { 
        ?nextPlanet a dbo:Planet.
        ?nextPlanet v:apoapsis ?dist2.
          ?dist2 v:uom unit:AU;
            rdf:value ?AUdist2 .
        }   
  ORDER BY ASC(?AUdist2) 
  }
   {
  }
}ORDER BY ASC(?AUdist)

Upvotes: 3

Views: 181

Answers (2)

cygri
cygri

Reputation: 9472

The general approach to this sort of query is to:

  1. Get all combinations of value1 and value2
  2. Keep only those combinations where value1 is smaller than value2
  3. Use GROUP BY and MIN to find the smallest value2 for a given value1

You've already done steps 1 and 2. To rewrite your query a bit:

SELECT * {
    ?planet v:apoapsis/rdf:value ?dist.
    ?otherPlanet v:apoapsis/rdf:value ?otherDist.
    FILTER (?dist < ?otherDist)
}

Now in step 3, we want to group by ?planet, and find the smallest ?otherDist within each group:

SELECT ?planet (MIN(?otherDist) AS ?nextDist) {
    ?planet v:apoapsis/rdf:value ?dist.
    ?otherPlanet v:apoapsis/rdf:value ?otherDist.
    FILTER (?dist < ?otherDist)
}
GROUP BY ?planet

That was the hard part. What remains is to turn the query above into a subquery inside a CONSTRUCT query that finds the ?nextPlanet corresponding to ?nextDist and constructs the target graph.

Upvotes: 0

UninformedUser
UninformedUser

Reputation: 8465

The idea is to get the minimum distance value in a subquery and then get the corresponding planet in the outer query:

CONSTRUCT {
    ?planet v:apoapsis ?dist;
    v:nextPlanet ?nextPlanet .
} WHERE {

    ?planet v:apoapsis ?dist ;
            v:nextPlanet ?nextPlanet .
    ?nextPlanet v:apoapsis ?nextDist
    BIND(abs(?dist - ?nextDist) as ?diff)
    FILTER(?diff = ?minDiff)

  # get planet and the minimum distance to its next planet 
  {
   SELECT ?planet (min(?diff) as ?minDiff) {
    ?planet v:apoapsis ?dist ;
            v:nextPlanet/v:apoapsis ?nextDist
    BIND(abs(?dist - ?nextDist) as ?diff)
  } GROUP BY ?planet
  }
}

Note, the query here starts from your intermediate results. You didn't share the whole data, thus, I had to test on what I got from you.

Upvotes: 2

Related Questions