user855443
user855443

Reputation: 2948

SPARQL group by and order by: not ordered

I follow up on query where the schema.org database is used to find the number of children of a class - as a simpler database than my application. I want to get the names of the children concatenated in alphabetic order. The query:

prefix schema:  <http://schema.org/>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

select    ?child  (group_concat (?string) as ?strings) 
where {
  ?child  rdfs:subClassOf schema:Event .
     ?grandchild rdfs:subClassOf ?child .
  bind (strafter(str(?grandchild), "http://schema.org/") as ?string)
}   group by ?child  order by asc(?string)
limit 20 

gives

schema:PublicationEvent   "OnDemandEvent BroadcastEvent"
schema:UserInteraction    "UserPageVisits UserComments UserPlays UserBlocks UserDownloads UserPlusOnes UserLikes UserCheckins UserTweets"

Which is not alphabetically ordered. If I replace the sort order to desc the result is exactly the same. I seem not to understand how group by, order by and possibly bind interact.

Upvotes: 4

Views: 4739

Answers (3)

Finn &#197;rup Nielsen
Finn &#197;rup Nielsen

Reputation: 6736

The asker seems to be using Jena or Fuseki. There might be users ending up here that are also interested in Wikidata's SPARQL endpoint, Wikidata Query Service that uses Blazegraph.

For a few examples I tried with the Wikidata Query Service I find that the ORDER BY in a subquery - like suggested by @user855443 - "seems" to do the trick. I am unfamiliar with the internals of Blazegraph to say whether this will be consistent.

Here is an example with a Scholia-like query (https://scholia.toolforge.org/work/Q57267388): https://w.wiki/AFFx

PREFIX target: <http://www.wikidata.org/entity/Q57267388>

SELECT
  (GROUP_CONCAT(?author; separator=", ") AS ?authors)

WITH {

SELECT
  ?order ?author
WHERE {
  {
    target: p:P50 ?author_statement .
    ?author_statement ps:P50 ?author_ .
    ?author_ rdfs:label ?author .
    FILTER (LANG(?author) = 'en')
    OPTIONAL {
       ?author_statement pq:P1545 ?order_ .
       BIND(xsd:integer(?order_) AS ?order)
     }
   }
   UNION
   {
     target: p:P2093 ?authorstring_statement .
     ?authorstring_statement ps:P2093 ?author
     OPTIONAL {
       ?authorstring_statement pq:P1545 ?order_ .
       BIND(xsd:integer(?order_) AS ?order)
     }
   } 
} ORDER BY DESC(?order)
} AS %authors

WHERE {
  INCLUDE %authors
}
GROUP BY ?dummary

Upvotes: 0

Stanislav Kralin
Stanislav Kralin

Reputation: 11479

18.5.1.7 GroupConcat:

The order of the strings is not specified.


From the horse's mouth:

On 2011-04-22, at 19:01, Steve Harris wrote:

On 2011-04-22, at 06:18, Jeen Broekstra wrote:

However, looking at the SPARQL 1.1 query spec, I think this is not a guaranteed result: as far as I can tell the solution modifier ORDER BY should be applied to the solution sequence after grouping and aggregation, so it can not influence the order of the input for the GROUP_CONCAT.

That's correct.

Upvotes: 2

user855443
user855443

Reputation: 2948

An additional select subquery is required to push the order inside the groups:

prefix schema:  <http://schema.org/>
prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#>

select    ?child  (group_concat (?string) as ?strings) 

where {
    select * 
    {
     ?child  rdfs:subClassOf schema:Event .
     ?grandchild rdfs:subClassOf ?child .
     bind (strafter(str(?grandchild), "http://schema.org/") as ?string)
    } order by asc(?string)
}   group by ?child  
limit 20 

Upvotes: 4

Related Questions