Reputation: 3819
I am using Dbpedia sparql and trying to retrieve list of persons with details.
SPARQL Query (Not working):
SELECT DISTINCT ?dbpedia_link ?freebase_link str(?abstract) as ?abstract str(?activeYearsStartYear) as ?activeYearsStartYear str(?alias) as ?alias
str(?birthDate) as ?birthDate str(?birthName) as ?birthName str(?birthPlace) as ?birthPlace str(?children) as ?children
str(?label) as ?label str(?occupation) as ?occupation str(?otherNames) as ?otherNames str(?residence) as ?residence
str(?shortDescription) as ?shortDescription str(?spouse) as ?spouse str(?description) as ?description str(?subject) as ?subject
str(?comment) as ?comment str(?almaMater) as ?almaMater str(?award) as ?award str(?education) as ?education str(?knownFor) as ?knownFor
str(?networth) as ?networth str(?parents) as ?parents str(?salary) as ?salary str(?viafId) as ?viafId str(?wikiPageID) as ?wikiPageID
str(?wikiPageRevisionID) as ?wikiPageRevisionID WHERE {
{
?dbpedia_link rdf:type dbpedia-owl:Person
}
OPTIONAL {?dbpedia_link dbpedia-owl:abstract ?abstract. }
OPTIONAL {?dbpedia_link dbpedia-owl:activeYearsStartYear ?activeYearsStartYear .}
OPTIONAL {?dbpedia_link dbpedia-owl:alias ?alias .}
OPTIONAL {?dbpedia_link dbpprop:birthDate ?birthDate .}
OPTIONAL {?dbpedia_link dbpprop:birthName ?birthName .}
OPTIONAL {?dbpedia_link dbpprop:birthPlace ?birthPlace .}
OPTIONAL {?dbpedia_link dbpprop:children ?children .}
OPTIONAL {?dbpedia_link rdfs:label ?label .}
OPTIONAL {?dbpedia_link dbpprop:occupation ?occupation .}
OPTIONAL {?dbpedia_link dbpprop:otherNames ?otherNames .}
OPTIONAL {?dbpedia_link dbpprop:residence ?residence .}
OPTIONAL {?dbpedia_link dbpprop:shortDescription ?shortDescription .}
OPTIONAL {?dbpedia_link dbpprop:spouse ?spouse .}
OPTIONAL {?dbpedia_link dc:description ?description .}
OPTIONAL {?dbpedia_link dcterms:subject ?subject .}
OPTIONAL {?dbpedia_link rdfs:comment ?comment .}
OPTIONAL {?dbpedia_link dbpprop:almaMater ?almaMater .}
OPTIONAL {?dbpedia_link dbpprop:awards ?award .}
OPTIONAL {?dbpedia_link dbpprop:education ?education .}
OPTIONAL {?dbpedia_link dbpprop:knownFor ?knownFor .}
OPTIONAL {?dbpedia_link dbpprop:networth ?networth .}
OPTIONAL {?dbpedia_link dbpprop:parents ?parents .}
OPTIONAL {?dbpedia_link dbpprop:salary ?salary .}
OPTIONAL {?dbpedia_link dbpedia-owl:viafId ?viafId .}
OPTIONAL {?dbpedia_link dbpedia-owl:wikiPageID ?wikiPageID .}
OPTIONAL {?dbpedia_link dbpedia-owl:wikiPageRevisionID ?wikiPageRevisionID .}
OPTIONAL {?dbpedia_link owl:sameAs ?freebase_link
FILTER regex(?freebase_link, "^http://rdf.freebase.com") .}
OPTIONAL {?dbpedia_link dcterms:subject ?sub .}
}LIMIT 2 Offset 5
I have set the limit to 2 and offset to 5. It gives timeout error. Don't know why?
But when I removed half of fields + OPTIONAL statement from query then it give results. And works fine
SPARQL query (working):
SELECT DISTINCT ?dbpedia_link str(?abstract) as ?abstract str(?activeYearsStartYear) as ?activeYearsStartYear str(?alias) as ?alias
str(?birthDate) as ?birthDate str(?birthName) as ?birthName str(?birthPlace) as ?birthPlace str(?children) as ?children
str(?label) as ?label str(?occupation) as ?occupation str(?otherNames) as ?otherNames str(?residence) as ?residence
WHERE {
{
?dbpedia_link rdf:type dbpedia-owl:Person
}
OPTIONAL {?dbpedia_link dbpedia-owl:abstract ?abstract. }
OPTIONAL {?dbpedia_link dbpedia-owl:activeYearsStartYear ?activeYearsStartYear .}
OPTIONAL {?dbpedia_link dbpedia-owl:alias ?alias .}
OPTIONAL {?dbpedia_link dbpprop:birthDate ?birthDate .}
OPTIONAL {?dbpedia_link dbpprop:birthName ?birthName .}
OPTIONAL {?dbpedia_link dbpprop:birthPlace ?birthPlace .}
OPTIONAL {?dbpedia_link dbpprop:children ?children .}
OPTIONAL {?dbpedia_link rdfs:label ?label .}
OPTIONAL {?dbpedia_link dbpprop:occupation ?occupation .}
OPTIONAL {?dbpedia_link dbpprop:otherNames ?otherNames .}
OPTIONAL {?dbpedia_link dbpprop:residence ?residence .}
}LIMIT 2 offset 5
But don't know why it is not working with all fields.
Is there any limitation of fields in Dbpedia SPARQL?
Upvotes: 1
Views: 224
Reputation: 85813
With that many variables, all of which are optional, it seems like you're already going to need to do some post processing of the results. As such, I'd suggest that you actually just start asking for persons, and for any property that's in that list of properties, via values. E.g.:
select distinct ?s ?p ?o {
values ?p { dbpedia-owl:abstract
dbpedia-owl:abstract
dbpedia-owl:activeYearsStartYear
dbpedia-owl:alias
dbpprop:birthDate
dbpprop:birthName
dbpprop:birthPlace
dbpprop:children
rdfs:label
dbpprop:occupation
dbpprop:otherNames
dbpprop:residence }
?s a dbpedia-owl:Person ; ?p ?o .
}
order by ?s ?p
limit 100
offset 50
That has a lot more rows, since it's got one per property, but it doesn't timeout. By ordering by ?s and then by ?p, the rows end up grouped by person, and with the properties in predictable order, so post processing shouldn't be all that hard. In fact, you could even use optional here, so that you'd always have the same number of rows per person, which would make it very easy (but I haven't tested this):
select ?s ?p ?o {
values ?p { #-- ...
}
?s a dbpedia-owl:Person .
optional { ?s ?p ?o }
}
order by ?s ?p
Upvotes: 1
Reputation: 3428
it's both, a limitation and a feature...
If you run your first query on http://dbpedia.org/sparql and read the reply it should say
Virtuoso 42000 Error The estimated execution time 4626142 (sec) exceeds the limit of 240 (sec).
This essentially tells you that your query is pretty complex. The query planner estimated that it would need 4626142 seconds (~54 days) to run your query. As DBpedia is a free best effort service, they don't run such queries to be able to provide a good service for as many people as possible.
As you realized, your query gets a lot less complicated by providing less OPTIONAL clauses. You might be unaware of the fact that you're asking for a cross-join (cartesian product) of all the fulfilling values for variables in all optional clauses. There are a lot less value combinations if you bind less variables.
If you're just interested in one value per variable you might want to have a look at the SAMPLE keyword.
Upvotes: 2