Hani Goc
Hani Goc

Reputation: 2441

return full wikipedia page for a query using dbpedia

I am using the following Code to retrieve disambiguation pages for a given query.

#disambiguation function
def disambiguation(name, sparql):
  query = "SELECT DISTINCT ?syn WHERE { { ?disPage dbpedia-owl:wikiPageDisambiguates <http://dbpedia.org/resource/"+name+"> . ?disPage dbpedia-owl:wikiPageDisambiguates ?syn . }  UNION {<http://dbpedia.org/resource/"+name+"> dbpedia-owl:wikiPageDisambiguates ?syn . } }"
  sparql.setQuery(query)
  sparql.setReturnFormat(JSON)  
  results_list = sparql.query().convert()
  return results_list

Question:

Is it possible to return the full wikipedia page for every element in the results_list?

Upvotes: 1

Views: 881

Answers (1)

Joshua Taylor
Joshua Taylor

Reputation: 85913

Simplifying your query

SELECT DISTINCT ?syn WHERE {
  { ?disPage dbpedia-owl:wikiPageDisambiguates <http://dbpedia.org/resource/"+name+"> .
    ?disPage dbpedia-owl:wikiPageDisambiguates ?syn . }
  UNION
  { <http://dbpedia.org/resource/"+name+"> dbpedia-owl:wikiPageDisambiguates ?syn . }
}

This query can be more cleanly written as

select distinct ?syn where {
  ?syn (dbpedia-owl:wikiPageDisambiguates|^dbpedia-owl:wikiPageDisambiguates)* dbpedia:name
}

This query says to find everything that's connected to dbpedia:name by a path of dbpedia-owl:wikiPageDisambiguates properties in any direction.

Getting the Wikipedia article URL

I actually wanted to retrieve the whole wikipedia page. For example: When I find a name in a different language I want to Go to the corresponding wikipedia page and retrieve its corresponding page

If you actually want to retrieve the page (using some other library, or whatever you have), then you just need to get the Wikipedia article URL. That's the value of the foaf:isPrimaryTopicOf property. E.g., if you look at property values for Johnny Cash, you'll see

http://dbpedia.org/resource/Johnny_Cash foaf:isPrimaryTopicOf http://en.wikipedia.org/wiki/Johnny_Cash

Based on that, it sounds like you'd want a query more like:

select distinct ?page where {
  ?syn (dbpedia-owl:wikiPageDisambiguates|^dbpedia-owl:wikiPageDisambiguates)* dbpedia:name ;
       foaf:isPrimaryTopicOf ?page

}

Then each value of ?page should be a Wikipedia article URL that you can download.

Upvotes: 2

Related Questions