Griff
Griff

Reputation: 2124

how to pull out list of all hyperlinked people on a persons wikipedia page using SPARQL and dbpedia

I want to pull out a list of all the "persons" which have a link to another person on Wikipedia.

For instance, George H. W. Bush has this sentence in his bio:

"Bush was born in Milton, Massachusetts, to Senator 
Prescott Bush and Dorothy Walker Bush."

Now Dorothy Bush is hyperlinked to her own page. Can I get a list which looks like:

George H. W. Bush | Dorothy Walker Bush
George H. W. Bush | Babe Ruth
George H. W. Bush | Bill Clinton

and to extend this.. for everyone on Wikipedia? I'll obviously have to break this down into bit sized chunks for it to output but I just am not sure how to code this to select for linked persons only. Thanks

Upvotes: 1

Views: 153

Answers (1)

Thomas
Thomas

Reputation: 2193

One way to start would simply be to search for connected resources that are both of type Person. You can use dbpedia's web based query form.

SELECT ?person1 ?p ?person2
WHERE { 
   ?person1 ?p ?person2. 
   ?person1 a foaf:Person. 
   ?person2 a foaf:Person.
}
ORDER BY ?person1
LIMIT 10
OFFSET 0

You can "split this data into chunks" by using the ORDER BY keyword and iterating over the value after OFFSET (eg. 10, 20, 30, ...). You should save all results of these seperate queries and then combine them afterwards to get the full result.

If you are only looking for a particular kind of interperson relationship on dbpedia, the following query will give you all the properties used to connect two persons.

SELECT DISTINCT ?p
WHERE { 
   ?person1 ?p ?person2. 
   ?person1 a foaf:Person. 
   ?person2 a foaf:Person.
}

Choose one or several of those properties, eg. http://dbpedia.org/property/married, and get a list of person related by this property using the following query.

SELECT ?person1 ?person2
WHERE { 
   ?person1 <http://dbpedia.org/property/married> ?person2. 
   ?person1 a foaf:Person. 
   ?person2 a foaf:Person.
}
ORDER BY ?person1
LIMIT 10
OFFSET 0

As you will see by yourself property usage on dbpedia is quite heterogeneous, so it might take some effort to get what you want.

Hope this helps as a starting point.

Upvotes: 2

Related Questions