Reputation: 1731
For @en text alone, a single item from the Wikidata dump contains multiple names:
<http://www.wikidata.org/entity/Q26> <http://www.w3.org/2000/01/rdf-schema#label> "Northern Ireland"@en .
<http://www.wikidata.org/entity/Q26> <http://www.w3.org/2004/02/skos/core#prefLabel> "Northern Ireland"@en .
<http://www.wikidata.org/entity/Q26> <http://schema.org/name> "Northern Ireland"@en .
On the Wikidata page for this article (http://www.wikidata.org/entity/Q26), which of these (if any) corresponds to the canonicalized name used on the associated (English) the Wikipedia page?
Upvotes: 1
Views: 725
Reputation: 86
Grab the triple in which the predicate is schema:partOf and the object is the wikipedia you want (for example, https://en.wikipedia.org/).
Here's an example using Python's rdflib:
>>> import rdflib
>>> g = rdflib.Graph()
>>> r = g.parse("https://www.wikidata.org/entity/Q26.nt")
>>> for s, p, o in g:
... if p == rdflib.URIRef('http://schema.org/isPartOf') and o == rdflib.URIRef('https://en.wikipedia.org/'):
... print(s)
...
https://en.wikipedia.org/wiki/Northern_Ireland
You can adjust this approach according to whatever parser you're using, of course.
Upvotes: 1