Alex
Alex

Reputation: 189

How to delete duplicate blank nodes

I'm trying to remove duplicate resources from a dataset, but am running into issues as the resources are blank nodes, and are not truly identical.

The data in question:

<http://faculty.washington.edu/tgis/ld/brumfield/uwDataset/places#NaroFominskiiraionMoskovskaia>
        a                        vra:AdministrativeArea ;
        rdfs:label               "Naro-Fominskii raion" ;
        uwext:typeOfAdminArea    "Raion" ;
        schema:containedInPlace  <http://faculty.washington.edu/tgis/ld/brumfield/uwDataset/places#MoskovskaiaoblastMoskovskaia> , <http://faculty.washington.edu/tgis/ld/brumfield/uwDataset/places#RussiaFederation> ;
        ns1:sameAs               <http://dbpedia.org/resource/Naro-Fominsky_District> ;
schema:geo
[ a                 schema:GeoCoord ;
  schema:latitude   "53.3793416" ;
  schema:longitude  "58.9708374"
],
[ a                 schema:GeoCoord ;
  schema:latitude   "53.3793416" ;
  schema:longitude  "58.9708374"
] .

What I've tried:

delete {?q a schema:GeoCoord.
?q schema:latitude ?lat .
?q schema:longitude ?long .  }
where
{
?s a schema:GeoCoord.
?s schema:latitude ?lat .
?s schema:longitude ?long .
?q a schema:GeoCoord.
?q schema:latitude ?lat .
?q schema:longitude ?long .
    filter(?q != ?s)
}

This deletes both schema:GeoCoord resources though. How can I remove the duplicate resource?

Upvotes: 1

Views: 175

Answers (1)

Jeen Broekstra
Jeen Broekstra

Reputation: 22053

There's a trick for this. Use

 filter(str(?q) < str(?s))

instead of

 filter(?q != ?s) 

The reason this works is that if you compare using !=, you get two matches: after all both bnodes are unequal to each other. However, only one bnode id is smaller than the other - so you'll only get one match.

Upvotes: 3

Related Questions