Reputation: 3096
A college and I are individually instantiating electronic health records into triples. We'd like to compare our sets of 10k to 100k triples to see if they have the same shapes.
As a policy, I create URIs based on UUIDs, so nothing semantic is embedded in them. I'd like to stick with this policy, as my college and I are really trying to holistically compare existing workflows.
I know how to compare two RDF files in TopBraid Composer, but I don't think it will be useful if we have the same data patterns but different URIs. I store my triples in Ontotext GraphDB but am glad to use any other tool.
For example, the triples about person ...fe54977c174a
and person ...4bcdc1c8abf9
should be considered equivalent, but ...fe54977c174a
and ...ae00dc86b3bb
should not. Is this feasible?
I would prefer not to spot-check with hand-crafted SPARQL ASK
statements.
@prefix ns0: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://example.com/4f79ea05-2358-4f43-a335-fe54977c174a>
a <http://example.com/Person> ;
ns0:gender ns0:Male ;
ns0:participatesIn ns0:5d2dfc7b-994c-4933-b787-f7971dae397c .
ns0:5d2dfc7b-994c-4933-b787-f7971dae397c
a ns0:HealthCareEncounter ;
ns0:startDate "2019-05-01"^^xsd:date ;
ns0:hasOutput ns0:a129ca96-c6d2-4a07-a4eb-4cf9ce23a314 .
ns0:a129ca96-c6d2-4a07-a4eb-4cf9ce23a314
a ns0:Diagnosis ;
ns0:mentions ns0:Headache .
has the same shape as this (despite the different URIs):
@prefix ns0: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://example.com/a740d254-084c-4621-b06d-4bcdc1c8abf9>
a <http://example.com/Person> ;
ns0:gender ns0:Male ;
ns0:participatesIn ns0:060d2091-b4f7-406d-ab0d-75b39b400823 .
ns0:060d2091-b4f7-406d-ab0d-75b39b400823
a ns0:HealthCareEncounter ;
ns0:startDate "2019-05-01"^^xsd:date ;
ns0:hasOutput ns0:bc549711-ed9d-4db6-8cf9-d43022903ef7 .
ns0:bc549711-ed9d-4db6-8cf9-d43022903ef7
a ns0:Diagnosis ;
ns0:mentions ns0:Headache .
but this is structurally different (due to the different gender and diagnosis mention):
@prefix ns0: <http://example.com/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<http://example.com/aa3a977a-999a-4c5c-9524-ae00dc86b3bb>
a <http://example.com/Person> ;
ns0:gender ns0:Female ;
ns0:participatesIn ns0:b31a62a5-337a-454d-a637-85aefef26684 .
ns0:b31a62a5-337a-454d-a637-85aefef26684
a ns0:HealthCareEncounter ;
ns0:startDate "2019-05-01"^^xsd:date ;
ns0:hasOutput ns0:6566d543-773e-4649-b589-66eb3d0f3165 .
ns0:6566d543-773e-4649-b589-66eb3d0f3165
a ns0:Diagnosis ;
ns0:mentions ns0:Nausea .
Upvotes: 0
Views: 111
Reputation: 22052
Eclipse Rdf4j (bundled with GraphDB) contains a graph isomorphism utility: Models.isomorphic. By default it only does blank node to blank node mappings. So you have two options:
Upvotes: 1