Reputation: 9
I'm working with RDF data stored in a PostgreSQL database that was added using the RDFLib-SQLAlchemy library. While querying the asserted_statements table using SQL, I noticed that some objects and subjects have IDs that start with the letter "N".
Here's a snippet of my SQL query and the results:
SELECT a.subject, a.predicate, a.object, b.subject, b.predicate, b.object
FROM public.kb_d5c47fc464_asserted_statements a
JOIN public.kb_d5c47fc464_asserted_statements b
ON a.object = b.subject
WHERE a.object LIKE 'N%' AND b.subject LIKE 'N%' AND a.subject LIKE 'http%'
ORDER BY a.id ASC;
Sample Data Output:
subject | predicate | object |
---|---|---|
http://purl.obolibrary.org/obo/BFO_0000062 | http://www.w3.org/2002/07/owl#propertyChainAxiom | N160ea22f83814f728990ceaafb6fbc43 |
http://purl.obolibrary.org/obo/BFO_0000062 | http://www.w3.org/2002/07/owl#propertyChainAxiom | N1cb51000d673480fb7bff6975709ab97 |
I'm curious about the significance of these IDs starting with N. Are they generated by RDFLib or related to blank nodes? What role do they play in the RDF structure, and how should I interpret them?
Any insights into why these IDs are being used and their purpose would be helpful.
Upvotes: 0
Views: 52
Reputation: 1
Those nodes starting with letter N are just blank nodes. See them as existential variables as you cant reuse them in further sparql queries. Although you can use them inside rdflib, eg:
#graph: rdflib.Graph
bnode_from_query = rdflib.BNode("N1234...")
triples_using_bnode_as_subject = graph.triples((bnode, None, None))
They are helpful, when you need to compare two graphs and you need some nodes without a name, see eg rdflib.compare
.
Upvotes: 0
Reputation: 49321
The rest of the id looks like a UUID (version 4, variant 1) encoded as hex without dashes.
Looking through the source for UUID leads to BNode.__new__
which contains:
value = _prefix + f"{node_id}"
where value
becomes the node id and _prefix
defaults to _unique_id()
(source)
What is this unique id?
It is the letter 'N'!
def _unique_id() -> str:
# Used to read: """Create a (hopefully) unique prefix"""
# now retained merely to leave internal API unchanged.
# From BNode.__new__() below ...
#
# acceptable bnode value range for RDF/XML needs to be
# something that can be serialzed as a nodeID for N3
#
# BNode identifiers must be valid NCNames" _:[A-Za-z][A-Za-z0-9]*
# http://www.w3.org/TR/2004/REC-rdf-testcases-20040210/#nodeID
return "N" # ensure that id starts with a letter
Upvotes: 0