Reputation: 325
I have a neo4j database populated with thousands of nodes without any relationship defined. I have a file which contains relationships between nodes, so I would like to create relationships between these nodes created in the database. My current approach is:
from py2neo import NodeSelector,Graph,Node,Relationship
graph = Graph('http://127.0.0.1:7474/db/data')
tx = graph.begin()
selector = NodeSelector(graph)
with open("file","r") as relations:
for line in relations:
line_split=line.split(";")
node1 = selector.select("Node",unique_name=line_split[0]).first()
node2 = selector.select("Node",unique_name=line_split[1]).first()
rs = Relationship(node1,"Relates to",node2)
tx.create(rs)
tx.commit()
The current approach needs 2 queries to database in order to obtain nodes to form a relationship + relationship creation. Is there a more efficient way given that nodes currently exist in the database?
Upvotes: 2
Views: 480
Reputation: 5682
You can use some form of node caching while populating relations:
from py2neo import NodeSelector,Graph,Node,Relationship
graph = Graph('http://127.0.0.1:7474/db/data')
tx = graph.begin()
selector = NodeSelector(graph)
node_cache = {}
with open("file","r") as relations:
for line in relations:
line_split=line.split(";")
# Check if we have this node in the cache
if line_split[0] in node_cache:
node1 = node_cache[line_split[0]]
else:
# Query and store for later
node1 = selector.select("Node",unique_name=line_split[0]).first()
node_cache[line_split[0]] = node1
if line_split[1] in node_cache:
node2 = node_cache[line_split[1]]
else:
node2 = selector.select("Node",unique_name=line_split[1]).first()
node_cache[line_split[1]] = node2
rs = Relationship(node1,"Relates to",node2)
tx.create(rs)
tx.commit()
With the above you will only load each node once and only if that node appears in your input file.
Upvotes: 2