Reputation: 813
I have term nodes connected to content nodes and a query that is meant up update a content nodes connections to these term nodes.
First I decrease the count of connected content nodes for each term node originally attached, then delete the relationships.
After that I create a new relationship to all the specified term nodes, attempting to increase the count of connected content nodes for each newly connected term node by one.
The problem is, after the query runs, the count of connected content nodes is not increased by one, but rather increased by what looks like the total number of new term nodes being connected.
It seems I'm still having trouble grasping exactly how the data is being handled behind the query. I suspect the answer may deal with doing a count of the connected nodes as has been the case previous when I've gotten stuck.
Here is the query:
var query = [
"MATCH (contentNode:content {UUID: {contentID} })-[r:TAGGED_WITH]->(oldTermNode:term) ",
"SET oldTermNode.contentConnections = oldTermNode.contentConnections - 1 ",
"DELETE r ",
"WITH contentNode ",
"MATCH (newTermNode:term) ",
"WHERE newTermNode.UUID IN {termIDs} ",
"CREATE UNIQUE contentNode-[:TAGGED_WITH]->newTermNode ",
"SET newTermNode.contentConnections = newTermNode.contentConnections + 1 ",
].join('\n');
As a side question, when updating the terms, often many of the new terms are the same as the old terms (the user only adds/removes one or two terms, leaving the rest the same). Would it make more sense/have faster performance if only the relationships that wouldn't be reconnected were deleted and then only the new terms added?
Thanks a lot.
Upvotes: 1
Views: 362
Reputation: 67019
This does not answer your question directly, but I wonder if your terms actually need to have a 'contentConnections' property at all. If not, then you original question becomes moot.
Based just on the info from your question, it looks like the term.contentConnections value is just a count of the number of times that the term is the pointed to by a :TAGGED_WITH relationship. If that is the case, then you should be able to get an equivalent count with something like the following:
MATCH ()-[:TAGGED_WITH]->(t:term {UUID:{termId}}) RETURN count(t);
This query would be really fast if you create an index (or, probably even better, a uniqueness constraint) for the UUID property of term nodes. If this works for you, then you can simplify and speed up your other queries, since there would be no need to maintain the contentConnections value.
For example, your original query could be simplified to:
var query = [
"MATCH (contentNode:content {UUID: {contentID} })-[r:TAGGED_WITH]->(oldTermNode:term) ",
"DELETE r ",
"WITH contentNode ",
"MATCH (newTermNode:term) ",
"WHERE newTermNode.UUID IN {termIDs} ",
"CREATE UNIQUE contentNode-[:TAGGED_WITH]->newTermNode ",
].join('\n');
Upvotes: 2
Reputation: 3308
I've revised your query to work as you've described it should function. What I've done is collected your terms into a distinct collection and iterated through each node to increment and decrement their connection counts. This should work in theory, but I would advise taking other precautions to maintain the consistency of your relationship counts on the term nodes.
I'm assuming though that each term could have unbounded connections and that would be expensive computationally to poll through each of your terms, and to then count the connections, and then to set that as a weight on the node.
MATCH (contentNode:content {UUID: "1234" })-[r:TAGGED_WITH]->(oldTermNode:term)
WITH contentNode, collect(r) as oldRels, collect(DISTINCT oldTermNode) as oldTermNodes
FOREACH (oldTermNode in oldTermNodes |
SET oldTermNode.contentConnections = oldTermNode.contentConnections - 1)
FOREACH (r in oldRels | DELETE r)
WITH contentNode
MATCH (newTermNode:term)
WHERE newTermNode.UUID IN ["1112", "1113"]
CREATE UNIQUE (contentNode)-[:TAGGED_WITH]->(newTermNode)
WITH collect(DISTINCT newTermNode) as newTermNodes
FOREACH (newTermNode in newTermNodes |
SET newTermNode.contentConnections = newTermNode.contentConnections + 1)
You'll need to reinsert your parameters, I constructed this code example for an actual test to make sure it worked.
As a side question, when updating the terms, often many of the new terms are the same as the old terms (the user only adds/removes one or two terms, leaving the rest the same). Would it make more sense/have faster performance if only the relationships that wouldn't be reconnected were deleted and then only the new terms added?
You could revise the query by specifying you only want oldTermNodes that are not in the newTermNode collection. So yes, to answer your question, this would prevent unnecessary writes, which would increase performance. You'll just need to make sure that you remove from your newTermNodes collection any of the redundant terms so that the contentConnections are not incremented for those terms in the last line of the script.
Upvotes: 2