Emmanuel Sunday
Emmanuel Sunday

Reputation: 65

How to get counts of Edges related to a node in one query - NEO4J

I have two questions.

What is the best way to index user activities like posts, reposts, comments, upvotes, and downvotes. My current solution is representing every activity as a POST. It should work, but I know its quite expensive to regard upvotes and downvotes as new nodes when I can just use a relationship to represent this. But then, I want to be able to fetch everything once and order.

Secondly: When I run the following excluding the WITH and following MATCH, The result is larger but as I try to get the counts of reposts, replies and upvotes. The result keeps getting smaller and eventually nothing.

MATCH (me:User {id: "172ed572-e3af-d3ee-77c0-8d9d181b12f1"})-[:COLLEAGUE_OF]-(u:User)-[posted:POSTED]->(p:Post) WHERE posted.date >= 0
WITH p, posted, u AS user MATCH (p)-[ro:REPOST_OF]-(:Post)
WITH count(ro) AS reposts, posted, ro, user MATCH (p)-[rt:REPLY_TO]->(:Post)
WITH count(rt) AS replies, posted, user, reposts MATCH (p)-[uv:UP_VOTE]->(:Post)
WITH count(uv) AS upvotes, posted, user, reposts, replies, p
RETURN p AS post,  posted, user, reposts, replies
ORDER BY -posted.date

Upvotes: 0

Views: 121

Answers (1)

cybersam
cybersam

Reputation: 66999

  1. You need to read the documentation on aggregating functions (like COUNT). In particular, you need to understand that the WITH (and RETURN) clause treats terms that do not contain aggregating functions as the "grouping keys" for the terms that do contain aggregating functions.

    For example, a clause such as WITH foo, COUNT(foo) AS fooCount will always produce a fooCount of 1.

  2. WITH clauses must specify the bound variables whose values you want to use later in the same query; any unspecified variables will be dropped. SInce your second and third WITH clauses do not specify p, their subsequent MATCH clauses are actually NOT using the previously bound value for p (but creating totally new p variables, each having multiple values).

  3. You should use OPTIONAL MATCH instead of MATCH to get the counts of things that may not exist. A MATCH would cause the entire query to abort if it fails to find a match.

  4. You neglected to make the (p)-[ro:REPOST_OF]-(:Post) relationship pattern directional. If you wanted to get a count of the number of times that p was reposted, so you should have used the pattern (p)<-[ro:REPOST_OF]-(:Post).

  5. You forgot to return upvotes.

  6. You should use ORDER BY posted.date DESC instead of ORDER BY -posted.date.

This may work better for you:

MATCH (:User {id: "172ed572-e3af-d3ee-77c0-8d9d181b12f1"})-[:COLLEAGUE_OF]-(user:User)-[posted:POSTED]->(p:Post)
WHERE posted.date >= 0
OPTIONAL MATCH (p)<-[ro:REPOST_OF]-(:Post)
WITH p, posted, user, COUNT(ro) AS reposts
OPTIONAL MATCH (p)-[rt:REPLY_TO]->(:Post)
WITH p, posted, user, reposts, COUNT(rt) AS replies
OPTIONAL MATCH (p)-[uv:UP_VOTE]->(:Post)
RETURN p, posted, user, reposts, replies, COUNT(uv) AS upvotes
ORDER BY posted.date DESC

Upvotes: 1

Related Questions