Reputation: 61
I have experience with Neo4j and Cypher, but still struggle with aggregate functions. I'm trying to pull a CSV out of Neo4j that should look like this:
Location | Number of Node X at Location | Number of Node Y at Location |
---|---|---|
ABC | 5 | 20 |
DEF | 39 | 4 |
Etc. | # | # |
My current query looks like this:
MATCH (loc:Location)--(x:Node_X)
RETURN loc.key AS Location, count(x) AS `Number of Node X at Location`, 0 AS `Number of Node Y at Location`
UNION
MATCH (loc:Location)--(y:Node_Y)
RETURN loc.key AS Location, 0 AS `Number of Node X at Location`, count(y) AS `Number of Node Y at Location`
Which yields a table like:
Location | Number of Node X at Location | Number of Node Y at Location |
---|---|---|
ABC | 5 | 0 |
DEF | 39 | 0 |
Etc. | # | # |
ABC | 0 | 20 |
DEF | 0 | 4 |
Etc. | # | # |
I think I'm close, but I have double the number of Location rows as I need, and am not sure how to make the results more succinct.
Upvotes: 0
Views: 583
Reputation: 146
I'd try this:
MATCH (loc:Location)
with distinct loc
OPTIONAL MATCH (loc)--(x:Node_X)
WITH distinct loc, count(x) AS xnum
OPTIONAL MATCH (loc)--(y:Node_Y)
WITH DISTINCT loc, count(y) AS ynum, xnum
RETURN
DISTINCT loc.key as Location,
xnum as `Number of Node X at Location`,
ynum as `Number of Node Y at Location`
Upvotes: 0
Reputation: 5385
I think You can solve it like this with even when counts are 0
MATCH (loc:loc1)
RETURN loc.type ,
size((loc)--(:Node_X)) AS xCount,
size((loc)--(:Node_Y)) AS yCount
You can also do
MATCH (loc:loc1)
RETURN loc.type ,
size([(loc)—-(x:Node_X) | x]) AS xCount
Upvotes: 1
Reputation: 61
@PrashantUpadhyay got the answer started, but I think this is the final answer I was looking for. It accounts for cases where counts may return zero, but still includes all Location rows.
MATCH (loc:Location)
OPTIONAL MATCH (loc)--(x:Node_X)
OPTIONAL MATCH (loc)--(y:Node_Y)
RETURN loc.key AS Location,
coalesce(count(distinct(x)), 0) as Node_X,
coalesce(count(distinct(y)), 0) as Node_Y
ORDER BY Location
Upvotes: 0
Reputation: 877
You can aggregate with distinct here.
MATCH (loc:loc1)--(x:Node_X), (loc)--(y:Node_Y)
RETURN loc.key ,
count(distinct(x)) as NODES_OF_TYPE_X,
count(distinct(y)) as NODES_OF_TYPE_Y
The problem in accessing x and y in above query is it changes the cardinality of the solution. For each solution of x, it will have all the solutions of y. If you had n1 x nodes and n2 y nodes and you don't use distinct, then you would get n1*n2 nodes for each x and y.
Upvotes: 0