Reputation: 2872
I have the following query I'm running in the browser client:
MATCH (m:Media)-[r:WITH]->(:Tag {name:'peripheral'})
WITH count(r) as totes, m
WHERE m-[:WITH]->(:Tag {name: 'target'})
RETURN COUNT(*), totes, count(*)/(totes*1.0)
I am trying to find the percentage of Media nodes with Tag 'target' out of all the Media nodes with Tag 'peripheral'.
Ex: # of Media nodes w/ 'target' & 'peripheral" / # of Media nodes w/ just 'peripheral'
Most of the query works except for the variable totes
which always returns 1. If I separate the query just keeping the first half, the count is correct - it is only when I use the WITH and subsequent WHERE clause that the count is messed up. Any ideas why totes
is incorrect?
Upvotes: 0
Views: 89
Reputation: 11216
Your query is incorrect for what you want to do. Your query only matches Media
with relationships to Tag
nodes named peripheral
where it also matches Tag
nodes named target
.
I believe what you really want is something more along the lines of this...
MATCH (m:Media)-[r:WITH]->(:Tag {name:'Peripheral'})
WITH count(r) as totes, collect(m) as mcol
unwind mcol as m
MATCH m-[:WITH]->(:Tag {name: 'Target'})
RETURN COUNT(*) as both, totes, count(*)/(totes*1.0)
Upvotes: 1
Reputation: 1893
My take on this is that your query is returning a row for each matched media node and hence the count of totes
will always be 1. To carry the total count forward you need to aggregate the returned nodes - hence using COLLECT
and then UNWIND
. Dave Bennett is correct although I believe that his second suggestion is what you need. To break it down:
This line gets your initial pattern match.
MATCH (m:Media)-[r:WITH]->(:Tag {name:'Peripheral'})
Now you need to return your total count and carry forward a collection of the matched media nodes. Just carrying m
forward results in the issue you have, i.e. getting a count of 1 for each media node you match
WITH count(r) as totes, collect(m) as mcol
Now you can unwind the collection of media nodes that you carried forward to transform it back into individual rows that you can use to match a subset of those media nodes that have the tag target
UNWIND mcol as media
MATCH media-[:WITH]->(:Tag {name: 'Target'})
Now you should have all the necessary info to get the result that you desire, i.e. the percentage of Media nodes with Tag 'target' out of all the Media nodes with Tag 'peripheral'
RETURN COUNT(*) as both, totes, COUNT(*)/(totes*1.0)
For other readers it may be worth pointing out that (totes*1.0)
is necessary to force one of the arguments to be a float. If this is not done then integer maths will take place and you will get 0 or 1! I also find it more expressive to indicate the exact expression that I am aggregating, so the return statement would become
RETURN COUNT(media) as both, totes, COUNT(media)/(totes*1.0)
and to express an actual percentage you could use
RETURN COUNT(media) as both, totes, COUNT(media)/(totes*1.0)*100
The only other thing would be to pass forward the second count explicitly
WITH COUNT(media) as both, totes
RETURN both, totes, both/(totes*1.0)*100
So to bring it all together it becomes
MATCH (m:Media)-[r:WITH]->(:Tag {name:'Peripheral'})
WITH count(r) as totes, collect(m) as mcol
UNWIND mcol as media
MATCH media-[:WITH]->(:Tag {name: 'Target'})
WITH COUNT(media) as both, totes
RETURN both, totes, both/(totes*1.0)*100
Not trying to take anything away from the previous answer - just expanding on it as I find sometimes it is useful to know why as well as how. So this is for my own benefit and future reference! :)
Upvotes: 1