Reputation: 6588
I have a graph:
CREATE (c1:Contract { id: "V-9087-4321" }),
(c2:Contract { id: "V-8046-2304" }),
(p:Partner {id: "C-4837-4536"}),
(p)-[:POLICY_HOLER]->(c1),
(p)-[:POLICY_HOLER]->(c2)
MATCH (p:Partner) WHERE p.id = "C-4837-4536"
CREATE (c:Claim { id: "ef70", date: "2019-04-27" }),
(p)-[:CLAIMANT {on: "2019-04-27"}]->(c)
MATCH (p:Partner) WHERE p.id = "C-4837-4536"
CREATE (c:Claim { id: "ab90", date: "2019-04-28" }),
(p)-[:CLAIMANT {on: "2019-04-28"}]->(c)
I want to find all partners who have more than one claim after April 1st:
I've tried this:
MATCH (claim:Claim)<-[r:CLAIMANT]-(p:Partner)
WITH count(r) as cnt, p, claim
WHERE cnt > 1 AND claim.date > '2019-04-01'
RETURN cnt, p.id
It doesn't work, because (if I remove the where clause), it is returning the two claims each with cnt=1
, rather than actually aggregating them. I've tried selecting based on claim.date
as well as r.on
.
How can this be done with Cypher?
Upvotes: 0
Views: 469
Reputation: 4052
@cybersam explained about aggregate functions very well. I suggest looking at their answer to understand how aggregate functions work. But their query is not filtering claims based on the count.
You can use the following query to find all partners who have more than one claim after April 1st:
MATCH (claim:Claim)<-[r:CLAIMANT]-(p:Partner)
WHERE claim.date > '2019-04-01'
WITH count(r) as cnt, p
WHERE cnt > 1
RETURN cnt, p.id
Upvotes: 1
Reputation: 66989
The neo4j aggregating functions (like COUNT()
) use the non-aggregating terms in the same WITH
or RETURN
clause as the "grouping keys".
In your example's WITH
clause, the grouping keys are p
and claim
, so the COUNT()
function is counting the number of r
relationships between each distinct p
and claim
pair (and so the count will always be 1). What you actually want instead is the number of r
relationships for each p
.
Assuming every Partner
node has a unique id
, you can find all partners who have more than one claim after April 1st using this simplified query:
MATCH (claim:Claim)<-[:CLAIMANT]-(p:Partner)
WHERE claim.date > '2019-04-01'
RETURN COUNT(claim) AS cnt, p.id
If you also want to return a list of all the claims for each matching partner:
MATCH (claim:Claim)<-[:CLAIMANT]-(p:Partner)
WHERE claim.date > '2019-04-01'
RETURN COUNT(claim) AS cnt, COLLECT(claim) AS claims, p.id
In addition, faster searches you should create either an index or a uniqueness constraint on :Claim(date)
.
Upvotes: 1