Sasja Vandendriessche
Sasja Vandendriessche

Reputation: 749

How to aggregate count of nodes at every depth of variable length relationship

Data structure

I have User and Organisation nodes in my graph stored in neo4j.

A User can refer another User (relationship: REFFERED) A User can subscribe to an Organisation (relationship: SUBSCRIBED_TO)

Goal

I want to find the referrals that are coming from one user (eg. friends-of-friends-of-n*/referral-of-referral-of-n*) and the subscriptions those users made, ordered by each level of depth.

Problem

I have following cypher query:

MATCH p= (start:User {uid: 1})-[:REFERRED|SUBSCRIBED_TO*]->(end)
WITH *, relationships(p) as rel
WITH *, size(rel) as Depth
WITH Depth, type(rel[Depth - 1]) as Type, count(p) as Count
RETURN Depth, Type, Count

Returning:

╒═══════╤══════════════════════╤═══════╕
│"Depth"│"Type"                │"Count"│
╞═══════╪══════════════════════╪═══════╡
│1      │"REFERRED"            │10     │
├───────┼──────────────────────┼───────┤
│2      │"REFERRED"            │100    │
├───────┼──────────────────────┼───────┤
│3      │"REFERRED"            │1000   │
├───────┼──────────────────────┼───────┤
│4      │"REFERRED"            │10000  │
├───────┼──────────────────────┼───────┤
│4      │"SUBSCRIBED_TO"       │6      │
├───────┼──────────────────────┼───────┤
│2      │"SUBSCRIBED_TO"       │2      │
└───────┴──────────────────────┴───────┘

And the results are correct. However, if I want to pass on end in WITH to use a UNION to MATCH a second result with the end value, my result set changes.

Query:

MATCH p= (start:User {uid: 1})-[:REFERRED|SUBSCRIBED_TO*]->(end)
WITH *, relationships(p) as rel
WITH *, size(rel) as Depth
WITH Depth, type(rel[Depth - 1]) as Type, count(p) as Count, end
RETURN Depth, Type, Count

Result:

╒═══════╤══════════════════════╤═══════╕
│"Depth"│"Type"                │"Count"│
╞═══════╪══════════════════════╪═══════╡
│1      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│2      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│3      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│4      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│4      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│4      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤
│4      │"REFERRED"            │1      │
├───────┼──────────────────────┼───────┤

...
├───────┼──────────────────────┼───────┤
│4      │"SUBSCRIBED_TO"       │2      │
├───────┼──────────────────────┼───────┤
│4      │"SUBSCRIBED_TO"       │2      │
├───────┼──────────────────────┼───────┤
│4      │"SUBSCRIBED_TO"       │1      │
├───────┼──────────────────────┼───────┤
│4      │"SUBSCRIBED_TO"       │1      │
...

Question

So my question has two parts:

  1. How would one most efficiently construct this kind of query with cypher?
  2. What paradigm or what principle of cypher makes it that the result gets spread over multiple rows when end is passed along

Upvotes: 0

Views: 182

Answers (1)

cybersam
cybersam

Reputation: 67019

Aggregating functions like COLLECT use the non-aggregating terms in the same WITH (or RETURN) clause as grouping keys. You should read the documentation for more details.

Here is a query that might be suitable for your use case:

MATCH p = (:User {uid: 1})-[:REFERRED|SUBSCRIBED_TO*]->(end)
WITH
  LENGTH(p) AS depth,
  TYPE(LAST(RELATIONSHIPS(p))) AS type,
  COLLECT(end) as ends
RETURN depth, type, SIZE(ends) AS count, ends

depth, type, and count should have your expected values. And ends will be a collection of the end nodes that have the same depth and type values.

Upvotes: 1

Related Questions