Jonathan Livni
Jonathan Livni

Reputation: 107092

Count of distinct relationships for each node in cypher

The following seems like a trivial query to find distinct relationships for each node:

MATCH (n:Department)-[r]-()
WITH id(n) AS node_id, type(r) AS rel_type
RETURN node_id, DISTINCT rel_type

And yet neo4j complains:

Invalid input 'rel_type'...
"RETURN node_id, DISTINCT rel_type"
                          ^

I must be missing something fundamental, as AFAIK rel_type is simply a column of strings, over which I should be able to DISTINCT.

The goal was to later also count:

MATCH (n:Department)-[r]-()
WITH id(n) as node_id, type(r) as rel_type
RETURN node_id, DISTINCT rel_type AS unique_rel_type, count(DISTINCT rel_type) as rel_type_count

But that fails with the same error in the same spot. How do I get the count of distinct relationship types per node?

Upvotes: 1

Views: 223

Answers (1)

Tomaž Bratanič
Tomaž Bratanič

Reputation: 6514

Neo4j uses implicit aggregation keys, meaning that all the non-aggregated keys in the WITH or RETURN statement are used as aggregation keys. You can simply use:

MATCH (n:Department)-[r]-()
WITH id(n) as node_id, type(r) as rel_type
RETURN node_id, rel_type AS unique_rel_type, count(rel_type) as rel_type_count

The DISTINCT can be used to deduplicate the whole row, or it can be used in aggregations.

Whole row example:

MATCH (n:Department)-[r]-()
RETURN distinct id(n) as node_id, type(r) as rel_type

Or used in aggregations:

MATCH (n:Department)-[r]-()
RETURN id(n) as node_id, collect(distinct type(r)) as rel_type

You cannot deduplicate only a single variable in a row. Pretty sure you can't do that in SQL either.

Upvotes: 3

Related Questions