Bill Brown
Bill Brown

Reputation: 11

Neo4j count is not predictable

Using the Preferential attachment example in the guides, I have a node C with degree 3 and node E with degree 1

UNWIND [["A", "C"], ["A", "B"], ["B", "D"],
    ["B", "C"], ["B", "E"], ["C", "D"]] AS pair
MERGE (n1:Node {name: pair[0]})
MERGE (n2:Node {name: pair[1]})
MERGE (n1)-[:FRIENDS]-(n2)

and when I try a simple degree query for node E I get the correct answer. But when I add another node, the answer changes to 3

MATCH (e:Node {name: 'E'})--(othere)
RETURN e,othere, count(othere) 

returns 1 for count(othere)

MATCH (c:Node {name: 'C'})--(otherc)
MATCH (e:Node {name: 'E'})--(othere)
RETURN e,othere, count(othere)

returns 3 for count(othere). Why should this be?

Upvotes: 0

Views: 41

Answers (1)

cybersam
cybersam

Reputation: 66989

In your last query's RETURN clause, the COUNT aggregating function counts the number of result rows that have the same e and othere values. With your sample data, there are 3 such result rows.

Here is one way to get a correct count of the number of relationships between e and othere:

MATCH (c:Node {name: 'C'})--(otherc)
MATCH (e:Node {name: 'E'})-[r]-(othere)
RETURN e, othere, COUNT(DISTINCT r)

[DISCUSSION]

In general, there can be any number of relationships between any 2 nodes. So, to make this a more general discussion, suppose the "E" node has 2 relationships to the "B" node (and no other relationships).

  • My query would return a correct COUNT (degree) of 2. Remember, the degree of a node is its number of relationships.

  • A similar-looking query that returned COUNT(DISTINCT othere) instead of COUNT(DISTINCT r) would return 1, which is incorrect.

  • Your second query's return clause (RETURN e, othere, COUNT(othere)) clause would return a COUNT of 6 (because there'd be 6 result rows with othere).

I hope this helps to make clear why I used COUNT(DISTINCT r). You should also read the aggregating function documentation carefully.

Upvotes: 1

Related Questions