Neo4j COUNT function and query confusion

Question

I'm just starting out with Neo4j, kicking the tyres if you will, but i'm getting a count I wasn't expecting.

I have a made a dummy database with 999 employees who have made expense claims (26901 - I know this is an odd number, I had a loop running inserting dummy expense claims using setInterval in javascript assigning random claims to random employees and I forgot to stop it :S :)).

Goal: I would like to know how many employees have made expense claims in my database.

I started with the following query:

MATCH (ex:EXPENSE)<-[:MADE_EXPENSE_CLAIM]-(employee:Employee)-[:WORKS_IN]->(:Department) 
RETURN COUNT(employee)`

However, i'm getting 20211 as the result of my count, which seems a little high.

If I run the query a different way I get a result that makes more sense (918 as the result):

MATCH (n:Employee)-[:WORKS_IN]->(:Department)
WHERE (n)-[:MADE_EXPENSE_CLAIM]->()
RETURN COUNT(n)`

Can anyone tell me what my two queries are doing and which one (if any) is actually achieving my goal? If not can you please correct my query?

Dave Bennett · Accepted Answer

Your first query returns the total number of expense claims made by all employees in departments.

Your second query returns the number of employees in departments that have made expense claims.

To simply return the number of distinct employees that have made expense claims you could do something like this...

MATCH (ex:EXPENSE)<-[:MADE_EXPENSE_CLAIM]-(employee:Employee)
RETURN count(DISTINCT employee)

To see the number of expense claims per employee you could do something like this. The number of rows will be the number of distinct employees that have made expense claims. If MADE_EXPENSE_CLAIM only points to expenses you could save some overhead in your query by removing the label for the Expense nodes.

MATCH ()<-[:MADE_EXPENSE_CLAIM]-(employee:Employee)
RETURN employee.name, count(*)

Neo4j COUNT function and query confusion

Answers (1)

Related Questions