Reputation: 1

NEO4j Join and select query

New starter for neo4j. Basic question. employeetoprojects and employees are two types of nodes in my database.

1st query I want to return employees with 2 or more projects I used this

MATCH (md:employeetoproject)
RETURN md.eid as ID, count(md.projectid) AS count
ORDER BY count DESC

Two questions:

I want to limit my view to only employees with greater than 5 projects how to do that?
Also, I want to return an employee name from the employee table to be joined in the result. How do I do this? Thanks

Upvotes: 0

Answers (1)

InverseFalcon

Reputation: 30417

Welcome to the world of graph databases!

A few things to keep in mind, Neo4j doesn't have tables, graph database terminology is a little different than what you may be used to.

Neo4j has nodes and relationships. A node may have a label, and that makes it like an entry in a table, but a strict set of properties are not required for nodes of certain labels (this differs from table properties in a RDBMS), and nodes can be multi-labeled, so the graph database model is a bit more flexible, allowing a node to play multiple roles depending on the context of how you want to view the data.

Data modeling is also different in graph databases in that relationships should be created and used instead of using a foreign key / table join approach. Also, join tables aren't needed, just use relationships to connect the nodes in question. Don't think about it the way you would relational data, these are not table joins.

The more graphy way to model your data might be:

(:Employee)-[:WORKS_ON]->(:Project)

So you would have :Employee nodes, :Project nodes, and :WORKS_ON relationships connecting them. During your data load/import process you might create all your :Employee nodes, then all your :Project nodes, then create the relationship between them. Also an index on both of these labels of the relevant properties you want to use for lookup would be helpful for future queries, such as an index on :Employee(eid).

A query to find employees who work on more than 5 projects, and show the employee name, eid, and count, would look like:

MATCH (e:Employee)
WITH e, size((e)-[:WORKS_ON]->(:Project)) as count
WHERE count > 5
RETURN e.name as name, e.eid as eid, count
ORDER BY count DESC

Note that if :WORKS_ON relationships always connect to a :Project node, then you can remove the label from that pattern and the query will become more efficient, using a degree check on :WORKS_ON relationships instead of needing to expand the relationship and filter the node on the other end to make sure it's a :Project node:

WITH e, size((e)-[:WORKS_ON]->()) as count

This is the main reason why I used size() on the desired pattern, rather than MATCHing to the full pattern and using the count() aggregation function. That approach works fine, but cannot be optimized as mentioned above to work with relationship degrees.

For reference, however, this is how you would get your same answer using the full pattern in the MATCH and the count() function:

MATCH (e:Employee)-[:WORKS_ON]->(p:Project)
WITH e, count(p) as count
WHERE count > 5
RETURN e.name as name, e.eid as eid, count
ORDER BY count DESC

Upvotes: 1

NEO4j Join and select query

Answers (1)

Related Questions