mysterious_guy
mysterious_guy

Reputation: 435

Complex neo4j cypher query to traverse a graph and extract nodes of a specific label and use them in optional match

I have a huge database of size 260GB, which is storing a ton of transaction information. It has Agent, Customer,Phone,ID_Card as the nodes. Relationships are as follows: Agent_Send, Customer_Send,Customer_at_Agent, Customer_used_Phone,Customer_used_ID.

A single agent is connected to many customers .And hence hitting the agent node while querying a path is not feasible. Below is my query:

match p=((ph: Phone {Phone_ID : "3851308.0"})-[r:Customer_Send 
  | Customer_used_ID | Customer_used_Phone *1..5]-(n2)) 
with nodes(p) as ns 
return extract (node in ns | Labels(node) ) as Labels

I am starting with a phone number and trying to extract a big "Customer" network. I am intentionally not touching the "Customer_at_Agent" relationship in the above networked query as it is not optimal as far as performance is concerned.

So, the idea is to extract all the "Customer" labeled nodes from the path and match it with [Customer_at_Agent] relationship.

For instance , something like:

match p=((ph: Phone {Phone_ID : "3851308.0"})-[r:Customer_Send 
  | Customer_used_ID | Customer_used_Phone *1..5]-(n2)) 
with nodes(p) as ns 
return extract (node in ns | Labels(node) ) as Labels 
of "type customer as c " 
optional match (c)-[r1:Customer_at_Agent]-(n3) 
return distinct p,r1

I am still new to neo4j and cypher and I am not able to figure out a hack to extract only "customer" nodes from the path and use that in the optional match.

Thanks in advance.

Upvotes: 0

Views: 999

Answers (1)

Tore Eschliman
Tore Eschliman

Reputation: 2507

Use filter notation instead of extract and you can drop any nodes that aren't labelled right. Try out this query instead:

MATCH p = (ph:Phone {Phone_ID : "3851308.0"}) - [:Customer_Send|:Customer_used_ID|:Customer_used_Phone*1..5] - ()
WITH ph, [node IN NODES(p) WHERE node:Customer] AS customer_nodes
UNWIND customer_nodes AS c_node
OPTIONAL MATCH (c_node) - [r1:Customer_at_Agent] - ()
RETURN ph, COLLECT(DISTINCT r1)

So the second line takes the phone number and the path generated and gives you a list of nodes that have the Customer label as customer_nodes. You then unwind this list so you have individual nodes you can use in path matching. Line 4 performs your optional match and finds the r1 you're interested in, then line 5 will return the phone number node you started with and a collection of all of the r1 relationships that you found on customer nodes hooked up to that phone number.

UPDATE: I added some modifications to clean up your first query line as well. If you aren't going to use an alias (like r or n2 in the first line), then don't assign them in the first place; they can affect performance and cause confusion. Empty nodes and relationships are totally fine if you don't actually have any restrictions to place on them. You also don't need parentheses to mark off a path; they are used as a core part of Cypher's ASCII art to signify nodes, so I find they are more confusing than helpful.

Upvotes: 1

Related Questions