Reputation: 175
I have Node Label CUSTOMER that has one key, CUSTOMER_ID
Each customer id is linked to other customer id's, these bidirectional relationships are created using CSV files.
I want to have the result in below form for all nodes
CUSTOMER_ID, MIN(CUSTOMER_ID) over the set of related nodes
600,600
601,600
602,600
604,600
605,600
There will many such linked nodes (sub graphs) in the total data I was able to get it using the below query
MATCH (a:Member_Matching_1) -[r:MATCHED*]-> (b:Member_Matching_1)
WITH DISTINCT a,b
RETURN a.OPTUM_LAB_ID ,min(b.OPTUM_LAB_ID)
order by toInt(min(b.OPTUM_LAB_ID)),ToINT(a.OPTUM_LAB_ID)
but the issue is that the query will traverse the graph too many number of unwanted times
Ex-
wanted : 600 -> 601 -> 602 -> 604
Unwanted : 600 -> 601 -> 602 -> 603 -> 602 -> 604
As the data volume will be too high, I want to use the most optimal query.
After having spent some time searching the web came across a solution
MATCH p=(a:Member_Matching_1) -[:MATCHED*]-> (b:Member_Matching_1)
WHERE NONE (n IN nodes(p)
WHERE size(filter(x IN nodes(p)
WHERE n = x))> 1)
RETURN EXTRACT(n IN NODES(p)| n.OPTUM_LAB_ID) ;
But I am facing the error
Neo.DatabaseError.General.UnknownError
key not found: UNNAMED32
Please advise
Thanks in advance
Upvotes: 1
Views: 390
Reputation: 5057
As of today, Cypher is not really well-suited for these sort of queries, as it only supports edge uniqueness, but not vertex uniqueness. There is a proposal in the openCypher language to support configurable matching semantics, but it has only been accepted recently and is not merged to Neo4j.
So currently, for this sort of traversal, you are probably better of using the APOC library's apoc.path.expandConfig
stored procedure. This allows you to set uniqueness constraints such as NODE_PATH
, which enforces that "For each returned node there’s a unique path from the start node to it."
Also, when I faced a similar problem, I tried to use the following hack: set a fixed depth of the traversal and manually specify the uniqueness constraints. This did not work well for my use case, but it might be worth to give it a try. Sketch code:
MATCH p=(n)-[*5]->(n)
WHERE nodes(p)[0] <> nodes(p)[2]
AND nodes(p)[0] <> nodes(p)[4]
AND nodes(p)[2] <> nodes(p)[4]
RETURN nodes(p)
LIMIT 1
The error you got Neo.DatabaseError.General.UnknownError
/ key not found: UNNAMED32
is very strange indeed, it seems that your query overstressed the database which resulted in this (quite unique) error message.
Note: I agree with the comment of @TomGeudens stating that you should not create the MATCHED
edge twice - just use a single direction and incorporate the undirected nature of the edge in your queries, i.e. use (...)-[...]-(...)
in Cypher.
Upvotes: 1