zulfi123786
zulfi123786

Reputation: 175

neo4j error while avoiding loops in relationships

I have Node Label CUSTOMER that has one key, CUSTOMER_ID

Each customer id is linked to other customer id's, these bidirectional relationships are created using CSV files.

Graph-Data

I want to have the result in below form for all nodes

CUSTOMER_ID, MIN(CUSTOMER_ID) over the set of related nodes
600,600
601,600
602,600
604,600
605,600

There will many such linked nodes (sub graphs) in the total data I was able to get it using the below query

MATCH (a:Member_Matching_1) -[r:MATCHED*]-> (b:Member_Matching_1)
WITH DISTINCT a,b
RETURN a.OPTUM_LAB_ID ,min(b.OPTUM_LAB_ID)
order by toInt(min(b.OPTUM_LAB_ID)),ToINT(a.OPTUM_LAB_ID)

but the issue is that the query will traverse the graph too many number of unwanted times

Ex-

wanted : 600 -> 601 -> 602 -> 604

Unwanted : 600 -> 601 -> 602 -> 603 -> 602 -> 604

As the data volume will be too high, I want to use the most optimal query.

After having spent some time searching the web came across a solution

MATCH  p=(a:Member_Matching_1) -[:MATCHED*]-> (b:Member_Matching_1)
WHERE NONE (n IN nodes(p) 
            WHERE size(filter(x IN nodes(p) 
                              WHERE n = x))> 1)
RETURN EXTRACT(n IN NODES(p)| n.OPTUM_LAB_ID) ;

But I am facing the error

Neo.DatabaseError.General.UnknownError
key not found:   UNNAMED32

Please advise

Thanks in advance

Upvotes: 1

Views: 390

Answers (1)

Gabor Szarnyas
Gabor Szarnyas

Reputation: 5057

As of today, Cypher is not really well-suited for these sort of queries, as it only supports edge uniqueness, but not vertex uniqueness. There is a proposal in the openCypher language to support configurable matching semantics, but it has only been accepted recently and is not merged to Neo4j.

So currently, for this sort of traversal, you are probably better of using the APOC library's apoc.path.expandConfig stored procedure. This allows you to set uniqueness constraints such as NODE_PATH, which enforces that "For each returned node there’s a unique path from the start node to it."

Also, when I faced a similar problem, I tried to use the following hack: set a fixed depth of the traversal and manually specify the uniqueness constraints. This did not work well for my use case, but it might be worth to give it a try. Sketch code:

MATCH p=(n)-[*5]->(n)
WHERE nodes(p)[0] <> nodes(p)[2]
  AND nodes(p)[0] <> nodes(p)[4]
  AND nodes(p)[2] <> nodes(p)[4]
RETURN nodes(p)
LIMIT 1

The error you got Neo.DatabaseError.General.UnknownError / key not found: UNNAMED32 is very strange indeed, it seems that your query overstressed the database which resulted in this (quite unique) error message.

Note: I agree with the comment of @TomGeudens stating that you should not create the MATCHED edge twice - just use a single direction and incorporate the undirected nature of the edge in your queries, i.e. use (...)-[...]-(...) in Cypher.

Upvotes: 1

Related Questions