wanderingcatto
wanderingcatto

Reputation: 53

Neo4j: Exclude certain nodes in variable path relationship

I've a graph database consisting of two types of nodes - persons and businesses, and one type of relationship - payment.

A person may pay either another person, or another business. Likewise, a business may pay a person or a business. That is, all these four types of paths are possible

(person)-[:PAYS]->(person)
(person)-[:PAYS]->(business)
(business)-[:PAYS]->(person)
(business)-[:PAYS]->(business)

In a use case of detecting possible money laundering, I would like to extract cases where payment made by a person went through several businesses before reaching another person. That is (omitting the relationship for convenience):

(person)-(business)-(business)-(business)-(person)

My cypher query should therefore look something like this:

(person)-[:PAYS*0..3]-(person)

However, this will also return me the following relationship, which isn't what I want:

(person)-(business)-(person)-(business)-(person)

What can I do to exclude (person) from the variable length relationship [:PAYS*0..3]?

I've followed the solution given here and tried this:

MATCH path((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path

However, this query ran for a long time before giving an output of zero results (which isn't correct). Another obvious solution is to change my relationship to make a distinction between [:PAYS_BUSINESS] and [:PAYS_PERSON], but I would find out if a solution is possible without changing my graph schema.

Upvotes: 0

Views: 909

Answers (3)

David A Stumpf
David A Stumpf

Reputation: 793

You might want to look at how I handled this with X-linked inheritance. In that use case you aggregate the sex of the parent (M or F) and can then excluded MM from the aggregated string since a man never passes an X to his son.

http://stumpf.org/genealogy-blog/graph-databases-in-genealogy

The query exclude all MM concatenated strings, rather accepted anything except MM:

match p=(n:Person{RN:32})<-[:father|mother*..99]-(m) with m, reduce(status ='', q IN nodes(p)| status + q.sex) AS c, reduce(srt2 ='|', q IN nodes(p)| srt2 + q.RN + '|') AS PathOrder where c=replace(c,'MM','') return distinct m.fullname as Fullname

In your case its P and B (person or business).

Upvotes: 0

Graphileon
Graphileon

Reputation: 5385

The reason that

MATCH path=((person)-[:PAYS*0..3]-(person))
WHERE NONE(n IN nodes(path) WHERE n:person)
RETURN path

does not result in anything seems to be that the first and the last node are persons

if you want to find the paths from :person to :person with only :business in between, you could do this

MATCH path=((p1:Person)-[:PAYS*1..3]-(p2:Person))
WHERE ALL(n IN nodes(path)[1..-1] WHERE n:Business)
RETURN path

You may all want to look at the apoc.path.expand and apoc.path.expandConfig procedures (https://neo4j.com/labs/apoc/4.1/overview/apoc.path/). Powerful, but you introduce a dependency on the APOC library.

Upvotes: 2

wanderingcatto
wanderingcatto

Reputation: 53

5 minutes after I posted this question, I thought of and tried a possible solution that seems to work. Not sure if this is against the rules, but here's a possible way out of my own problem (in case someone else is facing the same problem):

MATCH x=(p1:person)-[:PAYS]-(b1:business)
WITH *
MATCH y=(b1:business)-[:PAYS*..3]-(b2:business)-[:PAYS]-(p2:person)
RETURN x, y

Upvotes: 0

Related Questions