Reputation: 439
I would like to find out all reachable nodes via one specific relationship starting from a node.
I have the below graphs.
(User) --[LOGGED_IN]--> (Ip)
(User) --[FRIEND]--> (User)
I would like to find all reachable User nodes thru LOGGED_IN relationship. eg.
user1 logged_in ip1
user2 logged_in ip1
user2 logged_in ip2
user3 logged_in ip2
user3 logged_in ip3
user4 logged_in ip3
user5 logged_in ip4
user1 friend user5
If I start from user1 I want to find user1, user2, user3, user4. I would like to ignore the FRIEND relationship.
I know if I only have [:LOGGED_IN] relationship I can do the below cypher. But I also have FRIEND relationship and this will also give me the users linked by [:FRIEND]
MATCH (u:User)-[*]->(connected:User)
WHERE u.user_id = <user1_id>
RETURN connected
Upvotes: 1
Views: 675
Reputation: 30397
If your nodes are deeply interconnected, then cypher alone may not work out for you, since MATCH operations in cypher with variable-length paths are all about finding all possible paths that fit the pattern, and that quickly gets you into trouble with the number of possible paths goes through the roof. This isn't a good fit when you're only concerned about distinct connected nodes.
If you have access to APOC Procedures, there are some path expander procedures that are optimized toward finding connected nodes. After installing and configuring APOC, give this a try:
MATCH (u:User)
WHERE u.user_id = <user1_id>
CALL apoc.path.subgraphNodes(u, {relationshipFilter:'LOGGED_IN', labelFilter:'>User', filterStartNode:true}) YIELD node as connected
RETURN connected;
Upvotes: 1
Reputation: 66999
This should work (with the appropriate value for <user1_id>
):
MATCH (u:User)-[:logged_in*0..]-(connected:User)
WHERE u.user_id = <user1_id>
RETURN DISTINCT connected;
The (u:User)-[:logged_in*0..]-(connected:User)
pattern:
logged_in
.0
for the variable-length path pattern, which allows the u
node itself to be assigned to connected
.loggeded_in
relationship, to permit traversals from Ip
nodes to User
nodes (and vice versa).The DISTINCT
keyword is used to eliminate duplicate results.
This query will always return the u
node (if it exists), since a node is trivially reachable from itself.
[UPDATED]
If you have enough data, then the variable-length path pattern will have to specify a reasonable upper bound (e.g., [:logged_in*0..5]
) to avoid running out of memory or having the query take forever to complete.
Upvotes: 3