Reputation: 539
I have created a database from twitter data and have a relationship between Users and Places like:
(:User)-[:WAS_AT]-> (p:Place)
There are 610.464 relationships of that type, between 59.257 Users and 823 Places.
I want to get all the users who were in the same place:
MATCH q=(u1:User)-[:WAS_AT]->(:Place)<-[:WAS_AT]-(u2:User)
RETURN q
That query has not finished after more than two hours, what I am doing wrong?
I tried adding an index to the users but that not improved the efficiency.
Thanks in advance,
Upvotes: 0
Views: 34
Reputation: 66999
Your query is trying to get every distinct pair of visits to the same Place
. So if there were N visits to a Place
, you are trying to get N*(N-1)
paths. And you are trying to do that for each and every Place
.
What you actually want is to get is a list of distinct Users
who visited the same Place
(which will be at most N in size). Here is how you can do that:
MATCH (u:User)-[:WAS_AT]->(place:Place)
RETURN place, COLLECT(DISTINCT u) AS users
The DISTINCT
option is only needed if a User
can have multiple WAS_AT
relationships to the same Place
.
Upvotes: 1