Reputation: 1181
I have a graph data and this data includes millions of data. I try to figure out query optimization methods in neo4j.
For example, I have a query like:
MATCH ((a1:App{appId:1}) <- [:PAID_BY] - (k:Keyword{countryCode:'US'}) - [:PAID_BY] -> (a2:App{appId:2}))
return distinct k.value
limit 50
For this query optimization, which indexes should I create it? Or is there any optimization way for this query?
Note: In this query, I try to find mutual keywords between two apps.
Upvotes: 1
Views: 46
Reputation: 12684
Your query is collecting keywords that are common between app1 and app2. You can create indices on App.appId and Keyword.countryCode. Then use this query.
RETURN apoc.coll.toSet(
[ (:App{appId:1}) <- [:PAID_BY] -
(k:Keyword{countryCode:'US'}) -
[:PAID_BY] -> (:App{appId:2}) | k.value]) as keywords
Where apoc.col.toSet will ensure that the list have unique values
[ ] is called comprehension and similar to collect() function
| or pipe is a way to filter data. So you list will contain keyword.value
We love APOC functions and one liner code, isn't?
Upvotes: 2