Pouria
Pouria

Reputation: 431

Improve performance for querying count

I have table in my website to show a list of links and the number of times they have been visited. Here's the cypher query I use to get such a data:

MATCH (u:USER {email: $email})-[:CREATED]->(l:URL)
OPTIONAL MATCH (l)<-[v:VISITED]-(:VISIT)
RETURN l, COUNT(v) AS count
LIMIT 10

I create a VISIT node for each visit for a URL in order to store analytics data for each visit. So in the above code, I grab the links that a user have created and count the visits for each one.

The problem is the above query is not performant. Now that data has got huge, it takes at least 8 seconds to resolve.

Any ways to improve this query?

Upvotes: 0

Views: 33

Answers (2)

InverseFalcon
InverseFalcon

Reputation: 30397

For the :VISITED relationships, if those only connect :VISIT nodes to :URL nodes, then you can use the size() function on the pattern, excluding the node label, which will get the degree information from the :URL node itself without having to expand out (you can confirm this by doing a PROFILE or EXPLAIN of the plan and expand all elements, look for GetDegreePrimitive in the Projection operation).

Also, since you're using LIMIT 10 without any kind of ordering, it's better to do the LIMIT earlier so you only perform subsequent operations with the limited set of nodes rather than doing all the work for all the nodes then only keeping 10.

MATCH (u:USER {email: $email})-[:CREATED]->(l:URL)
WITH l
LIMIT 10
RETURN l, size((l)<-[:VISITED]-()) as count

Also, as noted by cybersam, you'll absolutely want an index on :USER(email) so lookup to your specific :USER node is fast.

Upvotes: 1

cybersam
cybersam

Reputation: 66989

In addition to @InverseFalcon's suggestions, you should either create an index or uniqueness constraint on :USER(email), to avoid having to scan through all USER nodes to find the one of interest.

Upvotes: 1

Related Questions