Matt Bryson
Matt Bryson

Reputation: 2934

Neo4j : how to implement SKIP and LIMIT on a UNION cypher ? What are the alternatives?

I am trying to create an activity feed across various relationships.

The only way I have managed to do it so far is using UNION to run 3 statements. However, you can't LIMIT and ORDER across the results of all the unions – it appears to be a limitation of Neo at the moment.

Does anyone know how I could re write the below so I can order and limit the whole lot?

match (me:User {username:'bob'})-[:Owner]->(m:Message)<-[r:Likes|:Reply|:Share]-(u:User)
return m as message, u as user, lower(type(r)) as activity, r.created_on as date  
order by date skip 0 limit 25

union match (me:User {username:'bob'})<-[r:Mentions]-(m:Message)-[:Owner]-(u:User) 
return m as message, u as user, lower(type(r)) as activity, r.created_on as date
order by date skip 0 limit 25

union match (me:User {username:'bob'})<-[r:Follows]-(u:User)
return NULL as message, u as user, lower(type(r)) as activity, r.created_on as date
order by date skip 0 limit 25

I got to this, which returns a single column with the data i need as nested properties, but I cant work out how to order the final collection.....

match (me:User {username:'bob'})-[:Owner]->(m:Message)<-[r:Likes|:Reply|:Share]-(u:User)
with collect({activity:lower(type(r)), user:u, message:m, date:r.created_on}) as a1

optional match (me:User {username:'bob'})<-[r:Mentions]-(m:Message)-[:Owner]-(u:User) 
with collect({activity:lower(type(r)), user:u, message:m, date:r.created_on }) as a2, a1 

optional match (me:User {username:'bob'})<-[r:Follows]-(u:User) 
with collect({activity:lower(type(r)), user:u, date:r.created_on }) as  a3, a2, a1

with a3 + a2 +a1 as all
unwind all as activity
return activity
skip 0 limit 25

Any help much appreciated!

UPDATE

So now I have this....

MATCH (me:User { username:'bob' })--(u:User)
OPTIONAL MATCH (me)-[:Owner]->(m:Message)<-[r:Likes|:Reply|:Share]-(u)
WITH me, collect({ type:lower(type(r)), user:u, message:m, date:r.created_on }) AS a1
OPTIONAL MATCH (me)<-[r:Mentions]-(m:Message)<-[:Owner]-(u)
WITH me, collect({ type:lower(type(r)), user:u, message:m, date:r.created_on }) AS a2, a1
OPTIONAL MATCH (me)<-[r:Follows]-(u)
WITH collect({ type:lower(type(r)), user:u, date:r.created_on })+ a2 + a1 AS all
UNWIND all AS activity
WITH activity
WHERE  activity.type is not null
RETURN activity
ORDER BY activity.date
LIMIT 25;

Can anyone see any performance issues with this?

Im matching me--user at the start to make sure I only look for users I have some relationship with. Then at the end I filter out the NUll matches from the OPTIONAL matches. As i'm manually collecting a literal object, it gets an object with NULL entries if nothing was matched, so I just remove these at the end....

All this because you cant filter POST UNION!

Upvotes: 3

Views: 1707

Answers (3)

Brian Underwood
Brian Underwood

Reputation: 10856

I want to steal yours / cybersam's solution with another suggestion. I'm going to work off cybersam's. Firstly, I'm thinking you don't want to only return results where there is always a path from user-[:Owner]->(:Message)-:Likes|:Reply|:Share]-(:User), so that needs to be an optional match. But secondly, I think it will be faster if you only match the end user variable once. Like so:

MATCH (me:User { username:'bob' }), (user:User)
OPTIONAL MATCH (me)-[:Owner]->(m:Message)<-[r:Likes|:Reply|:Share]-(user)
WITH me, collect({ type:lower(type(r)), user:user, message:m, date:r.created_on }) AS a1
OPTIONAL MATCH (me)<-[r:Mentions]-(m:Message)<-[:Owner]-(user)
WITH me, collect({ type:lower(type(r)), user:user, message:m, date:r.created_on }) AS a2, a1
OPTIONAL MATCH (me)<-[r:Follows]-(user)
WITH collect({ type:lower(type(r)), user:user, date:r.created_on }) + a2 + a1 AS all
UNWIND all AS activity
RETURN activity
ORDER BY activity.date
LIMIT 25;

Upvotes: 0

Brian Underwood
Brian Underwood

Reputation: 10856

You could do this:

MATCH (me:User {username:'bob'}), (user:User)
OPTIONAL MATCH (me)-[:Owner]->(m1:Message)<-[message_type:Likes|:Reply|:Share]-(user)
OPTIONAL MATCH (me)<-[mentions:Mentions]-(m2:Message)-[:Owner]-(user) 
OPTIONAL MATCH (me)<-[follows:Follows]-(user)
WITH *
WHERE message_type IS NOT NULL OR mentions IS NOT NULL OR follows IS NOT NULL
RETURN
  COALESCE(m1, m2) AS message,
  user,
  lower(COALESCE(type(message_type), type(mentions), type(follows))),
  COALESCE(message_type.date, mentions.date, follows.date) AS date
ORDER BY date
SKIP 0
LIMIT 25

The main disadvantage here (aside from all the COALESCEs ;) is that since there are only OPTIONAL MATCH paths from me to user that you'll end up matching me against every single user which means your query might grow more slow as you get more users. That's why I put the WHERE in there after the WITH, because you need to filter out all of the cases where there is no path between the two users.

EDIT:

I just realized that this is a problematic solution. If two or more of your OPTIONAL MATCH clauses match then you'll only get one result. You could do something like this:

MATCH (me:User {username:'bob'}), (user:User)
OPTIONAL MATCH (me)-[:Owner]->(message:Message)<-[message_type:Likes|:Reply|:Share]-(user)
OPTIONAL MATCH (me)<-[mentions:Mentions]-(mention_message:Message)-[:Owner]-(user) 
OPTIONAL MATCH (me)<-[follows:Follows]-(user)
WITH *
WHERE message_type IS NOT NULL OR mentions IS NOT NULL OR follows IS NOT NULL
RETURN
  message,
  mention_message
  user,
  type(message_type) AS message_type_type,
  type(mentions) AS mentions_type,
  type(follows) AS follows_type,
  message_type.date AS message_type_date,
  mentions.date AS mentions_date,
  follows.date AS follows_date
ORDER BY date
SKIP 0
LIMIT 25

But then I think you might get repeats of values depending on how you match.

Upvotes: 0

cybersam
cybersam

Reputation: 66989

The ability to do "post-UNION processing" is not yet supported, but is promised "as soon as possible" (see neo4j issue 2725). You may want to add comments to that issue if you would like this to happen soon.

Your attempted solution was close. This query should work better:

MATCH (me:User { username:'bob' })-[:Owner]->(m:Message)<-[r:Likes|:Reply|:Share]-(u:User)
WITH me, collect({ type:lower(type(r)), user:u, message:m, date:r.created_on }) AS a1
OPTIONAL MATCH (me)<-[r:Mentions]-(m:Message)<-[:Owner]-(u:User)
WITH me, collect({ type:lower(type(r)), user:u, message:m, date:r.created_on }) AS a2, a1
OPTIONAL MATCH (me)<-[r:Follows]-(u:User)
WITH collect({ type:lower(type(r)), user:u, date:r.created_on })+ a2 + a1 AS all
UNWIND all AS activity
RETURN activity
ORDER BY activity.date
LIMIT 25;

This query:

  • ORDERS BY the date of the combined activities, so that the 25 earliest activities are returned. This was the main change needed to your query.
  • Eliminates the unneeded SKIP 0 clause.
  • Passes me through the first 2 WITH clauses so that the OPTIONAL MATCH clauses don't have to re-find me again.

Upvotes: 1

Related Questions