user201411
user201411

Reputation: 254

Returning most common paths in neo4j

I have a very simple structure:

U1-:VISITS->P1-:VISITS->P2-:VISITS->P3-VISITS->P4...

Each VISITS relationship has a rating on a scale 1 to 10. I am interested in relationships that start with U1-:VISITS->P1-:VISITS->P2 where 1st rating is <2 and 2nd rating is greater than 5. Each page node has page link as a property. After that, I am interested in the next 2 pages the user visits. This should return a list of paths. I am interested in the most frequent paths the user takes and ordering them number of times they appear. My query doesn't return the correct count of paths. What did I do wrong?

MATCH p=(a)-[r:VISITS]-(b)-[t:VISITS]-(c)-[q*1..2]-(page:Page) WHERE r.rating<2 AND t.rating>5 RETURN EXTRACT (n IN nodes(p)|n.page_id) ,count(p) ORDER BY count(p) DESC;

For example:

U1->P1->P2
U2->P1->P2
U3->P3->P4

should have

P1,P2  2
P3,P4  1

as the final result.

EDIT: This is my solution that returns the correct result for the above problem (u->p1->p2):

MATCH p=(a)-[r:VISITS]-(b:Page)-[t:VISITS]-(page:Page) WHERE r.rating<2 AND t.rating>5 WITH EXTRACT (n IN nodes(p)|n.page_id) AS my_pages,t AS rels RETURN DISTINCT(my_pages) AS pages,count(DISTINCT rels) as count;

I need to extend it now to include longer paths.

Upvotes: 2

Views: 612

Answers (1)

Brian Underwood
Brian Underwood

Reputation: 10856

The first thing that I notice (and it make have just been a transcription error, is that there are no directions on the relationships. Also, you're not using labels, so you could be matching on any sub-section of the path. This might work better:

MATCH p=(a:User)-[r:VISITS]->(b:Page)-[t:VISITS]->(c:Page)-[q*1..2]->(page:Page)
  WHERE r.rating<2 AND t.rating>5
  RETURN EXTRACT (n IN nodes(p)|n.page_id) ,count(p)
  ORDER BY count(p) DESC;

If you don't have labels, you could maybe also add WHERE NOT(()-[:VISITS]->(a))

Upvotes: 2

Related Questions