Mariano Soto
Mariano Soto

Reputation: 136

How to query Gremlin when multiple connections between nodes are present

I'm trying to build a suggestion engine using Gremlin but I'm having a hard time trying to understand how to create a query when multiple nodes are connected by different intermediate nodes.

Playground: https://gremlify.com/alxrvpfnlo9/2

Graph:

graph image

In this simple example I have two users, both like cheese and bread. But User2 also likes sandwiches, which seems a good suggestion for User1 as he shares some common interests with User2

The question I'm trying to answer is: "What can I suggest to User1 based on what other users like?"

The answer should be: Everything that other users that like the same things as User1 likes, but excluding what User1 already like. In this case it should return a sandwich

So far I have this query:

g.V(2448600).as('user1')
 .out().as('user1Likes')
 .in().where(neq('user1')) // to get to User2
 .out().where(neq('user1Likes')) // to get to what User2 likes but excluding items that User1 likes

Which returns:

Sandwich, bread, Sandwich (again), cheese

I think that it returns that data because it walks through the graph by the Cheese node first, so Bread is not included in the 'user1Likes' list, thus not excluded in the final result. Then it walks through the Bread node, so cheese in this case is a good suggestion.

Any ideas/suggestions on how to write that query? Take into consideration that it should escalate to multiple users-ingredients

Upvotes: 0

Views: 741

Answers (1)

Bassem
Bassem

Reputation: 3006

I suggest that you model your problem differently. Normally the vertex label is used to determine the type of the entity. Not to identify the entity. In your case, I think you need two vertex labels: "user" and "product".

Here is the code that creates the graph.

g.addV('user').property('name', 'User1').as('user1').
  addV('user').property('name', 'User2').as('user2').
  addV('product').property('name', 'Cheese').as('cheese').
  addV('product').property('name', 'Bread').as('bread').
  addV('product').property('name', 'Sandwiches').as('sandwiches').
  addE('likes').from('user1').to('cheese').
  addE('likes').from('user1').to('bread').
  addE('likes').from('user2').to('cheese').
  addE('likes').from('user2').to('bread').
  addE('likes').from('user2').to('sandwiches')

And here is the traversal that gets the recommended products for "User1".

g.V().has('user', 'name', 'User1').as('user1').
  out('likes').aggregate('user1Likes').
  in('likes').
  where(neq('user1')).
  dedup().
  out('likes').
  where(without('user1Likes')).
  dedup()

The aggregate step aggregates all the products liked by "User1" into a collection named "user1Likes".

The without predicate passes only the vertices that are not within the collection "user1Likes".

Upvotes: 1

Related Questions