Reputation: 3632
I am using python+gremlin to implement my graph queries, but still far from understanding a lot of the concepts, and have encountered an interesting query I don't know to do.
Let's say we have a number of chef vertices with label Chef
, ingredient vertices with label Ingredient
, and dish vertices Dish
. At any given time, a chef can have ingredients at hand to use, indicated with an edge between Chef
and Ingredient
called has
. Dishes have ingredients, indicated with an edge between Dish
and Ingredient
called uses
. There is also an edge between Chef
and Dish
indicating if he/she has made it before, called madeBefore
.
Probably obvious, but there are Dishes that a chef has never made, and not all Dishes use all ingredients, and a chef most likely doesn't have all ingredients.
I would like to create a query which does the following:
Get Dishes that the chef has never made, sorted by the dishes that contain the most ingredients that the chef has to make it (if can get the ratio too would be great). So the first dishes in the results are ones the chef has never made, and maybe has all the ingredients to make, somewhere in the middle of the results are dishes they have never made and have around half the ingredients needed to make it, and last would be dishes they have never made and also have pretty much none of the ingredients needed to make it.
The following query will find all dishes that the chef has never made:
g.V()\
.hasLabel("Dish")\
.filter(
__.not_(
__.in_("madeBefore").has("Chef", "name", "chefbot1")
))\
.valueMap(True)\
.toList()
But from here I just have no idea where to start in order to start sorting those dishes based on how many of the ingredients the chef has.
My other thinking was to instead query over ingredients and using project
to get the counts of edges connecting both the chef and the dish and then filter them in some way then, but I don't know what to do after:
g.V()\
.hasLabel("Ingredient")\
.filter(
__.in_("has").has("Chef", "name", "chefbot1"))\
.project("v", "dishesUsingIngredient")\
.by(valueMap(True))\
.by(inE().hasLabel("uses").count())\
.order().by("dishesUsingIngredient", Order.desc)\
.toList()
My problem right now with Gremlin is understanding how to chain together more complicated queries, is there anyone that can shine some light on how to solve this kind of problem?
Upvotes: 2
Views: 550
Reputation: 2856
If I understood your description you can do something like this:
g.V().hasLabel('Dish').
filter(__.not(__.in('madeBefore').
has('Chef', 'name', 'chefbot1'))).
group().by('name').
by(out('uses').in('has').
has('Chef', 'name', 'chefbot1').count())
.order(local).by(values)
example: https://gremlify.com/8w
Upvotes: 1