John Stephenson
John Stephenson

Reputation: 501

Gremlin - how do you merge vertices to combine their properties without listing the properties explicitly?

Background: I'm trying to implement a time-series versioned DB using this approach, using gremlin (tinkerpop v3).

enter image description here

I want to get the latest state node (in red) for a given identity node (in blue) (linked by a 'state' edge which contains a timestamp range), but I want to return a single aggregated object which contains the id (cid) from the identity node and all the properties from the state node, but I don't want to have to list them explicitly. (8640000000000000 is my way of indicating no 'to' date - i.e. the edge is current - slightly different from the image shown).

I've got this far:

:> g.V().hasLabel('product').
     as('cid').
     outE('state').
     has('to', 8640000000000000).
     inV().
     as('name').
     as('price').
     select('cid', 'name','price').
     by('cid').
     by('name').
     by('price')

=>{cid=1, name="Cheese", price=2.50}
=>{cid=2, name="Ham", price=5.00}

but as you can see I have to list out the properties of the 'state' node - in the example above the name and price properties of a product. But this will apply to any domain object so I don't want to have to list the properties all the time. I could run a query before this to get the properties but I don't think I should need to run 2 queries, and have the overhead of 2 round trips. I've looked at 'aggregate', 'union', 'fold' etc but nothing seems to do this.

Any ideas?

===================

Edit: Based on Daniel's answer (which doesn't quite do what I want ATM) I'm going to use his example graph. In the 'modernGraph' people-create->software. If I run:

> g.V().hasLabel('person').valueMap()
==>[name:[marko], age:[29]]
==>[name:[vadas], age:[27]]
==>[name:[josh], age:[32]]
==>[name:[peter], age:[35]]

then the results are a list of entities's with the properties. What I want is, on the assumption that a person can only create one piece of software ever (although hopefully we will see how this could be opened up later for lists of software created), to include the created software 'language' property into the returned entity to get:

> <run some query here>
==>[name:[marko], age:[29], lang:[java]]
==>[name:[vadas], age:[27], lang:[java]]
==>[name:[josh], age:[32], lang:[java]]
==>[name:[peter], age:[35], lang:[java]]

At the moment the best suggestion so far comes up with the following:

> g.V().hasLabel('person').union(identity(), out("created")).valueMap().unfold().group().by {it.getKey()}.by {it.getValue()}
==>[name:[marko, lop, lop, lop, vadas, josh, ripple, peter], lang:[java, java, java, java], age:[29, 27, 32, 35]]

I hope that's clearer. If not please let me know.

Upvotes: 7

Views: 7697

Answers (3)

Vinit Siriah
Vinit Siriah

Reputation: 327

Thanks for the answer by Daniel Kuppitz and youhans it has given me a basic idea on the solution of the issue. But later I found out that the solution is not working for multiple rows. It is required to have local step for handling multiple rows. The modified gremlin query will look like:

g.V()
.local(
        __.union(__.valueMap(), __.outE().inV().valueMap())
        .unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
)

    

This will limit the scope of union and group by to a single row.

If you can work with custom DSL ,create custom DSL with java like this one.

public default GraphTraversal<S, LinkedHashMap> unpackMaps(){
        GraphTraversal<S, LinkedHashMap> it = map(x -> {
            LinkedHashMap mapSource = (LinkedHashMap) x.get();
            LinkedHashMap mapDest = new LinkedHashMap();

            mapSource.keySet().stream().forEach(key->{

                Object obj = mapSource.get(key);
                if (obj instanceof LinkedHashMap) {

                    LinkedHashMap childMap = (LinkedHashMap) obj;
                    childMap.keySet().iterator().forEachRemaining( key_child ->
                            mapDest.put(key_child,childMap.get(key_child)
                            ));


                } else
                    mapDest.put(key,obj);

            });

            return mapDest;
        });
        return it;
    }

and use it freely like

g.V().as("s")

.valueMap().as("value_map_0")
.select("s").outE("INFO1").inV().valueMap().as("value_map_1")
.select("s").outE("INFO2").inV().valueMap().as("value_map_2")
.select("s").outE("INFO3").inV().valueMap().as("value_map_3")

.select("s").local(__.outE("INFO1").count()).as("value_1")
.select("s").outE("INFO1").inV().value("name").as("value_2")


.project("val_map1","val_map2","val_map3","val1","val2")
.by(__.select("value_map_1"))
.by(__.select("value_map_2"))
.by(__.select("value_1"))
.by(__.select("value_2"))
.unpackMaps()

results to rows with

 map1_val1, map1_val2,.... ,map2_va1, map2_val2....,value1, value2

This can handle mix of values and valueMaps in a natural gremlin way.

Upvotes: 0

youhans
youhans

Reputation: 6859

Merging edge and vertex properties using gremlin java DSL:

 g.V().has('User', 'id', userDbId).outE(Edges.TWEETS)
    .union(__.identity().valueMap(), __.inV().valueMap())
    .unfold().group().by(__.select(Column.keys)).by(__.select(Column.values))
    .map(v -> converter.toTweet((Map) v.get())).toList();

Upvotes: 0

Daniel Kuppitz
Daniel Kuppitz

Reputation: 10904

Since you didn't provide I sample graph, I'll use TinkerPop's toy graph to show how it's done.

Assume you want to merge marko and lop:

gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).valueMap()
==>[name:[marko],age:[29]]
gremlin> g.V(1).out("created").valueMap()
==>[name:[lop],lang:[java]]

Note, that there are two name properties and in theory you won't be able to predict which name makes it into your merged result; however that doesn't seem to be an issue in your graph.

Get the properties for both vertices:

gremlin> g.V(1).union(identity(), out("created")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[lop],lang:[java]]

Merge them:

gremlin> g.V(1).union(identity(), out("created")).valueMap().
           unfold().group().by(select(keys)).by(select(values))
==>[name:[lop],lang:[java],age:[29]]

UPDATE

Thank you for the added sample output. That makes it a lot easier to come up with a solution (although I think your output contains errors; vadas didn't create anything).

gremlin> g.V().hasLabel("person").
           filter(outE("created")).map(
             union(valueMap(),
                   outE("created").limit(1).inV().valueMap("lang")).
             unfold().group().by {it.getKey()}.by {it.getValue()})
==>[name:[marko], lang:[java], age:[29]]
==>[name:[josh], lang:[java], age:[32]]
==>[name:[peter], lang:[java], age:[35]]

Upvotes: 10

Related Questions