Hadi Awad
Hadi Awad

Reputation: 23

Using Union Method of Gremlin Query and its affect on Neptune Transactions

Suppose that I have a graph created on AWS Neptune and I want to get multiple information about a vertex.

I have used the union method to get multiple information via the following query,

this.graphTraversalSource.V().hasLabel("label1").hasId("id1")
                .union(__.valueMap(true) ,__.outE().groupCount().by(T.label).unfold(),
                   __.inE("created_post").outV().valueMap(true))

Simply, the query is trying to get the following, 1. attributes of the vertex itself. 2. the out-edges and their counts 3. the attributes of user who has created the post of id1 The above query returns a

A sample result of the above query on Gremlin-Server would be like this

=>{label=label1, id=id1, text=[My sample text]}
==>has_comment=1
==>has_like=1
==>{label=user, id=user1}

Now, suppose that I am going to write these in Java, as you know the query must be ended by next() or iterate(). If I typed next(), then the first item in the above result would be returned. {label=label1, id=id1, text=[My sample text]} . However, if I tried to use next(4) then I will get the full result. Alternatively, I am aware that I can use next() and hasNext() of GraphTraversal to fetch the results as needed without using next(4)

I am concerned about the transactions being submitted to Neptune, because according to them

Transactions Neptune opens a new transaction at the beginning of each Gremlin traversal and closes the transaction upon the successful completion of the traversal. The transaction is rolled back when there is an error.

Multiple statements separated by a semicolon (;) or a newline character (\n) are included in a single transaction. Every statement other than the last must end with a next() step to be executed. Only the final traversal data is returned.

Manual transaction logic using tx.commit() and tx.rollback() is not supported.

According to the above, I understood that each .next() statement will cause a tx to take place at Neptune.

Having said all that, my questions are ,

  1. Is my understanding to the transaction techniques of Neptune true ?
  2. if yes, does that mean that each time I write .next() a transaction will occur ?
  3. how does the .next(int) behave, for example does using .next(4) mean that 4 transaction will take place ?
  4. How can I optimize this ? to fetch all needed data using one database access and one transaction ? i.e. is there a better way to accomplish my needs?
  5. How can I get all results of the above query using one line ? i.e. not to use hasNext() and next() methods ? is there a way to know the size of the result-set and thus to use next(result-set-size) method ?

Thanks Hadi

Upvotes: 2

Views: 1776

Answers (1)

stephen mallette
stephen mallette

Reputation: 46226

You can terminate the traversal in a variety of ways beyond next() and iterate() - for example toList() grabs all the results and packages them into a List object. I'd say that's the most common way people terminate traversals (and easiest to work with), assuming you don't mind realizing a List in memory on the client.

Since Neptune uses TinkerPop drivers, I can say that calls to next() (or toList() or other terminators) on bytecode-based traversals will not trigger additional requests to Neptune and therefore not start new transactions. The call tonext(),toList()` and other terminating steps operate on data streamed back to the client.

I believe that this portion of the documentation you pointed out:

Multiple statements separated by a semicolon (;) or a newline character (\n) are included in a single transaction. Every statement other than the last must end with a next() step to be executed. Only the final traversal data is returned.

is related to script based execution and probably doesn't apply based on the fact that you appear to be using bytecode-based traversals.

Upvotes: 2

Related Questions