Odomontois
Odomontois

Reputation: 16308

Execute query lazily in Orient-DB

In current project we need to find cheapest paths in almost fully connected graph which can contain lots of edges per vertex pair.

We developed a plugin containing functions

  1. for special traversal this graph to lower reoccurences of similar paths while TRAVERSE execution. We will refer it as search()
  2. for special effective extraction of desired information from results of such traverses. We will refer it as extract()
  3. for extracting best N records according to target parameter without costly ORDER BY. We will refer it as best()

But resulted query still has unsatisfactory performance on full data.

So we decided to modify search() function so it could watch best edges first and prune paths leading to definitely undesired result by using current state of best() function.
Overall solution is effectively a flexible implementation of Branch and Bound method

Resulting query (omitting extract() step) should look like

SELECT best(path, <limit>) FROM (
   TRAVERSE search(<params>) FROM #<starting_point>
   WHILE <conditions on intermediate vertixes>
  ) WHERE <conditions on result elements> 

This form is very desired so we could adapt conditions under WHILE and WHERE for our current task. The path field is generated by search() containing all information for best() to proceed.

The trouble is that best() function is executed strictly after search() function, so search() can not prune non-optimal branches according to results already evaluated by best().

So the Question is:
Is there a way to pipeline results from TRAVERSE step to SELECT step in the way that older paths were TRAVERSEd with search() after earlier paths handled by SELECT with best()?

Upvotes: 3

Views: 178

Answers (1)

Luigi Dell&#39;Aquila
Luigi Dell&#39;Aquila

Reputation: 2814

the query execution in this case will be streamed. If you add a

 System.out.println()

or you put a breakpoint in your functions you'll see that the invocation sequence will be

search
best
search
best
search
...

You can use a ThreadLocal object http://docs.oracle.com/javase/7/docs/api/java/lang/ThreadLocal.html

to store some context data and share it between the two functions, or you can use the OCommandContext (the last parameter in OSQLFunction.execute() method to store context information.

You can use context.getVariable() and context.setVariable() for this.

The contexts of the two queries (the parent and the inner query) are different, but they should be linked by a parent/child relationship, so you should be able to retrieve them using OCommandContext.getParent()

Upvotes: 2

Related Questions