tarka
tarka

Reputation: 5587

OrientDB Time Span Search Query

In OrientDB I have setup a time series using this use case. However, instead of appending my Vertex as an embedded list to the respective hour I have opted to just create an edge from the hour to the time dependent Vertex.

For arguments sake lets say that each hour has up to 60 time Vertex each identified by a timestamp. This means I can perform the following query to obtain a specific desired Vertex:

SELECT FROM ( SELECT expand( month[5].day[12].hour[0].out() ) FROM Year WHERE year = 2015) WHERE timestamp = 1434146922

I can see from the use case that I can use UNION to get several specified time branches in one go.

SELECT expand( records ) FROM (
  SELECT union( month[3].day[20].hour[10].out(), month[3].day[20].hour[11].out() ) AS records
  FROM Year WHERE year = 2015
)

This works fine if you only have a small number of branches but it doesn't work very well if you want to get all the records for a given time span. Say you wanted to get all the records between;

month[3].day[20].hour[11] -> month[3].day[29].hour[23]

I could iterate through the timespan and create a huge union query but at some point I guess the query would be too long and my guess is that it wouldn't be very efficient. I could also completely bypass the time branches and query the Vectors directly based on the timestamp.

SELECT FROM Sector WHERE timestamp BETWEEN 1406588622 AND 1406588624

The problem being that you loose all efficiencies gained by the time branches.

Upvotes: 1

Views: 406

Answers (1)

AlexB
AlexB

Reputation: 3548

By experimenting and reading a bit about data types in orientdb, I found that :

The squared brackets allow to :

  • filtering by one index, example out()[0]
  • filtering by multiple indexes, example out()[0,2,4]
  • filtering by ranges, example out()[0-9]

OPTION 1 (UPDATE) :

Using a union to join on multiple time is the only option if you don't want to create all indexes and if your range is small. Here is a query exemple using union in the documentation.

OPTION 2 :

If you always have all the indexes created for your time and if you filter on wide ranges, you should filter by ranges. This is more performant then option 1 for the cost of having to create all indexes on which you want to filter on. Official documentation about field part.

This is how the query would look like :

select 
    *
from 
(
    select 
        expand(hour[0-23].out()) 
    from 
       (select 
            expand(month[3].day[20-29]) 
       from 
            Year 
       where 
            year = 2015)
)
where timestamp > 1406588622

I would highly recommend reading this.

Upvotes: 5

Related Questions