siddharth ubale
siddharth ubale

Reputation: 53

Titan DB Aggregations

I wanted to use Titan Db (Storage Back end: HBase) for performing aggregations on the graph Data we have maintained. We aim to store our Data Warehouse data in the form of a graph in Titan DB. However , aggregations take a lot of time , i am using one instance of titan. Steps followed: 1. Creation of graph -4.5lac vertices , 4 lac edges. 2. Creation of Indexes -vertex & edge. 3. Enable database caching.

when i traverse the graph over a depth of 4 to find the sum to find sum of approx 8000 vertices on a property, i see that the first time it takes approx 30 secs to respond the query. Subsequent queries till 3 mins(database cache is flushed at 3 ins) served in under one sec. But after 3 mins again it takes 30 secs to rebuild the cache and provide the reponse. Has anyone has a similar use-case and any suggestions as to how i can make titan perform faster for aggregations? I am expecting a real time performance from titan.

Upvotes: 0

Views: 130

Answers (1)

Filipe Teixeira
Filipe Teixeira

Reputation: 3565

You might know this already but I will post what we did to get some performance boosts from Titan. The list here is all based on this chapter of the Titan Docs.

  1. Composite Indices - You probably have these but it's worth mentioning as they greatly speed up direct lookups for certain vertices.
  2. Vertex Centric Indices - If you have super nodes these can very quickly eliminate the edges you don't need to traverse.
  3. Mixed Indices - These are great for any operation which require numeric ranges or ordering and Elasticsearch is a very powerful indexing tool.

IF the problem is not reading but writing you could also try bulk loading

Upvotes: 1

Related Questions