Ajay
Ajay

Reputation: 2643

Efficient way to Increment Vertex counter property in janusgraph

I am using janusGraph-0.2.0 with Cassandra backend with ES.

I want to store no.of views in Vertex property, Need an efficient and scalable way to increment/store the views count without impacting read performance.

  1. Read views property from graph while fetching vertex, and update new views count in another query. (Wont impact read performance, but counter is not synchronised)

    g.V().has("key","keyId").valueMap(true);
    g.V(id).property('views', 21);
    
  2. Using sack to store value 1, and add it to views property.

    g.withSack(0).V().has("key","keyId").
       sack(assign).by("views").sack(sum).by(constant(1)).
       property("views", sack())
    
  3. Use in-memory storage (Redis) to increment counters, and persist the updates in graph periodically.
  4. Any other better approach ?

Is there any way to use cassendra's counter functionality in janusGraph?

Upvotes: 2

Views: 523

Answers (1)

Oleksandr
Oleksandr

Reputation: 3744

There is no way to use Cassandra counters with JanusGraph. Even more, there is no way to use Cassandra counters with general Cassandra table. The logic of Cassandra counter developed in such way that updating the counter don't require a lock. That is why you get a lot of limitations in exchange for great performance.

Counting views isn't that easy task. In short, my suggestion would be to go with option 3.

I would go with Redis and periodical update to JanusGraph in case when we are in a single data center and your single master server can handle all requests (you can ofcourse use some hash ring to split your counters among different Redis servers but it will increase complexity costs for maintenance).

In case you have multiple data centers your single master Redis server cannot handle all requests I would go with Cassandra counters.

In case you have a very big amount of view events so even Cassandra counters (with their cache) cannot handle all requests because the disk is accessed too many times and you cannot scale more because of high cost then the logic would be harder. I have never been in such situation so it is just theoretically. In this case I would develop the application servers to cache and group views and periodically send this cached data to RabbitMQ workers so that they could update Cassandra counters and then update necessary vertex with total views amount in JanusGraph. In such case very frequently vertex views would be grouped so that we don't need to update counter with +1 each time but with +100 or +1000 views in a single update. It would decease disk usage very much and you would have eventually consistent and fast counters. Again, this solution is only theoretical and should be tested. I believe other solutions also exist.

Upvotes: 1

Related Questions