Saurabh Kumar
Saurabh Kumar

Reputation: 16651

Cassandra batch select and batch update

I have a requirement to update all users with a specific value in a job.

i have million of users in my Cassandra database. is it okay to query million user first and do some kind of batch update? or is there some implementation available to do these kind of work. I am using hector API to interact with Cassandra. What can be the best possible way to do this.?

Upvotes: 1

Views: 2600

Answers (1)

Arya
Arya

Reputation: 2153

You never want to fetch 1 million users and keep them locally. Ideally you want to iterate over all those user keys using a range query. Hector calls this RangeSliceQuery. There is a good example here:

http://irfannagoo.wordpress.com/2013/02/27/hector-slice-query-options-with-cassandra/

For start and end key use null and add this also:

rangeQuery.setRowCount(100) to fetch 100 rows at a time.

Do this inside a loop. The first time you fetch with null being start and end key, the last key you get from the first result set should be the start key of your next query. And you continue paginating like this.

You can then use batch mutate and update in batches.

http://hector-client.github.io/hector/source/content/API/core/1.0-1/me/prettyprint/cassandra/service/BatchMutation.html

Upvotes: 1

Related Questions