tnk_peka
tnk_peka

Reputation: 1535

Cassandra bulk insert solution

I have a java program run as service , this program must insert 50k rows/s (1 row have 25 column ) to cassandra cluster.

My cluster contain 3 nodes, 1 node have 4 cpu core (core i5 2.4 ghz) , 4 gb ram.

i used Hector api, multithread, bulk insert but the performance is too low as expect (about 25k rows /s ).

Any one have suggest another solution for that. Is there cassandra support an internal bulk insert (without use Thrift).

Upvotes: 5

Views: 3788

Answers (3)

samarth
samarth

Reputation: 4024

The fastest way to bulk-insert data into Cassandra is sstableloader an utility provided by Cassandra in 0.8 onwards. For that you have to create sstables first which is possible with SSTableSimpleUnsortedWriter more about this is described here

Another faster way is Cassandras BulkoutputFormat for hadoop.With this we can write Hadoop job to load data to cassandra.See more on this bulkload to cassandra with hadoo

Upvotes: 1

libjack
libjack

Reputation: 6443

I've had good luck creating sstables and loading them directly. There is a sstableloader tool included in the distribution as well as a JMX interface. You can create the sstables using the SSTableSimpleUnsortedWriter class.

Details here.

Upvotes: 1

phuongdo
phuongdo

Reputation: 271

Astyanax is a high level Java client for Apache Cassandra. Apache Cassandra is a highly available column oriented database. Astyanax is currently in use at Netflix. Issues generally are fixed as quickly as possbile and releases done frequently.

https://github.com/Netflix/astyanax

Upvotes: 1

Related Questions