RIP SunMicrosystem
RIP SunMicrosystem

Reputation: 426

Batch queries into Cassandra

I'm Trying to insert a batch of objects into Cassandra something like this:

public void insertIntoCassandra(ArrayList<FileLoaderVo> LoadVo)
        throws NitroLoaderException {

    int temp = LoadVo.size();

    try {
        Session session = cassandraDAO.getSession();
        if (session == null) {
            String msg = "CassandraDAO.getSession() returned null";
            logger.error(msg);
            throw new FileLoaderException(msg);
        }

        BoundStatement bStmtHistTable = null;

        if (pStmtHistTable == null) {
            pStmtHistTable = session.prepare(insertToCassandra);
            pStmtHistTable.setConsistencyLevel(ConsistencyLevel.ONE);
        }

        for (FileLoaderVo fileLoaderVo : LoadVo) {              

            bStmtHistTable = pStmtHistTable.bind(fileLoaderVo.getSecurityCode(),
                    fileLoaderVo.getType(), fileLoaderVo.getAreaCode(),
                    fileLoaderVo.getEmpName(), fileLoaderVo.getCityType(),
                    fileLoaderVo.getHomeFIPS(), fileLoaderVo.getLastName(),
                    fileLoaderVo.getDst(), fileLoaderVo.getCssCode(),
                    fileLoaderVo.getAbbr(), fileLoaderVo.getOfficeFIPS(),
                    fileLoaderVo.getMiddleName(), fileLoaderVo.getZone(),
                    fileLoaderVo.getUtc());

            session.execute(bStmtHistTable);
            logger.info("LoadVo.size() is :"+temp);
            temp--;
        }
    } catch (Exception e) {
        System.out.println(e);
    }

}

Here I'm passing this method an ArrayList of objects to be inserted into Cassandra., But Is there any way I could run a single query on these objects like a batch insert ?

I've looked into datastax but couldn't find anything, your inputs would be appreciated.

Thanks in advance.

Upvotes: 1

Views: 4766

Answers (2)

Christopher Batey
Christopher Batey

Reputation: 684

Batches for different partitions add a lot of overhead on the coordinator so aren't recommended unless you want to ensure the statements succeed even if the coordindator and your application crash.

You'll likely see the best performance from making many async calls and then collecting the results and retrying any one that failed.

For full details including common anti-patterns see:

Logged batches: http://christopher-batey.blogspot.co.uk/2015/03/cassandra-anti-pattern-cassandra-logged.html

Unlogged batches: http://christopher-batey.blogspot.co.uk/2015/02/cassandra-anti-pattern-misuse-of.html

Upvotes: 2

Alex Popescu
Alex Popescu

Reputation: 4002

Depending on the version of Cassandra you are running, tv you could either have bound statements added to a batch (C* 2.0) or prepare a batch statement (C* 1.2). These 2 options are covered in this blog post: http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0

Basically with C* 2.0 you can do:

if (pStmtHistTable == null) {
    pStmtHistTable = session.prepare(insertToCassandra);
    pStmtHistTable.setConsistencyLevel(ConsistencyLevel.ONE);
}
// create a batch statement
BatchStatement batch = new BatchStatement();

for (FileLoaderVo fileLoaderVo : LoadVo) {
    // add bound statements to the batch
    batch.add(pStmtHistTable.bind(...));
}
// execute all
session.execute(batch);

Upvotes: 1

Related Questions