How to multi insert rows in cassandra

ajjain picture ajjain · Jul 26, 2013 · Viewed 35.9k times · Source

What is the most efficient way of inserting multiple rows in cassandra column family. Is it possible to do this in a single call.

Right now my approach is to addinsert multiple column and then execute. There in a single call I am persisting one row. I am looking for strategy so that I can do a batch insert.

Answer

Priyank Desai picture Priyank Desai · Jul 15, 2015

CQL contains a BEGIN BATCH...APPLY BATCH statement that allows you to group multiple inserts so that a developer can create and execute a series of requests (see http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0).

The following worked for me (Scala):

PreparedStatement ps = session.prepare(
"BEGIN BATCH" +    
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +    
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +    
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +    
"APPLY BATCH" ); 

session.execute(ps.bind(uid, mid1, title1, body1, uid, mid2, title2, body2, uid, mid3, title3, body3));

If you don't know in advance which statements you want to execute, you can use the following syntax (Scala):

var statement: PreparedStatement = session.prepare("INSERT INTO people (name,age) VALUES (?,?)")
var boundStatement = new BoundStatement(statement)
val batchStmt = new BatchStatement()
batchStmt.add(boundStatement.bind("User A", "10"))
batchStmt.add(boundStatement.bind("User B", "12"))
session.execute(batchStmt)

Note: BatchStatement can only hold up to 65536 statements. I learned that the hard way. :-)