What is the most efficient way of inserting multiple rows in cassandra column family. Is it possible to do this in a single call.
Right now my approach is to addinsert multiple column and then execute. There in a single call I am persisting one row. I am looking for strategy so that I can do a batch insert.
CQL contains a BEGIN BATCH...APPLY BATCH
statement that allows you to group multiple inserts so that a developer can create and execute a series of requests
(see http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0).
The following worked for me (Scala):
PreparedStatement ps = session.prepare(
"BEGIN BATCH" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"INSERT INTO messages (user_id, msg_id, title, body) VALUES (?, ?, ?, ?);" +
"APPLY BATCH" );
session.execute(ps.bind(uid, mid1, title1, body1, uid, mid2, title2, body2, uid, mid3, title3, body3));
If you don't know in advance which statements you want to execute, you can use the following syntax (Scala):
var statement: PreparedStatement = session.prepare("INSERT INTO people (name,age) VALUES (?,?)")
var boundStatement = new BoundStatement(statement)
val batchStmt = new BatchStatement()
batchStmt.add(boundStatement.bind("User A", "10"))
batchStmt.add(boundStatement.bind("User B", "12"))
session.execute(batchStmt)
Note: BatchStatement
can only hold up to 65536 statements. I learned that the hard way. :-)