Spring data Cassandra 2.0 Select BLOB column returns incorrect ByteBuffer data

berhauz picture berhauz · Aug 18, 2014 · Viewed 7.2k times · Source

Context: Spring data cassandra official 1.0.2.RELEASE from Maven Central repo, CQL3, cassandra 2.0, datastax driver 2.0.4

Background: The cassandra blob data type is mapped to a Java ByteBuffer.

The sample code below demonstrates that you won't retrieve the correct bytes using select next to an equivalent insert. The data actually retrieved is prefixed by numerous garbage bytes that actually looks like a serialization of the entire row. This older post relating to Cassandra 1.2 suggested that we may have to start at ByteBuffer.arrayOffset() of length ByteBuffer.remaining(), but a the arrayOffset value is actually 0.

I discovered a spring-data-cassandra 2.0.0. SNAPSHOT but the CassandraOperations API is much different, and its package name too: org.springdata... versus org.springframework...

Help in fixing this will be much welcome.

In the mean time it looks like I have to encode/decode Base64 my binary data to/from a text data type column.

--- here is the simple table CQL meta data I use -------------

CREATE TABLE person (
  id text,
  age int,
  name text,
  pict blob,
  PRIMARY KEY (id)
) ;

--- follows the simple data object mapped to a CQL table ---

package org.spring.cassandra.example; 
 
import java.nio.ByteBuffer;
import org.springframework.data.cassandra.mapping.PrimaryKey; 
import org.springframework.data.cassandra.mapping.Table; 
 
@Table 
public class Person { 
 
 @PrimaryKey 
 private String id; 
 
 private int age; 
 private String name; 
 private ByteBuffer pict; 
 
 public Person(String id, int age, String name, ByteBuffer pict) { 
  this.id = id; this.name = name; this.age = age; this.pict = pict;
 } 
 
 public String getId() { return id; } 
 public String getName() { return name; } 
 public int getAge() { return age; } 
 public ByteBuffer getPict() { return pict; } 
     
 } 
 
}

--- and the plain java application code that simply inserts and retrieves a person object --

package org.spring.cassandra.example;

import java.nio.ByteBuffer;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;
import org.springframework.data.cassandra.core.CassandraOperations;

import com.datastax.driver.core.ResultSet;
import com.datastax.driver.core.Row;
import com.datastax.driver.core.querybuilder.QueryBuilder;
import com.datastax.driver.core.querybuilder.Select;

public class CassandraApp {

    private static final Logger logger = LoggerFactory
            .getLogger(CassandraApp.class);

    public static String hexDump(ByteBuffer bb) {
        char[] hexArray = "0123456789ABCDEF".toCharArray();
        bb.rewind();
        char[] hexChars = new char[bb.limit() * 2];
        for ( int j = 0; j < bb.limit(); j++ ) {
            int v = bb.get() & 0xFF;
            hexChars[j * 2] = hexArray[v >>> 4];
            hexChars[j * 2 + 1] = hexArray[v & 0x0F];
        }
        bb.rewind();
        return new String(hexChars);
    }

    public static void main(String[] args) {

        ApplicationContext applicationContext = new ClassPathXmlApplicationContext("app-context.xml");

        try {

            CassandraOperations cassandraOps = applicationContext.getBean(
                    "cassandraTemplate", CassandraOperations.class);

            cassandraOps.truncate("person");
            // prepare data
            byte[] ba = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x11, 0x22, 0x33, 0x44, 0x55, (byte) 0xAA, (byte) 0xCC, (byte) 0xFF };
            ByteBuffer myPict = ByteBuffer.wrap(ba);
            String myId = "1234567890";
            String myName = "mickey";
            int myAge = 50;
            
            logger.info("We try id=" + myId + ", name=" + myName + ", age=" + myAge +", pict=" + hexDump(myPict));
            
            cassandraOps.insert(new Person(myId, myAge, myName, myPict ));

            Select s = QueryBuilder.select("id","name","age","pict").from("person");
            s.where(QueryBuilder.eq("id", myId));

            ResultSet rs = cassandraOps.query(s);
            Row r = rs.one();
            
            logger.info("We got id=" + r.getString(0) + ", name=" + r.getString(1) + ", age=" + r.getInt(2) +", pict=" + hexDump(r.getBytes(3)));

        } catch (Exception e) {
            e.printStackTrace();
        }

    }
}

--- assuming you have configured a simple Spring project for cassandra as explained at http://projects.spring.io/spring-data-cassandra/

The actual execution yields:

[main] INFO org.spring.cassandra.example.CassandraApp - We try id=1234567890, name=mickey, age=50, pict= 0001020304051122334455AACCFF

[main] INFO org.spring.cassandra.example.CassandraApp - We got id=1234567890, name=mickey, age=50, pict=8200000800000073000000020000000100000004000A6D796B657973706163650006706572736F6E00026964000D00046E616D65000D000361676500090004706963740003000000010000000A31323334353637383930000000066D69636B657900000004000000320000000E 0001020304051122334455AACCFF

although the insert looks correct in the database itself, as seen from cqlsh command line:

cqlsh:mykeyspace> select * from person;

 id         | age | name   | pict
------------+-----+--------+--------------------------------
 1234567890 |  50 | mickey | 0x0001020304051122334455aaccff

(1 rows)

Answer

Martin picture Martin · Sep 10, 2014

I had exactly the same problem but have fortunately found a solution. The problem is that ByteBuffer use is confusing. Try doing something like:

ByteBuffer bb = resultSet.one().getBytes("column_name");
byte[] data = new byte[bb.remaining()];
bb.get(data);

Thanks to Sylvain's for this suggestion here: http://grokbase.com/t/cassandra/user/134brvqzd3/blobs-in-cql