Row Inserts having same primary key, are replacing previous writes in Cassandra

Vikram picture Vikram · Sep 12, 2014 · Viewed 8.6k times · Source

Created a table in Cassandra where the primary key is based on two columns(groupname,type). When I'm trying to insert more than 1 row where the groupname and type is same, then in such situation its not storing more than one row, subsequent writes where in the groupname and type are same.. then the latest write is replacing the previous similar writes. Why Cassandra is replacing in this manner instead of writing every row im inserting?

Write 1

cqlsh:resto> insert into restmaster (rest_id,type,rname,groupname,address,city,country)values(blobAsUuid(timeuuidAsBlob(now())),'SportsBar','SportsDen','VK Group','Majestic','Bangalore','India');

Write 2

insert into restmaster (rest_id,type,rname,groupname,address,city,country)values(blobAsUuid(timeuuidAsBlob(now())),'SportsBar','Sports Spot','VK Group','Bandra','Mumbai','India');

Write 3

cqlsh:resto> insert into restmaster (rest_id,type,rname,groupname,address,city,country)values(blobAsUuid(timeuuidAsBlob(now())),'SportsBar','Cricket Heaven ','VK Group','Connaught Place','New Delhi','India');

The result Im expecting(check rows 4,5,6)

 groupname      | type       | rname
----------------+------------+-----------------
           none |      Udipi |  Gayatri Bhavan
           none |     dinein |    Blue Diamond
       VK Group |  FoodCourt |        FoodLion
       VK Group |  SportsBar |      Sports Den
       VK Group |  SportsBar |     Sports Spot
       VK Group |  SportsBar |  Cricket Heaven
  Viceroy Group | Vegetarian |  Palace Heights
 Mainland Group |    Chinese |  MainLand China
      JSP Group |  FoodCourt |        Nautanki
          Ohris |  FoodCourt |           Ohris

But this is the actual result (write 3 has replaced previous 2 inserts [rows 4,5])

 cqlsh:resto> select groupname,type,rname From restmaster;

 groupname      | type       | rname
----------------+------------+-----------------
           none |      Udipi |  Gayatri Bhavan
           none |     dinein |    Blue Diamond
       VK Group |  FoodCourt |        FoodLion
       VK Group |  SportsBar | Cricket Heaven 
  Viceroy Group | Vegetarian |  Palace Heights
 Mainland Group |    Chinese |  MainLand China
      JSP Group |  FoodCourt |        Nautanki
          Ohris |  FoodCourt |           Ohris


cqlsh:resto> describe table restmaster;

CREATE TABLE restmaster (
  groupname text,
  type text,
  address text,
  city text,
  country text,
  rest_id uuid,
  rname text,
  PRIMARY KEY ((groupname), type)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.100000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.000000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

Answer

RussS picture RussS · Sep 13, 2014

All inserts to the Cassandra database are actually insert/update operations and there can only be on set of non-key values per uniquely defined primary key. This means that you can not ever have more than one set of values for one primary key and that you will only see the last write.

More info: http://www.datastax.com/documentation/cql/3.1/cql/cql_intro_c.html

Update: A datamodel

If you used a key like

Primary Key ((groupname),type,rname)

As long as you have unique restaurant names you will be able to get the results you are expecting. But what you really should be asking is "What queries would I like to perform on this data?" All Cassandra Tables should be based around satisfying a class of queries. The key I wrote above basically says "This table is constructed to quickly look up all the restaurants in a particular group and the only conditionals I will use will be on type and on restaurant name"

Examples queries you could perform with that schema

 SELECT * FROM restmaster WHERE groupname = 'Lettuce Entertain You' ;
 SELECT * FROM restmaster WHERE groupname = 'Lettuce Entertain You' and type = 'Formal'  ;
 SELECT * FROM restmaster WHERE groupname = 'Lettuce Entertain You' and type = 'Formal' 
    and rname > 'C' and rname < 'Y' ;

If that isn't the kind of queries you want to be performing in your application or you want other queries in addition to those, you most likely will need additional tables.