How to change PARTITION KEY column in Cassandra?

Кирилл Давиденко picture Кирилл Давиденко · Aug 18, 2015 · Viewed 8.7k times · Source

Suppose we have such table:

create table users (
    id text,
    roles set<text>,
    PRIMARY KEY ((id))
);

I want all the values of this table to be stored on the same Cassandra node (OK, not really the same, same 3, but have all the data mirrored, but you got the point), so to achieve that i want to change this table to be like this:

create table users_v2 (
    partition int,
    id text,
    roles set<text>,
    PRIMARY KEY ((partition), id)
);

How can i do that without losing the data from the first table? It seems to be impossible to ALTER TABLE in order to add such column. i'm OK with that. What i try to do is to copy data from the first table and insert to the second table. When i do it as it is, the partition column іs missing, which is expected. I can ALTER the first table and add a 'partition' column to the end, and then COPY in correct order, but i can't update all the rows in the first table to set the all some partition, and it seems to be no "default" value when column is added.

Answer

sam picture sam · Aug 18, 2015

You simply cannot alter the primary key of a Cassandra table. You need to create another table with your new schema and perform a data migration. I would suggest that you use Spark for that since it is really easy to do a migration between two tables with only a few lines of code.

This also answer to the alter primary key question.