Query using composite keys, other than Row Key in Cassandra

user988544 picture user988544 · Jul 19, 2013 · Viewed 10.9k times · Source

I want to query data filtering by composite keys other than Row Key in CQL3. These are my queries:

CREATE TABLE grades (id int,
  date timestamp,
  subject text,
  status text,
  PRIMARY KEY (id, subject, status, date)
);

When I try and access the data,

SELECT * FROM grades where id = 1098; //works fine
SELECT * FROM grades where subject = 'English' ALLOW FILTERING; //works fine
SELECT * FROM grades where status = 'Active' ALLOW FILTERING; //gives an error

Bad Request: PRIMARY KEY part status cannot be restricted (preceding part subject is either not restricted or by a non-EQ relation)

Just to experiment, I shuffled the keys around keeping 'id' as my Primary Row Key always. I am always ONLY able to query using either the Primary Row key or the second key, considering above example, if I swap subjects and status in Primary Key list, I can then query with status but I get similar error if I try to do by subject or by time.

Am I doing something wrong? Can I not query data using any other composite key in CQL3? I'm using Cassandra 1.2.6 and CQL3.

Answer

emgsilva picture emgsilva · Jul 19, 2013

That looks all normal behavior according to Cassandra Composite Key model (http://www.datastax.com/docs/1.2/cql_cli/cql/SELECT). Cassandra data model aims (and this is a general NoSQL way of thinking) at granting that queries are performant, that comes to the expense of "restrictions" on the way you store and index your data, and then how you query it, namely you "always need to restrict the preceding part of subject" on the primary key.

You cannot swap elements on the primary key list on the queries (that is more a SQL way of thinking). You always need to "Constraint"/"Restrict" the previous element of the primary key if you are to use multiple elements of the composite key. This means that if you have composite key = (id, subject, status, date) and want to query "status", you will need to restrict "id" and/or "subject" ("or" is possible in case you use "allow filtering", i.e., you can restrict only "subject" and do not need to restrict "id"). So, if you want to query on "status" you will b able to query in two different ways:

select * from grades where id = '1093' and subject = 'English' and status = 'Active';

Or

select * from grades where subject = 'English' and status = 'Active' allow filtering;

The first is for a specific "student", the second for all the "students" on the subject in status = "Active".