Cassandra CQL query check multiple values

user2122264 picture user2122264 · Oct 7, 2013 · Viewed 11k times · Source

How can I check if a non-primary key field's value is either 'A' or 'B' with a Cassandra CQL query? (I'm using Cassandra 2.0.1)

Here's the table definition:

CREATE TABLE my_table (
  my_field text,
  my_field2 text,
  PRIMARY KEY (my_field)
);

I tried:

1> SELECT * FROM my_table WHERE my_field2 IN ('A', 'B');

2> SELECT * FROM my_table WHERE my_field2 = 'A' OR my_field = 'B' ;

The first one failed with this messeage:

Bad Request: IN predicates on non-primary-key columns (my_field2) is not yet supported

The second one failed because Cassandra CQL doesn't support OR keyword

I couldn't get this simple query working (with a pretty straight forward way). I'm pretty frustrated dealing with CQL queries in general. Is it because Cassandra is not mature enough and has really poor support with queries, or is it me who must change the way of thinking?

Answer

Chris Brewer picture Chris Brewer · Oct 7, 2013

This is the intentional functionality of cassandra. You cannot query using a WHERE clause on columns that are not

  • the partition key
  • part of a composite key

This is because your data is partitioned around a ring of cassandra nodes. You want to avoid having to ask the entire ring to return the answer to your query. Ideally you want to be able to retrieve your data from a single node in the ring

Generally in cassandra you want to structure your table to match your queries as opposed to relational normalization. So you have a few options to deal with this.

1) write your data to multiple tables to support various queries. In your case you may want to create a second table as

CREATE TABLE my_table (
  my_field2 text,
  my_field text,
  PRIMARY KEY (my_field2)
);

Then your first query will return correctly

2) Create your table with a composite key as

CREATE TABLE my_table (
  my_field text,
  my_field2 text,
  PRIMARY KEY (my_field, my_field2)
);

With this method, if you do not specify a query value for my_field then you will need to append your query with a qualifier to tell cassandra that you really want to query the entire ring

SELECT * FROM my_table WHERE my_field2 IN ('A', 'B') ALLOW FILTERING;

-edit-

you cannot use a secondary index to search for multiple values. Per CQL documentation

http://www.datastax.com/documentation/cql/3.0/webhelp/cql/ddl/ddl_primary_index_c.html

"An index is a data structure that allows for fast, efficient lookup of data matching a given condition."

So, you must give it one and only one value.