Cassandra cql select sorting

ptmoy2 picture ptmoy2 · Feb 22, 2014 · Viewed 14.4k times · Source

If I define a table like this using cql:

CREATE TABLE scores (
 name text,
 age int,
 score int,
 date timestamp,
 PRIMARY KEY (name, age, score)
);

And do a SELECT in cqlsh like this:

select * from mykeyspace.scores;

The result displayed seems to always be sorted by 'age', then 'score' automatically in ascending order regardless of input-data ordering (as expected, return rows are not sorted by the partition key 'name'). I have the following questions:

  1. Does SELECT automatically sort return rows by the clustering keys?
  2. If yes, what's the purpose of using the ORDER BY clause when using SELECT?
  3. If no, how do I get the return rows to sort by the clustering keys since cql doesn't allow ORDER BY on a select *?

Answer

Mikhail Stepura picture Mikhail Stepura · Feb 22, 2014

Your clustering columns define the order (in your case age then score)

http://cassandra.apache.org/doc/cql3/CQL.html#createTableStmt

On a given physical node, rows for a given partition key are stored in the order induced by the clustering columns, making the retrieval of rows in that clustering order particularly efficient (see SELECT).

http://cassandra.apache.org/doc/cql3/CQL.html#selectStmt

The ORDER BY option allows to select the order of the returned results. It takes as argument a list of column names along with the order for the column (ASC for ascendant and DESC for descendant, omitting the order being equivalent to ASC). Currently the possible orderings are limited (which depends on the table CLUSTERING ORDER):

  • if the table has been defined without any specific CLUSTERING ORDER, then the allowed orderings are the order induced by the clustering columns and the reverse of that one.
  • otherwise, the orderings allowed are the order of the CLUSTERING ORDER option and the reversed one.