Cassandra order and clustering key

farhawa picture farhawa · Mar 1, 2016 · Viewed 10k times · Source

I have this table:

CREATE TABLE custumer_events_service.events_by_websiteId_time(
    "event_id" text,
    "currentTime" timestamp,
    "websiteId" varchar,

    OTHER COLUMNS ...

    PRIMARY KEY(event_id, websiteId, currentTime)
)

In this case, would I get 10000 rows ordered by currentime when I execute this query:

SELECT * FROM events_by_websiteid_time WHERE websiteid='xxxx' LIMIT 10000 ALLOW FILTERING;

Or did I have to add WITH CLUSTERING ORDER BY (currentTime DESC); at the end?

Answer

Aaron picture Aaron · Mar 1, 2016

Cassandra can only enforce a sort order within a partition. As you are using ALLOW FILTERING to avoid having to provide your partition key (event_id) your result set will be ordered by the hashed token values of each event_id, and then by websiteid and currentTime.

To get your results to be ordered by currentTime, you would need to either create a new query table or alter the PRIMARY KEY definition (and perhaps the CLUSTERING ORDER) of your existing table. If you decide to create a new query table, it would have to look something like this:

CREATE TABLE custumer_events_service.events_by_websiteId_time_eventid(
  event_id text,
  currentTime timestamp,
  websiteId varchar,

OTHER COLUMNS ...

  PRIMARY KEY (websiteid,currentTime,event_id))
WITH CLUSTERING ORDER BY (currentTime DESC, event_id ASC);

That would allow this query:

SELECT * FROM events_by_websiteid_time_eventid WHERE websiteid='xxxx' LIMIT 10000;

...to work as you expect.