I use the following code for creating table:
CREATE KEYSPACE mykeyspace
WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
USE mykeyspace;
CREATE TABLE users (
user_id int PRIMARY KEY,
fname text,
lname text
);
INSERT INTO users (user_id, fname, lname)
VALUES (1745, 'john', 'smith');
INSERT INTO users (user_id, fname, lname)
VALUES (1744, 'john', 'doe');
INSERT INTO users (user_id, fname, lname)
VALUES (1746, 'john', 'smith');
I would like to find the distinct value of lname
column (that is not a PRIMARY KEY). I would like to get the following result:
lname
-------
smith
By using SELECT DISTINCT lname FROM users;
However since lname
is not a PRIMARY KEY
I get the following error:
InvalidRequest: code=2200 [Invalid query] message="SELECT DISTINCT queries must
only request partition key columns and/or static columns (not lname)"
cqlsh:mykeyspace> SELECT DISTINCT lname FROM users;
How can I get the distinct values from lname
?
User - Undefined_variable - makes two good points:
DISTINCT
only works on partition keys.So, one way to get this to work, would be to build a specific table to support that query:
CREATE TABLE users_by_lname (
lname text,
fname text,
user_id int,
PRIMARY KEY (lname, fname, user_id)
);
Now after I run your INSERTs to this new query table, this works:
aploetz@cqlsh:stackoverflow> SELECT DISTINCT lname FROm users_by_lname ;
lname
-------
smith
doe
(2 rows)
Notes: In this table, all rows with the same partition key (lname
) will be sorted by fname
, as fname
is a clustering key. I added user_id
as an additional clustering key, just to ensure uniqueness.