I have a table that has a column of list
type (tags):
CREATE TABLE "Videos" (
video_id UUID,
title VARCHAR,
tags LIST<VARCHAR>,
PRIMARY KEY (video_id, upload_timestamp)
) WITH CLUSTERING ORDER BY (upload_timestamp DESC);
I have plenty of rows containing various values in the tags
column, ie. ["outdoor","funny cats","funny mice"]
.
I want to perform a SELECT
query that will return all rows that contain "funny cats" in the tags
column. How can I do that?
To directly answer your question, yes there is a way to accomplish this. As of Cassandra 2.1 you can create a secondary index on a collection. First, I'll re-create your column family definition (while adding a definition for upload_timestamp timeuuid
) and put some values in it.
aploetz@cqlsh:stackoverflow> SELECT * FROM videos ;
video_id | upload_timestamp | tags | title
--------------------------------------+--------------------------------------+-----------------------------------------------+---------------------------
2977b806-df76-4dd7-a57e-11d361e72ce1 | fc011080-64f9-11e4-a819-21b264d4c94d | ['sci-fi', 'action', 'adventure'] | Star Wars
ab696e1f-78c0-45e6-893f-430e88db7f46 | 8db7c4b0-64fa-11e4-a819-21b264d4c94d | ['documentary'] | The Witches of Whitewater
15e6bc0d-6195-4d8b-ad25-771966c780c8 | 1680d120-64fa-11e4-a819-21b264d4c94d | ['dark comedy', 'action', 'language warning'] | Pulp Fiction
(3 rows)
Next, I'll create a secondary index on the tags
column:
aploetz@cqlsh:stackoverflow> CREATE INDEX ON videos (tags);
Now, if I want to query the videos that contain the tag "action," I can accomplish this with the CONTAINS
keyword:
aploetz@cqlsh:stackoverflow> SELECT * FROM videos WHERE tags CONTAINS 'action';
video_id | upload_timestamp | tags | title
--------------------------------------+--------------------------------------+-----------------------------------------------+--------------
2977b806-df76-4dd7-a57e-11d361e72ce1 | fc011080-64f9-11e4-a819-21b264d4c94d | ['sci-fi', 'action', 'adventure'] | Star Wars
15e6bc0d-6195-4d8b-ad25-771966c780c8 | 1680d120-64fa-11e4-a819-21b264d4c94d | ['dark comedy', 'action', 'language warning'] | Pulp Fiction
(2 rows)
With this all being said, I should pass along a couple of warnings:
videosbytag
query table, with the same data but keyed like this: PRIMARY KEY (tag,video_id)