A few days ago I read about wide-column stored type of NoSQL and exclusively Apache-Cassandra.
What I understand is that Cassandra consist of:
A keyspace(like database in relational databases) and supporting many column families or tables (Same as table in relational databases) and unlimited rows.
From Stackoverflow tags:
A wide column store is a type of key-value database. It uses tables, rows, and columns, but unlike a relational database, the names and format of the columns can vary from row to row in the same table.
In Cassandra all of the rows (in a table) should have a row key then each row key can have multiple columns. I read about differences in implementation and storing data of Relational database and NoSQL (Cassandra).
But I don't understand the difference between structure:
Imagine a scenario which I have a table (or column family in Cassandra):
When I execute a query (CQL) like this :
select * from users;
It gives me the result as you can see :
lastname | age | city | email
----------+------+---------------+----------------------
Doe | 36 | Beverly Hills | [email protected]
Jones | 35 | Austin | [email protected]
Byrne | 24 | San Diego | [email protected]
Smith | 46 | Sacramento | null
Jones2 | null | Austin | [email protected]
So I perform the above scenario in relational database (MS SQL) with the following query:
select * from [users]
And the result is:
lastname | age | city | email
----------+------+---------------+----------------------
Doe | 36 | Beverly Hills | [email protected]
Jones | 35 | Austin | [email protected]
Byrne | 24 | San Diego | [email protected]
Smith | 46 | Sacramento | NULL
Jones2 | NULL | Austin | [email protected]
I know that Cassandra supports dynamic column and I can perform this by using sth like:
ALTER TABLE users ADD website varchar;
But it is available in relational model for example in mssql the above code can be implemented too. Something like:
ALTER TABLE users ADD website varchar(MAX);
What I see is that the first select and second select result is the same.
In Cassandra , they just give a row key (lastname) as a standalone object but it is same as a unique field (like ID or a text) in mssql (and all relational databases) and I see the type of column in Cassandra is static (in my example varchar
) unlike what it describes in Stackoverflow tag.
So my questions is:
Is there any misunderstanding in my imagination about Cassandra?!
So what is different between two structure ?! I show you the result is same.
Is there any special scenarios (JSON like) that cannot be implemented in relational databases but Cassandra supports? (For example I know that nested column doesn't support in Cassandra.)
Thank you for reading.
We have to look at more complex example to see the differences :)
For start:
Table is defined as "two-dimensional view of a multi-dimensional column family".
The term "wide-rows" was related mainly to the Thrift API. In cql it is defined a bit differently, but underneath looks the same.
Comparing SQL and CQL. In SQL table is a set of rows. In simple example it looks like in CQL it is the same, but it is not. CQL table is a set of partitions, where each partition can be just a single row (e.g. when you don't have a clustering key) or multiple rows. Partition containing multiple rows is in Thrift therminology named "wide-row". To see how it is stored underneath, please read e.g. part about composite-keys from here.
There are more differences:
address
as type, and reuse this type in many places), or collection
can be a collection of user defined typesI hope I was able to make it a bit more clear for you. I recommend watching some vidoes (or reading slides) from Datastax Core Concepts Course as solid introduction to Cassandra.