I have a table in clickhouse, say "my_table", which has replicates (my_table_rep1,...). And I need to add a column, of type float64, with default value (-1).
How should I do that?
I would prefer that the default are not actually added to existing entries.
The documentation is pretty straight-forward:
ALTER TABLE [db].name [ON CLUSTER cluster] ADD COLUMN name [type] [default_expr] [AFTER name_after]
Regarding that:
I would prefer that the default are not actually added to existing entries.
There's also a statement in the docs:
If you add a new column to a table but later change its default expression, the values used for old data will change (for data where values were not stored on the disk)
So, basically, you have to:
DEFAULT 0
, or with something else - depends on what do you want to have in the existing entries)OPTIMIZE TABLE .. FINAL
to force Clickhouse to write the new data to the diskAn example:
:) CREATE TABLE my_table (date Date DEFAULT today(), s String) ENGINE = MergeTree(date, (date), 8192);
:) INSERT INTO my_table (s) VALUES ('1. foo');
:) ALTER TABLE my_table ADD COLUMN f Float64;
:) INSERT INTO my_table (s) VALUES ('2. bar');
:) SELECT * FROM my_table;
┌───────date─┬─s──────┬─f─┐
│ 2018-04-20 │ 1. foo │ 0 │
│ 2018-04-20 │ 2. bar │ 0 │
└────────────┴────────┴───┘
:) OPTIMIZE TABLE my_table PARTITION 201804 FINAL;
:) ALTER TABLE my_table MODIFY COLUMN f Float64 DEFAULT -1;
:) INSERT INTO my_table (s) VALUES ('3. baz');
:) SELECT * FROM my_table;
┌───────date─┬─s──────┬──f─┐
│ 2018-04-20 │ 3. baz │ -1 │
│ 2018-04-20 │ 1. foo │ 0 │
│ 2018-04-20 │ 2. bar │ 0 │
└────────────┴────────┴────┘
You really have to do OPTIMIZE TABLE ... FULL
, because if you won't do that, weird things will happen: https://gist.github.com/hatarist/5e7653808e59349c34d4589b2fc69b14