* initial split * initial adjust links * make en buildable * re-normalize
7.5 KiB
toc_priority | toc_title |
---|---|
37 | COLUMN |
Column Manipulations
A set of queries that allow changing the table structure.
Syntax:
ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|CLEAR|COMMENT|MODIFY COLUMN ...
In the query, specify a list of one or more comma-separated actions. Each action is an operation on a column.
The following actions are supported:
- ADD COLUMN — Adds a new column to the table.
- DROP COLUMN — Deletes the column.
- CLEAR COLUMN — Resets column values.
- COMMENT COLUMN — Adds a text comment to the column.
- MODIFY COLUMN — Changes column’s type, default expression and TTL.
These actions are described in detail below.
ADD COLUMN
ADD COLUMN [IF NOT EXISTS] name [type] [default_expr] [codec] [AFTER name_after]
Adds a new column to the table with the specified name
, type
, codec
and default_expr
(see the section Default expressions).
If the IF NOT EXISTS
clause is included, the query won’t return an error if the column already exists. If you specify AFTER name_after
(the name of another column), the column is added after the specified one in the list of table columns. Otherwise, the column is added to the end of the table. Note that there is no way to add a column to the beginning of a table. For a chain of actions, name_after
can be the name of a column that is added in one of the previous actions.
Adding a column just changes the table structure, without performing any actions with data. The data doesn’t appear on the disk after ALTER
. If the data is missing for a column when reading from the table, it is filled in with default values (by performing the default expression if there is one, or using zeros or empty strings). The column appears on the disk after merging data parts (see MergeTree).
This approach allows us to complete the ALTER
query instantly, without increasing the volume of old data.
Example:
ALTER TABLE visits ADD COLUMN browser String AFTER user_id
DROP COLUMN
DROP COLUMN [IF EXISTS] name
Deletes the column with the name name
. If the IF EXISTS
clause is specified, the query won’t return an error if the column doesn’t exist.
Deletes data from the file system. Since this deletes entire files, the query is completed almost instantly.
Example:
ALTER TABLE visits DROP COLUMN browser
CLEAR COLUMN
CLEAR COLUMN [IF EXISTS] name IN PARTITION partition_name
Resets all data in a column for a specified partition. Read more about setting the partition name in the section How to specify the partition expression.
If the IF EXISTS
clause is specified, the query won’t return an error if the column doesn’t exist.
Example:
ALTER TABLE visits CLEAR COLUMN browser IN PARTITION tuple()
COMMENT COLUMN
COMMENT COLUMN [IF EXISTS] name 'comment'
Adds a comment to the column. If the IF EXISTS
clause is specified, the query won’t return an error if the column doesn’t exist.
Each column can have one comment. If a comment already exists for the column, a new comment overwrites the previous comment.
Comments are stored in the comment_expression
column returned by the DESCRIBE TABLE query.
Example:
ALTER TABLE visits COMMENT COLUMN browser 'The table shows the browser used for accessing the site.'
MODIFY COLUMN
MODIFY COLUMN [IF EXISTS] name [type] [default_expr] [TTL]
This query changes the name
column properties:
-
Type
-
Default expression
-
TTL
For examples of columns TTL modifying, see [Column TTL](../../engines/table_engines/mergetree_family/mergetree.md#mergetree-column-ttl).
If the IF EXISTS
clause is specified, the query won’t return an error if the column doesn’t exist.
When changing the type, values are converted as if the toType functions were applied to them. If only the default expression is changed, the query doesn’t do anything complex, and is completed almost instantly.
Example:
ALTER TABLE visits MODIFY COLUMN browser Array(String)
Changing the column type is the only complex action – it changes the contents of files with data. For large tables, this may take a long time.
There are several processing stages:
- Preparing temporary (new) files with modified data.
- Renaming old files.
- Renaming the temporary (new) files to the old names.
- Deleting the old files.
Only the first stage takes time. If there is a failure at this stage, the data is not changed. If there is a failure during one of the successive stages, data can be restored manually. The exception is if the old files were deleted from the file system but the data for the new files did not get written to the disk and was lost.
The ALTER
query for changing columns is replicated. The instructions are saved in ZooKeeper, then each replica applies them. All ALTER
queries are run in the same order. The query waits for the appropriate actions to be completed on the other replicas. However, a query to change columns in a replicated table can be interrupted, and all actions will be performed asynchronously.
Limitations
The ALTER
query lets you create and delete separate elements (columns) in nested data structures, but not whole nested data structures. To add a nested data structure, you can add columns with a name like name.nested_name
and the type Array(T)
. A nested data structure is equivalent to multiple array columns with a name that has the same prefix before the dot.
There is no support for deleting columns in the primary key or the sampling key (columns that are used in the ENGINE
expression). Changing the type for columns that are included in the primary key is only possible if this change does not cause the data to be modified (for example, you are allowed to add values to an Enum or to change a type from DateTime
to UInt32
).
If the ALTER
query is not sufficient to make the table changes you need, you can create a new table, copy the data to it using the INSERT SELECT query, then switch the tables using the RENAME query and delete the old table. You can use the clickhouse-copier as an alternative to the INSERT SELECT
query.
The ALTER
query blocks all reads and writes for the table. In other words, if a long SELECT
is running at the time of the ALTER
query, the ALTER
query will wait for it to complete. At the same time, all new queries to the same table will wait while this ALTER
is running.
For tables that don’t store data themselves (such as Merge
and Distributed
), ALTER
just changes the table structure, and does not change the structure of subordinate tables. For example, when running ALTER for a Distributed
table, you will also need to run ALTER
for the tables on all remote servers.