Merge branch 'master' into standardize-dictionary-categories

This commit is contained in:
Dan Roscigno 2022-11-15 16:18:18 -05:00 committed by GitHub
commit b178a3711c
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
24 changed files with 184 additions and 39 deletions

View File

@ -7,8 +7,8 @@ Contains information about stack traces for fatal errors. The table does not exi
Columns:
- `event_date` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date of the event.
- `event_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Time of the event.
- `event_date` ([DateTime](../../sql-reference/data-types/datetime.md)) — Date of the event.
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Time of the event.
- `timestamp_ns` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Timestamp of the event with nanoseconds.
- `signal` ([Int32](../../sql-reference/data-types/int-uint.md)) — Signal number.
- `thread_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Thread ID.

View File

@ -15,7 +15,7 @@ Columns:
- `command` ([String](/docs/en/sql-reference/data-types/string.md)) — The mutation command string (the part of the query after `ALTER TABLE [db.]table`).
- `create_time` ([Datetime](/docs/en/sql-reference/data-types/datetime.md)) — Date and time when the mutation command was submitted for execution.
- `create_time` ([DateTime](/docs/en/sql-reference/data-types/datetime.md)) — Date and time when the mutation command was submitted for execution.
- `block_numbers.partition_id` ([Array](/docs/en/sql-reference/data-types/array.md)([String](/docs/en/sql-reference/data-types/string.md))) — For mutations of replicated tables, the array contains the partitions' IDs (one record for each partition). For mutations of non-replicated tables the array is empty.
@ -39,7 +39,7 @@ If there were problems with mutating some data parts, the following columns cont
- `latest_failed_part` ([String](/docs/en/sql-reference/data-types/string.md)) — The name of the most recent part that could not be mutated.
- `latest_fail_time` ([Datetime](/docs/en/sql-reference/data-types/datetime.md)) — The date and time of the most recent part mutation failure.
- `latest_fail_time` ([DateTime](/docs/en/sql-reference/data-types/datetime.md)) — The date and time of the most recent part mutation failure.
- `latest_fail_reason` ([String](/docs/en/sql-reference/data-types/string.md)) — The exception message that caused the most recent part mutation failure.

View File

@ -29,7 +29,7 @@ Columns:
- `MUTATE_PART` — Apply one or several mutations to the part.
- `ALTER_METADATA` — Apply alter modification according to global /metadata and /columns paths.
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was submitted for execution.
- `create_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was submitted for execution.
- `required_quorum` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of replicas waiting for the task to complete with confirmation of completion. This column is only relevant for the `GET_PARTS` task.
@ -47,13 +47,13 @@ Columns:
- `last_exception` ([String](../../sql-reference/data-types/string.md)) — Text message about the last error that occurred (if any).
- `last_attempt_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was last attempted.
- `last_attempt_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was last attempted.
- `num_postponed` ([UInt32](../../sql-reference/data-types/int-uint.md)) — The number of postponed tasks.
- `postpone_reason` ([String](../../sql-reference/data-types/string.md)) — The reason why the task was postponed.
- `last_postpone_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was last postponed.
- `last_postpone_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Date and time when the task was last postponed.
- `merge_type` ([String](../../sql-reference/data-types/string.md)) — Type of the current merge. Empty if it's a mutation.

View File

@ -6,7 +6,7 @@ sidebar_label: Date32
# Date32
A date. Supports the date range same with [Datetime64](../../sql-reference/data-types/datetime64.md). Stored in four bytes as the number of days since 1900-01-01. Allows storing values till 2299-12-31.
A date. Supports the date range same with [DateTime64](../../sql-reference/data-types/datetime64.md). Stored in four bytes as the number of days since 1900-01-01. Allows storing values till 2299-12-31.
**Examples**

View File

@ -4,7 +4,7 @@ sidebar_position: 48
sidebar_label: DateTime
---
# Datetime
# DateTime
Allows to store an instant in time, that can be expressed as a calendar date and a time of a day.

View File

@ -4,7 +4,7 @@ sidebar_position: 49
sidebar_label: DateTime64
---
# Datetime64
# DateTime64
Allows to store an instant in time, that can be expressed as a calendar date and a time of a day, with defined sub-second precision

View File

@ -550,7 +550,7 @@ Alias: `dateTrunc`.
- Value, truncated to the specified part of date.
Type: [Datetime](../../sql-reference/data-types/datetime.md).
Type: [DateTime](../../sql-reference/data-types/datetime.md).
**Example**
@ -881,7 +881,7 @@ now([timezone])
- Current date and time.
Type: [Datetime](../../sql-reference/data-types/datetime.md).
Type: [DateTime](../../sql-reference/data-types/datetime.md).
**Example**
@ -932,7 +932,7 @@ now64([scale], [timezone])
- Current date and time with sub-second precision.
Type: [Datetime64](../../sql-reference/data-types/datetime64.md).
Type: [DateTime64](../../sql-reference/data-types/datetime64.md).
**Example**
@ -968,7 +968,7 @@ nowInBlock([timezone])
- Current date and time at the moment of processing of each block of data.
Type: [Datetime](../../sql-reference/data-types/datetime.md).
Type: [DateTime](../../sql-reference/data-types/datetime.md).
**Example**

View File

@ -2,9 +2,134 @@
slug: /en/sql-reference/statements/alter/projection
sidebar_position: 49
sidebar_label: PROJECTION
title: "Manipulating Projections"
title: "Projections"
---
Projections store data in a format that optimizes query execution, this feature is useful for:
- Running queries on a column that is not a part of the primary key
- Pre-aggregating columns, it will reduce both computation and IO
You can define one or more projections for a table, and during the query analysis the projection with the least data to scan will be selected by ClickHouse without modifying the query provided by the user.
## Example filtering without using primary keys
Creating the table:
```
CREATE TABLE visits_order
(
`user_id` UInt64,
`user_name` String,
`pages_visited` Nullable(Float64),
`user_agent` String
)
ENGINE = MergeTree()
PRIMARY KEY user_agent
```
Using `ALTER TABLE`, we could add the Projection to an existing table:
```
ALTER TABLE visits_order ADD PROJECTION user_name_projection (
SELECT
*
ORDER BY user_name
)
ALTER TABLE visits_order MATERIALIZE PROJECTION user_name_projection
```
Inserting the data:
```
INSERT INTO visits_order SELECT
number,
'test',
1.5 * (number / 2),
'Android'
FROM numbers(1, 100);
```
The Projection will allow us to filter by `user_name` fast even if in the original Table `user_name` was not defined as a `PRIMARY_KEY`.
At query time ClickHouse determined that less data will be processed if the projection is used, as the data is ordered by `user_name`.
```
SELECT
*
FROM visits_order
WHERE user_name='test'
LIMIT 2
```
To verify that a query is using the projection, we could review the `system.query_log` table. On the `projections` field we have the name of the projection used or empty if none has been used:
```
SELECT query, projections FROM system.query_log WHERE query_id='<query_id>'
```
## Example pre-aggregation query
Creating the table with the Projection:
```
CREATE TABLE visits
(
`user_id` UInt64,
`user_name` String,
`pages_visited` Nullable(Float64),
`user_agent` String,
PROJECTION projection_visits_by_user
(
SELECT
user_agent,
sum(pages_visited)
GROUP BY user_id, user_agent
)
)
ENGINE = MergeTree()
ORDER BY user_agent
```
Inserting the data:
```
INSERT INTO visits SELECT
number,
'test',
1.5 * (number / 2),
'Android'
FROM numbers(1, 100);
```
```
INSERT INTO visits SELECT
number,
'test',
1. * (number / 2),
'IOS'
FROM numbers(100, 500);
```
We will execute a first query using `GROUP BY` using the field `user_agent`, this query will not use the projection defined as the pre-aggregation does not match.
```
SELECT
user_agent,
count(DISTINCT user_id)
FROM visits
GROUP BY user_agent
```
To use the projection we could execute queries that select part of, or all of the pre-aggregation and `GROUP BY` fields.
```
SELECT
user_agent
FROM visits
WHERE user_id > 50 AND user_id < 150
GROUP BY user_agent
```
```
SELECT
user_agent,
sum(pages_visited)
FROM visits
GROUP BY user_id
```
As mentioned before, we could review the `system.query_log` table. On the `projections` field we have the name of the projection used or empty if none has been used:
```
SELECT query, projections FROM system.query_log WHERE query_id='<query_id>'
```
# Manipulating Projections
The following operations with [projections](/docs/en/engines/table-engines/mergetree-family/mergetree.md/#projections) are available:
## ADD PROJECTION

View File

@ -7,8 +7,8 @@ slug: /ru/operations/system-tables/crash-log
Колонки:
- `event_date` ([Datetime](../../sql-reference/data-types/datetime.md)) — Дата события.
- `event_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — Время события.
- `event_date` ([DateTime](../../sql-reference/data-types/datetime.md)) — Дата события.
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Время события.
- `timestamp_ns` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Время события с наносекундами.
- `signal` ([Int32](../../sql-reference/data-types/int-uint.md)) — Номер сигнала, пришедшего в поток.
- `thread_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Идентификатор треда.

View File

@ -15,7 +15,7 @@ slug: /ru/operations/system-tables/mutations
- `command` ([String](../../sql-reference/data-types/string.md)) — команда мутации (часть запроса после `ALTER TABLE [db.]table`).
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — дата и время создания мутации.
- `create_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — дата и время создания мутации.
- `block_numbers.partition_id` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — Для мутаций реплицированных таблиц массив содержит содержит номера партиций (по одной записи для каждой партиции). Для мутаций нереплицированных таблиц массив пустой.
@ -39,7 +39,7 @@ slug: /ru/operations/system-tables/mutations
- `latest_failed_part` ([String](../../sql-reference/data-types/string.md)) — имя последнего куска, мутация которого не удалась.
- `latest_fail_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — дата и время последней ошибки мутации.
- `latest_fail_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — дата и время последней ошибки мутации.
- `latest_fail_reason` ([String](../../sql-reference/data-types/string.md)) — причина последней ошибки мутации.

View File

@ -29,7 +29,7 @@ slug: /ru/operations/system-tables/replication_queue
- `MUTATE_PART` — применить одну или несколько мутаций к куску.
- `ALTER_METADATA` — применить изменения структуры таблицы в результате запросов с выражением `ALTER`.
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — дата и время отправки задачи на выполнение.
- `create_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — дата и время отправки задачи на выполнение.
- `required_quorum` ([UInt32](../../sql-reference/data-types/int-uint.md)) — количество реплик, ожидающих завершения задачи, с подтверждением о завершении. Этот столбец актуален только для задачи `GET_PARTS`.
@ -47,13 +47,13 @@ slug: /ru/operations/system-tables/replication_queue
- `last_exception` ([String](../../sql-reference/data-types/string.md)) — текст сообщения о последней возникшей ошибке, если таковые имеются.
- `last_attempt_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — дата и время последней попытки выполнить задачу.
- `last_attempt_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — дата и время последней попытки выполнить задачу.
- `num_postponed` ([UInt32](../../sql-reference/data-types/int-uint.md)) — количество отложенных задач.
- `postpone_reason` ([String](../../sql-reference/data-types/string.md)) — причина, по которой была отложена задача.
- `last_postpone_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — дата и время, когда была отложена задача в последний раз.
- `last_postpone_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — дата и время, когда была отложена задача в последний раз.
- `merge_type` ([String](../../sql-reference/data-types/string.md)) — тип текущего слияния. Пусто, если это мутация.

View File

@ -6,7 +6,7 @@ sidebar_label: Date32
# Date32 {#data_type-datetime32}
Дата. Поддерживается такой же диапазон дат, как для типа [Datetime64](../../sql-reference/data-types/datetime64.md). Значение хранится в четырех байтах и соответствует числу дней с 1900-01-01 по 2299-12-31.
Дата. Поддерживается такой же диапазон дат, как для типа [DateTime64](../../sql-reference/data-types/datetime64.md). Значение хранится в четырех байтах и соответствует числу дней с 1900-01-01 по 2299-12-31.
**Пример**

View File

@ -602,7 +602,7 @@ date_trunc(unit, value[, timezone])
- Дата и время, отсеченные до указанной части.
Тип: [Datetime](../../sql-reference/data-types/datetime.md).
Тип: [DateTime](../../sql-reference/data-types/datetime.md).
**Примеры**
@ -913,7 +913,7 @@ now([timezone])
- Текущие дата и время.
Тип: [Datetime](../../sql-reference/data-types/datetime.md).
Тип: [DateTime](../../sql-reference/data-types/datetime.md).
**Пример**

View File

@ -7,8 +7,8 @@ slug: /zh/operations/system-tables/crash-log
列信息:
- `event_date` ([Datetime](../../sql-reference/data-types/datetime.md)) — 事件日期.
- `event_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — 事件时间.
- `event_date` ([DateTime](../../sql-reference/data-types/datetime.md)) — 事件日期.
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — 事件时间.
- `timestamp_ns` ([UInt64](../../sql-reference/data-types/int-uint.md)) — 以纳秒为单位的事件时间戳.
- `signal` ([Int32](../../sql-reference/data-types/int-uint.md)) — 信号编号.
- `thread_id` ([UInt64](../../sql-reference/data-types/int-uint.md)) — 线程ID.

View File

@ -15,7 +15,7 @@ slug: /zh/operations/system-tables/mutations
- `command` ([String](../../sql-reference/data-types/string.md)) — mutation命令字符串`ALTER TABLE [db.]table`语句之后的部分)。
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — mutation命令提交执行的日期和时间。
- `create_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — mutation命令提交执行的日期和时间。
- `block_numbers.partition_id` ([Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md))) — 对于复制表的mutation该数组包含分区的ID每个分区都有一条记录。对于非复制表的mutation该数组为空。
@ -39,7 +39,7 @@ slug: /zh/operations/system-tables/mutations
- `latest_failed_part`([String](../../sql-reference/data-types/string.md)) — 最近不能mutation的part的名称。
- `latest_fail_time`([Datetime](../../sql-reference/data-types/datetime.md)) — 最近的一个mutation失败的时间。
- `latest_fail_time`([DateTime](../../sql-reference/data-types/datetime.md)) — 最近的一个mutation失败的时间。
- `latest_fail_reason`([String](../../sql-reference/data-types/string.md)) — 导致最近part的mutation失败的异常消息。

View File

@ -29,7 +29,7 @@ slug: /zh/operations/system-tables/replication_queue
- `MUTATE_PART` — 对分片应用一个或多个突变.
- `ALTER_METADATA` — 根据全局 /metadata 和 /columns 路径应用alter修改.
- `create_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — 提交任务执行的日期和时间.
- `create_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — 提交任务执行的日期和时间.
- `required_quorum` ([UInt32](../../sql-reference/data-types/int-uint.md)) — 等待任务完成并确认完成的副本数. 此列仅与 `GET_PARTS` 任务相关.
@ -47,13 +47,13 @@ slug: /zh/operations/system-tables/replication_queue
- `last_exception` ([String](../../sql-reference/data-types/string.md)) — 发生的最后一个错误的短信(如果有).
- `last_attempt_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — 上次尝试任务的日期和时间.
- `last_attempt_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — 上次尝试任务的日期和时间.
- `num_postponed` ([UInt32](../../sql-reference/data-types/int-uint.md)) — 延期任务数.
- `postpone_reason` ([String](../../sql-reference/data-types/string.md)) — 任务延期的原因.
- `last_postpone_time` ([Datetime](../../sql-reference/data-types/datetime.md)) — 上次推迟任务的日期和时间.
- `last_postpone_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — 上次推迟任务的日期和时间.
- `merge_type` ([String](../../sql-reference/data-types/string.md)) — 当前合并的类型. 如果是突变则为空.

View File

@ -152,7 +152,7 @@ sidebar_label: "ANSI\u517C\u5BB9\u6027"
| F051-02 | TIME时间数据类型并支持用于表达时间的字面量小数秒精度至少为0 | 否 {.text-danger} | |
| F051-03 | 时间戳数据类型并支持用于表达时间戳的字面量小数秒精度至少为0和6 | 是 {.text-danger} | |
| F051-04 | 日期、时间和时间戳数据类型的比较谓词 | 是 {.text-success} | |
| F051-05 | Datetime 类型和字符串形式表达的时间之间的显式转换 | 是 {.text-success} | |
| F051-05 | DateTime 类型和字符串形式表达的时间之间的显式转换 | 是 {.text-success} | |
| F051-06 | CURRENT_DATE | 否 {.text-danger} | 使用`today()`替代 |
| F051-07 | LOCALTIME | 否 {.text-danger} | 使用`now()`替代 |
| F051-08 | LOCALTIMESTAMP | 否 {.text-danger} | |

View File

@ -6,7 +6,7 @@ sidebar_position: 49
sidebar_label: DateTime64
---
# Datetime64 {#data_type-datetime64}
# DateTime64 {#data_type-datetime64}
此类型允许以日期date加时间time的形式来存储一个时刻的时间值具有定义的亚秒精度

View File

@ -539,7 +539,7 @@ date_trunc(unit, value[, timezone])
- 按指定的单位向前取整后的DateTime。
类型: [Datetime](../../sql-reference/data-types/datetime.md).
类型: [DateTime](../../sql-reference/data-types/datetime.md).
**示例**
@ -850,7 +850,7 @@ now([timezone])
- 当前日期和时间。
类型: [Datetime](../../sql-reference/data-types/datetime.md).
类型: [DateTime](../../sql-reference/data-types/datetime.md).
**示例**

View File

@ -1628,6 +1628,14 @@ void ClientBase::processParsedSingleQuery(const String & full_query, const Strin
global_context->applySettingChange(change);
}
global_context->resetSettingsToDefaultValue(set_query->default_settings);
/// Query parameters inside SET queries should be also saved on the client side
/// to override their previous definitions set with --param_* arguments
/// and for substitutions to work inside INSERT ... VALUES queries
for (const auto & [name, value] : set_query->query_parameters)
query_parameters.insert_or_assign(name, value);
global_context->addQueryParameters(set_query->query_parameters);
}
if (const auto * use_query = parsed_query->as<ASTUseQuery>())
{

View File

@ -182,6 +182,7 @@ void ReplicatedMergeTreeAttachThread::runImpl()
storage.createNewZooKeeperNodes();
storage.syncPinnedPartUUIDs();
std::lock_guard lock(storage.table_shared_id_mutex);
storage.createTableSharedID();
};

View File

@ -7609,8 +7609,6 @@ std::unique_ptr<MergeTreeSettings> StorageReplicatedMergeTree::getDefaultSetting
String StorageReplicatedMergeTree::getTableSharedID() const
{
/// Lock is not required in other places because createTableSharedID()
/// can be called only during table initialization
std::lock_guard lock(table_shared_id_mutex);
/// Can happen if table was partially initialized before drop by DatabaseCatalog
@ -7637,8 +7635,12 @@ String StorageReplicatedMergeTree::getTableSharedID() const
void StorageReplicatedMergeTree::createTableSharedID() const
{
LOG_DEBUG(log, "Creating shared ID for table {}", getStorageID().getNameForLogs());
// can be set by the call to getTableSharedID
if (table_shared_id != UUIDHelpers::Nil)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Table shared id already initialized");
{
LOG_INFO(log, "Shared ID already set to {}", table_shared_id);
return;
}
auto zookeeper = getZooKeeper();
String zookeeper_table_id_path = fs::path(zookeeper_path) / "table_shared_id";

View File

@ -0,0 +1,8 @@
DROP TABLE IF EXISTS 02476_query_parameters_insert;
CREATE TABLE 02476_query_parameters_insert (x Int32) ENGINE=MergeTree() ORDER BY tuple();
SET param_x = 1;
INSERT INTO 02476_query_parameters_insert VALUES ({x: Int32});
SELECT * FROM 02476_query_parameters_insert;
DROP TABLE 02476_query_parameters_insert;