Merge branch 'master' into better_remove_empty_parts

This commit is contained in:
mergify[bot] 2021-06-30 08:42:26 +00:00 committed by GitHub
commit 8182b6e890
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
51 changed files with 434 additions and 242 deletions

View File

@ -0,0 +1,3 @@
wget 'https://builds.clickhouse.tech/master/freebsd/clickhouse'
chmod a+x ./clickhouse
sudo ./clickhouse install

View File

@ -0,0 +1,3 @@
wget 'https://builds.clickhouse.tech/master/macos-aarch64/clickhouse'
chmod a+x ./clickhouse
./clickhouse

View File

@ -0,0 +1,3 @@
wget 'https://builds.clickhouse.tech/master/macos/clickhouse'
chmod a+x ./clickhouse
./clickhouse

View File

@ -107,9 +107,10 @@ sudo ./clickhouse install
For non-Linux operating systems and for AArch64 CPU arhitecture, ClickHouse builds are provided as a cross-compiled binary from the latest commit of the `master` branch (with a few hours delay).
- [macOS](https://builds.clickhouse.tech/master/macos/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/macos/clickhouse' && chmod a+x ./clickhouse`
- [FreeBSD](https://builds.clickhouse.tech/master/freebsd/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/freebsd/clickhouse' && chmod a+x ./clickhouse`
- [AArch64](https://builds.clickhouse.tech/master/aarch64/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/aarch64/clickhouse' && chmod a+x ./clickhouse`
- [MacOS x86_64](https://builds.clickhouse.tech/master/macos/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/macos/clickhouse' && chmod a+x ./clickhouse`
- [MacOS Aarch64 (Apple Silicon)](https://builds.clickhouse.tech/master/macos-aarch64/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/macos-aarch64/clickhouse' && chmod a+x ./clickhouse`
- [FreeBSD x86_64](https://builds.clickhouse.tech/master/freebsd/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/freebsd/clickhouse' && chmod a+x ./clickhouse`
- [Linux AArch64](https://builds.clickhouse.tech/master/aarch64/clickhouse) — `curl -O 'https://builds.clickhouse.tech/master/aarch64/clickhouse' && chmod a+x ./clickhouse`
After downloading, you can use the `clickhouse client` to connect to the server, or `clickhouse local` to process local data.

View File

@ -154,5 +154,6 @@ toc_title: Adopters
| <a href="https://www.hydrolix.io/" class="favicon">Hydrolix</a> | Cloud data platform | Main product | — | — | [Documentation](https://docs.hydrolix.io/guide/query) |
| <a href="https://www.argedor.com/en/clickhouse/" class="favicon">Argedor</a> | ClickHouse support | — | — | — | [Official website](https://www.argedor.com/en/clickhouse/) |
| <a href="https://signoz.io/" class="favicon">SigNoz</a> | Observability Platform | Main Product | — | — | [Source code](https://github.com/SigNoz/signoz) |
| <a href="https://chelpipegroup.com/" class="favicon">ChelPipe Group</a> | Analytics | — | — | — | [Blog post, June 2021](https://vc.ru/trade/253172-tyazhelomu-proizvodstvu-user-friendly-sayt-internet-magazin-trub-dlya-chtpz) |
[Original article](https://clickhouse.tech/docs/en/introduction/adopters/) <!--hide-->

View File

@ -4,14 +4,14 @@ The `median*` functions are the aliases for the corresponding `quantile*` functi
Functions:
- `median` — Alias for [quantile](#quantile).
- `medianDeterministic` — Alias for [quantileDeterministic](#quantiledeterministic).
- `medianExact` — Alias for [quantileExact](#quantileexact).
- `medianExactWeighted` — Alias for [quantileExactWeighted](#quantileexactweighted).
- `medianTiming` — Alias for [quantileTiming](#quantiletiming).
- `medianTimingWeighted` — Alias for [quantileTimingWeighted](#quantiletimingweighted).
- `medianTDigest` — Alias for [quantileTDigest](#quantiletdigest).
- `medianTDigestWeighted` — Alias for [quantileTDigestWeighted](#quantiletdigestweighted).
- `median` — Alias for [quantile](../../../sql-reference/aggregate-functions/reference/quantile.md#quantile).
- `medianDeterministic` — Alias for [quantileDeterministic](../../../sql-reference/aggregate-functions/reference/quantiledeterministic.md#quantiledeterministic).
- `medianExact` — Alias for [quantileExact](../../../sql-reference/aggregate-functions/reference/quantileexact.md#quantileexact).
- `medianExactWeighted` — Alias for [quantileExactWeighted](../../../sql-reference/aggregate-functions/reference/quantileexactweighted.md#quantileexactweighted).
- `medianTiming` — Alias for [quantileTiming](../../../sql-reference/aggregate-functions/reference/quantiletiming.md#quantiletiming).
- `medianTimingWeighted` — Alias for [quantileTimingWeighted](../../../sql-reference/aggregate-functions/reference/quantiletimingweighted.md#quantiletimingweighted).
- `medianTDigest` — Alias for [quantileTDigest](../../../sql-reference/aggregate-functions/reference/quantiletdigest.md#quantiletdigest).
- `medianTDigestWeighted` — Alias for [quantileTDigestWeighted](../../../sql-reference/aggregate-functions/reference/quantiletdigestweighted.md#quantiletdigestweighted).
**Example**

View File

@ -0,0 +1,48 @@
---
toc_priority: 55
toc_title: s3Cluster
---
# s3Cluster Table Function {#s3Cluster-table-function}
Allows processing files from [Amazon S3](https://aws.amazon.com/s3/) in parallel from many nodes in a specified cluster. On initiator it creates a connection to all nodes in the cluster, discloses asterics in S3 file path, and dispatches each file dynamically. On the worker node it asks the initiator about the next task to process and processes it. This is repeated until all tasks are finished.
**Syntax**
``` sql
s3Cluster(cluster_name, source, [access_key_id, secret_access_key,] format, structure)
```
**Arguments**
- `cluster_name` — Name of a cluster that is used to build a set of addresses and connection parameters to remote and local servers.
- `source` — URL to a file or a bunch of files. Supports following wildcards in readonly mode: `*`, `?`, `{'abc','def'}` and `{N..M}` where `N`, `M` — numbers, `abc`, `def` — strings. For more information see [Wildcards In Path](../../engines/table-engines/integrations/s3.md#wildcards-in-path).
- `access_key_id` and `secret_access_key` — Keys that specify credentials to use with given endpoint. Optional.
- `format` — The [format](../../interfaces/formats.md#formats) of the file.
- `structure` — Structure of the table. Format `'column1_name column1_type, column2_name column2_type, ...'`.
**Returned value**
A table with the specified structure for reading or writing data in the specified file.
**Examples**
Select the data from all files in the cluster `cluster_simple`:
``` sql
SELECT * FROM s3Cluster('cluster_simple', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))') ORDER BY (name, value, polygon);
```
Count the total amount of rows in all files in the cluster `cluster_simple`:
``` sql
SELECT count(*) FROM s3Cluster('cluster_simple', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))');
```
!!! warning "Warning"
If your listing of files contains number ranges with leading zeros, use the construction with braces for each digit separately or use `?`.
**See Also**
- [S3 engine](../../engines/table-engines/integrations/s3.md)
- [s3 table function](../../sql-reference/table-functions/s3.md)

View File

@ -0,0 +1,48 @@
---
toc_priority: 55
toc_title: s3Cluster
---
# Табличная функция s3Cluster {#s3Cluster-table-function}
Позволяет обрабатывать файлы из [Amazon S3](https://aws.amazon.com/s3/) параллельно из многих узлов в указанном кластере. На узле-инициаторе функция создает соединение со всеми узлами в кластере, заменяет символы '*' в пути к файлу S3 и динамически отправляет каждый файл. На рабочем узле функция запрашивает у инициатора следующую задачу и обрабатывает ее. Это повторяется до тех пор, пока все задачи не будут завершены.
**Синтаксис**
``` sql
s3Cluster(cluster_name, source, [access_key_id, secret_access_key,] format, structure)
```
**Аргументы**
- `cluster_name` — имя кластера, используемое для создания набора адресов и параметров подключения к удаленным и локальным серверам.
- `source` — URL файла или нескольких файлов. Поддерживает следующие символы подстановки: `*`, `?`, `{'abc','def'}` и `{N..M}`, где `N`, `M` — числа, `abc`, `def` — строки. Подробнее смотрите в разделе [Символы подстановки](../../engines/table-engines/integrations/s3.md#wildcards-in-path).
- `access_key_id` и `secret_access_key` — ключи, указывающие на учетные данные для использования с точкой приема запроса. Необязательные параметры.
- `format` — [формат](../../interfaces/formats.md#formats) файла.
- `structure` — структура таблицы. Формат `'column1_name column1_type, column2_name column2_type, ...'`.
**Возвращаемое значение**
Таблица с указанной структурой для чтения или записи данных в указанный файл.
**Примеры**
Вывод данных из всех файлов кластера `cluster_simple`:
``` sql
SELECT * FROM s3Cluster('cluster_simple', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))') ORDER BY (name, value, polygon);
```
Подсчет общего количества строк во всех файлах кластера `cluster_simple`:
``` sql
SELECT count(*) FROM s3Cluster('cluster_simple', 'http://minio1:9001/root/data/{clickhouse,database}/*', 'minio', 'minio123', 'CSV', 'name String, value UInt32, polygon Array(Array(Tuple(Float64, Float64)))');
```
!!! warning "Внимание"
Если список файлов содержит диапазоны чисел с ведущими нулями, используйте конструкцию с фигурными скобками для каждой цифры отдельно или используйте `?`.
**Смотрите также**
- [Движок таблиц S3](../../engines/table-engines/integrations/s3.md)
- [Табличная функция s3](../../sql-reference/table-functions/s3.md)

View File

@ -154,9 +154,6 @@ def build(args):
if not args.skip_website:
website.build_website(args)
if not args.skip_test_templates:
test.test_templates(args.website_dir)
if not args.skip_docs:
generate_cmake_flags_files()
@ -197,7 +194,6 @@ if __name__ == '__main__':
arg_parser.add_argument('--skip-blog', action='store_true')
arg_parser.add_argument('--skip-git-log', action='store_true')
arg_parser.add_argument('--skip-docs', action='store_true')
arg_parser.add_argument('--skip-test-templates', action='store_true')
arg_parser.add_argument('--test-only', action='store_true')
arg_parser.add_argument('--minify', action='store_true')
arg_parser.add_argument('--htmlproofer', action='store_true')

View File

@ -7,36 +7,6 @@ import bs4
import subprocess
def test_template(template_path):
if template_path.endswith('amp.html'):
# Inline CSS/JS is ok for AMP pages
return
logging.debug(f'Running tests for {template_path} template')
with open(template_path, 'r') as f:
soup = bs4.BeautifulSoup(
f,
features='html.parser'
)
for tag in soup.find_all():
style_attr = tag.attrs.get('style')
assert not style_attr, f'Inline CSS is prohibited, found {style_attr} in {template_path}'
if tag.name == 'script':
if tag.attrs.get('type') == 'application/ld+json':
continue
for content in tag.contents:
assert not content, f'Inline JavaScript is prohibited, found "{content}" in {template_path}'
def test_templates(base_dir):
logging.info('Running tests for templates')
for root, _, filenames in os.walk(base_dir):
for filename in filenames:
if filename.endswith('.html'):
test_template(os.path.join(root, filename))
def test_single_page(input_path, lang):
with open(input_path) as f:
soup = bs4.BeautifulSoup(

View File

@ -54,7 +54,7 @@ SELECT * FROM file_engine_table
## 在 Clickhouse-local 中的使用 {#zai-clickhouse-local-zhong-de-shi-yong}
使用 [ツ环板-ョツ嘉ッツ偲](../../../engines/table-engines/special/file.md) 时File 引擎除了 `Format` 之外,还可以接受文件路径参数。可以使用数字或人类可读的名称来指定标准输入/输出流,例如 `0``stdin``1` 或 `stdout`
使用 [clickhouse-local](../../../operations/utilities/clickhouse-local.md) 时File 引擎除了 `Format` 之外,还可以接收文件路径参数。可以使用数字或名称来指定标准输入/输出流,例如 `0``stdin``1` 或 `stdout`
**例如:**
``` bash

View File

@ -16,7 +16,7 @@ machine_translated_rev: 5decc73b5dc60054f19087d3690c4eb99446a6c3
**示例**
``` sql
SELECT * FROM system.data_type_families WHERE alias_to = 'String'
SELECT * FROM system.data_type_families WHERE alias_to = 'String';
```
``` text

View File

@ -288,21 +288,35 @@ DataTypePtr getLeastSupertype(const DataTypes & types)
ErrorCodes::NO_COMMON_TYPE);
if (have_datetime64 == 0)
{
for (const auto & type : types)
{
if (isDateTime(type))
return type;
}
return std::make_shared<DataTypeDateTime>();
}
UInt8 max_scale = 0;
size_t max_scale_date_time_index = 0;
for (const auto & t : types)
for (size_t i = 0; i < types.size(); ++i)
{
if (const auto * dt64 = typeid_cast<const DataTypeDateTime64 *>(t.get()))
const auto & type = types[i];
if (const auto * date_time64_type = typeid_cast<const DataTypeDateTime64 *>(type.get()))
{
const auto scale = dt64->getScale();
if (scale > max_scale)
const auto scale = date_time64_type->getScale();
if (scale >= max_scale)
{
max_scale_date_time_index = i;
max_scale = scale;
}
}
}
return std::make_shared<DataTypeDateTime64>(max_scale);
return types[max_scale_date_time_index];
}
}

View File

@ -4,6 +4,7 @@
#include <Core/DecimalFunctions.h>
#include <Common/Exception.h>
#include <common/DateLUTImpl.h>
#include <common/DateLUT.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnDecimal.h>
#include <Functions/FunctionHelpers.h>
@ -863,19 +864,27 @@ struct DateTimeTransformImpl
{
using Op = Transformer<typename FromDataType::FieldType, typename ToDataType::FieldType, Transform>;
size_t time_zone_argument_position = 1;
if constexpr (std::is_same_v<ToDataType, DataTypeDateTime64>)
time_zone_argument_position = 2;
const DateLUTImpl & time_zone = extractTimeZoneFromFunctionArguments(arguments, time_zone_argument_position, 0);
const ColumnPtr source_col = arguments[0].column;
if (const auto * sources = checkAndGetColumn<typename FromDataType::ColumnType>(source_col.get()))
{
auto mutable_result_col = result_type->createColumn();
auto * col_to = assert_cast<typename ToDataType::ColumnType *>(mutable_result_col.get());
Op::vector(sources->getData(), col_to->getData(), time_zone, transform);
WhichDataType result_data_type(result_type);
if (result_data_type.isDateTime() || result_data_type.isDateTime64())
{
const auto & time_zone = dynamic_cast<const TimezoneMixin &>(*result_type).getTimeZone();
Op::vector(sources->getData(), col_to->getData(), time_zone, transform);
}
else
{
size_t time_zone_argument_position = 1;
if constexpr (std::is_same_v<ToDataType, DataTypeDateTime64>)
time_zone_argument_position = 2;
const DateLUTImpl & time_zone = extractTimeZoneFromFunctionArguments(arguments, time_zone_argument_position, 0);
Op::vector(sources->getData(), col_to->getData(), time_zone, transform);
}
return mutable_result_col;
}

View File

@ -15,7 +15,7 @@ ADDINCL(
contrib/libs/libdivide
contrib/libs/rapidjson/include
contrib/libs/xxhash
contrib/restricted/murmurhash
GLOBAL contrib/restricted/murmurhash
)
PEERDIR(

View File

@ -14,7 +14,7 @@ ADDINCL(
contrib/libs/libdivide
contrib/libs/rapidjson/include
contrib/libs/xxhash
contrib/restricted/murmurhash
GLOBAL contrib/restricted/murmurhash
)
PEERDIR(

View File

@ -457,6 +457,14 @@ struct ContextSharedPart
{
auto lock = std::lock_guard(mutex);
/** Compiled expressions stored in cache need to be destroyed before destruction of static objects.
* Because CHJIT instance can be static object.
*/
#if USE_EMBEDDED_COMPILER
if (auto * cache = CompiledExpressionCacheFactory::instance().tryGetCache())
cache->reset();
#endif
/// Preemptive destruction is important, because these objects may have a refcount to ContextShared (cyclic reference).
/// TODO: Get rid of this.

View File

@ -1504,7 +1504,7 @@ ExpressionAnalysisResult::ExpressionAnalysisResult(
if (auto actions = query_analyzer.appendPrewhere(chain, !first_stage, additional_required_columns_after_prewhere))
{
prewhere_info = std::make_shared<PrewhereDAGInfo>(actions, query.prewhere()->getColumnName(settings));
prewhere_info = std::make_shared<PrewhereInfo>(actions, query.prewhere()->getColumnName(settings));
if (allowEarlyConstantFolding(*prewhere_info->prewhere_actions, settings))
{
@ -1725,7 +1725,6 @@ void ExpressionAnalysisResult::checkActions() const
check_actions(prewhere_info->prewhere_actions);
check_actions(prewhere_info->alias_actions);
check_actions(prewhere_info->remove_columns_actions);
}
}

View File

@ -239,7 +239,7 @@ struct ExpressionAnalysisResult
/// Columns will be removed after prewhere actions execution.
NameSet columns_to_remove_after_prewhere;
PrewhereDAGInfoPtr prewhere_info;
PrewhereInfoPtr prewhere_info;
FilterDAGInfoPtr filter_info;
ConstantFilterDescription prewhere_constant_filter_description;
ConstantFilterDescription where_constant_filter_description;

View File

@ -204,6 +204,7 @@ HashJoin::HashJoin(std::shared_ptr<TableJoin> table_join_, const Block & right_s
if (table_join->dictionary_reader)
{
LOG_DEBUG(log, "Performing join over dict");
data->type = Type::DICT;
std::get<MapsOne>(data->maps).create(Type::DICT);
chooseMethod(key_columns, key_sizes); /// init key_sizes
@ -319,30 +320,23 @@ public:
using Mapped = RowRef;
using FindResult = ColumnsHashing::columns_hashing_impl::FindResultImpl<Mapped, true>;
KeyGetterForDict(const ColumnRawPtrs & key_columns_, const Sizes &, void *)
: key_columns(key_columns_)
{}
FindResult findKey(const TableJoin & table_join, size_t row, const Arena &)
KeyGetterForDict(const TableJoin & table_join, const ColumnRawPtrs & key_columns)
{
const DictionaryReader & reader = *table_join.dictionary_reader;
if (!read_result)
{
reader.readKeys(*key_columns[0], read_result, found, positions);
result.block = &read_result;
table_join.dictionary_reader->readKeys(*key_columns[0], read_result, found, positions);
if (table_join.forceNullableRight())
for (auto & column : read_result)
if (table_join.rightBecomeNullable(column.type))
JoinCommon::convertColumnToNullable(column);
}
for (ColumnWithTypeAndName & column : read_result)
if (table_join.rightBecomeNullable(column.type))
JoinCommon::convertColumnToNullable(column);
}
FindResult findKey(void *, size_t row, const Arena &)
{
result.block = &read_result;
result.row_num = positions[row];
return FindResult(&result, found[row], 0);
}
private:
const ColumnRawPtrs & key_columns;
Block read_result;
Mapped result;
ColumnVector<UInt8>::Container found;
@ -851,6 +845,7 @@ void setUsed(IColumn::Filter & filter [[maybe_unused]], size_t pos [[maybe_unuse
/// Makes filter (1 if row presented in right table) and returns offsets to replicate (for ALL JOINS).
template <ASTTableJoin::Kind KIND, ASTTableJoin::Strictness STRICTNESS, typename KeyGetter, typename Map, bool need_filter, bool has_null_map>
NO_INLINE IColumn::Filter joinRightColumns(
KeyGetter && key_getter,
const Map & map,
AddedColumns & added_columns,
const ConstNullMapPtr & null_map [[maybe_unused]],
@ -880,8 +875,6 @@ NO_INLINE IColumn::Filter joinRightColumns(
if constexpr (need_replication)
added_columns.offsets_to_replicate = std::make_unique<IColumn::Offsets>(rows);
auto key_getter = createKeyGetter<KeyGetter, is_asof_join>(added_columns.key_columns, added_columns.key_sizes);
IColumn::Offset current_offset = 0;
for (size_t i = 0; i < rows; ++i)
@ -980,35 +973,51 @@ NO_INLINE IColumn::Filter joinRightColumns(
template <ASTTableJoin::Kind KIND, ASTTableJoin::Strictness STRICTNESS, typename KeyGetter, typename Map>
IColumn::Filter joinRightColumnsSwitchNullability(
const Map & map, AddedColumns & added_columns, const ConstNullMapPtr & null_map, JoinStuff::JoinUsedFlags & used_flags)
KeyGetter && key_getter,
const Map & map,
AddedColumns & added_columns,
const ConstNullMapPtr & null_map,
JoinStuff::JoinUsedFlags & used_flags)
{
if (added_columns.need_filter)
{
if (null_map)
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, true, true>(map, added_columns, null_map, used_flags);
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, true, true>(
std::forward<KeyGetter>(key_getter), map, added_columns, null_map, used_flags);
else
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, true, false>(map, added_columns, nullptr, used_flags);
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, true, false>(
std::forward<KeyGetter>(key_getter), map, added_columns, nullptr, used_flags);
}
else
{
if (null_map)
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, false, true>(map, added_columns, null_map, used_flags);
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, false, true>(
std::forward<KeyGetter>(key_getter), map, added_columns, null_map, used_flags);
else
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, false, false>(map, added_columns, nullptr, used_flags);
return joinRightColumns<KIND, STRICTNESS, KeyGetter, Map, false, false>(
std::forward<KeyGetter>(key_getter), map, added_columns, nullptr, used_flags);
}
}
template <ASTTableJoin::Kind KIND, ASTTableJoin::Strictness STRICTNESS, typename Maps>
IColumn::Filter switchJoinRightColumns(
const Maps & maps_, AddedColumns & added_columns, HashJoin::Type type, const ConstNullMapPtr & null_map, JoinStuff::JoinUsedFlags & used_flags)
const Maps & maps_,
AddedColumns & added_columns,
HashJoin::Type type,
const ConstNullMapPtr & null_map,
JoinStuff::JoinUsedFlags & used_flags)
{
constexpr bool is_asof_join = STRICTNESS == ASTTableJoin::Strictness::Asof;
switch (type)
{
#define M(TYPE) \
case HashJoin::Type::TYPE: \
return joinRightColumnsSwitchNullability<KIND, STRICTNESS,\
typename KeyGetterForType<HashJoin::Type::TYPE, const std::remove_reference_t<decltype(*maps_.TYPE)>>::Type>(\
*maps_.TYPE, added_columns, null_map, used_flags);
{ \
using KeyGetter = typename KeyGetterForType<HashJoin::Type::TYPE, const std::remove_reference_t<decltype(*maps_.TYPE)>>::Type; \
auto key_getter = createKeyGetter<KeyGetter, is_asof_join>(added_columns.key_columns, added_columns.key_sizes); \
return joinRightColumnsSwitchNullability<KIND, STRICTNESS, KeyGetter>( \
std::move(key_getter), *maps_.TYPE, added_columns, null_map, used_flags); \
}
APPLY_FOR_JOIN_VARIANTS(M)
#undef M
@ -1025,8 +1034,12 @@ IColumn::Filter dictionaryJoinRightColumns(const TableJoin & table_join, AddedCo
STRICTNESS == ASTTableJoin::Strictness::Semi ||
STRICTNESS == ASTTableJoin::Strictness::Anti))
{
assert(added_columns.key_columns.size() == 1);
JoinStuff::JoinUsedFlags flags;
return joinRightColumnsSwitchNullability<KIND, STRICTNESS, KeyGetterForDict>(table_join, added_columns, null_map, flags);
KeyGetterForDict key_getter(table_join, added_columns.key_columns);
return joinRightColumnsSwitchNullability<KIND, STRICTNESS, KeyGetterForDict>(
std::move(key_getter), nullptr, added_columns, null_map, flags);
}
throw Exception("Logical error: wrong JOIN combination", ErrorCodes::LOGICAL_ERROR);

View File

@ -958,11 +958,11 @@ void InterpreterSelectQuery::executeImpl(QueryPlan & query_plan, const BlockInpu
if (expressions.prewhere_info)
{
if (expressions.prewhere_info->row_level_filter_actions)
if (expressions.prewhere_info->row_level_filter)
{
auto row_level_filter_step = std::make_unique<FilterStep>(
query_plan.getCurrentDataStream(),
expressions.prewhere_info->row_level_filter_actions,
expressions.prewhere_info->row_level_filter,
expressions.prewhere_info->row_level_column_name,
false);
@ -978,18 +978,6 @@ void InterpreterSelectQuery::executeImpl(QueryPlan & query_plan, const BlockInpu
prewhere_step->setStepDescription("PREWHERE");
query_plan.addStep(std::move(prewhere_step));
// To remove additional columns in dry run
// For example, sample column which can be removed in this stage
// TODO There seems to be no place initializing remove_columns_actions
if (expressions.prewhere_info->remove_columns_actions)
{
auto remove_columns = std::make_unique<ExpressionStep>(
query_plan.getCurrentDataStream(), expressions.prewhere_info->remove_columns_actions);
remove_columns->setStepDescription("Remove unnecessary columns after PREWHERE");
query_plan.addStep(std::move(remove_columns));
}
}
}
else
@ -1479,33 +1467,29 @@ void InterpreterSelectQuery::addEmptySourceToQueryPlan(
if (prewhere_info.alias_actions)
{
pipe.addSimpleTransform(
[&](const Block & header) { return std::make_shared<ExpressionTransform>(header, prewhere_info.alias_actions); });
pipe.addSimpleTransform([&](const Block & header)
{
return std::make_shared<ExpressionTransform>(header,
std::make_shared<ExpressionActions>(prewhere_info.alias_actions));
});
}
if (prewhere_info.row_level_filter)
{
pipe.addSimpleTransform([&](const Block & header)
{
return std::make_shared<FilterTransform>(header, prewhere_info.row_level_filter, prewhere_info.row_level_column_name, true);
return std::make_shared<FilterTransform>(header,
std::make_shared<ExpressionActions>(prewhere_info.row_level_filter),
prewhere_info.row_level_column_name, true);
});
}
pipe.addSimpleTransform([&](const Block & header)
{
return std::make_shared<FilterTransform>(
header, prewhere_info.prewhere_actions, prewhere_info.prewhere_column_name, prewhere_info.remove_prewhere_column);
header, std::make_shared<ExpressionActions>(prewhere_info.prewhere_actions),
prewhere_info.prewhere_column_name, prewhere_info.remove_prewhere_column);
});
// To remove additional columns
// In some cases, we did not read any marks so that the pipeline.streams is empty
// Thus, some columns in prewhere are not removed as expected
// This leads to mismatched header in distributed table
if (prewhere_info.remove_columns_actions)
{
pipe.addSimpleTransform(
[&](const Block & header) { return std::make_shared<ExpressionTransform>(header, prewhere_info.remove_columns_actions); });
}
}
auto read_from_pipe = std::make_unique<ReadFromPreparedSource>(std::move(pipe));
@ -1560,7 +1544,7 @@ void InterpreterSelectQuery::addPrewhereAliasActions()
if (does_storage_support_prewhere && settings.optimize_move_to_prewhere)
{
/// Execute row level filter in prewhere as a part of "move to prewhere" optimization.
expressions.prewhere_info = std::make_shared<PrewhereDAGInfo>(
expressions.prewhere_info = std::make_shared<PrewhereInfo>(
std::move(expressions.filter_info->actions),
std::move(expressions.filter_info->column_name));
expressions.prewhere_info->prewhere_actions->projectInput(false);
@ -1572,9 +1556,9 @@ void InterpreterSelectQuery::addPrewhereAliasActions()
else
{
/// Add row level security actions to prewhere.
expressions.prewhere_info->row_level_filter_actions = std::move(expressions.filter_info->actions);
expressions.prewhere_info->row_level_filter = std::move(expressions.filter_info->actions);
expressions.prewhere_info->row_level_column_name = std::move(expressions.filter_info->column_name);
expressions.prewhere_info->row_level_filter_actions->projectInput(false);
expressions.prewhere_info->row_level_filter->projectInput(false);
expressions.filter_info = nullptr;
}
}
@ -1613,9 +1597,9 @@ void InterpreterSelectQuery::addPrewhereAliasActions()
auto prewhere_required_columns = prewhere_info->prewhere_actions->getRequiredColumns().getNames();
required_columns_from_prewhere.insert(prewhere_required_columns.begin(), prewhere_required_columns.end());
if (prewhere_info->row_level_filter_actions)
if (prewhere_info->row_level_filter)
{
auto row_level_required_columns = prewhere_info->row_level_filter_actions->getRequiredColumns().getNames();
auto row_level_required_columns = prewhere_info->row_level_filter->getRequiredColumns().getNames();
required_columns_from_prewhere.insert(row_level_required_columns.begin(), row_level_required_columns.end());
}
}
@ -1898,28 +1882,7 @@ void InterpreterSelectQuery::executeFetchColumns(QueryProcessingStage::Enum proc
auto & prewhere_info = analysis_result.prewhere_info;
if (prewhere_info)
{
auto actions_settings = ExpressionActionsSettings::fromContext(context, CompileExpressions::yes);
query_info.prewhere_info = std::make_shared<PrewhereInfo>();
query_info.prewhere_info->prewhere_actions
= std::make_shared<ExpressionActions>(prewhere_info->prewhere_actions, actions_settings);
if (prewhere_info->row_level_filter_actions)
query_info.prewhere_info->row_level_filter
= std::make_shared<ExpressionActions>(prewhere_info->row_level_filter_actions, actions_settings);
if (prewhere_info->alias_actions)
query_info.prewhere_info->alias_actions
= std::make_shared<ExpressionActions>(prewhere_info->alias_actions, actions_settings);
if (prewhere_info->remove_columns_actions)
query_info.prewhere_info->remove_columns_actions
= std::make_shared<ExpressionActions>(prewhere_info->remove_columns_actions, actions_settings);
query_info.prewhere_info->prewhere_column_name = prewhere_info->prewhere_column_name;
query_info.prewhere_info->remove_prewhere_column = prewhere_info->remove_prewhere_column;
query_info.prewhere_info->row_level_column_name = prewhere_info->row_level_column_name;
query_info.prewhere_info->need_filter = prewhere_info->need_filter;
}
query_info.prewhere_info = prewhere_info;
/// Create optimizer with prepared actions.
/// Maybe we will need to calc input_order_info later, e.g. while reading from StorageMerge.

View File

@ -98,12 +98,12 @@ Block getHeaderForProcessingStage(
if (prewhere_info.row_level_filter)
{
prewhere_info.row_level_filter->execute(header);
header = prewhere_info.row_level_filter->updateHeader(std::move(header));
header.erase(prewhere_info.row_level_column_name);
}
if (prewhere_info.prewhere_actions)
prewhere_info.prewhere_actions->execute(header);
header = prewhere_info.prewhere_actions->updateHeader(std::move(header));
if (prewhere_info.remove_prewhere_column)
header.erase(prewhere_info.prewhere_column_name);

View File

@ -94,6 +94,7 @@ ReadFromMergeTree::ReadFromMergeTree(
, data(data_)
, query_info(query_info_)
, prewhere_info(getPrewhereInfo(query_info))
, actions_settings(ExpressionActionsSettings::fromContext(context_))
, metadata_snapshot(std::move(metadata_snapshot_))
, metadata_snapshot_base(std::move(metadata_snapshot_base_))
, context(std::move(context_))
@ -157,7 +158,7 @@ Pipe ReadFromMergeTree::readFromPool(
i, pool, min_marks_for_concurrent_read, max_block_size,
settings.preferred_block_size_bytes, settings.preferred_max_column_in_block_size_bytes,
data, metadata_snapshot, use_uncompressed_cache,
prewhere_info, reader_settings, virt_column_names);
prewhere_info, actions_settings, reader_settings, virt_column_names);
if (i == 0)
{
@ -180,7 +181,7 @@ ProcessorPtr ReadFromMergeTree::createSource(
return std::make_shared<TSource>(
data, metadata_snapshot, part.data_part, max_block_size, preferred_block_size_bytes,
preferred_max_column_in_block_size_bytes, required_columns, part.ranges, use_uncompressed_cache,
prewhere_info, true, reader_settings, virt_column_names, part.part_index_in_query);
prewhere_info, actions_settings, true, reader_settings, virt_column_names, part.part_index_in_query);
}
Pipe ReadFromMergeTree::readInOrder(

View File

@ -90,6 +90,7 @@ private:
const MergeTreeData & data;
SelectQueryInfo query_info;
PrewhereInfoPtr prewhere_info;
ExpressionActionsSettings actions_settings;
StorageMetadataPtr metadata_snapshot;
StorageMetadataPtr metadata_snapshot_base;

View File

@ -198,7 +198,7 @@ NameDependencies IStorage::getDependentViewsByColumn(ContextPtr context) const
return name_deps;
}
std::string PrewhereDAGInfo::dump() const
std::string PrewhereInfo::dump() const
{
WriteBufferFromOwnString ss;
ss << "PrewhereDagInfo\n";
@ -213,11 +213,6 @@ std::string PrewhereDAGInfo::dump() const
ss << "prewhere_actions " << prewhere_actions->dumpDAG() << "\n";
}
if (remove_columns_actions)
{
ss << "remove_columns_actions " << remove_columns_actions->dumpDAG() << "\n";
}
ss << "remove_prewhere_column " << remove_prewhere_column
<< ", need_filter " << need_filter << "\n";

View File

@ -27,6 +27,7 @@ MergeTreeBaseSelectProcessor::MergeTreeBaseSelectProcessor(
const MergeTreeData & storage_,
const StorageMetadataPtr & metadata_snapshot_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
UInt64 max_block_size_rows_,
UInt64 preferred_block_size_bytes_,
UInt64 preferred_max_column_in_block_size_bytes_,
@ -50,6 +51,23 @@ MergeTreeBaseSelectProcessor::MergeTreeBaseSelectProcessor(
for (auto it = virt_column_names.rbegin(); it != virt_column_names.rend(); ++it)
if (header_without_virtual_columns.has(*it))
header_without_virtual_columns.erase(*it);
if (prewhere_info)
{
prewhere_actions = std::make_unique<PrewhereExprInfo>();
if (prewhere_info->alias_actions)
prewhere_actions->alias_actions = std::make_shared<ExpressionActions>(prewhere_info->alias_actions, actions_settings);
if (prewhere_info->row_level_filter)
prewhere_actions->row_level_filter = std::make_shared<ExpressionActions>(prewhere_info->row_level_filter, actions_settings);
prewhere_actions->prewhere_actions = std::make_shared<ExpressionActions>(prewhere_info->prewhere_actions, actions_settings);
prewhere_actions->row_level_column_name = prewhere_info->row_level_column_name;
prewhere_actions->prewhere_column_name = prewhere_info->prewhere_column_name;
prewhere_actions->remove_prewhere_column = prewhere_info->remove_prewhere_column;
prewhere_actions->need_filter = prewhere_info->need_filter;
}
}
@ -79,14 +97,14 @@ void MergeTreeBaseSelectProcessor::initializeRangeReaders(MergeTreeReadTask & cu
{
if (reader->getColumns().empty())
{
current_task.range_reader = MergeTreeRangeReader(pre_reader.get(), nullptr, prewhere_info, true);
current_task.range_reader = MergeTreeRangeReader(pre_reader.get(), nullptr, prewhere_actions.get(), true);
}
else
{
MergeTreeRangeReader * pre_reader_ptr = nullptr;
if (pre_reader != nullptr)
{
current_task.pre_range_reader = MergeTreeRangeReader(pre_reader.get(), nullptr, prewhere_info, false);
current_task.pre_range_reader = MergeTreeRangeReader(pre_reader.get(), nullptr, prewhere_actions.get(), false);
pre_reader_ptr = &current_task.pre_range_reader;
}
@ -397,16 +415,17 @@ void MergeTreeBaseSelectProcessor::injectVirtualColumns(
chunk.setColumns(columns, num_rows);
}
void MergeTreeBaseSelectProcessor::executePrewhereActions(Block & block, const PrewhereInfoPtr & prewhere_info)
Block MergeTreeBaseSelectProcessor::transformHeader(
Block block, const PrewhereInfoPtr & prewhere_info, const DataTypePtr & partition_value_type, const Names & virtual_columns)
{
if (prewhere_info)
{
if (prewhere_info->alias_actions)
prewhere_info->alias_actions->execute(block);
block = prewhere_info->alias_actions->updateHeader(std::move(block));
if (prewhere_info->row_level_filter)
{
prewhere_info->row_level_filter->execute(block);
block = prewhere_info->row_level_filter->updateHeader(std::move(block));
auto & row_level_column = block.getByName(prewhere_info->row_level_column_name);
if (!row_level_column.type->canBeUsedInBooleanContext())
{
@ -418,7 +437,7 @@ void MergeTreeBaseSelectProcessor::executePrewhereActions(Block & block, const P
}
if (prewhere_info->prewhere_actions)
prewhere_info->prewhere_actions->execute(block);
block = prewhere_info->prewhere_actions->updateHeader(std::move(block));
auto & prewhere_column = block.getByName(prewhere_info->prewhere_column_name);
if (!prewhere_column.type->canBeUsedInBooleanContext())
@ -441,12 +460,7 @@ void MergeTreeBaseSelectProcessor::executePrewhereActions(Block & block, const P
ErrorCodes::ILLEGAL_TYPE_OF_COLUMN_FOR_FILTER);
}
}
}
Block MergeTreeBaseSelectProcessor::transformHeader(
Block block, const PrewhereInfoPtr & prewhere_info, const DataTypePtr & partition_value_type, const Names & virtual_columns)
{
executePrewhereActions(block, prewhere_info);
injectVirtualColumns(block, nullptr, partition_value_type, virtual_columns);
return block;
}

View File

@ -13,7 +13,7 @@ namespace DB
class IMergeTreeReader;
class UncompressedCache;
class MarkCache;
struct PrewhereExprInfo;
/// Base class for MergeTreeThreadSelectProcessor and MergeTreeSelectProcessor
class MergeTreeBaseSelectProcessor : public SourceWithProgress
@ -24,6 +24,7 @@ public:
const MergeTreeData & storage_,
const StorageMetadataPtr & metadata_snapshot_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
UInt64 max_block_size_rows_,
UInt64 preferred_block_size_bytes_,
UInt64 preferred_max_column_in_block_size_bytes_,
@ -36,8 +37,6 @@ public:
static Block transformHeader(
Block block, const PrewhereInfoPtr & prewhere_info, const DataTypePtr & partition_value_type, const Names & virtual_columns);
static void executePrewhereActions(Block & block, const PrewhereInfoPtr & prewhere_info);
protected:
Chunk generate() final;
@ -61,6 +60,7 @@ protected:
StorageMetadataPtr metadata_snapshot;
PrewhereInfoPtr prewhere_info;
std::unique_ptr<PrewhereExprInfo> prewhere_actions;
UInt64 max_block_size_rows;
UInt64 preferred_block_size_bytes;

View File

@ -272,16 +272,16 @@ MergeTreeReadTaskColumns getReadTaskColumns(
if (prewhere_info)
{
if (prewhere_info->alias_actions)
pre_column_names = prewhere_info->alias_actions->getRequiredColumns();
pre_column_names = prewhere_info->alias_actions->getRequiredColumnsNames();
else
{
pre_column_names = prewhere_info->prewhere_actions->getRequiredColumns();
pre_column_names = prewhere_info->prewhere_actions->getRequiredColumnsNames();
if (prewhere_info->row_level_filter)
{
NameSet names(pre_column_names.begin(), pre_column_names.end());
for (auto & name : prewhere_info->row_level_filter->getRequiredColumns())
for (auto & name : prewhere_info->row_level_filter->getRequiredColumnsNames())
{
if (names.count(name) == 0)
pre_column_names.push_back(name);

View File

@ -3945,15 +3945,9 @@ bool MergeTreeData::getQueryProcessingStageWithAggregateProjection(
if (analysis_result.prewhere_info)
{
const auto & prewhere_info = analysis_result.prewhere_info;
candidate.prewhere_info = std::make_shared<PrewhereInfo>();
candidate.prewhere_info->prewhere_column_name = prewhere_info->prewhere_column_name;
candidate.prewhere_info->remove_prewhere_column = prewhere_info->remove_prewhere_column;
// std::cerr << fmt::format("remove prewhere column : {}", candidate.prewhere_info->remove_prewhere_column) << std::endl;
candidate.prewhere_info->row_level_column_name = prewhere_info->row_level_column_name;
candidate.prewhere_info->need_filter = prewhere_info->need_filter;
candidate.prewhere_info = analysis_result.prewhere_info;
auto prewhere_actions = prewhere_info->prewhere_actions->clone();
auto prewhere_actions = candidate.prewhere_info->prewhere_actions->clone();
auto prewhere_required_columns = required_columns;
// required_columns should not contain columns generated by prewhere
for (const auto & column : prewhere_actions->getResultColumns())
@ -3961,28 +3955,27 @@ bool MergeTreeData::getQueryProcessingStageWithAggregateProjection(
// std::cerr << fmt::format("prewhere_actions = \n{}", prewhere_actions->dumpDAG()) << std::endl;
// Prewhere_action should not add missing keys.
prewhere_required_columns = prewhere_actions->foldActionsByProjection(
prewhere_required_columns, projection.sample_block_for_keys, prewhere_info->prewhere_column_name, false);
prewhere_required_columns, projection.sample_block_for_keys, candidate.prewhere_info->prewhere_column_name, false);
// std::cerr << fmt::format("prewhere_actions = \n{}", prewhere_actions->dumpDAG()) << std::endl;
// std::cerr << fmt::format("prewhere_required_columns = \n{}", fmt::join(prewhere_required_columns, ", ")) << std::endl;
if (prewhere_required_columns.empty())
return false;
candidate.prewhere_info->prewhere_actions = std::make_shared<ExpressionActions>(prewhere_actions, actions_settings);
candidate.prewhere_info->prewhere_actions = prewhere_actions;
if (prewhere_info->row_level_filter_actions)
if (candidate.prewhere_info->row_level_filter)
{
auto row_level_filter_actions = prewhere_info->row_level_filter_actions->clone();
auto row_level_filter_actions = candidate.prewhere_info->row_level_filter->clone();
prewhere_required_columns = row_level_filter_actions->foldActionsByProjection(
prewhere_required_columns, projection.sample_block_for_keys, prewhere_info->row_level_column_name, false);
prewhere_required_columns, projection.sample_block_for_keys, candidate.prewhere_info->row_level_column_name, false);
// std::cerr << fmt::format("row_level_filter_required_columns = \n{}", fmt::join(prewhere_required_columns, ", ")) << std::endl;
if (prewhere_required_columns.empty())
return false;
candidate.prewhere_info->row_level_filter
= std::make_shared<ExpressionActions>(row_level_filter_actions, actions_settings);
candidate.prewhere_info->row_level_filter = row_level_filter_actions;
}
if (prewhere_info->alias_actions)
if (candidate.prewhere_info->alias_actions)
{
auto alias_actions = prewhere_info->alias_actions->clone();
auto alias_actions = candidate.prewhere_info->alias_actions->clone();
// std::cerr << fmt::format("alias_actions = \n{}", alias_actions->dumpDAG()) << std::endl;
prewhere_required_columns
= alias_actions->foldActionsByProjection(prewhere_required_columns, projection.sample_block_for_keys, {}, false);
@ -3990,7 +3983,7 @@ bool MergeTreeData::getQueryProcessingStageWithAggregateProjection(
// std::cerr << fmt::format("alias_required_columns = \n{}", fmt::join(prewhere_required_columns, ", ")) << std::endl;
if (prewhere_required_columns.empty())
return false;
candidate.prewhere_info->alias_actions = std::make_shared<ExpressionActions>(alias_actions, actions_settings);
candidate.prewhere_info->alias_actions = alias_actions;
}
required_columns.insert(prewhere_required_columns.begin(), prewhere_required_columns.end());
}

View File

@ -520,7 +520,7 @@ size_t MergeTreeRangeReader::ReadResult::countBytesInResultFilter(const IColumn:
MergeTreeRangeReader::MergeTreeRangeReader(
IMergeTreeReader * merge_tree_reader_,
MergeTreeRangeReader * prev_reader_,
const PrewhereInfoPtr & prewhere_info_,
const PrewhereExprInfo * prewhere_info_,
bool last_reader_in_chain_)
: merge_tree_reader(merge_tree_reader_)
, index_granularity(&(merge_tree_reader->data_part->index_granularity))

View File

@ -15,6 +15,25 @@ class MergeTreeIndexGranularity;
struct PrewhereInfo;
using PrewhereInfoPtr = std::shared_ptr<PrewhereInfo>;
class ExpressionActions;
using ExpressionActionsPtr = std::shared_ptr<ExpressionActions>;
/// The same as PrewhereInfo, but with ExpressionActions instead of ActionsDAG
struct PrewhereExprInfo
{
/// Actions which are executed in order to alias columns are used for prewhere actions.
ExpressionActionsPtr alias_actions;
/// Actions for row level security filter. Applied separately before prewhere_actions.
/// This actions are separate because prewhere condition should not be executed over filtered rows.
ExpressionActionsPtr row_level_filter;
/// Actions which are executed on block in order to get filter column for prewhere step.
ExpressionActionsPtr prewhere_actions;
String row_level_column_name;
String prewhere_column_name;
bool remove_prewhere_column = false;
bool need_filter = false;
};
/// MergeTreeReader iterator which allows sequential reading for arbitrary number of rows between pairs of marks in the same part.
/// Stores reading state, which can be inside granule. Can skip rows in current granule and start reading from next mark.
/// Used generally for reading number of rows less than index granularity to decrease cache misses for fat blocks.
@ -24,7 +43,7 @@ public:
MergeTreeRangeReader(
IMergeTreeReader * merge_tree_reader_,
MergeTreeRangeReader * prev_reader_,
const PrewhereInfoPtr & prewhere_info_,
const PrewhereExprInfo * prewhere_info_,
bool last_reader_in_chain_);
MergeTreeRangeReader() = default;
@ -217,7 +236,7 @@ private:
IMergeTreeReader * merge_tree_reader = nullptr;
const MergeTreeIndexGranularity * index_granularity = nullptr;
MergeTreeRangeReader * prev_reader = nullptr; /// If not nullptr, read from prev_reader firstly.
PrewhereInfoPtr prewhere_info;
const PrewhereExprInfo * prewhere_info;
Stream stream;

View File

@ -23,6 +23,7 @@ MergeTreeReverseSelectProcessor::MergeTreeReverseSelectProcessor(
MarkRanges mark_ranges_,
bool use_uncompressed_cache_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
bool check_columns,
const MergeTreeReaderSettings & reader_settings_,
const Names & virt_column_names_,
@ -31,7 +32,7 @@ MergeTreeReverseSelectProcessor::MergeTreeReverseSelectProcessor(
:
MergeTreeBaseSelectProcessor{
metadata_snapshot_->getSampleBlockForColumns(required_columns_, storage_.getVirtuals(), storage_.getStorageID()),
storage_, metadata_snapshot_, prewhere_info_, max_block_size_rows_,
storage_, metadata_snapshot_, prewhere_info_, std::move(actions_settings), max_block_size_rows_,
preferred_block_size_bytes_, preferred_max_column_in_block_size_bytes_,
reader_settings_, use_uncompressed_cache_, virt_column_names_},
required_columns{std::move(required_columns_)},

View File

@ -27,6 +27,7 @@ public:
MarkRanges mark_ranges,
bool use_uncompressed_cache,
const PrewhereInfoPtr & prewhere_info,
ExpressionActionsSettings actions_settings,
bool check_columns,
const MergeTreeReaderSettings & reader_settings,
const Names & virt_column_names = {},

View File

@ -23,6 +23,7 @@ MergeTreeSelectProcessor::MergeTreeSelectProcessor(
MarkRanges mark_ranges_,
bool use_uncompressed_cache_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
bool check_columns_,
const MergeTreeReaderSettings & reader_settings_,
const Names & virt_column_names_,
@ -31,7 +32,7 @@ MergeTreeSelectProcessor::MergeTreeSelectProcessor(
:
MergeTreeBaseSelectProcessor{
metadata_snapshot_->getSampleBlockForColumns(required_columns_, storage_.getVirtuals(), storage_.getStorageID()),
storage_, metadata_snapshot_, prewhere_info_, max_block_size_rows_,
storage_, metadata_snapshot_, prewhere_info_, std::move(actions_settings), max_block_size_rows_,
preferred_block_size_bytes_, preferred_max_column_in_block_size_bytes_,
reader_settings_, use_uncompressed_cache_, virt_column_names_},
required_columns{std::move(required_columns_)},

View File

@ -27,6 +27,7 @@ public:
MarkRanges mark_ranges,
bool use_uncompressed_cache,
const PrewhereInfoPtr & prewhere_info,
ExpressionActionsSettings actions_settings,
bool check_columns,
const MergeTreeReaderSettings & reader_settings,
const Names & virt_column_names = {},

View File

@ -19,11 +19,12 @@ MergeTreeThreadSelectBlockInputProcessor::MergeTreeThreadSelectBlockInputProcess
const StorageMetadataPtr & metadata_snapshot_,
const bool use_uncompressed_cache_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
const MergeTreeReaderSettings & reader_settings_,
const Names & virt_column_names_)
:
MergeTreeBaseSelectProcessor{
pool_->getHeader(), storage_, metadata_snapshot_, prewhere_info_, max_block_size_rows_,
pool_->getHeader(), storage_, metadata_snapshot_, prewhere_info_, std::move(actions_settings), max_block_size_rows_,
preferred_block_size_bytes_, preferred_max_column_in_block_size_bytes_,
reader_settings_, use_uncompressed_cache_, virt_column_names_},
thread{thread_},

View File

@ -25,7 +25,9 @@ public:
const StorageMetadataPtr & metadata_snapshot_,
const bool use_uncompressed_cache_,
const PrewhereInfoPtr & prewhere_info_,
ExpressionActionsSettings actions_settings,
const MergeTreeReaderSettings & reader_settings_,
const Names & virt_column_names_);
String getName() const override { return "MergeTreeThread"; }

View File

@ -21,9 +21,6 @@ using ActionsDAGPtr = std::shared_ptr<ActionsDAG>;
struct PrewhereInfo;
using PrewhereInfoPtr = std::shared_ptr<PrewhereInfo>;
struct PrewhereDAGInfo;
using PrewhereDAGInfoPtr = std::shared_ptr<PrewhereDAGInfo>;
struct FilterInfo;
using FilterInfoPtr = std::shared_ptr<FilterInfo>;
@ -45,34 +42,19 @@ using ClusterPtr = std::shared_ptr<Cluster>;
struct PrewhereInfo
{
/// Actions which are executed in order to alias columns are used for prewhere actions.
ExpressionActionsPtr alias_actions;
ActionsDAGPtr alias_actions;
/// Actions for row level security filter. Applied separately before prewhere_actions.
/// This actions are separate because prewhere condition should not be executed over filtered rows.
ExpressionActionsPtr row_level_filter;
ActionsDAGPtr row_level_filter;
/// Actions which are executed on block in order to get filter column for prewhere step.
ExpressionActionsPtr prewhere_actions;
/// Actions which are executed after reading from storage in order to remove unused columns.
ExpressionActionsPtr remove_columns_actions;
String row_level_column_name;
String prewhere_column_name;
bool remove_prewhere_column = false;
bool need_filter = false;
};
/// Same as PrewhereInfo, but with ActionsDAG.
struct PrewhereDAGInfo
{
ActionsDAGPtr alias_actions;
ActionsDAGPtr row_level_filter_actions;
ActionsDAGPtr prewhere_actions;
ActionsDAGPtr remove_columns_actions;
String row_level_column_name;
String prewhere_column_name;
bool remove_prewhere_column = false;
bool need_filter = false;
PrewhereDAGInfo() = default;
explicit PrewhereDAGInfo(ActionsDAGPtr prewhere_actions_, String prewhere_column_name_)
PrewhereInfo() = default;
explicit PrewhereInfo(ActionsDAGPtr prewhere_actions_, String prewhere_column_name_)
: prewhere_actions(std::move(prewhere_actions_)), prewhere_column_name(std::move(prewhere_column_name_)) {}
std::string dump() const;

View File

@ -369,13 +369,14 @@ void StorageBuffer::read(
{
if (query_info.prewhere_info)
{
auto actions_settings = ExpressionActionsSettings::fromContext(local_context);
if (query_info.prewhere_info->alias_actions)
{
pipe_from_buffers.addSimpleTransform([&](const Block & header)
{
return std::make_shared<ExpressionTransform>(
header,
query_info.prewhere_info->alias_actions);
std::make_shared<ExpressionActions>(query_info.prewhere_info->alias_actions, actions_settings));
});
}
@ -385,7 +386,7 @@ void StorageBuffer::read(
{
return std::make_shared<FilterTransform>(
header,
query_info.prewhere_info->row_level_filter,
std::make_shared<ExpressionActions>(query_info.prewhere_info->row_level_filter, actions_settings),
query_info.prewhere_info->row_level_column_name,
false);
});
@ -395,7 +396,7 @@ void StorageBuffer::read(
{
return std::make_shared<FilterTransform>(
header,
query_info.prewhere_info->prewhere_actions,
std::make_shared<ExpressionActions>(query_info.prewhere_info->prewhere_actions, actions_settings),
query_info.prewhere_info->prewhere_column_name,
query_info.prewhere_info->remove_prewhere_column);
});

View File

@ -554,9 +554,7 @@ def test_concurrent_queries(started_cluster):
busy_pool = Pool(5)
p = busy_pool.map_async(node_insert, range(5))
p.wait()
result = node1.query("SELECT count() FROM test_pg_table", user='default')
logging.debug(result)
assert(int(result) == 5 * 5 * 1000)
assert_eq_with_retry(node1, "SELECT count() FROM test_pg_table", str(5*5*1000))
def node_insert_select(_):
for i in range(5):
@ -566,9 +564,7 @@ def test_concurrent_queries(started_cluster):
busy_pool = Pool(5)
p = busy_pool.map_async(node_insert_select, range(5))
p.wait()
result = node1.query("SELECT count() FROM test_pg_table", user='default')
logging.debug(result)
assert(int(result) == 5 * 5 * 1000 * 2)
assert_eq_with_retry(node1, "SELECT count() FROM test_pg_table", str(5*5*1000*2))
node1.query('DROP TABLE test_pg_table;')
cursor.execute('DROP TABLE clickhouse.test_pg_table;')

View File

@ -39,8 +39,8 @@ def test_mutate_and_upgrade(start_cluster):
node2.restart_with_latest_version(signal=9)
# After hard restart table can be in readonly mode
exec_query_with_retry(node2, "INSERT INTO mt VALUES ('2020-02-13', 3)")
exec_query_with_retry(node1, "SYSTEM SYNC REPLICA mt")
exec_query_with_retry(node2, "INSERT INTO mt VALUES ('2020-02-13', 3)", retry_count=60)
exec_query_with_retry(node1, "SYSTEM SYNC REPLICA mt", retry_count=60)
assert node1.query("SELECT COUNT() FROM mt") == "2\n"
assert node2.query("SELECT COUNT() FROM mt") == "2\n"
@ -79,7 +79,8 @@ def test_upgrade_while_mutation(start_cluster):
node3.restart_with_latest_version(signal=9)
exec_query_with_retry(node3, "SYSTEM RESTART REPLICA mt1")
# checks for readonly
exec_query_with_retry(node3, "OPTIMIZE TABLE mt1", retry_count=60)
node3.query("ALTER TABLE mt1 DELETE WHERE id > 100000", settings={"mutations_sync": "2"})
# will delete nothing, but previous async mutation will finish with this query

View File

@ -0,0 +1,37 @@
<test>
<create_query>
CREATE TABLE join_dictionary_source_table (key UInt64, value String)
ENGINE = MergeTree ORDER BY key;
</create_query>
<create_query>
CREATE DICTIONARY join_hashed_dictionary (key UInt64, value String)
PRIMARY KEY key
SOURCE(CLICKHOUSE(DB 'default' TABLE 'join_dictionary_source_table'))
LIFETIME(MIN 0 MAX 1000)
LAYOUT(HASHED());
</create_query>
<fill_query>
INSERT INTO join_dictionary_source_table
SELECT number, toString(number)
FROM numbers(1000000);
</fill_query>
<query>
SELECT COUNT()
FROM join_dictionary_source_table
JOIN join_hashed_dictionary
ON join_dictionary_source_table.key = join_hashed_dictionary.key;
</query>
<query>
SELECT COUNT()
FROM join_dictionary_source_table
JOIN join_hashed_dictionary
ON join_dictionary_source_table.key = toUInt64(join_hashed_dictionary.key);
</query>
<drop_query>DROP DICTIONARY IF EXISTS join_hashed_dictionary;</drop_query>
<drop_query>DROP TABLE IF EXISTS join_dictionary_source_table;</drop_query>
</test>

View File

@ -92,8 +92,8 @@ value vs value
0 1 1 UInt64 Decimal(18, 0) Decimal(38, 0)
0 1 1 UInt64 Decimal(38, 0) Decimal(38, 0)
1970-01-01 1970-01-02 1970-01-02 Date Date Date
2000-01-01 2000-01-01 00:00:01 2000-01-01 00:00:01 Date DateTime(\'Europe/Moscow\') DateTime
2000-01-01 00:00:00 2000-01-02 2000-01-02 00:00:00 DateTime(\'Europe/Moscow\') Date DateTime
2000-01-01 2000-01-01 00:00:01 2000-01-01 00:00:01 Date DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\')
2000-01-01 00:00:00 2000-01-02 2000-01-02 00:00:00 DateTime(\'Europe/Moscow\') Date DateTime(\'Europe/Moscow\')
1970-01-01 03:00:00 1970-01-01 03:00:01 1970-01-01 03:00:01 DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\')
column vs value
0 1 1 Int8 Int8 Int8
@ -189,6 +189,6 @@ column vs value
0 1 1 UInt64 Decimal(18, 0) Decimal(38, 0)
0 1 1 UInt64 Decimal(38, 0) Decimal(38, 0)
1970-01-01 1970-01-02 1970-01-02 Date Date Date
2000-01-01 2000-01-01 00:00:01 2000-01-01 00:00:01 Date DateTime(\'Europe/Moscow\') DateTime
2000-01-01 00:00:00 2000-01-02 2000-01-02 00:00:00 DateTime(\'Europe/Moscow\') Date DateTime
2000-01-01 2000-01-01 00:00:01 2000-01-01 00:00:01 Date DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\')
2000-01-01 00:00:00 2000-01-02 2000-01-02 00:00:00 DateTime(\'Europe/Moscow\') Date DateTime(\'Europe/Moscow\')
1970-01-01 03:00:00 1970-01-01 03:00:01 1970-01-01 03:00:01 DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\') DateTime(\'Europe/Moscow\')

View File

@ -0,0 +1,2 @@
1
1

View File

@ -0,0 +1,2 @@
SELECT toDate('2000-01-01') < toDateTime('2000-01-01 00:00:01', 'Europe/Moscow');
SELECT toDate('2000-01-01') < toDateTime64('2000-01-01 00:00:01', 0, 'Europe/Moscow');

View File

@ -0,0 +1,12 @@
Array
Array(DateTime(\'Europe/Moscow\'))
Array(DateTime64(5, \'Europe/Moscow\'))
Array(DateTime64(6, \'Europe/Moscow\'))
If
2000-01-01 00:00:00 DateTime(\'Europe/Moscow\')
2000-01-01 00:00:00 DateTime(\'Europe/Moscow\')
2000-01-01 00:00:00.00000 DateTime64(5, \'Europe/Moscow\')
2000-01-01 00:00:00.00000 DateTime64(5, \'Europe/Moscow\')
Cast
2000-01-01 00:00:00 DateTime(\'UTC\')
2000-01-01 00:00:00.00000 DateTime64(5, \'UTC\')

View File

@ -0,0 +1,24 @@
SELECT 'Array';
SELECT toTypeName([toDate('2000-01-01'), toDateTime('2000-01-01', 'Europe/Moscow')]);
SELECT toTypeName([toDate('2000-01-01'), toDateTime('2000-01-01', 'Europe/Moscow'), toDateTime64('2000-01-01', 5, 'Europe/Moscow')]);
SELECT toTypeName([toDate('2000-01-01'), toDateTime('2000-01-01', 'Europe/Moscow'), toDateTime64('2000-01-01', 5, 'Europe/Moscow'), toDateTime64('2000-01-01', 6, 'Europe/Moscow')]);
DROP TABLE IF EXISTS predicate_table;
CREATE TABLE predicate_table (value UInt8) ENGINE=TinyLog;
INSERT INTO predicate_table VALUES (0), (1);
SELECT 'If';
WITH toDate('2000-01-01') as a, toDateTime('2000-01-01', 'Europe/Moscow') as b
SELECT if(value, b, a) as result, toTypeName(result)
FROM predicate_table;
WITH toDateTime('2000-01-01') as a, toDateTime64('2000-01-01', 5, 'Europe/Moscow') as b
SELECT if(value, b, a) as result, toTypeName(result)
FROM predicate_table;
SELECT 'Cast';
SELECT CAST(toDate('2000-01-01') AS DateTime('UTC')) AS x, toTypeName(x);
SELECT CAST(toDate('2000-01-01') AS DateTime64(5, 'UTC')) AS x, toTypeName(x);

View File

@ -0,0 +1,2 @@
1 2
3 4

View File

@ -0,0 +1,8 @@
SELECT * FROM (
SELECT 1 AS a, 2 AS b FROM system.one
JOIN system.one USING dummy
UNION ALL
SELECT 3 AS a, 4 AS b FROM system.one
)
WHERE a != 10
ORDER BY a, b;

View File

@ -12,7 +12,7 @@ sudo npm install -g purify-css amphtml-validator
sudo apt install wkhtmltopdf
virtualenv build
./build.py --skip-multi-page --skip-single-page --skip-amp --skip-pdf --skip-blog --skip-git-log --skip-docs --skip-test-templates --livereload 8080
./build.py --skip-multi-page --skip-single-page --skip-amp --skip-pdf --skip-blog --skip-git-log --skip-docs --livereload 8080
# Open the web browser and go to http://localhost:8080/
```
@ -20,11 +20,11 @@ virtualenv build
# How to quickly test the blog
```
./build.py --skip-multi-page --skip-single-page --skip-amp --skip-pdf --skip-git-log --skip-docs --skip-test-templates --livereload 8080
./build.py --skip-multi-page --skip-single-page --skip-amp --skip-pdf --skip-git-log --skip-docs --livereload 8080
```
# How to quickly test the ugly annoying broken links in docs
```
./build.py --skip-multi-page --skip-amp --skip-pdf --skip-blog --skip-git-log --skip-test-templates --lang en --livereload 8080
./build.py --skip-multi-page --skip-amp --skip-pdf --skip-blog --skip-git-log --lang en --livereload 8080
```

View File

@ -2,9 +2,7 @@
<div class="container lead">
<h2 id="quick-start" class="display-4 mt-5">Quick start</h2>
<p>System requirements for pre-built packages: Linux, x86_64 with SSE 4.2.</p>
<ul class="nav nav-tabs" id="install-tab" role="tablist">
<ul class="nav nav-tabs mt-1 small" id="install-tab" role="tablist">
<li class="nav-item">
<a class="nav-link active" id="deb-tab" data-toggle="tab" href="#deb" role="tab" aria-controls="deb" aria-selected="true" title="deb packages">Ubuntu or Debian</a>
</li>
@ -14,6 +12,15 @@
<li class="nav-item">
<a class="nav-link" id="tgz-tab" data-toggle="tab" href="#tgz" role="tab" aria-controls="tgz" aria-selected="false" title="tgz packages">Other Linux</a>
</li>
<li class="nav-item">
<a class="nav-link" id="mac-x86-tab" data-toggle="tab" href="#mac-x86" role="tab" aria-controls="mac-x86" aria-selected="false" title="Mac x86 packages">Mac (x86)</a>
</li>
<li class="nav-item">
<a class="nav-link" id="mac-arm-tab" data-toggle="tab" href="#mac-arm" role="tab" aria-controls="mac-arm" aria-selected="false" title="Mac ARM packages">Mac (ARM)</a>
</li>
<li class="nav-item">
<a class="nav-link" id="freebsd-tab" data-toggle="tab" href="#freebsd" role="tab" aria-controls="freebsd" aria-selected="false" title="FreeBSD packages">FreeBSD</a>
</li>
<li class="nav-item">
<a class="nav-link" href="/docs/en/commercial/cloud/" role="tab" aria-controls="cloud" aria-selected="false" title="Cloud Service Providers"><strong>Cloud</strong></a>
</li>
@ -29,6 +36,15 @@
<div class="tab-pane syntax p-3 my-3" id="tgz" role="tabpanel" aria-labelledby="tgz-tab">
<pre>{% include "install/tgz.sh" %}</pre>
</div>
<div class="tab-pane syntax p-3 my-3" id="mac-x86" role="tabpanel" aria-labelledby="mac-x86-tab">
<pre>{% include "install/mac-x86.sh" %}</pre>
</div>
<div class="tab-pane syntax p-3 my-3" id="mac-arm" role="tabpanel" aria-labelledby="mac-arm-tab">
<pre>{% include "install/mac-arm.sh" %}</pre>
</div>
<div class="tab-pane syntax p-3 my-3" id="freebsd" role="tabpanel" aria-labelledby="freebsd-tab">
<pre>{% include "install/freebsd.sh" %}</pre>
</div>
</div>
<p>For other operating systems the easiest way to get started is using