diff --git a/docs/en/sql-reference/functions/date-time-functions.md b/docs/en/sql-reference/functions/date-time-functions.md index 0a7be3142ee..af920ba2482 100644 --- a/docs/en/sql-reference/functions/date-time-functions.md +++ b/docs/en/sql-reference/functions/date-time-functions.md @@ -983,6 +983,8 @@ Result: Adds the time interval or date interval to the provided date or date with time. +If the addition results in a value outside the bounds of the data type, the result is undefined. + **Syntax** ``` sql @@ -1006,13 +1008,13 @@ Aliases: `dateAdd`, `DATE_ADD`. - `year` - `value` — Value of interval to add. [Int](../../sql-reference/data-types/int-uint.md). -- `date` — The date or date with time to which `value` is added. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +- `date` — The date or date with time to which `value` is added. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Returned value** Date or date with time obtained by adding `value`, expressed in `unit`, to `date`. -Type: [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Example** @@ -1028,10 +1030,16 @@ Result: └───────────────────────────────────────────────┘ ``` +**See Also** + +- [addDate](#addDate) + ## date\_sub Subtracts the time interval or date interval from the provided date or date with time. +If the subtraction results in a value outside the bounds of the data type, the result is undefined. + **Syntax** ``` sql @@ -1056,13 +1064,13 @@ Aliases: `dateSub`, `DATE_SUB`. - `year` - `value` — Value of interval to subtract. [Int](../../sql-reference/data-types/int-uint.md). -- `date` — The date or date with time from which `value` is subtracted. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +- `date` — The date or date with time from which `value` is subtracted. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Returned value** Date or date with time obtained by subtracting `value`, expressed in `unit`, from `date`. -Type: [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Example** @@ -1078,10 +1086,15 @@ Result: └────────────────────────────────────────────────┘ ``` +**See Also** +- [subDate](#subDate) + ## timestamp\_add Adds the specified time value with the provided date or date time value. +If the addition results in a value outside the bounds of the data type, the result is undefined. + **Syntax** ``` sql @@ -1092,7 +1105,7 @@ Aliases: `timeStampAdd`, `TIMESTAMP_ADD`. **Arguments** -- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). - `value` — Value of interval to add. [Int](../../sql-reference/data-types/int-uint.md). - `unit` — The type of interval to add. [String](../../sql-reference/data-types/string.md). Possible values: @@ -1110,7 +1123,7 @@ Aliases: `timeStampAdd`, `TIMESTAMP_ADD`. Date or date with time with the specified `value` expressed in `unit` added to `date`. -Type: [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Example** @@ -1130,6 +1143,8 @@ Result: Subtracts the time interval from the provided date or date with time. +If the subtraction results in a value outside the bounds of the data type, the result is undefined. + **Syntax** ``` sql @@ -1153,13 +1168,13 @@ Aliases: `timeStampSub`, `TIMESTAMP_SUB`. - `year` - `value` — Value of interval to subtract. [Int](../../sql-reference/data-types/int-uint.md). -- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Returned value** Date or date with time obtained by subtracting `value`, expressed in `unit`, from `date`. -Type: [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Example** @@ -1175,6 +1190,90 @@ Result: └──────────────────────────────────────────────────────────────┘ ``` +## addDate + +Adds the time interval or date interval to the provided date or date with time. + +If the addition results in a value outside the bounds of the data type, the result is undefined. + +**Syntax** + +``` sql +addDate(date, interval) +``` + +**Arguments** + +- `date` — The date or date with time to which `interval` is added. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). +- `interval` — Interval to add. [Interval](../../sql-reference/data-types/special-data-types/interval.md). + +**Returned value** + +Date or date with time obtained by adding `interval` to `date`. + +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). + +**Example** + +```sql +SELECT addDate(toDate('2018-01-01'), INTERVAL 3 YEAR); +``` + +Result: + +```text +┌─addDate(toDate('2018-01-01'), toIntervalYear(3))─┐ +│ 2021-01-01 │ +└──────────────────────────────────────────────────┘ +``` + +Alias: `ADDDATE` + +**See Also** +- [date_add](#date_add) + +## subDate + +Subtracts the time interval or date interval from the provided date or date with time. + +If the subtraction results in a value outside the bounds of the data type, the result is undefined. + +**Syntax** + +``` sql +subDate(date, interval) +``` + +**Arguments** + +- `date` — The date or date with time from which `interval` is subtracted. [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). +- `interval` — Interval to subtract. [Interval](../../sql-reference/data-types/special-data-types/interval.md). + +**Returned value** + +Date or date with time obtained by subtracting `interval` from `date`. + +Type: [Date](../../sql-reference/data-types/date.md), [Date32](../../sql-reference/data-types/date32.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). + +**Example** + +```sql +SELECT subDate(toDate('2018-01-01'), INTERVAL 3 YEAR); +``` + +Result: + +```text +┌─subDate(toDate('2018-01-01'), toIntervalYear(3))─┐ +│ 2015-01-01 │ +└──────────────────────────────────────────────────┘ +``` + +Alias: `SUBDATE` + +**See Also** +- [date_sub](#date_sub) + ## now Returns the current date and time at the moment of query analysis. The function is a constant expression. @@ -1671,7 +1770,7 @@ monthName(date) **Arguments** -- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md) or [DateTime](../../sql-reference/data-types/datetime.md). +- `date` — Date or date with time. [Date](../../sql-reference/data-types/date.md), [DateTime](../../sql-reference/data-types/datetime.md) or [DateTime64](../../sql-reference/data-types/datetime64.md). **Returned value** diff --git a/src/Client/ClientBase.cpp b/src/Client/ClientBase.cpp index cc60095884e..9e86f30b691 100644 --- a/src/Client/ClientBase.cpp +++ b/src/Client/ClientBase.cpp @@ -1071,7 +1071,9 @@ void ClientBase::receiveResult(ASTPtr parsed_query, Int32 signals_before_stop, b } catch (const LocalFormatError &) { - local_format_error = std::current_exception(); + /// Remember the first exception. + if (!local_format_error) + local_format_error = std::current_exception(); connection->sendCancel(); } } diff --git a/src/Columns/ColumnDecimal.cpp b/src/Columns/ColumnDecimal.cpp index 142ee6c271d..0d82818a431 100644 --- a/src/Columns/ColumnDecimal.cpp +++ b/src/Columns/ColumnDecimal.cpp @@ -80,7 +80,7 @@ StringRef ColumnDecimal::serializeValueIntoArena(size_t n, Arena & arena, cha res.data = pos; } memcpy(pos, &data[n], sizeof(T)); - return StringRef(pos, sizeof(T)); + return res; } template diff --git a/src/Functions/FunctionsOpDate.cpp b/src/Functions/FunctionsOpDate.cpp new file mode 100644 index 00000000000..0d8ca2b58cc --- /dev/null +++ b/src/Functions/FunctionsOpDate.cpp @@ -0,0 +1,108 @@ +#include + +#include +#include +#include +#include + +namespace DB +{ +namespace ErrorCodes +{ + extern const int ILLEGAL_TYPE_OF_ARGUMENT; +} + +namespace +{ +template +class FunctionOpDate : public IFunction +{ +public: + static constexpr auto name = Op::name; + + explicit FunctionOpDate(ContextPtr context_) : context(context_) {} + + + static FunctionPtr create(ContextPtr context) { return std::make_shared>(context); } + + String getName() const override { return name; } + + bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; } + size_t getNumberOfArguments() const override { return 2; } + + DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override + { + if (!isDateOrDate32(arguments[0].type) && !isDateTime(arguments[0].type) && !isDateTime64(arguments[0].type)) + throw Exception( + ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, + "Illegal type {} of 1st argument of function {}. Should be a date or a date with time", + arguments[0].type->getName(), + getName()); + + if (!isInterval(arguments[1].type)) + throw Exception( + ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, + "Illegal type {} of 2nd argument of function {}. Should be an interval", + arguments[1].type->getName(), + getName()); + + auto op = FunctionFactory::instance().get(Op::internal_name, context); + auto op_build = op->build(arguments); + + return op_build->getResultType(); + } + + bool useDefaultImplementationForConstants() const override { return true; } + ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0, 2}; } + + ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override + { + if (!isDateOrDate32(arguments[0].type) && !isDateTime(arguments[0].type) && !isDateTime64(arguments[0].type)) + throw Exception( + ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, + "Illegal type {} of 1st argument of function {}. Should be a date or a date with time", + arguments[0].type->getName(), + getName()); + + if (!isInterval(arguments[1].type)) + throw Exception( + ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, + "Illegal type {} of 2nd argument of function {}. Should be an interval", + arguments[1].type->getName(), + getName()); + + auto op = FunctionFactory::instance().get(Op::internal_name, context); + auto op_build = op->build(arguments); + + auto res_type = op_build->getResultType(); + return op_build->execute(arguments, res_type, input_rows_count); + } + +private: + ContextPtr context; +}; + +} + +struct AddDate +{ + static constexpr auto name = "addDate"; + static constexpr auto internal_name = "plus"; +}; + +struct SubDate +{ + static constexpr auto name = "subDate"; + static constexpr auto internal_name = "minus"; +}; + +using FunctionAddDate = FunctionOpDate; +using FunctionSubDate = FunctionOpDate; + +REGISTER_FUNCTION(AddInterval) +{ + factory.registerFunction({}, FunctionFactory::CaseInsensitive); + factory.registerFunction({}, FunctionFactory::CaseInsensitive); +} + +} diff --git a/src/Parsers/fuzzers/codegen_fuzzer/update.sh b/src/Parsers/fuzzers/codegen_fuzzer/update.sh index daee56dcea1..24dd7c8ec69 100755 --- a/src/Parsers/fuzzers/codegen_fuzzer/update.sh +++ b/src/Parsers/fuzzers/codegen_fuzzer/update.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash _main() { diff --git a/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp b/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp index 76f39b07a05..2a9892b7219 100644 --- a/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp +++ b/src/Processors/Formats/Impl/ArrowColumnToCHColumn.cpp @@ -427,8 +427,6 @@ static ColumnPtr readOffsetsFromArrowListColumn(std::shared_ptr &>(*offsets_column).getData(); offsets_data.reserve(arrow_column->length()); - uint64_t start_offset = 0u; - for (int chunk_i = 0, num_chunks = arrow_column->num_chunks(); chunk_i < num_chunks; ++chunk_i) { arrow::ListArray & list_chunk = dynamic_cast(*(arrow_column->chunk(chunk_i))); @@ -436,21 +434,27 @@ static ColumnPtr readOffsetsFromArrowListColumn(std::shared_ptr(*arrow_offsets_array); /* - * It seems like arrow::ListArray::values() (nested column data) might or might not be shared across chunks. - * When it is shared, the offsets will be monotonically increasing. Otherwise, the offsets will be zero based. - * In order to account for both cases, the starting offset is updated whenever a zero-based offset is found. - * More info can be found in: https://lists.apache.org/thread/rrwfb9zo2dc58dhd9rblf20xd7wmy7jm and - * https://github.com/ClickHouse/ClickHouse/pull/43297 + * CH uses element size as "offsets", while arrow uses actual offsets as offsets. + * That's why CH usually starts reading offsets with i=1 and i=0 is ignored. + * In case multiple batches are used to read a column, there is a chance the offsets are + * monotonically increasing, which will cause inconsistencies with the batch data length on `DB::ColumnArray`. + * + * If the offsets are monotonically increasing, `arrow_offsets.Value(0)` will be non-zero for the nth batch, where n > 0. + * If they are not monotonically increasing, it'll always be 0. + * Therefore, we subtract the previous offset from the current offset to get the corresponding CH "offset". + * + * The same might happen for multiple chunks. In this case, we need to add the last offset of the previous chunk, hence + * `offsets.back()`. More info can be found in https://lists.apache.org/thread/rrwfb9zo2dc58dhd9rblf20xd7wmy7jm, + * https://github.com/ClickHouse/ClickHouse/pull/43297 and https://github.com/ClickHouse/ClickHouse/pull/54370 * */ - if (list_chunk.offset() == 0) - { - start_offset = offsets_data.back(); - } + uint64_t previous_offset = arrow_offsets.Value(0); for (int64_t i = 1; i < arrow_offsets.length(); ++i) { auto offset = arrow_offsets.Value(i); - offsets_data.emplace_back(start_offset + offset); + uint64_t elements = offset - previous_offset; + previous_offset = offset; + offsets_data.emplace_back(offsets_data.back() + elements); } } return offsets_column; diff --git a/src/Storages/MergeTree/MergeTreeData.cpp b/src/Storages/MergeTree/MergeTreeData.cpp index 3337e136c16..4b6d2ea41ed 100644 --- a/src/Storages/MergeTree/MergeTreeData.cpp +++ b/src/Storages/MergeTree/MergeTreeData.cpp @@ -4245,7 +4245,7 @@ void MergeTreeData::forcefullyMovePartToDetachedAndRemoveFromMemory(const MergeT } -void MergeTreeData::tryRemovePartImmediately(DataPartPtr && part) +bool MergeTreeData::tryRemovePartImmediately(DataPartPtr && part) { DataPartPtr part_to_delete; { @@ -4271,7 +4271,7 @@ void MergeTreeData::tryRemovePartImmediately(DataPartPtr && part) if (!it->unique()) LOG_WARNING(log, "Cannot immediately remove part {} because someone using it right now " "usage counter {}", part_name_with_state, it->use_count()); - return; + return false; } modifyPartState(it, DataPartState::Deleting); @@ -4296,6 +4296,7 @@ void MergeTreeData::tryRemovePartImmediately(DataPartPtr && part) removePartsFinally({part_to_delete}); LOG_TRACE(log, "Removed part {}", part_to_delete->name); + return true; } diff --git a/src/Storages/MergeTree/MergeTreeData.h b/src/Storages/MergeTree/MergeTreeData.h index 95d8e74f32c..6f9779bde00 100644 --- a/src/Storages/MergeTree/MergeTreeData.h +++ b/src/Storages/MergeTree/MergeTreeData.h @@ -671,7 +671,7 @@ public: void outdateUnexpectedPartAndCloneToDetached(const DataPartPtr & part); /// If the part is Obsolete and not used by anybody else, immediately delete it from filesystem and remove from memory. - void tryRemovePartImmediately(DataPartPtr && part); + bool tryRemovePartImmediately(DataPartPtr && part); /// Returns old inactive parts that can be deleted. At the same time removes them from the list of parts but not from the disk. /// If 'force' - don't wait for old_parts_lifetime. diff --git a/src/Storages/MergeTree/MergeTreeIndexFullText.cpp b/src/Storages/MergeTree/MergeTreeIndexFullText.cpp index 754340352dc..ad56e59c1c5 100644 --- a/src/Storages/MergeTree/MergeTreeIndexFullText.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexFullText.cpp @@ -337,15 +337,22 @@ bool MergeTreeConditionFullText::extractAtomFromTree(const RPNBuilderTreeNode & if (node.tryGetConstant(const_value, const_type)) { /// Check constant like in KeyCondition - if (const_value.getType() == Field::Types::UInt64 - || const_value.getType() == Field::Types::Int64 - || const_value.getType() == Field::Types::Float64) - { - /// Zero in all types is represented in memory the same way as in UInt64. - out.function = const_value.get() - ? RPNElement::ALWAYS_TRUE - : RPNElement::ALWAYS_FALSE; + if (const_value.getType() == Field::Types::UInt64) + { + out.function = const_value.get() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; + return true; + } + + if (const_value.getType() == Field::Types::Int64) + { + out.function = const_value.get() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; + return true; + } + + if (const_value.getType() == Field::Types::Float64) + { + out.function = const_value.get() != 0.0 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; return true; } } diff --git a/src/Storages/MergeTree/MergeTreeIndexInverted.cpp b/src/Storages/MergeTree/MergeTreeIndexInverted.cpp index 325df6ffb6f..55af0d8b2fe 100644 --- a/src/Storages/MergeTree/MergeTreeIndexInverted.cpp +++ b/src/Storages/MergeTree/MergeTreeIndexInverted.cpp @@ -377,15 +377,21 @@ bool MergeTreeConditionInverted::traverseAtomAST(const RPNBuilderTreeNode & node if (node.tryGetConstant(const_value, const_type)) { /// Check constant like in KeyCondition - if (const_value.getType() == Field::Types::UInt64 - || const_value.getType() == Field::Types::Int64 - || const_value.getType() == Field::Types::Float64) + if (const_value.getType() == Field::Types::UInt64) { - /// Zero in all types is represented in memory the same way as in UInt64. - out.function = const_value.get() - ? RPNElement::ALWAYS_TRUE - : RPNElement::ALWAYS_FALSE; + out.function = const_value.get() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; + return true; + } + if (const_value.getType() == Field::Types::Int64) + { + out.function = const_value.get() ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; + return true; + } + + if (const_value.getType() == Field::Types::Float64) + { + out.function = const_value.get() != 0.00 ? RPNElement::ALWAYS_TRUE : RPNElement::ALWAYS_FALSE; return true; } } diff --git a/src/Storages/MergeTree/MergeTreeSelectAlgorithms.cpp b/src/Storages/MergeTree/MergeTreeSelectAlgorithms.cpp index 8bc4377cffb..bf97d269dc6 100644 --- a/src/Storages/MergeTree/MergeTreeSelectAlgorithms.cpp +++ b/src/Storages/MergeTree/MergeTreeSelectAlgorithms.cpp @@ -1,5 +1,4 @@ #include -#include namespace DB { @@ -9,57 +8,29 @@ namespace ErrorCodes extern const int LOGICAL_ERROR; } -MergeTreeThreadSelectAlgorithm::TaskResult MergeTreeThreadSelectAlgorithm::getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) -{ - TaskResult res; - res.first = pool.getTask(thread_idx, previous_task); - res.second = !!res.first; - return res; -} - -MergeTreeReadTask::BlockAndProgress MergeTreeThreadSelectAlgorithm::readFromTask(MergeTreeReadTask * task, const MergeTreeReadTask::BlockSizeParams & params) -{ - if (!task) - return {}; - - return task->read(params); -} - -IMergeTreeSelectAlgorithm::TaskResult MergeTreeInOrderSelectAlgorithm::getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) +MergeTreeReadTaskPtr MergeTreeInOrderSelectAlgorithm::getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) { if (!pool.preservesOrderOfRanges()) throw Exception(ErrorCodes::LOGICAL_ERROR, "MergeTreeInOrderSelectAlgorithm requires read pool that preserves order of ranges, got: {}", pool.getName()); - TaskResult res; - res.first = pool.getTask(part_idx, previous_task); - res.second = !!res.first; - return res; + return pool.getTask(part_idx, previous_task); } -MergeTreeReadTask::BlockAndProgress MergeTreeInOrderSelectAlgorithm::readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) -{ - if (!task) - return {}; - - return task->read(params); -} - -IMergeTreeSelectAlgorithm::TaskResult MergeTreeInReverseOrderSelectAlgorithm::getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) +MergeTreeReadTaskPtr MergeTreeInReverseOrderSelectAlgorithm::getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) { if (!pool.preservesOrderOfRanges()) throw Exception(ErrorCodes::LOGICAL_ERROR, "MergeTreeInReverseOrderSelectAlgorithm requires read pool that preserves order of ranges, got: {}", pool.getName()); - TaskResult res; - res.first = pool.getTask(part_idx, previous_task); - /// We may have some chunks to return in buffer. - /// Set continue_reading to true but actually don't create a new task. - res.second = !!res.first || !chunks.empty(); - return res; + if (!chunks.empty()) + throw Exception(ErrorCodes::LOGICAL_ERROR, + "Cannot get new task for reading in reverse order because there are {} buffered chunks", chunks.size()); + + return pool.getTask(part_idx, previous_task); } -MergeTreeReadTask::BlockAndProgress MergeTreeInReverseOrderSelectAlgorithm::readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) +MergeTreeReadTask::BlockAndProgress MergeTreeInReverseOrderSelectAlgorithm::readFromTask(MergeTreeReadTask & task, const BlockSizeParams & params) { MergeTreeReadTask::BlockAndProgress res; @@ -70,11 +41,8 @@ MergeTreeReadTask::BlockAndProgress MergeTreeInReverseOrderSelectAlgorithm::read return res; } - if (!task) - return {}; - - while (!task->isFinished()) - chunks.push_back(task->read(params)); + while (!task.isFinished()) + chunks.push_back(task.read(params)); if (chunks.empty()) return {}; diff --git a/src/Storages/MergeTree/MergeTreeSelectAlgorithms.h b/src/Storages/MergeTree/MergeTreeSelectAlgorithms.h index a6254a90687..afc8032bb99 100644 --- a/src/Storages/MergeTree/MergeTreeSelectAlgorithms.h +++ b/src/Storages/MergeTree/MergeTreeSelectAlgorithms.h @@ -1,6 +1,7 @@ #pragma once #include +#include #include namespace DB @@ -11,15 +12,16 @@ class IMergeTreeReadPool; class IMergeTreeSelectAlgorithm : private boost::noncopyable { public: - /// The pair of {task, continue_reading}. - using TaskResult = std::pair; using BlockSizeParams = MergeTreeReadTask::BlockSizeParams; + using BlockAndProgress = MergeTreeReadTask::BlockAndProgress; virtual ~IMergeTreeSelectAlgorithm() = default; virtual String getName() const = 0; - virtual TaskResult getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) = 0; - virtual MergeTreeReadTask::BlockAndProgress readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) = 0; + virtual bool needNewTask(const MergeTreeReadTask & task) const = 0; + + virtual MergeTreeReadTaskPtr getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) = 0; + virtual BlockAndProgress readFromTask(MergeTreeReadTask & task, const BlockSizeParams & params) = 0; }; using MergeTreeSelectAlgorithmPtr = std::unique_ptr; @@ -28,9 +30,12 @@ class MergeTreeThreadSelectAlgorithm : public IMergeTreeSelectAlgorithm { public: explicit MergeTreeThreadSelectAlgorithm(size_t thread_idx_) : thread_idx(thread_idx_) {} + String getName() const override { return "Thread"; } - TaskResult getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override; - MergeTreeReadTask::BlockAndProgress readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) override; + bool needNewTask(const MergeTreeReadTask & task) const override { return task.isFinished(); } + + MergeTreeReadTaskPtr getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override { return pool.getTask(thread_idx, previous_task); } + BlockAndProgress readFromTask(MergeTreeReadTask & task, const BlockSizeParams & params) override { return task.read(params); } private: const size_t thread_idx; @@ -40,9 +45,12 @@ class MergeTreeInOrderSelectAlgorithm : public IMergeTreeSelectAlgorithm { public: explicit MergeTreeInOrderSelectAlgorithm(size_t part_idx_) : part_idx(part_idx_) {} + String getName() const override { return "InOrder"; } - TaskResult getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override; - MergeTreeReadTask::BlockAndProgress readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) override; + bool needNewTask(const MergeTreeReadTask & task) const override { return task.isFinished(); } + + MergeTreeReadTaskPtr getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override; + MergeTreeReadTask::BlockAndProgress readFromTask(MergeTreeReadTask & task, const BlockSizeParams & params) override { return task.read(params); } private: const size_t part_idx; @@ -52,13 +60,16 @@ class MergeTreeInReverseOrderSelectAlgorithm : public IMergeTreeSelectAlgorithm { public: explicit MergeTreeInReverseOrderSelectAlgorithm(size_t part_idx_) : part_idx(part_idx_) {} + String getName() const override { return "InReverseOrder"; } - TaskResult getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override; - MergeTreeReadTask::BlockAndProgress readFromTask(MergeTreeReadTask * task, const BlockSizeParams & params) override; + bool needNewTask(const MergeTreeReadTask & task) const override { return chunks.empty() && task.isFinished(); } + + MergeTreeReadTaskPtr getNewTask(IMergeTreeReadPool & pool, MergeTreeReadTask * previous_task) override; + BlockAndProgress readFromTask(MergeTreeReadTask & task, const BlockSizeParams & params) override; private: const size_t part_idx; - std::vector chunks; + std::vector chunks; }; } diff --git a/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp b/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp index 975fad1ab6b..95fcde23f8e 100644 --- a/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp +++ b/src/Storages/MergeTree/MergeTreeSelectProcessor.cpp @@ -139,11 +139,10 @@ ChunkAndProgress MergeTreeSelectProcessor::read() { try { - bool continue_reading = true; - if (!task || task->isFinished()) - std::tie(task, continue_reading) = algorithm->getNewTask(*pool, task.get()); + if (!task || algorithm->needNewTask(*task)) + task = algorithm->getNewTask(*pool, task.get()); - if (!continue_reading) + if (!task) break; } catch (const Exception & e) @@ -153,10 +152,10 @@ ChunkAndProgress MergeTreeSelectProcessor::read() throw; } - if (task && !task->getMainRangeReader().isInitialized()) + if (!task->getMainRangeReader().isInitialized()) initializeRangeReaders(); - auto res = algorithm->readFromTask(task.get(), block_size_params); + auto res = algorithm->readFromTask(*task, block_size_params); if (res.row_count) { diff --git a/src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp b/src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp index 5b235322394..75679a5750a 100644 --- a/src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp +++ b/src/Storages/MergeTree/ReplicatedMergeTreeSink.cpp @@ -3,6 +3,7 @@ #include #include #include +#include "Common/Exception.h" #include #include #include @@ -44,6 +45,7 @@ namespace ErrorCodes extern const int LOGICAL_ERROR; extern const int TABLE_IS_READ_ONLY; extern const int QUERY_WAS_CANCELLED; + extern const int CHECKSUM_DOESNT_MATCH; } template @@ -801,8 +803,48 @@ std::pair, bool> ReplicatedMergeTreeSinkImpl:: "Conflict block ids and block number lock should not " "be empty at the same time for async inserts"); - /// Information about the part. - storage.getCommitPartOps(ops, part, block_id_path); + if constexpr (!async_insert) + { + if (!existing_part_name.empty()) + { + LOG_DEBUG(log, "Will check part {} checksums", existing_part_name); + try + { + NameSet unused; + /// if we found part in deduplication hashes part must exists on some replica + storage.checkPartChecksumsAndAddCommitOps(zookeeper, part, ops, existing_part_name, unused); + } + catch (const zkutil::KeeperException &) + { + throw; + } + catch (const Exception & ex) + { + if (ex.code() == ErrorCodes::CHECKSUM_DOESNT_MATCH) + { + LOG_INFO( + log, + "Block with ID {} has the same deduplication hash as other part {} on other replica, but checksums (which " + "include metadata files like columns.txt) doesn't match, will not write it locally", + block_id, + existing_part_name); + return; + } + throw; + } + } + else + { + /// Information about the part. + storage.getCommitPartOps(ops, part, block_id_path); + } + } + else + { + chassert(existing_part_name.empty()); + storage.getCommitPartOps(ops, part, block_id_path); + } + /// It's important to create it outside of lock scope because /// otherwise it can lock parts in destructor and deadlock is possible. diff --git a/src/Storages/StorageReplicatedMergeTree.cpp b/src/Storages/StorageReplicatedMergeTree.cpp index 4eda4176cba..7c7e6dbd42c 100644 --- a/src/Storages/StorageReplicatedMergeTree.cpp +++ b/src/Storages/StorageReplicatedMergeTree.cpp @@ -1486,8 +1486,12 @@ void StorageReplicatedMergeTree::syncPinnedPartUUIDs() } } -void StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps(const zkutil::ZooKeeperPtr & zookeeper, - const DataPartPtr & part, Coordination::Requests & ops, String part_name, NameSet * absent_replicas_paths) +bool StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps( + const ZooKeeperWithFaultInjectionPtr & zookeeper, + const DataPartPtr & part, + Coordination::Requests & ops, + String part_name, + NameSet & absent_replicas_paths) { if (part_name.empty()) part_name = part->name; @@ -1497,20 +1501,24 @@ void StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps(const zkutil: Strings replicas = zookeeper->getChildren(fs::path(zookeeper_path) / "replicas"); std::shuffle(replicas.begin(), replicas.end(), thread_local_rng); - bool has_been_already_added = false; + bool part_found = false; + bool part_exists_on_our_replica = false; for (const String & replica : replicas) { String current_part_path = fs::path(zookeeper_path) / "replicas" / replica / "parts" / part_name; - String part_zk_str; if (!zookeeper->tryGet(current_part_path, part_zk_str)) { - if (absent_replicas_paths) - absent_replicas_paths->emplace(current_part_path); - + absent_replicas_paths.emplace(current_part_path); continue; } + else + { + part_found = true; + if (replica == replica_name) + part_exists_on_our_replica = true; + } ReplicatedMergeTreePartHeader replica_part_header; if (part_zk_str.empty()) @@ -1550,20 +1558,13 @@ void StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps(const zkutil: } replica_part_header.getChecksums().checkEqual(local_part_header.getChecksums(), true); - - if (replica == replica_name) - has_been_already_added = true; - - /// If we verify checksums in "sequential manner" (i.e. recheck absence of checksums on other replicas when commit) - /// then it is enough to verify checksums on at least one replica since checksums on other replicas must be the same. - if (absent_replicas_paths) - { - absent_replicas_paths->clear(); - break; - } + break; } - if (!has_been_already_added) + if (part_found) + absent_replicas_paths.clear(); + + if (!part_exists_on_our_replica) { const auto storage_settings_ptr = getSettings(); String part_path = fs::path(replica_path) / "parts" / part_name; @@ -1588,6 +1589,7 @@ void StorageReplicatedMergeTree::checkPartChecksumsAndAddCommitOps(const zkutil: LOG_WARNING(log, "checkPartAndAddToZooKeeper: node {} already exists. Will not commit any nodes.", (fs::path(replica_path) / "parts" / part_name).string()); } + return part_found; } MergeTreeData::DataPartsVector StorageReplicatedMergeTree::checkPartChecksumsAndCommit(Transaction & transaction, @@ -1606,14 +1608,14 @@ MergeTreeData::DataPartsVector StorageReplicatedMergeTree::checkPartChecksumsAnd size_t zero_copy_lock_ops_size = ops.size(); /// Checksums are checked here and `ops` is filled. In fact, the part is added to ZK just below, when executing `multi`. - checkPartChecksumsAndAddCommitOps(zookeeper, part, ops, part->name, &absent_part_paths_on_replicas); + bool part_found = checkPartChecksumsAndAddCommitOps(std::make_shared(zookeeper), part, ops, part->name, absent_part_paths_on_replicas); /// Do not commit if the part is obsolete, we have just briefly checked its checksums if (transaction.isEmpty()) return {}; /// Will check that the part did not suddenly appear on skipped replicas - if (!absent_part_paths_on_replicas.empty()) + if (!part_found) { Coordination::Requests new_ops; for (const String & part_path : absent_part_paths_on_replicas) @@ -1627,6 +1629,10 @@ MergeTreeData::DataPartsVector StorageReplicatedMergeTree::checkPartChecksumsAnd new_ops.insert(new_ops.end(), ops.begin(), ops.end()); ops = std::move(new_ops); } + else + { + chassert(absent_part_paths_on_replicas.empty()); + } Coordination::Responses responses; Coordination::Error e = zookeeper->tryMulti(ops, responses); @@ -9145,7 +9151,7 @@ std::pair StorageReplicatedMergeTree::unlockSharedDataByID( zookeeper_ptr, fs::path(zc_zookeeper_path).parent_path(), part_info, data_format_version, logger); // parent_not_to_remove == std::nullopt means that we were unable to retrieve parts set - if (has_parent || parent_not_to_remove == std::nullopt) + if (has_parent && parent_not_to_remove == std::nullopt) { LOG_TRACE(logger, "Failed to get mutation parent on {} for part {}, refusing to remove blobs", zookeeper_part_replica_node, part_name); return {false, {}}; diff --git a/src/Storages/StorageReplicatedMergeTree.h b/src/Storages/StorageReplicatedMergeTree.h index 1f37416f881..794991d8e06 100644 --- a/src/Storages/StorageReplicatedMergeTree.h +++ b/src/Storages/StorageReplicatedMergeTree.h @@ -631,8 +631,12 @@ private: * Adds actions to `ops` that add data about the part into ZooKeeper. * Call under lockForShare. */ - void checkPartChecksumsAndAddCommitOps(const zkutil::ZooKeeperPtr & zookeeper, const DataPartPtr & part, - Coordination::Requests & ops, String part_name = "", NameSet * absent_replicas_paths = nullptr); + bool checkPartChecksumsAndAddCommitOps( + const ZooKeeperWithFaultInjectionPtr & zookeeper, + const DataPartPtr & part, + Coordination::Requests & ops, + String part_name, + NameSet & absent_replicas_paths); String getChecksumsForZooKeeper(const MergeTreeDataPartChecksums & checksums) const; diff --git a/src/Storages/System/StorageSystemLicenses.sh b/src/Storages/System/StorageSystemLicenses.sh index fd5495cd460..79f05d50d1d 100755 --- a/src/Storages/System/StorageSystemLicenses.sh +++ b/src/Storages/System/StorageSystemLicenses.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash ROOT_PATH="$(git rev-parse --show-toplevel)" IFS=$'\t' diff --git a/tests/queries/0_stateless/00990_hasToken_and_tokenbf.sql b/tests/queries/0_stateless/00990_hasToken_and_tokenbf.sql index 8e88af40046..9552acd3c93 100644 --- a/tests/queries/0_stateless/00990_hasToken_and_tokenbf.sql +++ b/tests/queries/0_stateless/00990_hasToken_and_tokenbf.sql @@ -59,3 +59,9 @@ SELECT max(id) FROM bloom_filter WHERE hasToken(s, 'yyy'); -- { serverError 158 SELECT max(id) FROM bloom_filter WHERE hasToken(s, 'zzz') == 1; -- { serverError 158 } DROP TABLE bloom_filter; + +-- AST fuzzer crash, issue #54541 +CREATE TABLE tab (row_id UInt32, str String, INDEX idx str TYPE tokenbf_v1(256, 2, 0)) ENGINE = MergeTree ORDER BY row_id; +INSERT INTO tab VALUES (0, 'a'); +SELECT * FROM tab WHERE str == 'else' AND 1.0; +DROP TABLE tab; diff --git a/tests/queries/0_stateless/02346_full_text_search.reference b/tests/queries/0_stateless/02346_full_text_search.reference index d6e510b9375..0cf74e14427 100644 --- a/tests/queries/0_stateless/02346_full_text_search.reference +++ b/tests/queries/0_stateless/02346_full_text_search.reference @@ -46,3 +46,4 @@ Test inverted(2) on UTF-8 data af inverted 102 clickhouse你好 1 +AST Fuzzer crash, issue #54541 diff --git a/tests/queries/0_stateless/02346_full_text_search.sql b/tests/queries/0_stateless/02346_full_text_search.sql index 18d1ce0fd96..c8536976377 100644 --- a/tests/queries/0_stateless/02346_full_text_search.sql +++ b/tests/queries/0_stateless/02346_full_text_search.sql @@ -235,6 +235,15 @@ SELECT read_rows==2 from system.query_log LIMIT 1; +---------------------------------------------------- +SELECT 'AST Fuzzer crash, issue #54541'; + +DROP TABLE IF EXISTS tab; +CREATE TABLE tab (row_id UInt32, str String, INDEX idx str TYPE inverted) ENGINE = MergeTree ORDER BY row_id; +INSERT INTO tab VALUES (0, 'a'); +SELECT * FROM tab WHERE str == 'b' AND 1.0; + + -- Tests with parameter max_digestion_size_per_segment are flaky in CI, not clear why --> comment out for the time being: -- ---------------------------------------------------- diff --git a/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.reference b/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.reference index dd843058281..69f455773b0 100644 --- a/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.reference +++ b/tests/queries/0_stateless/02415_all_new_functions_must_be_documented.reference @@ -68,6 +68,7 @@ accurateCastOrDefault accurateCastOrNull acos acosh +addDate addDays addHours addMicroseconds @@ -668,6 +669,7 @@ splitByWhitespace sqrt startsWith subBitmap +subDate substring substringIndex substringIndexUTF8 diff --git a/tests/queries/0_stateless/02446_parent_zero_copy_locks.reference b/tests/queries/0_stateless/02446_parent_zero_copy_locks.reference new file mode 100644 index 00000000000..c3c7c53f625 --- /dev/null +++ b/tests/queries/0_stateless/02446_parent_zero_copy_locks.reference @@ -0,0 +1,12 @@ +0 +1 1 10 1 +1 2 20 2 +1 3 30 3 +1 4 40 4 +1 5 50 5 +2 1 10 1 +2 2 20 2 +2 3 30 3 +2 4 40 4 +2 5 50 5 +3 0 diff --git a/tests/queries/0_stateless/02446_parent_zero_copy_locks.sql b/tests/queries/0_stateless/02446_parent_zero_copy_locks.sql new file mode 100644 index 00000000000..86eda526c72 --- /dev/null +++ b/tests/queries/0_stateless/02446_parent_zero_copy_locks.sql @@ -0,0 +1,49 @@ +-- Tags: no-replicated-database, no-fasttest +-- Tag no-replicated-database: different number of replicas + +create table rmt1 (n int, m int, k int) engine=ReplicatedMergeTree('/test/02446/{database}/rmt', '1') order by n + settings storage_policy='s3_cache', allow_remote_fs_zero_copy_replication=1, old_parts_lifetime=0, cleanup_delay_period=0, max_cleanup_delay_period=1, cleanup_delay_period_random_add=1, min_bytes_for_wide_part=0; +create table rmt2 (n int, m int, k int) engine=ReplicatedMergeTree('/test/02446/{database}/rmt', '2') order by n + settings storage_policy='s3_cache', allow_remote_fs_zero_copy_replication=1, old_parts_lifetime=0, cleanup_delay_period=0, max_cleanup_delay_period=1, cleanup_delay_period_random_add=1, min_bytes_for_wide_part=0; + +-- FIXME zero-copy locks may remain in ZooKeeper forever if we failed to insert a part. +-- Probably that's why we have to replace repsistent lock with ephemeral sometimes. +-- See also "Replacing persistent lock with ephemeral for path {}. It can happen only in case of local part loss" +-- in StorageReplicatedMergeTree::createZeroCopyLockNode +set insert_keeper_fault_injection_probability=0; + +insert into rmt1 values(1, 1, 1); +insert into rmt2 values(2, 2, 2); + +alter table rmt1 update m = 0 where n=0; +insert into rmt1 values(3, 3, 3); +insert into rmt2 values(4, 4, 4); +select sleepEachRow(0.5) as test_does_not_rely_on_this; + +insert into rmt1 values(5, 5, 5); +alter table rmt2 update m = m * 10 where 1 settings mutations_sync=2; + +system sync replica rmt2; +set optimize_throw_if_noop=1; +optimize table rmt2 final; + +select 1, * from rmt1 order by n; + +system sync replica rmt1; +select 2, * from rmt2 order by n; + +-- a funny way to wait for outdated parts to be removed +select sleep(1), sleepEachRow(0.1) from url('http://localhost:8123/?param_tries={1..10}&query=' || encodeURLComponent( + 'select *, _state from system.parts where database=''' || currentDatabase() || ''' and table like ''rmt%'' and active=0' + ), 'LineAsString', 's String') settings max_threads=1 format Null; + +select *, _state from system.parts where database=currentDatabase() and table like 'rmt%' and active=0; + +-- ensure that old zero copy locks were removed +set allow_unrestricted_reads_from_keeper=1; +select count(), sum(ephemeralOwner) from system.zookeeper where path like '/clickhouse/zero_copy/zero_copy_s3/' || + (select value from system.zookeeper where path='/test/02446/'||currentDatabase()||'/rmt' and name='table_shared_id') || '/%'; + +select * from system.zookeeper where path like '/clickhouse/zero_copy/zero_copy_s3/' || + (select value from system.zookeeper where path='/test/02446/'||currentDatabase()||'/rmt' and name='table_shared_id') || '/%' + and path not like '%/all_0_5_2_6%'; diff --git a/tests/queries/0_stateless/02834_add_sub_date_functions.reference b/tests/queries/0_stateless/02834_add_sub_date_functions.reference new file mode 100644 index 00000000000..c37ac34920c --- /dev/null +++ b/tests/queries/0_stateless/02834_add_sub_date_functions.reference @@ -0,0 +1,11 @@ +2022-05-07 00:05:00 +2022-05-07 00:05:00 +2022-05-07 00:05:00.000 +2022-05-07 00:05:00 +2022-05-07 00:05:00.000 +--- +2022-05-06 23:55:00 +2022-05-06 23:55:00 +2022-05-06 23:55:00.000 +2022-05-06 23:55:00 +2022-05-06 23:55:00.000 diff --git a/tests/queries/0_stateless/02834_add_sub_date_functions.sql b/tests/queries/0_stateless/02834_add_sub_date_functions.sql new file mode 100644 index 00000000000..44d9bb8a2aa --- /dev/null +++ b/tests/queries/0_stateless/02834_add_sub_date_functions.sql @@ -0,0 +1,27 @@ +SET session_timezone = 'UTC'; + +SELECT ADDDATE('2022-05-07'::Date, INTERVAL 5 MINUTE); + +SELECT addDate('2022-05-07'::Date, INTERVAL 5 MINUTE); +SELECT addDate('2022-05-07'::Date32, INTERVAL 5 MINUTE); +SELECT addDate('2022-05-07'::DateTime, INTERVAL 5 MINUTE); +SELECT addDate('2022-05-07'::DateTime64, INTERVAL 5 MINUTE); + +SELECT addDate('2022-05-07'::Date); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH } +SELECT addDate('2022-05-07'::Date, INTERVAL 5 MINUTE, 5); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH } +SELECT addDate('2022-05-07'::Date, 10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT } +SELECT addDate('1234', INTERVAL 5 MINUTE); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT } + +SELECT '---'; + +SELECT SUBDATE('2022-05-07'::Date, INTERVAL 5 MINUTE); + +SELECT subDate('2022-05-07'::Date, INTERVAL 5 MINUTE); +SELECT subDate('2022-05-07'::Date32, INTERVAL 5 MINUTE); +SELECT subDate('2022-05-07'::DateTime, INTERVAL 5 MINUTE); +SELECT subDate('2022-05-07'::DateTime64, INTERVAL 5 MINUTE); + +SELECT subDate('2022-05-07'::Date); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH } +SELECT subDate('2022-05-07'::Date, INTERVAL 5 MINUTE, 5); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH } +SELECT subDate('2022-05-07'::Date, 10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT } +SELECT subDate('1234', INTERVAL 5 MINUTE); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT } diff --git a/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.reference b/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.reference new file mode 100644 index 00000000000..d00491fd7e5 --- /dev/null +++ b/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.reference @@ -0,0 +1 @@ +1 diff --git a/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.sh b/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.sh new file mode 100755 index 00000000000..f74e662520b --- /dev/null +++ b/tests/queries/0_stateless/02842_capn_proto_outfile_without_schema.sh @@ -0,0 +1,11 @@ +#!/usr/bin/env bash +# Tags: no-fasttest + +CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=../shell_config.sh +. "$CUR_DIR"/../shell_config.sh + + +$CLICKHOUSE_LOCAL -q "select * from numbers(10) into outfile '$CLICKHOUSE_TEST_UNIQUE_NAME.capnp' settings format_capn_proto_use_autogenerated_schema=0" 2>&1 | grep "The format CapnProto requires a schema" -c +rm $CLICKHOUSE_TEST_UNIQUE_NAME.capnp + diff --git a/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.reference b/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.reference new file mode 100644 index 00000000000..ba63f2f7e9c --- /dev/null +++ b/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.reference @@ -0,0 +1,3 @@ +Parquet +e76a749f346078a6a43e0cbd25f0d18a - +400 diff --git a/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.sh b/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.sh new file mode 100755 index 00000000000..83196458a84 --- /dev/null +++ b/tests/queries/0_stateless/02874_parquet_multiple_batches_array_inconsistent_offsets.sh @@ -0,0 +1,129 @@ +#!/usr/bin/env bash +# Tags: no-ubsan, no-fasttest + +CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) +# shellcheck source=../shell_config.sh +. "$CUR_DIR"/../shell_config.sh + +echo "Parquet" + +# More info on: https://github.com/ClickHouse/ClickHouse/pull/54370 + +# File generated with the below code + +#std::string random_string(size_t length) { +# static const std::string characters = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"; +# +# std::random_device random_device; +# std::mt19937 generator(random_device()); +# std::uniform_int_distribution<> distribution(0, characters.size() - 1); +# +# std::string random_string; +# random_string.reserve(length); +# +# std::generate_n(std::back_inserter(random_string), length, [&]() { +# return characters[distribution(generator)]; +# }); +# +# return random_string; +#} +# +#static const std::string the_string = random_string(9247124u); +# +#std::shared_ptr CreateIntArray(std::size_t length) { +# arrow::MemoryPool* pool = arrow::default_memory_pool(); +# +# auto int_builder_ptr = std::make_shared(pool); +# auto & int_builder = *int_builder_ptr; +# arrow::ListBuilder list_builder(pool, int_builder_ptr); +# +# for (auto i = 0u; i < length; i++) +# { +# if (i % 10 == 0) +# { +# ARROW_CHECK_OK(list_builder.Append()); +# } +# else +# { +# ARROW_CHECK_OK(int_builder.Append(i)); +# } +# } +# +# std::shared_ptr int_list_array; +# ARROW_CHECK_OK(list_builder.Finish(&int_list_array)); +# return int_list_array; +#} +# +#std::shared_ptr CreateStringArray(std::size_t length) { +# arrow::MemoryPool* pool = arrow::default_memory_pool(); +# +# auto str_builder = std::make_shared(arrow::large_utf8(), pool); +# +# for (auto i = 0u; i < length; i++) +# { +# if (i % 10 == 0) +# { +# ARROW_CHECK_OK(str_builder->AppendNull()); +# } +# else +# { +# ARROW_CHECK_OK(str_builder->Append(the_string)); +# } +# } +# +# std::shared_ptr str_array; +# ARROW_CHECK_OK(str_builder->Finish(&str_array)); +# return str_array; +#} +# +#void run() +#{ +# auto schema = arrow::schema({ +# arrow::field("ints", arrow::list(arrow::int64())), +# arrow::field("strings", arrow::utf8()) +# }); +# +# auto l1_length = 2000; +# auto l2_length = 2000; +# +# std::vector> batches; +# +# auto int_array1 = CreateIntArray(l1_length); +# +# auto int_array2 = CreateIntArray(l1_length); +# +# auto str_array1 = CreateStringArray(l2_length); +# +# auto str_array2 = CreateStringArray(l2_length); +# +# batches.push_back(arrow::RecordBatch::Make(schema, int_array1->length(), {int_array1, str_array1})); +# +# batches.push_back(arrow::RecordBatch::Make(schema, int_array2->length(), {int_array2, str_array2})); +# +# std::shared_ptr outfile; +# PARQUET_ASSIGN_OR_THROW(outfile, arrow::io::FileOutputStream::Open("generated.parquet")); +# +# parquet::WriterProperties::Builder builder; +# builder.compression(arrow::Compression::GZIP); +# builder.dictionary_pagesize_limit(10*1024*1024); +# builder.data_pagesize(20*1024*1024); +# +# std::shared_ptr props = builder.build(); +# +# std::unique_ptr file_writer; +# PARQUET_ASSIGN_OR_THROW(file_writer, parquet::arrow::FileWriter::Open(*schema, ::arrow::default_memory_pool(), outfile, props)); +# +# for (const auto& batch : batches) { +# PARQUET_THROW_NOT_OK(file_writer->WriteRecordBatch(*batch)); +# } +# +# PARQUET_THROW_NOT_OK(file_writer->Close()); +#} + +DATA_FILE=$CUR_DIR/data_parquet/string_int_list_inconsistent_offset_multiple_batches.parquet +${CLICKHOUSE_CLIENT} --query="DROP TABLE IF EXISTS parquet_load" +${CLICKHOUSE_CLIENT} --query="CREATE TABLE parquet_load (ints Array(Int64), strings Nullable(String)) ENGINE = Memory" +cat "$DATA_FILE" | ${CLICKHOUSE_CLIENT} -q "INSERT INTO parquet_load FORMAT Parquet" +${CLICKHOUSE_CLIENT} --query="SELECT * FROM parquet_load" | md5sum +${CLICKHOUSE_CLIENT} --query="SELECT count() FROM parquet_load" +${CLICKHOUSE_CLIENT} --query="drop table parquet_load" \ No newline at end of file diff --git a/tests/queries/0_stateless/02875_fix_column_decimal_serialization.reference b/tests/queries/0_stateless/02875_fix_column_decimal_serialization.reference new file mode 100644 index 00000000000..3dc4904dbfd --- /dev/null +++ b/tests/queries/0_stateless/02875_fix_column_decimal_serialization.reference @@ -0,0 +1 @@ +11 1 1 8 8 7367 diff --git a/tests/queries/0_stateless/02875_fix_column_decimal_serialization.sql b/tests/queries/0_stateless/02875_fix_column_decimal_serialization.sql new file mode 100644 index 00000000000..2e71e47e9d8 --- /dev/null +++ b/tests/queries/0_stateless/02875_fix_column_decimal_serialization.sql @@ -0,0 +1,17 @@ +CREATE TABLE max_length_alias_14053__fuzz_45 +( + `a` Date, + `b` Nullable(Decimal(76, 45)), + `c.d` Array(Nullable(DateTime64(3))), + `dcount` Int8 ALIAS length(c.d) +) +ENGINE = MergeTree +PARTITION BY toMonday(a) +ORDER BY (a, b) +SETTINGS allow_nullable_key = 1, index_granularity = 8192; + +INSERT INTO max_length_alias_14053__fuzz_45 VALUES ('2020-10-06',7367,['2020-10-06','2020-10-06','2020-10-06','2020-10-06','2020-10-06']),('2020-10-06',7367,['2020-10-06','2020-10-06','2020-10-06']),('2020-10-06',7367,['2020-10-06','2020-10-06']),('2020-10-07',7367,['2020-10-07','2020-10-07','2020-10-07','2020-10-07','2020-10-07']),('2020-10-08',7367,['2020-10-08','2020-10-08','2020-10-08','2020-10-08']),('2020-10-11',7367,['2020-10-11','2020-10-11','2020-10-11','2020-10-11','2020-10-11','2020-10-11','2020-10-11','2020-10-11']),('2020-10-11',7367,['2020-10-11']),('2020-08-26',7367,['2020-08-26','2020-08-26']),('2020-08-28',7367,['2020-08-28','2020-08-28','2020-08-28']),('2020-08-29',7367,['2020-08-29']),('2020-09-22',7367,['2020-09-22','2020-09-22','2020-09-22','2020-09-22','2020-09-22','2020-09-22','2020-09-22']); + +SELECT count(), min(length(c.d)) AS minExpr, min(dcount) AS minAlias, max(length(c.d)) AS maxExpr, max(dcount) AS maxAlias, b FROM max_length_alias_14053__fuzz_45 GROUP BY b; + +DROP TABLE max_length_alias_14053__fuzz_45; diff --git a/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.reference b/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.reference deleted file mode 100644 index f93821e36f2..00000000000 --- a/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.reference +++ /dev/null @@ -1 +0,0 @@ -Can't get data for node '/test-keeper-client-default': node doesn't exist diff --git a/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.sh b/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.sh deleted file mode 100755 index 421e1972839..00000000000 --- a/tests/queries/0_stateless/02882_clickhouse_keeper_client_no_confirmation.sh +++ /dev/null @@ -1,13 +0,0 @@ -#!/usr/bin/env bash - -CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) -# shellcheck source=../shell_config.sh -. "$CUR_DIR"/../shell_config.sh - -path="/test-keeper-client-$CLICKHOUSE_DATABASE" - -$CLICKHOUSE_KEEPER_CLIENT -q "rm $path" >& /dev/null - -$CLICKHOUSE_KEEPER_CLIENT -q "create $path '' 0" -$CLICKHOUSE_KEEPER_CLIENT -q "rmr $path" -$CLICKHOUSE_KEEPER_CLIENT -q "get $path" 2>&1 diff --git a/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.reference b/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.reference new file mode 100644 index 00000000000..23dfd352361 --- /dev/null +++ b/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.reference @@ -0,0 +1,6 @@ +1 +1 +0 +1 +1 +1 diff --git a/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.sql b/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.sql new file mode 100644 index 00000000000..8ee9d672659 --- /dev/null +++ b/tests/queries/0_stateless/02882_replicated_fetch_checksums_doesnt_match.sql @@ -0,0 +1,42 @@ +DROP TABLE IF EXISTS r1; +DROP TABLE IF EXISTS r2; +DROP TABLE IF EXISTS r3; + +CREATE TABLE checksums_r1 (column1 UInt32, column2 String) Engine = ReplicatedMergeTree('/tables/{database}/checksums_table', 'r1') ORDER BY tuple(); + +CREATE TABLE checksums_r2 (column1 UInt32, column2 String) Engine = ReplicatedMergeTree('/tables/{database}/checksums_table', 'r2') ORDER BY tuple(); + +CREATE TABLE checksums_r3 (column1 UInt32, column2 String) Engine = ReplicatedMergeTree('/tables/{database}/checksums_table', 'r3') ORDER BY tuple(); + +SYSTEM STOP REPLICATION QUEUES checksums_r2; +SYSTEM STOP REPLICATION QUEUES checksums_r3; + +ALTER TABLE checksums_r1 MODIFY COLUMN column1 Int32 SETTINGS alter_sync=1; + +INSERT INTO checksums_r1 VALUES (1, 'hello'); + +INSERT INTO checksums_r3 VALUES (1, 'hello'); + +SYSTEM START REPLICATION QUEUES checksums_r2; + +SYSTEM SYNC REPLICA checksums_r2; + +SELECT count() FROM checksums_r1; +SELECT count() FROM checksums_r2; +SELECT count() FROM checksums_r3; + +SYSTEM START REPLICATION QUEUES checksums_r3; +SYSTEM SYNC REPLICA checksums_r3; + +SELECT count() FROM checksums_r1; +SELECT count() FROM checksums_r2; +SELECT count() FROM checksums_r3; + +SYSTEM FLUSH LOGS; + +SELECT * FROM system.text_log WHERE event_time >= now() - 30 and level == 'Error' and message like '%CHECKSUM_DOESNT_MATCH%'and message like '%checksums_r%'; + +DROP TABLE IF EXISTS checksums_r3; +DROP TABLE IF EXISTS checksums_r2; +DROP TABLE IF EXISTS checksums_r1; + diff --git a/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.reference b/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.reference new file mode 100644 index 00000000000..f77195f1f31 --- /dev/null +++ b/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.reference @@ -0,0 +1 @@ +198401_1_1_0 diff --git a/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.sql b/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.sql new file mode 100644 index 00000000000..76821c8797d --- /dev/null +++ b/tests/queries/0_stateless/02883_read_in_reverse_order_virtual_column.sql @@ -0,0 +1,10 @@ +DROP TABLE IF EXISTS t_reverse_order_virt_col; + +CREATE TABLE t_reverse_order_virt_col (`order_0` Decimal(76, 53), `p_time` Date) +ENGINE = MergeTree PARTITION BY toYYYYMM(p_time) +ORDER BY order_0; + +INSERT INTO t_reverse_order_virt_col SELECT number, '1984-01-01' FROM numbers(1000000); +SELECT DISTINCT _part FROM (SELECT _part FROM t_reverse_order_virt_col ORDER BY order_0 DESC); + +DROP TABLE IF EXISTS t_reverse_order_virt_col; diff --git a/tests/queries/0_stateless/data_parquet/string_int_list_inconsistent_offset_multiple_batches.parquet b/tests/queries/0_stateless/data_parquet/string_int_list_inconsistent_offset_multiple_batches.parquet new file mode 100644 index 00000000000..ca0e2cf6762 Binary files /dev/null and b/tests/queries/0_stateless/data_parquet/string_int_list_inconsistent_offset_multiple_batches.parquet differ diff --git a/tests/queries/shell_config.sh b/tests/queries/shell_config.sh index 4f28956b91c..12bc0002191 100644 --- a/tests/queries/shell_config.sh +++ b/tests/queries/shell_config.sh @@ -79,11 +79,6 @@ export CLICKHOUSE_PORT_POSTGRESQL=${CLICKHOUSE_PORT_POSTGRESQL:="9005"} export CLICKHOUSE_PORT_KEEPER=${CLICKHOUSE_PORT_KEEPER:=$(${CLICKHOUSE_EXTRACT_CONFIG} --try --key=keeper_server.tcp_port 2>/dev/null)} 2>/dev/null export CLICKHOUSE_PORT_KEEPER=${CLICKHOUSE_PORT_KEEPER:="9181"} -# keeper-client -[ -x "${CLICKHOUSE_BINARY}-keeper-client" ] && CLICKHOUSE_KEEPER_CLIENT=${CLICKHOUSE_KEEPER_CLIENT:="${CLICKHOUSE_BINARY}-keeper-client"} -[ -x "${CLICKHOUSE_BINARY}" ] && CLICKHOUSE_KEEPER_CLIENT=${CLICKHOUSE_KEEPER_CLIENT:="${CLICKHOUSE_BINARY} keeper-client"} -export CLICKHOUSE_KEEPER_CLIENT=${CLICKHOUSE_KEEPER_CLIENT:="${CLICKHOUSE_BINARY}-keeper-client --port $CLICKHOUSE_PORT_KEEPER"} - export CLICKHOUSE_CLIENT_SECURE=${CLICKHOUSE_CLIENT_SECURE:=$(echo "${CLICKHOUSE_CLIENT}" | sed 's/--secure //' | sed 's/'"--port=${CLICKHOUSE_PORT_TCP}"'//g; s/$/'"--secure --accept-invalid-certificate --port=${CLICKHOUSE_PORT_TCP_SECURE}"'/g')} # Add database and log comment to url params diff --git a/utils/check-style/aspell-ignore/en/aspell-dict.txt b/utils/check-style/aspell-ignore/en/aspell-dict.txt index 0b6d97998c1..08bf2cc2ab0 100644 --- a/utils/check-style/aspell-ignore/en/aspell-dict.txt +++ b/utils/check-style/aspell-ignore/en/aspell-dict.txt @@ -987,6 +987,7 @@ acos acosh activecube activerecord +addDate addDays addHours addMinutes @@ -2255,6 +2256,7 @@ structureToProtobufSchema studentTTest studentttest subBitmap +subDate subarray subarrays subcolumn diff --git a/utils/list-licenses/list-licenses.sh b/utils/list-licenses/list-licenses.sh index cee5cf87a08..f09168a0596 100755 --- a/utils/list-licenses/list-licenses.sh +++ b/utils/list-licenses/list-licenses.sh @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash if [[ "$OSTYPE" == "darwin"* ]]; then # use GNU versions, their presence is ensured in cmake/tools.cmake