mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-22 15:42:02 +00:00
Allow to attach partition from table with different partition expression when destination partition expression doesn't re-partition (#39507)
* temp commit * temp commit * draft impl for feedback * fix weird style changes * fix weird style changes * fix weird style changes * fix weird style changes * fix weird style changes * aa * aa * Add integ tests and remove partition key restriction * fix small incosistency in partition id * style fix * style fix * style fix * use existing DataPartStorageBuilder instead of new one * Refactor part clone to make it more readable and maintainable * Add MergeTreeDataPartCloner docs * define ErrorCodes::BAD_ARGUMENTS * Rebase * camel case methods * address some comments * yet another rebase? * Move from integ tests to stateless tests * address more comments * add finalize on min_max_idx files * Add sync option to DistinctPartitionExpCloner * just a temp test * revert temp change * Use echoOn to distinguish test queries * remove comment * fix build issue during rebase * atempt to fix build after rebase * finally fix build * clear minmaxidx hyperrectangle before loading it * Fix error on min_max files deletion where it was being assumed that partition expression contained all columns * get it to the state it was previously * add missing include * getting functional? * refactoring and renaming * some more refactoring * extern bad arguments * try to fix style * improvements and docs * remove duplicate includes * fix crash * make tests more stable by ordering * rebase once again.. * fix * make ci happy? * fix rebase issues * docs * rebase, but prolly needs to be improved * refactor out from nasty inheritance to static methods * fix style * work around optional * refactor & integrate some changes * update column_type * add tests by dencrane * set utc * fix ref file * fix tests * use MergeTree instead of SummingMergeTree * mark MergeTreeDataPart::getBlock as const * address a few comments * compute module function name size at compile time * simplify branching in getPartitionAstFieldsCount * remove column_indexes argument * merge getBlock with buildBlock * add some const specifiers * small adjustments * remove no longer needed isNull check * use std::min and max to update global min max idx * add some assertions * forward declare some symbols * fix grammar * forward decl * try to fix build.. * remove IFunction forward decl * Revert "use std::min and max to update global min max idx" This reverts commitb2fe79dda7
. * Revert "remove no longer needed isNull check" This reverts commit129db2610f
. * Revert "Revert "remove no longer needed isNull check"" This reverts commit9416087dd8
. * Revert "Revert "use std::min and max to update global min max idx"" This reverts commit20246d4416
. * remove some comments * partial use of MonotonicityCheckMatcher * ranges * remove KeyDescriptionMonotonicityChecker * remove duplication of applyfunction * move functions to anonymous namespace * move functions to cpp * Relax partition compatibility requirements by accepting subset, add tests from partitioned to unpartitioned * updte reference file * Support for partition by a, b, c to partition by a, b * refactoring part 1 * refactoring part 2, use hyperrectangle, still not complete * refactoring part 3, build hyperrectangle with intersection of source & destination min max columns * Support attaching to table with partition expression of multiple expressions * add tests * rename method * remove some code duplication * draft impl for replicatedmergetree, need to dive deeper * ship ref file * fix impl for replicatedmergetree.. * forbid attach empty partition replicatedmergetree * Add replicated merge tree integration tests * add test missing files * fix black * do not check for monotonicity of empty partition * add empty tests & fix replicated * remove no longer needed buildBlockWithMinMaxINdexes * remove column logic in buildHyperrectangle * simplify implementation by using existing methods * further simplify implementation * move all MergeTreeDataPartClone private methods to .cpp file * decrease decomposition * use different namespaces * reduce code duplication * fix style * address a few comments * add chassert to assert arguments size on MonotonicityCheckVisitor * remove deleteMinMaxFiles method * remove useless checks from sanitycheck * add tests for attach partition (not id) * Remove sanityCheckASTPartition and bring back conditional getPartitionIDFromQuery * remove empty block comment * small fixes * fix formatting * add missing include * remove duplicate iuncludes * trigger ci * reduce some code duplication * use updated partition id on replicatedmergetree * fix build * fix build * small refactor * do not use insert increment on fetch part * remove duplicate includes * add one more integ test * black * black * rely on partition exp instead of partition id on replicated part fetch to decide if it is a different partition exp * add one more integ test * add order by clause * fix black --------- Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
This commit is contained in:
parent
5804a65262
commit
24b8bbe9fa
@ -112,7 +112,7 @@ Note that:
|
||||
For the query to run successfully, the following conditions must be met:
|
||||
|
||||
- Both tables must have the same structure.
|
||||
- Both tables must have the same partition key, the same order by key and the same primary key.
|
||||
- Both tables must have the same order by key and the same primary key.
|
||||
- Both tables must have the same indices and projections.
|
||||
- Both tables must have the same storage policy.
|
||||
|
||||
|
@ -1,13 +1,17 @@
|
||||
#pragma once
|
||||
|
||||
#include <AggregateFunctions/AggregateFunctionFactory.h>
|
||||
#include <Core/Range.h>
|
||||
#include <DataTypes/DataTypeFactory.h>
|
||||
#include <DataTypes/FieldToDataType.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <Interpreters/InDepthNodeVisitor.h>
|
||||
#include <Interpreters/IdentifierSemantic.h>
|
||||
#include <Interpreters/InDepthNodeVisitor.h>
|
||||
#include <Interpreters/applyFunction.h>
|
||||
#include <Parsers/ASTFunction.h>
|
||||
#include <Parsers/ASTIdentifier.h>
|
||||
#include <Parsers/ASTLiteral.h>
|
||||
#include <Parsers/ASTOrderByElement.h>
|
||||
#include <Parsers/ASTTablesInSelectQuery.h>
|
||||
#include <Parsers/IAST.h>
|
||||
@ -33,6 +37,8 @@ public:
|
||||
ASTIdentifier * identifier = nullptr;
|
||||
DataTypePtr arg_data_type = {};
|
||||
|
||||
Range range = Range::createWholeUniverse();
|
||||
|
||||
void reject() { monotonicity.is_monotonic = false; }
|
||||
bool isRejected() const { return !monotonicity.is_monotonic; }
|
||||
|
||||
@ -97,13 +103,30 @@ public:
|
||||
if (data.isRejected())
|
||||
return;
|
||||
|
||||
/// TODO: monotonicity for functions of several arguments
|
||||
if (!ast_function.arguments || ast_function.arguments->children.size() != 1)
|
||||
/// Monotonicity check only works for functions that contain at most two arguments and one of them must be a constant.
|
||||
if (!ast_function.arguments)
|
||||
{
|
||||
data.reject();
|
||||
return;
|
||||
}
|
||||
|
||||
auto arguments_size = ast_function.arguments->children.size();
|
||||
|
||||
if (arguments_size == 0 || arguments_size > 2)
|
||||
{
|
||||
data.reject();
|
||||
return;
|
||||
}
|
||||
else if (arguments_size == 2)
|
||||
{
|
||||
/// If the function has two arguments, then one of them must be a constant.
|
||||
if (!ast_function.arguments->children[0]->as<ASTLiteral>() && !ast_function.arguments->children[1]->as<ASTLiteral>())
|
||||
{
|
||||
data.reject();
|
||||
return;
|
||||
}
|
||||
}
|
||||
|
||||
if (!data.canOptimize(ast_function))
|
||||
{
|
||||
data.reject();
|
||||
@ -124,14 +147,33 @@ public:
|
||||
return;
|
||||
}
|
||||
|
||||
ColumnsWithTypeAndName args;
|
||||
args.emplace_back(data.arg_data_type, "tmp");
|
||||
auto function_base = function->build(args);
|
||||
auto function_arguments = getFunctionArguments(ast_function, data);
|
||||
|
||||
auto function_base = function->build(function_arguments);
|
||||
|
||||
if (function_base && function_base->hasInformationAboutMonotonicity())
|
||||
{
|
||||
bool is_positive = data.monotonicity.is_positive;
|
||||
data.monotonicity = function_base->getMonotonicityForRange(*data.arg_data_type, Field(), Field());
|
||||
data.monotonicity = function_base->getMonotonicityForRange(*data.arg_data_type, data.range.left, data.range.right);
|
||||
|
||||
auto & key_range = data.range;
|
||||
|
||||
/// If we apply function to open interval, we can get empty intervals in result.
|
||||
/// E.g. for ('2020-01-03', '2020-01-20') after applying 'toYYYYMM' we will get ('202001', '202001').
|
||||
/// To avoid this we make range left and right included.
|
||||
/// Any function that treats NULL specially is not monotonic.
|
||||
/// Thus we can safely use isNull() as an -Inf/+Inf indicator here.
|
||||
if (!key_range.left.isNull())
|
||||
{
|
||||
key_range.left = applyFunction(function_base, data.arg_data_type, key_range.left);
|
||||
key_range.left_included = true;
|
||||
}
|
||||
|
||||
if (!key_range.right.isNull())
|
||||
{
|
||||
key_range.right = applyFunction(function_base, data.arg_data_type, key_range.right);
|
||||
key_range.right_included = true;
|
||||
}
|
||||
|
||||
if (!is_positive)
|
||||
data.monotonicity.is_positive = !data.monotonicity.is_positive;
|
||||
@ -143,13 +185,53 @@ public:
|
||||
|
||||
static bool needChildVisit(const ASTPtr & parent, const ASTPtr &)
|
||||
{
|
||||
/// Currently we check monotonicity only for single-argument functions.
|
||||
/// Although, multi-argument functions with all but one constant arguments can also be monotonic.
|
||||
/// Multi-argument functions with all but one constant arguments can be monotonic.
|
||||
if (const auto * func = typeid_cast<const ASTFunction *>(parent.get()))
|
||||
return func->arguments->children.size() < 2;
|
||||
return func->arguments->children.size() <= 2;
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static ColumnWithTypeAndName extractLiteralColumnAndTypeFromAstLiteral(const ASTLiteral * literal)
|
||||
{
|
||||
ColumnWithTypeAndName result;
|
||||
|
||||
result.type = applyVisitor(FieldToDataType(), literal->value);
|
||||
result.column = result.type->createColumnConst(0, literal->value);
|
||||
|
||||
return result;
|
||||
}
|
||||
|
||||
static ColumnsWithTypeAndName getFunctionArguments(const ASTFunction & ast_function, const Data & data)
|
||||
{
|
||||
ColumnsWithTypeAndName args;
|
||||
|
||||
auto arguments_size = ast_function.arguments->children.size();
|
||||
|
||||
chassert(arguments_size == 1 || arguments_size == 2);
|
||||
|
||||
if (arguments_size == 2)
|
||||
{
|
||||
if (ast_function.arguments->children[0]->as<ASTLiteral>())
|
||||
{
|
||||
const auto * literal = ast_function.arguments->children[0]->as<ASTLiteral>();
|
||||
args.push_back(extractLiteralColumnAndTypeFromAstLiteral(literal));
|
||||
args.emplace_back(data.arg_data_type, "tmp");
|
||||
}
|
||||
else
|
||||
{
|
||||
const auto * literal = ast_function.arguments->children[1]->as<ASTLiteral>();
|
||||
args.emplace_back(data.arg_data_type, "tmp");
|
||||
args.push_back(extractLiteralColumnAndTypeFromAstLiteral(literal));
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
args.emplace_back(data.arg_data_type, "tmp");
|
||||
}
|
||||
|
||||
return args;
|
||||
}
|
||||
};
|
||||
|
||||
using MonotonicityCheckVisitor = ConstInDepthNodeVisitor<MonotonicityCheckMatcher, false>;
|
||||
|
43
src/Interpreters/applyFunction.cpp
Normal file
43
src/Interpreters/applyFunction.cpp
Normal file
@ -0,0 +1,43 @@
|
||||
#include <Interpreters/applyFunction.h>
|
||||
|
||||
#include <Core/Range.h>
|
||||
#include <Functions/IFunction.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
static Field applyFunctionForField(const FunctionBasePtr & func, const DataTypePtr & arg_type, const Field & arg_value)
|
||||
{
|
||||
ColumnsWithTypeAndName columns{
|
||||
{arg_type->createColumnConst(1, arg_value), arg_type, "x"},
|
||||
};
|
||||
|
||||
auto col = func->execute(columns, func->getResultType(), 1);
|
||||
return (*col)[0];
|
||||
}
|
||||
|
||||
FieldRef applyFunction(const FunctionBasePtr & func, const DataTypePtr & current_type, const FieldRef & field)
|
||||
{
|
||||
/// Fallback for fields without block reference.
|
||||
if (field.isExplicit())
|
||||
return applyFunctionForField(func, current_type, field);
|
||||
|
||||
String result_name = "_" + func->getName() + "_" + toString(field.column_idx);
|
||||
const auto & columns = field.columns;
|
||||
size_t result_idx = columns->size();
|
||||
|
||||
for (size_t i = 0; i < result_idx; ++i)
|
||||
if ((*columns)[i].name == result_name)
|
||||
result_idx = i;
|
||||
|
||||
if (result_idx == columns->size())
|
||||
{
|
||||
ColumnsWithTypeAndName args{(*columns)[field.column_idx]};
|
||||
field.columns->emplace_back(ColumnWithTypeAndName{nullptr, func->getResultType(), result_name});
|
||||
(*columns)[result_idx].column = func->execute(args, (*columns)[result_idx].type, columns->front().column->size());
|
||||
}
|
||||
|
||||
return {field.columns, field.row_idx, result_idx};
|
||||
}
|
||||
|
||||
}
|
16
src/Interpreters/applyFunction.h
Normal file
16
src/Interpreters/applyFunction.h
Normal file
@ -0,0 +1,16 @@
|
||||
#pragma once
|
||||
|
||||
#include <memory>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
struct FieldRef;
|
||||
|
||||
class IFunctionBase;
|
||||
class IDataType;
|
||||
|
||||
using DataTypePtr = std::shared_ptr<const IDataType>;
|
||||
using FunctionBasePtr = std::shared_ptr<const IFunctionBase>;
|
||||
|
||||
FieldRef applyFunction(const FunctionBasePtr & func, const DataTypePtr & current_type, const FieldRef & field);
|
||||
}
|
@ -3,6 +3,11 @@
|
||||
|
||||
namespace DB
|
||||
{
|
||||
String queryToStringNullable(const ASTPtr & query)
|
||||
{
|
||||
return query ? queryToString(query) : "";
|
||||
}
|
||||
|
||||
String queryToString(const ASTPtr & query)
|
||||
{
|
||||
return queryToString(*query);
|
||||
|
@ -6,4 +6,5 @@ namespace DB
|
||||
{
|
||||
String queryToString(const ASTPtr & query);
|
||||
String queryToString(const IAST & query);
|
||||
String queryToStringNullable(const ASTPtr & query);
|
||||
}
|
||||
|
@ -81,6 +81,7 @@ void IMergeTreeDataPart::MinMaxIndex::load(const MergeTreeData & data, const Par
|
||||
auto minmax_column_types = data.getMinMaxColumnsTypes(partition_key);
|
||||
size_t minmax_idx_size = minmax_column_types.size();
|
||||
|
||||
hyperrectangle.clear();
|
||||
hyperrectangle.reserve(minmax_idx_size);
|
||||
for (size_t i = 0; i < minmax_idx_size; ++i)
|
||||
{
|
||||
@ -104,6 +105,39 @@ void IMergeTreeDataPart::MinMaxIndex::load(const MergeTreeData & data, const Par
|
||||
initialized = true;
|
||||
}
|
||||
|
||||
Block IMergeTreeDataPart::MinMaxIndex::getBlock(const MergeTreeData & data) const
|
||||
{
|
||||
if (!initialized)
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to get block from uninitialized MinMax index.");
|
||||
|
||||
Block block;
|
||||
|
||||
const auto metadata_snapshot = data.getInMemoryMetadataPtr();
|
||||
const auto & partition_key = metadata_snapshot->getPartitionKey();
|
||||
|
||||
const auto minmax_column_names = data.getMinMaxColumnsNames(partition_key);
|
||||
const auto minmax_column_types = data.getMinMaxColumnsTypes(partition_key);
|
||||
const auto minmax_idx_size = minmax_column_types.size();
|
||||
|
||||
for (size_t i = 0; i < minmax_idx_size; ++i)
|
||||
{
|
||||
const auto & data_type = minmax_column_types[i];
|
||||
const auto & column_name = minmax_column_names[i];
|
||||
|
||||
const auto column = data_type->createColumn();
|
||||
|
||||
const auto min_val = hyperrectangle.at(i).left;
|
||||
const auto max_val = hyperrectangle.at(i).right;
|
||||
|
||||
column->insert(min_val);
|
||||
column->insert(max_val);
|
||||
|
||||
block.insert(ColumnWithTypeAndName(column->getPtr(), data_type, column_name));
|
||||
}
|
||||
|
||||
return block;
|
||||
}
|
||||
|
||||
IMergeTreeDataPart::MinMaxIndex::WrittenFiles IMergeTreeDataPart::MinMaxIndex::store(
|
||||
const MergeTreeData & data, IDataPartStorage & part_storage, Checksums & out_checksums) const
|
||||
{
|
||||
@ -185,8 +219,7 @@ void IMergeTreeDataPart::MinMaxIndex::merge(const MinMaxIndex & other)
|
||||
|
||||
if (!initialized)
|
||||
{
|
||||
hyperrectangle = other.hyperrectangle;
|
||||
initialized = true;
|
||||
*this = other;
|
||||
}
|
||||
else
|
||||
{
|
||||
|
@ -336,6 +336,7 @@ public:
|
||||
}
|
||||
|
||||
void load(const MergeTreeData & data, const PartMetadataManagerPtr & manager);
|
||||
Block getBlock(const MergeTreeData & data) const;
|
||||
|
||||
using WrittenFiles = std::vector<std::unique_ptr<WriteBufferFromFileBase>>;
|
||||
|
||||
|
@ -1,36 +1,37 @@
|
||||
#include <Storages/MergeTree/KeyCondition.h>
|
||||
#include <Storages/MergeTree/BoolMask.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <Columns/ColumnConst.h>
|
||||
#include <Columns/ColumnSet.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
#include <DataTypes/DataTypeNullable.h>
|
||||
#include <DataTypes/DataTypeNothing.h>
|
||||
#include <DataTypes/DataTypeString.h>
|
||||
#include <DataTypes/DataTypesNumber.h>
|
||||
#include <DataTypes/FieldToDataType.h>
|
||||
#include <DataTypes/getLeastSupertype.h>
|
||||
#include <DataTypes/Utils.h>
|
||||
#include <Interpreters/TreeRewriter.h>
|
||||
#include <Interpreters/ExpressionAnalyzer.h>
|
||||
#include <Interpreters/ExpressionActions.h>
|
||||
#include <Interpreters/castColumn.h>
|
||||
#include <Interpreters/misc.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
#include <Functions/indexHint.h>
|
||||
#include <DataTypes/getLeastSupertype.h>
|
||||
#include <Functions/CastOverloadResolver.h>
|
||||
#include <Functions/FunctionFactory.h>
|
||||
#include <Functions/IFunction.h>
|
||||
#include <Common/FieldVisitorToString.h>
|
||||
#include <Common/MortonUtils.h>
|
||||
#include <Common/typeid_cast.h>
|
||||
#include <Columns/ColumnSet.h>
|
||||
#include <Columns/ColumnConst.h>
|
||||
#include <Interpreters/convertFieldToType.h>
|
||||
#include <Functions/indexHint.h>
|
||||
#include <IO/Operators.h>
|
||||
#include <IO/WriteBufferFromString.h>
|
||||
#include <Interpreters/ExpressionActions.h>
|
||||
#include <Interpreters/ExpressionAnalyzer.h>
|
||||
#include <Interpreters/Set.h>
|
||||
#include <Parsers/queryToString.h>
|
||||
#include <Interpreters/TreeRewriter.h>
|
||||
#include <Interpreters/applyFunction.h>
|
||||
#include <Interpreters/castColumn.h>
|
||||
#include <Interpreters/convertFieldToType.h>
|
||||
#include <Interpreters/misc.h>
|
||||
#include <Parsers/ASTIdentifier.h>
|
||||
#include <Parsers/ASTLiteral.h>
|
||||
#include <Parsers/ASTSelectQuery.h>
|
||||
#include <IO/WriteBufferFromString.h>
|
||||
#include <IO/Operators.h>
|
||||
#include <Parsers/queryToString.h>
|
||||
#include <Storages/MergeTree/BoolMask.h>
|
||||
#include <Storages/MergeTree/KeyCondition.h>
|
||||
#include <Storages/MergeTree/MergeTreeIndexUtils.h>
|
||||
#include <Common/FieldVisitorToString.h>
|
||||
#include <Common/MortonUtils.h>
|
||||
#include <Common/typeid_cast.h>
|
||||
|
||||
#include <algorithm>
|
||||
#include <cassert>
|
||||
@ -836,21 +837,6 @@ bool KeyCondition::getConstant(const ASTPtr & expr, Block & block_with_constants
|
||||
return node.tryGetConstant(out_value, out_type);
|
||||
}
|
||||
|
||||
|
||||
static Field applyFunctionForField(
|
||||
const FunctionBasePtr & func,
|
||||
const DataTypePtr & arg_type,
|
||||
const Field & arg_value)
|
||||
{
|
||||
ColumnsWithTypeAndName columns
|
||||
{
|
||||
{ arg_type->createColumnConst(1, arg_value), arg_type, "x" },
|
||||
};
|
||||
|
||||
auto col = func->execute(columns, func->getResultType(), 1);
|
||||
return (*col)[0];
|
||||
}
|
||||
|
||||
/// The case when arguments may have types different than in the primary key.
|
||||
static std::pair<Field, DataTypePtr> applyFunctionForFieldOfUnknownType(
|
||||
const FunctionBasePtr & func,
|
||||
@ -890,33 +876,6 @@ static std::pair<Field, DataTypePtr> applyBinaryFunctionForFieldOfUnknownType(
|
||||
return {std::move(result), std::move(return_type)};
|
||||
}
|
||||
|
||||
|
||||
static FieldRef applyFunction(const FunctionBasePtr & func, const DataTypePtr & current_type, const FieldRef & field)
|
||||
{
|
||||
/// Fallback for fields without block reference.
|
||||
if (field.isExplicit())
|
||||
return applyFunctionForField(func, current_type, field);
|
||||
|
||||
String result_name = "_" + func->getName() + "_" + toString(field.column_idx);
|
||||
const auto & columns = field.columns;
|
||||
size_t result_idx = columns->size();
|
||||
|
||||
for (size_t i = 0; i < result_idx; ++i)
|
||||
{
|
||||
if ((*columns)[i].name == result_name)
|
||||
result_idx = i;
|
||||
}
|
||||
|
||||
if (result_idx == columns->size())
|
||||
{
|
||||
ColumnsWithTypeAndName args{(*columns)[field.column_idx]};
|
||||
field.columns->emplace_back(ColumnWithTypeAndName {nullptr, func->getResultType(), result_name});
|
||||
(*columns)[result_idx].column = func->execute(args, (*columns)[result_idx].type, columns->front().column->size());
|
||||
}
|
||||
|
||||
return {field.columns, field.row_idx, result_idx};
|
||||
}
|
||||
|
||||
/** When table's key has expression with these functions from a column,
|
||||
* and when a column in a query is compared with a constant, such as:
|
||||
* CREATE TABLE (x String) ORDER BY toDate(x)
|
||||
|
@ -8,21 +8,6 @@
|
||||
#include <Backups/BackupEntryWrappedWith.h>
|
||||
#include <Backups/IBackup.h>
|
||||
#include <Backups/RestorerFromBackup.h>
|
||||
#include <Common/Config/ConfigHelper.h>
|
||||
#include <Common/CurrentMetrics.h>
|
||||
#include <Common/Increment.h>
|
||||
#include <Common/ProfileEventsScope.h>
|
||||
#include <Common/SimpleIncrement.h>
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
#include <Common/ThreadFuzzer.h>
|
||||
#include <Common/escapeForFileName.h>
|
||||
#include <Common/getNumberOfPhysicalCPUCores.h>
|
||||
#include <Common/noexcept_scope.h>
|
||||
#include <Common/quoteString.h>
|
||||
#include <Common/scope_guard_safe.h>
|
||||
#include <Common/typeid_cast.h>
|
||||
#include <Storages/MergeTree/RangesInDataPart.h>
|
||||
#include <Compression/CompressedReadBuffer.h>
|
||||
#include <Core/QueryProcessingStage.h>
|
||||
#include <DataTypes/DataTypeEnum.h>
|
||||
@ -43,19 +28,20 @@
|
||||
#include <IO/WriteHelpers.h>
|
||||
#include <Interpreters/Aggregator.h>
|
||||
#include <Interpreters/Context.h>
|
||||
#include <Interpreters/convertFieldToType.h>
|
||||
#include <Interpreters/evaluateConstantExpression.h>
|
||||
#include <Interpreters/ExpressionAnalyzer.h>
|
||||
#include <Interpreters/InterpreterSelectQuery.h>
|
||||
#include <Interpreters/MergeTreeTransaction.h>
|
||||
#include <Interpreters/PartLog.h>
|
||||
#include <Interpreters/TransactionLog.h>
|
||||
#include <Interpreters/TreeRewriter.h>
|
||||
#include <Interpreters/convertFieldToType.h>
|
||||
#include <Interpreters/evaluateConstantExpression.h>
|
||||
#include <Interpreters/inplaceBlockConversions.h>
|
||||
#include <Parsers/ASTAlterQuery.h>
|
||||
#include <Parsers/ASTExpressionList.h>
|
||||
#include <Parsers/ASTIndexDeclaration.h>
|
||||
#include <Parsers/ASTHelpers.h>
|
||||
#include <Parsers/ASTFunction.h>
|
||||
#include <Parsers/ASTHelpers.h>
|
||||
#include <Parsers/ASTIndexDeclaration.h>
|
||||
#include <Parsers/ASTLiteral.h>
|
||||
#include <Parsers/ASTNameTypePair.h>
|
||||
#include <Parsers/ASTPartition.h>
|
||||
@ -64,26 +50,41 @@
|
||||
#include <Parsers/ExpressionListParsers.h>
|
||||
#include <Parsers/parseQuery.h>
|
||||
#include <Parsers/queryToString.h>
|
||||
#include <Parsers/ASTAlterQuery.h>
|
||||
#include <Processors/Formats/IInputFormat.h>
|
||||
#include <Processors/QueryPlan/QueryIdHolder.h>
|
||||
#include <Processors/QueryPlan/ReadFromMergeTree.h>
|
||||
#include <Storages/AlterCommands.h>
|
||||
#include <Storages/BlockNumberColumn.h>
|
||||
#include <Storages/Freeze.h>
|
||||
#include <Storages/MergeTree/ActiveDataPartSet.h>
|
||||
#include <Storages/MergeTree/DataPartStorageOnDiskFull.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartBuilder.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartCloner.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartCompact.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartInMemory.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartWide.h>
|
||||
#include <Storages/Statistics/Estimator.h>
|
||||
#include <Storages/MergeTree/MergeTreeSelectProcessor.h>
|
||||
#include <Storages/MergeTree/RangesInDataPart.h>
|
||||
#include <Storages/MergeTree/checkDataPart.h>
|
||||
#include <Storages/MutationCommands.h>
|
||||
#include <Storages/MergeTree/ActiveDataPartSet.h>
|
||||
#include <Storages/StorageMergeTree.h>
|
||||
#include <Storages/StorageReplicatedMergeTree.h>
|
||||
#include <Storages/VirtualColumnUtils.h>
|
||||
#include <Common/Config/ConfigHelper.h>
|
||||
#include <Common/CurrentMetrics.h>
|
||||
#include <Common/Increment.h>
|
||||
#include <Common/ProfileEventsScope.h>
|
||||
#include <Common/SimpleIncrement.h>
|
||||
#include <Common/Stopwatch.h>
|
||||
#include <Common/StringUtils/StringUtils.h>
|
||||
#include <Common/ThreadFuzzer.h>
|
||||
#include <Common/escapeForFileName.h>
|
||||
#include <Common/getNumberOfPhysicalCPUCores.h>
|
||||
#include <Common/noexcept_scope.h>
|
||||
#include <Common/quoteString.h>
|
||||
#include <Common/scope_guard_safe.h>
|
||||
#include <Common/typeid_cast.h>
|
||||
|
||||
#include <boost/range/algorithm_ext/erase.hpp>
|
||||
#include <boost/algorithm/string/join.hpp>
|
||||
@ -197,6 +198,50 @@ namespace ErrorCodes
|
||||
extern const int LIMIT_EXCEEDED;
|
||||
}
|
||||
|
||||
static size_t getPartitionAstFieldsCount(const ASTPartition & partition_ast, ASTPtr partition_value_ast)
|
||||
{
|
||||
if (partition_ast.fields_count.has_value())
|
||||
return *partition_ast.fields_count;
|
||||
|
||||
if (partition_value_ast->as<ASTLiteral>())
|
||||
return 1;
|
||||
|
||||
const auto * tuple_ast = partition_value_ast->as<ASTFunction>();
|
||||
|
||||
if (!tuple_ast)
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::INVALID_PARTITION_VALUE, "Expected literal or tuple for partition key, got {}", partition_value_ast->getID());
|
||||
}
|
||||
|
||||
if (tuple_ast->name != "tuple")
|
||||
{
|
||||
if (!isFunctionCast(tuple_ast))
|
||||
throw Exception(ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
|
||||
if (tuple_ast->arguments->as<ASTExpressionList>()->children.empty())
|
||||
throw Exception(ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
|
||||
auto first_arg = tuple_ast->arguments->as<ASTExpressionList>()->children.at(0);
|
||||
if (const auto * inner_tuple = first_arg->as<ASTFunction>(); inner_tuple && inner_tuple->name == "tuple")
|
||||
{
|
||||
const auto * arguments_ast = tuple_ast->arguments->as<ASTExpressionList>();
|
||||
return arguments_ast ? arguments_ast->children.size() : 0;
|
||||
}
|
||||
else if (const auto * inner_literal_tuple = first_arg->as<ASTLiteral>(); inner_literal_tuple)
|
||||
{
|
||||
return inner_literal_tuple->value.getType() == Field::Types::Tuple ? inner_literal_tuple->value.safeGet<Tuple>().size() : 1;
|
||||
}
|
||||
|
||||
throw Exception(ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
}
|
||||
else
|
||||
{
|
||||
const auto * arguments_ast = tuple_ast->arguments->as<ASTExpressionList>();
|
||||
return arguments_ast ? arguments_ast->children.size() : 0;
|
||||
}
|
||||
}
|
||||
|
||||
static void checkSuspiciousIndices(const ASTFunction * index_function)
|
||||
{
|
||||
std::unordered_set<UInt64> unique_index_expression_hashes;
|
||||
@ -4854,7 +4899,7 @@ void MergeTreeData::removePartContributionToColumnAndSecondaryIndexSizes(const D
|
||||
}
|
||||
|
||||
void MergeTreeData::checkAlterPartitionIsPossible(
|
||||
const PartitionCommands & commands, const StorageMetadataPtr & /*metadata_snapshot*/, const Settings & settings, ContextPtr local_context) const
|
||||
const PartitionCommands & commands, const StorageMetadataPtr & /*metadata_snapshot*/, const Settings & settings, ContextPtr) const
|
||||
{
|
||||
for (const auto & command : commands)
|
||||
{
|
||||
@ -4882,7 +4927,15 @@ void MergeTreeData::checkAlterPartitionIsPossible(
|
||||
throw DB::Exception(ErrorCodes::SUPPORT_IS_DISABLED, "Only support DROP/DETACH PARTITION ALL currently");
|
||||
}
|
||||
else
|
||||
getPartitionIDFromQuery(command.partition, local_context);
|
||||
{
|
||||
// The below `getPartitionIDFromQuery` call will not work for attach / replace because it assumes the partition expressions
|
||||
// are the same and deliberately uses this storage. Later on, `MergeTreeData::replaceFrom` is called, and it makes the right
|
||||
// call to `getPartitionIDFromQuery` using source storage.
|
||||
// Note: `PartitionCommand::REPLACE_PARTITION` is used both for `REPLACE PARTITION` and `ATTACH PARTITION FROM` queries.
|
||||
// But not for `ATTACH PARTITION` queries.
|
||||
if (command.type != PartitionCommand::REPLACE_PARTITION)
|
||||
getPartitionIDFromQuery(command.partition, getContext());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -5616,69 +5669,8 @@ String MergeTreeData::getPartitionIDFromQuery(const ASTPtr & ast, ContextPtr loc
|
||||
MergeTreePartInfo::validatePartitionID(partition_ast.id->clone(), format_version);
|
||||
return partition_ast.id->as<ASTLiteral>()->value.safeGet<String>();
|
||||
}
|
||||
size_t partition_ast_fields_count = 0;
|
||||
ASTPtr partition_value_ast = partition_ast.value->clone();
|
||||
if (!partition_ast.fields_count.has_value())
|
||||
{
|
||||
if (partition_value_ast->as<ASTLiteral>())
|
||||
{
|
||||
partition_ast_fields_count = 1;
|
||||
}
|
||||
else if (const auto * tuple_ast = partition_value_ast->as<ASTFunction>())
|
||||
{
|
||||
if (tuple_ast->name != "tuple")
|
||||
{
|
||||
if (isFunctionCast(tuple_ast))
|
||||
{
|
||||
if (tuple_ast->arguments->as<ASTExpressionList>()->children.empty())
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
}
|
||||
auto first_arg = tuple_ast->arguments->as<ASTExpressionList>()->children.at(0);
|
||||
if (const auto * inner_tuple = first_arg->as<ASTFunction>(); inner_tuple && inner_tuple->name == "tuple")
|
||||
{
|
||||
const auto * arguments_ast = tuple_ast->arguments->as<ASTExpressionList>();
|
||||
if (arguments_ast)
|
||||
partition_ast_fields_count = arguments_ast->children.size();
|
||||
else
|
||||
partition_ast_fields_count = 0;
|
||||
}
|
||||
else if (const auto * inner_literal_tuple = first_arg->as<ASTLiteral>(); inner_literal_tuple)
|
||||
{
|
||||
if (inner_literal_tuple->value.getType() == Field::Types::Tuple)
|
||||
partition_ast_fields_count = inner_literal_tuple->value.safeGet<Tuple>().size();
|
||||
else
|
||||
partition_ast_fields_count = 1;
|
||||
}
|
||||
else
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
}
|
||||
}
|
||||
else
|
||||
throw Exception(ErrorCodes::INVALID_PARTITION_VALUE, "Expected tuple for complex partition key, got {}", tuple_ast->name);
|
||||
}
|
||||
else
|
||||
{
|
||||
const auto * arguments_ast = tuple_ast->arguments->as<ASTExpressionList>();
|
||||
if (arguments_ast)
|
||||
partition_ast_fields_count = arguments_ast->children.size();
|
||||
else
|
||||
partition_ast_fields_count = 0;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::INVALID_PARTITION_VALUE, "Expected literal or tuple for partition key, got {}", partition_value_ast->getID());
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
partition_ast_fields_count = *partition_ast.fields_count;
|
||||
}
|
||||
auto partition_ast_fields_count = getPartitionAstFieldsCount(partition_ast, partition_value_ast);
|
||||
|
||||
if (format_version < MERGE_TREE_DATA_MIN_FORMAT_VERSION_WITH_CUSTOM_PARTITIONING)
|
||||
{
|
||||
@ -7014,23 +7006,35 @@ MergeTreeData & MergeTreeData::checkStructureAndGetMergeTreeData(IStorage & sour
|
||||
if (my_snapshot->getColumns().getAllPhysical().sizeOfDifference(src_snapshot->getColumns().getAllPhysical()))
|
||||
throw Exception(ErrorCodes::INCOMPATIBLE_COLUMNS, "Tables have different structure");
|
||||
|
||||
auto query_to_string = [] (const ASTPtr & ast)
|
||||
{
|
||||
return ast ? queryToString(ast) : "";
|
||||
};
|
||||
|
||||
if (query_to_string(my_snapshot->getSortingKeyAST()) != query_to_string(src_snapshot->getSortingKeyAST()))
|
||||
if (queryToStringNullable(my_snapshot->getSortingKeyAST()) != queryToStringNullable(src_snapshot->getSortingKeyAST()))
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Tables have different ordering");
|
||||
|
||||
if (query_to_string(my_snapshot->getPartitionKeyAST()) != query_to_string(src_snapshot->getPartitionKeyAST()))
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Tables have different partition key");
|
||||
|
||||
if (format_version != src_data->format_version)
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Tables have different format_version");
|
||||
|
||||
if (query_to_string(my_snapshot->getPrimaryKeyAST()) != query_to_string(src_snapshot->getPrimaryKeyAST()))
|
||||
if (queryToStringNullable(my_snapshot->getPrimaryKeyAST()) != queryToStringNullable(src_snapshot->getPrimaryKeyAST()))
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Tables have different primary key");
|
||||
|
||||
const auto is_a_subset_of = [](const auto & lhs, const auto & rhs)
|
||||
{
|
||||
if (lhs.size() > rhs.size())
|
||||
return false;
|
||||
|
||||
const auto rhs_set = NameSet(rhs.begin(), rhs.end());
|
||||
for (const auto & lhs_element : lhs)
|
||||
if (!rhs_set.contains(lhs_element))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
};
|
||||
|
||||
if (!is_a_subset_of(my_snapshot->getColumnsRequiredForPartitionKey(), src_snapshot->getColumnsRequiredForPartitionKey()))
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::BAD_ARGUMENTS,
|
||||
"Destination table partition expression columns must be a subset of source table partition expression columns");
|
||||
}
|
||||
|
||||
const auto check_definitions = [](const auto & my_descriptions, const auto & src_descriptions)
|
||||
{
|
||||
if (my_descriptions.size() != src_descriptions.size())
|
||||
@ -7071,128 +7075,56 @@ std::pair<MergeTreeData::MutableDataPartPtr, scope_guard> MergeTreeData::cloneAn
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings)
|
||||
{
|
||||
/// Check that the storage policy contains the disk where the src_part is located.
|
||||
bool does_storage_policy_allow_same_disk = false;
|
||||
for (const DiskPtr & disk : getStoragePolicy()->getDisks())
|
||||
{
|
||||
if (disk->getName() == src_part->getDataPartStorage().getDiskName())
|
||||
{
|
||||
does_storage_policy_allow_same_disk = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!does_storage_policy_allow_same_disk)
|
||||
throw Exception(
|
||||
ErrorCodes::BAD_ARGUMENTS,
|
||||
"Could not clone and load part {} because disk does not belong to storage policy",
|
||||
quoteString(src_part->getDataPartStorage().getFullPath()));
|
||||
return MergeTreeDataPartCloner::clone(
|
||||
this, src_part, metadata_snapshot, dst_part_info, tmp_part_prefix, require_part_metadata, params, read_settings, write_settings);
|
||||
}
|
||||
|
||||
String dst_part_name = src_part->getNewName(dst_part_info);
|
||||
String tmp_dst_part_name = tmp_part_prefix + dst_part_name;
|
||||
auto temporary_directory_lock = getTemporaryPartDirectoryHolder(tmp_dst_part_name);
|
||||
std::pair<MergeTreeData::MutableDataPartPtr, scope_guard> MergeTreeData::cloneAndLoadPartOnSameDiskWithDifferentPartitionKey(
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const MergeTreePartition & new_partition,
|
||||
const String & partition_id,
|
||||
const IMergeTreeDataPart::MinMaxIndex & min_max_index,
|
||||
const String & tmp_part_prefix,
|
||||
const StorageMetadataPtr & my_metadata_snapshot,
|
||||
const IDataPartStorage::ClonePartParams & clone_params,
|
||||
ContextPtr local_context,
|
||||
Int64 min_block,
|
||||
Int64 max_block
|
||||
)
|
||||
{
|
||||
MergeTreePartInfo dst_part_info(partition_id, min_block, max_block, src_part->info.level);
|
||||
|
||||
/// Why it is needed if we only hardlink files?
|
||||
auto reservation = src_part->getDataPartStorage().reserve(src_part->getBytesOnDisk());
|
||||
auto src_part_storage = src_part->getDataPartStoragePtr();
|
||||
return MergeTreeDataPartCloner::cloneWithDistinctPartitionExpression(
|
||||
this,
|
||||
src_part,
|
||||
my_metadata_snapshot,
|
||||
dst_part_info,
|
||||
tmp_part_prefix,
|
||||
local_context->getReadSettings(),
|
||||
local_context->getWriteSettings(),
|
||||
new_partition,
|
||||
min_max_index,
|
||||
false,
|
||||
clone_params);
|
||||
}
|
||||
|
||||
scope_guard src_flushed_tmp_dir_lock;
|
||||
MergeTreeData::MutableDataPartPtr src_flushed_tmp_part;
|
||||
std::pair<MergeTreePartition, IMergeTreeDataPart::MinMaxIndex> MergeTreeData::createPartitionAndMinMaxIndexFromSourcePart(
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
ContextPtr local_context)
|
||||
{
|
||||
const auto & src_data = src_part->storage;
|
||||
|
||||
/// If source part is in memory, flush it to disk and clone it already in on-disk format
|
||||
/// Protect tmp dir from removing by cleanup thread with src_flushed_tmp_dir_lock
|
||||
/// Construct src_flushed_tmp_part in order to delete part with its directory at destructor
|
||||
if (auto src_part_in_memory = asInMemoryPart(src_part))
|
||||
{
|
||||
auto flushed_part_path = *src_part_in_memory->getRelativePathForPrefix(tmp_part_prefix);
|
||||
auto metadata_manager = std::make_shared<PartMetadataManagerOrdinary>(src_part.get());
|
||||
IMergeTreeDataPart::MinMaxIndex min_max_index;
|
||||
|
||||
auto tmp_src_part_file_name = fs::path(tmp_dst_part_name).filename();
|
||||
src_flushed_tmp_dir_lock = src_part->storage.getTemporaryPartDirectoryHolder(tmp_src_part_file_name);
|
||||
min_max_index.load(src_data, metadata_manager);
|
||||
|
||||
auto flushed_part_storage = src_part_in_memory->flushToDisk(flushed_part_path, metadata_snapshot);
|
||||
MergeTreePartition new_partition;
|
||||
|
||||
src_flushed_tmp_part = MergeTreeDataPartBuilder(*this, src_part->name, flushed_part_storage)
|
||||
.withPartInfo(src_part->info)
|
||||
.withPartFormatFromDisk()
|
||||
.build();
|
||||
new_partition.create(metadata_snapshot, min_max_index.getBlock(src_data), 0u, local_context);
|
||||
|
||||
src_flushed_tmp_part->is_temp = true;
|
||||
src_part_storage = flushed_part_storage;
|
||||
}
|
||||
|
||||
String with_copy;
|
||||
if (params.copy_instead_of_hardlink)
|
||||
with_copy = " (copying data)";
|
||||
|
||||
auto dst_part_storage = src_part_storage->freeze(
|
||||
relative_data_path,
|
||||
tmp_dst_part_name,
|
||||
read_settings,
|
||||
write_settings,
|
||||
/* save_metadata_callback= */ {},
|
||||
params);
|
||||
|
||||
if (params.metadata_version_to_write.has_value())
|
||||
{
|
||||
chassert(!params.keep_metadata_version);
|
||||
auto out_metadata = dst_part_storage->writeFile(IMergeTreeDataPart::METADATA_VERSION_FILE_NAME, 4096, getContext()->getWriteSettings());
|
||||
writeText(metadata_snapshot->getMetadataVersion(), *out_metadata);
|
||||
out_metadata->finalize();
|
||||
if (getSettings()->fsync_after_insert)
|
||||
out_metadata->sync();
|
||||
}
|
||||
|
||||
LOG_DEBUG(log, "Clone{} part {} to {}{}",
|
||||
src_flushed_tmp_part ? " flushed" : "",
|
||||
src_part_storage->getFullPath(),
|
||||
std::string(fs::path(dst_part_storage->getFullRootPath()) / tmp_dst_part_name),
|
||||
with_copy);
|
||||
|
||||
auto dst_data_part = MergeTreeDataPartBuilder(*this, dst_part_name, dst_part_storage)
|
||||
.withPartFormatFromDisk()
|
||||
.build();
|
||||
|
||||
if (!params.copy_instead_of_hardlink && params.hardlinked_files)
|
||||
{
|
||||
params.hardlinked_files->source_part_name = src_part->name;
|
||||
params.hardlinked_files->source_table_shared_id = src_part->storage.getTableSharedID();
|
||||
|
||||
for (auto it = src_part->getDataPartStorage().iterate(); it->isValid(); it->next())
|
||||
{
|
||||
if (!params.files_to_copy_instead_of_hardlinks.contains(it->name())
|
||||
&& it->name() != IMergeTreeDataPart::DELETE_ON_DESTROY_MARKER_FILE_NAME_DEPRECATED
|
||||
&& it->name() != IMergeTreeDataPart::TXN_VERSION_METADATA_FILE_NAME)
|
||||
{
|
||||
params.hardlinked_files->hardlinks_from_source_part.insert(it->name());
|
||||
}
|
||||
}
|
||||
|
||||
auto projections = src_part->getProjectionParts();
|
||||
for (const auto & [name, projection_part] : projections)
|
||||
{
|
||||
const auto & projection_storage = projection_part->getDataPartStorage();
|
||||
for (auto it = projection_storage.iterate(); it->isValid(); it->next())
|
||||
{
|
||||
auto file_name_with_projection_prefix = fs::path(projection_storage.getPartDirectory()) / it->name();
|
||||
if (!params.files_to_copy_instead_of_hardlinks.contains(file_name_with_projection_prefix)
|
||||
&& it->name() != IMergeTreeDataPart::DELETE_ON_DESTROY_MARKER_FILE_NAME_DEPRECATED
|
||||
&& it->name() != IMergeTreeDataPart::TXN_VERSION_METADATA_FILE_NAME)
|
||||
{
|
||||
params.hardlinked_files->hardlinks_from_source_part.insert(file_name_with_projection_prefix);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// We should write version metadata on part creation to distinguish it from parts that were created without transaction.
|
||||
TransactionID tid = params.txn ? params.txn->tid : Tx::PrehistoricTID;
|
||||
dst_data_part->version.setCreationTID(tid, nullptr);
|
||||
dst_data_part->storeVersionMetadata();
|
||||
|
||||
dst_data_part->is_temp = true;
|
||||
|
||||
dst_data_part->loadColumnsChecksumsIndexes(require_part_metadata, true);
|
||||
dst_data_part->modification_time = dst_part_storage->getLastModified().epochTime();
|
||||
return std::make_pair(dst_data_part, std::move(temporary_directory_lock));
|
||||
return {new_partition, min_max_index};
|
||||
}
|
||||
|
||||
String MergeTreeData::getFullPathOnDisk(const DiskPtr & disk) const
|
||||
|
@ -231,6 +231,7 @@ public:
|
||||
}
|
||||
};
|
||||
|
||||
|
||||
using DataParts = std::set<DataPartPtr, LessDataPart>;
|
||||
using MutableDataParts = std::set<MutableDataPartPtr, LessDataPart>;
|
||||
using DataPartsVector = std::vector<DataPartPtr>;
|
||||
@ -848,6 +849,23 @@ public:
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings);
|
||||
|
||||
std::pair<MergeTreeData::MutableDataPartPtr, scope_guard> cloneAndLoadPartOnSameDiskWithDifferentPartitionKey(
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const MergeTreePartition & new_partition,
|
||||
const String & partition_id,
|
||||
const IMergeTreeDataPart::MinMaxIndex & min_max_index,
|
||||
const String & tmp_part_prefix,
|
||||
const StorageMetadataPtr & my_metadata_snapshot,
|
||||
const IDataPartStorage::ClonePartParams & clone_params,
|
||||
ContextPtr local_context,
|
||||
Int64 min_block,
|
||||
Int64 max_block);
|
||||
|
||||
static std::pair<MergeTreePartition, IMergeTreeDataPart::MinMaxIndex> createPartitionAndMinMaxIndexFromSourcePart(
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
ContextPtr local_context);
|
||||
|
||||
virtual std::vector<MergeTreeMutationStatus> getMutationsStatus() const = 0;
|
||||
|
||||
/// Returns true if table can create new parts with adaptive granularity
|
||||
|
320
src/Storages/MergeTree/MergeTreeDataPartCloner.cpp
Normal file
320
src/Storages/MergeTree/MergeTreeDataPartCloner.cpp
Normal file
@ -0,0 +1,320 @@
|
||||
#include <Interpreters/MergeTreeTransaction.h>
|
||||
#include <Storages/MergeTree/MergeTreeData.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartBuilder.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartCloner.h>
|
||||
#include <Common/escapeForFileName.h>
|
||||
#include <Common/logger_useful.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int BAD_ARGUMENTS;
|
||||
}
|
||||
|
||||
static Poco::Logger * log = &Poco::Logger::get("MergeTreeDataPartCloner");
|
||||
|
||||
namespace DistinctPartitionExpression
|
||||
{
|
||||
std::unique_ptr<WriteBufferFromFileBase> updatePartitionFile(
|
||||
const MergeTreeData & merge_tree_data,
|
||||
const MergeTreePartition & partition,
|
||||
const MergeTreeData::MutableDataPartPtr & dst_part,
|
||||
IDataPartStorage & storage)
|
||||
{
|
||||
storage.removeFile("partition.dat");
|
||||
// Leverage already implemented MergeTreePartition::store to create & store partition.dat.
|
||||
// Checksum is re-calculated later.
|
||||
return partition.store(merge_tree_data, storage, dst_part->checksums);
|
||||
}
|
||||
|
||||
IMergeTreeDataPart::MinMaxIndex::WrittenFiles updateMinMaxFiles(
|
||||
const MergeTreeData & merge_tree_data,
|
||||
const MergeTreeData::MutableDataPartPtr & dst_part,
|
||||
IDataPartStorage & storage,
|
||||
const StorageMetadataPtr & metadata_snapshot)
|
||||
{
|
||||
for (const auto & column_name : MergeTreeData::getMinMaxColumnsNames(metadata_snapshot->partition_key))
|
||||
{
|
||||
auto file = "minmax_" + escapeForFileName(column_name) + ".idx";
|
||||
storage.removeFile(file);
|
||||
}
|
||||
|
||||
return dst_part->minmax_idx->store(merge_tree_data, storage, dst_part->checksums);
|
||||
}
|
||||
|
||||
void finalizeNewFiles(const std::vector<std::unique_ptr<WriteBufferFromFileBase>> & files, bool sync_new_files)
|
||||
{
|
||||
for (const auto & file : files)
|
||||
{
|
||||
file->finalize();
|
||||
if (sync_new_files)
|
||||
file->sync();
|
||||
}
|
||||
}
|
||||
|
||||
void updateNewPartFiles(
|
||||
const MergeTreeData & merge_tree_data,
|
||||
const MergeTreeData::MutableDataPartPtr & dst_part,
|
||||
const MergeTreePartition & new_partition,
|
||||
const IMergeTreeDataPart::MinMaxIndex & new_min_max_index,
|
||||
const StorageMetadataPtr & src_metadata_snapshot,
|
||||
bool sync_new_files)
|
||||
{
|
||||
auto & storage = dst_part->getDataPartStorage();
|
||||
|
||||
*dst_part->minmax_idx = new_min_max_index;
|
||||
|
||||
auto partition_file = updatePartitionFile(merge_tree_data, new_partition, dst_part, storage);
|
||||
|
||||
auto min_max_files = updateMinMaxFiles(merge_tree_data, dst_part, storage, src_metadata_snapshot);
|
||||
|
||||
IMergeTreeDataPart::MinMaxIndex::WrittenFiles written_files;
|
||||
|
||||
if (partition_file)
|
||||
written_files.emplace_back(std::move(partition_file));
|
||||
|
||||
written_files.insert(written_files.end(), std::make_move_iterator(min_max_files.begin()), std::make_move_iterator(min_max_files.end()));
|
||||
|
||||
finalizeNewFiles(written_files, sync_new_files);
|
||||
|
||||
// MergeTreeDataPartCloner::finalize_part calls IMergeTreeDataPart::loadColumnsChecksumsIndexes, which will re-create
|
||||
// the checksum file if it doesn't exist. Relying on that is cumbersome, but this refactoring is simply a code extraction
|
||||
// with small improvements. It can be further improved in the future.
|
||||
storage.removeFile("checksums.txt");
|
||||
}
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
bool doesStoragePolicyAllowSameDisk(MergeTreeData * merge_tree_data, const MergeTreeData::DataPartPtr & src_part)
|
||||
{
|
||||
for (const DiskPtr & disk : merge_tree_data->getStoragePolicy()->getDisks())
|
||||
if (disk->getName() == src_part->getDataPartStorage().getDiskName())
|
||||
return true;
|
||||
return false;
|
||||
}
|
||||
|
||||
DataPartStoragePtr flushPartStorageToDiskIfInMemory(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const String & tmp_part_prefix,
|
||||
const String & tmp_dst_part_name,
|
||||
scope_guard & src_flushed_tmp_dir_lock,
|
||||
MergeTreeData::MutableDataPartPtr src_flushed_tmp_part)
|
||||
{
|
||||
if (auto src_part_in_memory = asInMemoryPart(src_part))
|
||||
{
|
||||
auto flushed_part_path = src_part_in_memory->getRelativePathForPrefix(tmp_part_prefix);
|
||||
auto tmp_src_part_file_name = fs::path(tmp_dst_part_name).filename();
|
||||
|
||||
src_flushed_tmp_dir_lock = src_part->storage.getTemporaryPartDirectoryHolder(tmp_src_part_file_name);
|
||||
|
||||
auto flushed_part_storage = src_part_in_memory->flushToDisk(*flushed_part_path, metadata_snapshot);
|
||||
|
||||
src_flushed_tmp_part = MergeTreeDataPartBuilder(*merge_tree_data, src_part->name, flushed_part_storage)
|
||||
.withPartInfo(src_part->info)
|
||||
.withPartFormatFromDisk()
|
||||
.build();
|
||||
|
||||
src_flushed_tmp_part->is_temp = true;
|
||||
|
||||
return flushed_part_storage;
|
||||
}
|
||||
|
||||
return src_part->getDataPartStoragePtr();
|
||||
}
|
||||
|
||||
std::shared_ptr<IDataPartStorage> hardlinkAllFiles(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DB::ReadSettings & read_settings,
|
||||
const DB::WriteSettings & write_settings,
|
||||
const DataPartStoragePtr & storage,
|
||||
const String & path,
|
||||
const DB::IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
return storage->freeze(
|
||||
merge_tree_data->getRelativeDataPath(),
|
||||
path,
|
||||
read_settings,
|
||||
write_settings,
|
||||
/*save_metadata_callback=*/{},
|
||||
params);
|
||||
}
|
||||
|
||||
std::pair<MergeTreeData::MutableDataPartPtr, scope_guard> cloneSourcePart(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const MergeTreeData::DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings,
|
||||
const DB::IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
const auto dst_part_name = src_part->getNewName(dst_part_info);
|
||||
|
||||
const auto tmp_dst_part_name = tmp_part_prefix + dst_part_name;
|
||||
|
||||
auto temporary_directory_lock = merge_tree_data->getTemporaryPartDirectoryHolder(tmp_dst_part_name);
|
||||
|
||||
src_part->getDataPartStorage().reserve(src_part->getBytesOnDisk());
|
||||
|
||||
scope_guard src_flushed_tmp_dir_lock;
|
||||
MergeTreeData::MutableDataPartPtr src_flushed_tmp_part;
|
||||
|
||||
auto src_part_storage = flushPartStorageToDiskIfInMemory(
|
||||
merge_tree_data, src_part, metadata_snapshot, tmp_part_prefix, tmp_dst_part_name, src_flushed_tmp_dir_lock, src_flushed_tmp_part);
|
||||
|
||||
auto dst_part_storage = hardlinkAllFiles(merge_tree_data, read_settings, write_settings, src_part_storage, tmp_dst_part_name, params);
|
||||
|
||||
if (params.metadata_version_to_write.has_value())
|
||||
{
|
||||
chassert(!params.keep_metadata_version);
|
||||
auto out_metadata = dst_part_storage->writeFile(
|
||||
IMergeTreeDataPart::METADATA_VERSION_FILE_NAME, 4096, merge_tree_data->getContext()->getWriteSettings());
|
||||
writeText(metadata_snapshot->getMetadataVersion(), *out_metadata);
|
||||
out_metadata->finalize();
|
||||
if (merge_tree_data->getSettings()->fsync_after_insert)
|
||||
out_metadata->sync();
|
||||
}
|
||||
|
||||
LOG_DEBUG(
|
||||
log,
|
||||
"Clone {} part {} to {}{}",
|
||||
src_flushed_tmp_part ? "flushed" : "",
|
||||
src_part_storage->getFullPath(),
|
||||
std::string(fs::path(dst_part_storage->getFullRootPath()) / tmp_dst_part_name),
|
||||
false);
|
||||
|
||||
|
||||
auto part = MergeTreeDataPartBuilder(*merge_tree_data, dst_part_name, dst_part_storage).withPartFormatFromDisk().build();
|
||||
|
||||
return std::make_pair(part, std::move(temporary_directory_lock));
|
||||
}
|
||||
|
||||
void handleHardLinkedParameterFiles(const MergeTreeData::DataPartPtr & src_part, const DB::IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
const auto & hardlinked_files = params.hardlinked_files;
|
||||
|
||||
hardlinked_files->source_part_name = src_part->name;
|
||||
hardlinked_files->source_table_shared_id = src_part->storage.getTableSharedID();
|
||||
|
||||
for (auto it = src_part->getDataPartStorage().iterate(); it->isValid(); it->next())
|
||||
{
|
||||
if (!params.files_to_copy_instead_of_hardlinks.contains(it->name())
|
||||
&& it->name() != IMergeTreeDataPart::DELETE_ON_DESTROY_MARKER_FILE_NAME_DEPRECATED
|
||||
&& it->name() != IMergeTreeDataPart::TXN_VERSION_METADATA_FILE_NAME)
|
||||
{
|
||||
hardlinked_files->hardlinks_from_source_part.insert(it->name());
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
void handleProjections(const MergeTreeData::DataPartPtr & src_part, const DB::IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
auto projections = src_part->getProjectionParts();
|
||||
for (const auto & [name, projection_part] : projections)
|
||||
{
|
||||
const auto & projection_storage = projection_part->getDataPartStorage();
|
||||
for (auto it = projection_storage.iterate(); it->isValid(); it->next())
|
||||
{
|
||||
auto file_name_with_projection_prefix = fs::path(projection_storage.getPartDirectory()) / it->name();
|
||||
if (!params.files_to_copy_instead_of_hardlinks.contains(file_name_with_projection_prefix)
|
||||
&& it->name() != IMergeTreeDataPart::DELETE_ON_DESTROY_MARKER_FILE_NAME_DEPRECATED
|
||||
&& it->name() != IMergeTreeDataPart::TXN_VERSION_METADATA_FILE_NAME)
|
||||
{
|
||||
params.hardlinked_files->hardlinks_from_source_part.insert(file_name_with_projection_prefix);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
MergeTreeData::MutableDataPartPtr finalizePart(
|
||||
const MergeTreeData::MutableDataPartPtr & dst_part, const DB::IDataPartStorage::ClonePartParams & params, bool require_part_metadata)
|
||||
{
|
||||
/// We should write version metadata on part creation to distinguish it from parts that were created without transaction.
|
||||
TransactionID tid = params.txn ? params.txn->tid : Tx::PrehistoricTID;
|
||||
dst_part->version.setCreationTID(tid, nullptr);
|
||||
dst_part->storeVersionMetadata();
|
||||
|
||||
dst_part->is_temp = true;
|
||||
|
||||
dst_part->loadColumnsChecksumsIndexes(require_part_metadata, true);
|
||||
|
||||
dst_part->modification_time = dst_part->getDataPartStorage().getLastModified().epochTime();
|
||||
|
||||
return dst_part;
|
||||
}
|
||||
|
||||
std::pair<MergeTreeDataPartCloner::MutableDataPartPtr, scope_guard> cloneAndHandleHardlinksAndProjections(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings,
|
||||
const IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
if (!doesStoragePolicyAllowSameDisk(merge_tree_data, src_part))
|
||||
throw Exception(
|
||||
ErrorCodes::BAD_ARGUMENTS,
|
||||
"Could not clone and load part {} because disk does not belong to storage policy",
|
||||
quoteString(src_part->getDataPartStorage().getFullPath()));
|
||||
|
||||
auto [destination_part, temporary_directory_lock] = cloneSourcePart(
|
||||
merge_tree_data, src_part, metadata_snapshot, dst_part_info, tmp_part_prefix, read_settings, write_settings, params);
|
||||
|
||||
if (!params.copy_instead_of_hardlink && params.hardlinked_files)
|
||||
{
|
||||
handleHardLinkedParameterFiles(src_part, params);
|
||||
handleProjections(src_part, params);
|
||||
}
|
||||
|
||||
return std::make_pair(destination_part, std::move(temporary_directory_lock));
|
||||
}
|
||||
}
|
||||
|
||||
std::pair<MergeTreeDataPartCloner::MutableDataPartPtr, scope_guard> MergeTreeDataPartCloner::clone(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
bool require_part_metadata,
|
||||
const IDataPartStorage::ClonePartParams & params,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings)
|
||||
{
|
||||
auto [destination_part, temporary_directory_lock] = cloneAndHandleHardlinksAndProjections(
|
||||
merge_tree_data, src_part, metadata_snapshot, dst_part_info, tmp_part_prefix, read_settings, write_settings, params);
|
||||
|
||||
return std::make_pair(finalizePart(destination_part, params, require_part_metadata), std::move(temporary_directory_lock));
|
||||
}
|
||||
|
||||
std::pair<MergeTreeDataPartCloner::MutableDataPartPtr, scope_guard> MergeTreeDataPartCloner::cloneWithDistinctPartitionExpression(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings,
|
||||
const MergeTreePartition & new_partition,
|
||||
const IMergeTreeDataPart::MinMaxIndex & new_min_max_index,
|
||||
bool sync_new_files,
|
||||
const IDataPartStorage::ClonePartParams & params)
|
||||
{
|
||||
auto [destination_part, temporary_directory_lock] = cloneAndHandleHardlinksAndProjections(
|
||||
merge_tree_data, src_part, metadata_snapshot, dst_part_info, tmp_part_prefix, read_settings, write_settings, params);
|
||||
|
||||
DistinctPartitionExpression::updateNewPartFiles(
|
||||
*merge_tree_data, destination_part, new_partition, new_min_max_index, src_part->storage.getInMemoryMetadataPtr(), sync_new_files);
|
||||
|
||||
return std::make_pair(finalizePart(destination_part, params, false), std::move(temporary_directory_lock));
|
||||
}
|
||||
|
||||
}
|
43
src/Storages/MergeTree/MergeTreeDataPartCloner.h
Normal file
43
src/Storages/MergeTree/MergeTreeDataPartCloner.h
Normal file
@ -0,0 +1,43 @@
|
||||
#pragma once
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
struct StorageInMemoryMetadata;
|
||||
using StorageMetadataPtr = std::shared_ptr<const StorageInMemoryMetadata>;
|
||||
struct MergeTreePartition;
|
||||
class IMergeTreeDataPart;
|
||||
|
||||
class MergeTreeDataPartCloner
|
||||
{
|
||||
public:
|
||||
using DataPart = IMergeTreeDataPart;
|
||||
using MutableDataPartPtr = std::shared_ptr<DataPart>;
|
||||
using DataPartPtr = std::shared_ptr<const DataPart>;
|
||||
|
||||
static std::pair<MutableDataPartPtr, scope_guard> clone(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
bool require_part_metadata,
|
||||
const IDataPartStorage::ClonePartParams & params,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings);
|
||||
|
||||
static std::pair<MutableDataPartPtr, scope_guard> cloneWithDistinctPartitionExpression(
|
||||
MergeTreeData * merge_tree_data,
|
||||
const DataPartPtr & src_part,
|
||||
const StorageMetadataPtr & metadata_snapshot,
|
||||
const MergeTreePartInfo & dst_part_info,
|
||||
const String & tmp_part_prefix,
|
||||
const ReadSettings & read_settings,
|
||||
const WriteSettings & write_settings,
|
||||
const MergeTreePartition & new_partition,
|
||||
const IMergeTreeDataPart::MinMaxIndex & new_min_max_index,
|
||||
bool sync_new_files,
|
||||
const IDataPartStorage::ClonePartParams & params);
|
||||
};
|
||||
|
||||
}
|
@ -467,6 +467,45 @@ void MergeTreePartition::create(const StorageMetadataPtr & metadata_snapshot, Bl
|
||||
}
|
||||
}
|
||||
|
||||
void MergeTreePartition::createAndValidateMinMaxPartitionIds(
|
||||
const StorageMetadataPtr & metadata_snapshot, Block block_with_min_max_partition_ids, ContextPtr context)
|
||||
{
|
||||
if (!metadata_snapshot->hasPartitionKey())
|
||||
return;
|
||||
|
||||
auto partition_key_names_and_types = executePartitionByExpression(metadata_snapshot, block_with_min_max_partition_ids, context);
|
||||
value.resize(partition_key_names_and_types.size());
|
||||
|
||||
/// Executing partition_by expression adds new columns to passed block according to partition functions.
|
||||
/// The block is passed by reference and is used afterwards. `moduloLegacy` needs to be substituted back
|
||||
/// with just `modulo`, because it was a temporary substitution.
|
||||
static constexpr std::string_view modulo_legacy_function_name = "moduloLegacy";
|
||||
|
||||
size_t i = 0;
|
||||
for (const auto & element : partition_key_names_and_types)
|
||||
{
|
||||
auto & partition_column = block_with_min_max_partition_ids.getByName(element.name);
|
||||
|
||||
if (element.name.starts_with(modulo_legacy_function_name))
|
||||
partition_column.name.replace(0, modulo_legacy_function_name.size(), "modulo");
|
||||
|
||||
Field extracted_min_partition_id_field;
|
||||
Field extracted_max_partition_id_field;
|
||||
|
||||
partition_column.column->get(0, extracted_min_partition_id_field);
|
||||
partition_column.column->get(1, extracted_max_partition_id_field);
|
||||
|
||||
if (extracted_min_partition_id_field != extracted_max_partition_id_field)
|
||||
{
|
||||
throw Exception(
|
||||
ErrorCodes::INVALID_PARTITION_VALUE,
|
||||
"Can not create the partition. A partition can not contain values that have different partition ids");
|
||||
}
|
||||
|
||||
partition_column.column->get(0u, value[i++]);
|
||||
}
|
||||
}
|
||||
|
||||
NamesAndTypesList MergeTreePartition::executePartitionByExpression(const StorageMetadataPtr & metadata_snapshot, Block & block, ContextPtr context)
|
||||
{
|
||||
auto adjusted_partition_key = adjustPartitionKey(metadata_snapshot, context);
|
||||
|
@ -1,11 +1,12 @@
|
||||
#pragma once
|
||||
|
||||
#include <base/types.h>
|
||||
#include <Core/Field.h>
|
||||
#include <Disks/IDisk.h>
|
||||
#include <IO/WriteBuffer.h>
|
||||
#include <Storages/KeyDescription.h>
|
||||
#include <Storages/MergeTree/IPartMetadataManager.h>
|
||||
#include <Core/Field.h>
|
||||
#include <Storages/MergeTree/PartMetadataManagerOrdinary.h>
|
||||
#include <base/types.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
@ -51,6 +52,11 @@ public:
|
||||
|
||||
void create(const StorageMetadataPtr & metadata_snapshot, Block block, size_t row, ContextPtr context);
|
||||
|
||||
/// Copy of MergeTreePartition::create, but also validates if min max partition keys are equal. If they are different,
|
||||
/// it means the partition can't be created because the data doesn't belong to the same partition.
|
||||
void createAndValidateMinMaxPartitionIds(
|
||||
const StorageMetadataPtr & metadata_snapshot, Block block_with_min_max_partition_ids, ContextPtr context);
|
||||
|
||||
static void appendFiles(const MergeTreeData & storage, Strings & files);
|
||||
|
||||
/// Adjust partition key and execute its expression on block. Return sample block according to used expression.
|
||||
|
@ -0,0 +1,91 @@
|
||||
#include <Interpreters/MonotonicityCheckVisitor.h>
|
||||
#include <Interpreters/getTableExpressions.h>
|
||||
#include <Storages/MergeTree/MergeTreeData.h>
|
||||
#include <Storages/MergeTree/MergeTreePartitionCompatibilityVerifier.h>
|
||||
#include <Storages/MergeTree/MergeTreePartitionGlobalMinMaxIdxCalculator.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
namespace ErrorCodes
|
||||
{
|
||||
extern const int BAD_ARGUMENTS;
|
||||
}
|
||||
|
||||
namespace
|
||||
{
|
||||
bool isDestinationPartitionExpressionMonotonicallyIncreasing(
|
||||
const std::vector<Range> & hyperrectangle, const MergeTreeData & destination_storage)
|
||||
{
|
||||
auto destination_table_metadata = destination_storage.getInMemoryMetadataPtr();
|
||||
|
||||
auto key_description = destination_table_metadata->getPartitionKey();
|
||||
auto definition_ast = key_description.definition_ast->clone();
|
||||
|
||||
auto table_identifier = std::make_shared<ASTIdentifier>(destination_storage.getStorageID().getTableName());
|
||||
auto table_with_columns
|
||||
= TableWithColumnNamesAndTypes{DatabaseAndTableWithAlias(table_identifier), destination_table_metadata->getColumns().getOrdinary()};
|
||||
|
||||
auto expression_list = extractKeyExpressionList(definition_ast);
|
||||
|
||||
MonotonicityCheckVisitor::Data data{{table_with_columns}, destination_storage.getContext(), /*group_by_function_hashes*/ {}};
|
||||
|
||||
for (auto i = 0u; i < expression_list->children.size(); i++)
|
||||
{
|
||||
data.range = hyperrectangle[i];
|
||||
|
||||
MonotonicityCheckVisitor(data).visit(expression_list->children[i]);
|
||||
|
||||
if (!data.monotonicity.is_monotonic || !data.monotonicity.is_positive)
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
bool isExpressionDirectSubsetOf(const ASTPtr source, const ASTPtr destination)
|
||||
{
|
||||
auto source_expression_list = extractKeyExpressionList(source);
|
||||
auto destination_expression_list = extractKeyExpressionList(destination);
|
||||
|
||||
std::unordered_set<std::string> source_columns;
|
||||
|
||||
for (auto i = 0u; i < source_expression_list->children.size(); ++i)
|
||||
source_columns.insert(source_expression_list->children[i]->getColumnName());
|
||||
|
||||
for (auto i = 0u; i < destination_expression_list->children.size(); ++i)
|
||||
if (!source_columns.contains(destination_expression_list->children[i]->getColumnName()))
|
||||
return false;
|
||||
|
||||
return true;
|
||||
}
|
||||
}
|
||||
|
||||
void MergeTreePartitionCompatibilityVerifier::verify(
|
||||
const MergeTreeData & source_storage, const MergeTreeData & destination_storage, const DataPartsVector & source_parts)
|
||||
{
|
||||
const auto source_metadata = source_storage.getInMemoryMetadataPtr();
|
||||
const auto destination_metadata = destination_storage.getInMemoryMetadataPtr();
|
||||
|
||||
const auto source_partition_key_ast = source_metadata->getPartitionKeyAST();
|
||||
const auto destination_partition_key_ast = destination_metadata->getPartitionKeyAST();
|
||||
|
||||
// If destination partition expression columns are a subset of source partition expression columns,
|
||||
// there is no need to check for monotonicity.
|
||||
if (isExpressionDirectSubsetOf(source_partition_key_ast, destination_partition_key_ast))
|
||||
return;
|
||||
|
||||
const auto src_global_min_max_indexes = MergeTreePartitionGlobalMinMaxIdxCalculator::calculate(source_parts, destination_storage);
|
||||
|
||||
assert(!src_global_min_max_indexes.hyperrectangle.empty());
|
||||
|
||||
if (!isDestinationPartitionExpressionMonotonicallyIncreasing(src_global_min_max_indexes.hyperrectangle, destination_storage))
|
||||
throw DB::Exception(ErrorCodes::BAD_ARGUMENTS, "Destination table partition expression is not monotonically increasing");
|
||||
|
||||
MergeTreePartition().createAndValidateMinMaxPartitionIds(
|
||||
destination_storage.getInMemoryMetadataPtr(),
|
||||
src_global_min_max_indexes.getBlock(destination_storage),
|
||||
destination_storage.getContext());
|
||||
}
|
||||
|
||||
}
|
@ -0,0 +1,30 @@
|
||||
#pragma once
|
||||
|
||||
#include <Core/Field.h>
|
||||
#include <Storages/MergeTree/IMergeTreeDataPart.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/*
|
||||
* Verifies that source and destination partitions are compatible.
|
||||
* To be compatible, one of the following criteria must be met:
|
||||
* 1. Destination partition expression columns are a subset of source partition columns; or
|
||||
* 2. Destination partition expression is monotonic on the source global min_max idx Range AND the computer partition id for
|
||||
* the source global min_max idx range is the same.
|
||||
*
|
||||
* If not, an exception is thrown.
|
||||
* */
|
||||
|
||||
class MergeTreePartitionCompatibilityVerifier
|
||||
{
|
||||
public:
|
||||
using DataPart = IMergeTreeDataPart;
|
||||
using DataPartPtr = std::shared_ptr<const DataPart>;
|
||||
using DataPartsVector = std::vector<DataPartPtr>;
|
||||
|
||||
static void
|
||||
verify(const MergeTreeData & source_storage, const MergeTreeData & destination_storage, const DataPartsVector & source_parts);
|
||||
};
|
||||
|
||||
}
|
@ -0,0 +1,25 @@
|
||||
#include <Storages/MergeTree/MergeTreePartitionGlobalMinMaxIdxCalculator.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
IMergeTreeDataPart::MinMaxIndex
|
||||
MergeTreePartitionGlobalMinMaxIdxCalculator::calculate(const DataPartsVector & parts, const MergeTreeData & storage)
|
||||
{
|
||||
IMergeTreeDataPart::MinMaxIndex global_min_max_indexes;
|
||||
|
||||
for (const auto & part : parts)
|
||||
{
|
||||
auto metadata_manager = std::make_shared<PartMetadataManagerOrdinary>(part.get());
|
||||
|
||||
auto local_min_max_index = MergeTreeData::DataPart::MinMaxIndex();
|
||||
|
||||
local_min_max_index.load(storage, metadata_manager);
|
||||
|
||||
global_min_max_indexes.merge(local_min_max_index);
|
||||
}
|
||||
|
||||
return global_min_max_indexes;
|
||||
}
|
||||
|
||||
}
|
@ -0,0 +1,24 @@
|
||||
#pragma once
|
||||
|
||||
#include <utility>
|
||||
|
||||
#include <Core/Field.h>
|
||||
#include <Storages/MergeTree/MergeTreeData.h>
|
||||
|
||||
namespace DB
|
||||
{
|
||||
|
||||
/*
|
||||
* Calculates global min max indexes for a given set of parts on given storage.
|
||||
* */
|
||||
class MergeTreePartitionGlobalMinMaxIdxCalculator
|
||||
{
|
||||
using DataPart = IMergeTreeDataPart;
|
||||
using DataPartPtr = std::shared_ptr<const DataPart>;
|
||||
using DataPartsVector = std::vector<DataPartPtr>;
|
||||
|
||||
public:
|
||||
static IMergeTreeDataPart::MinMaxIndex calculate(const DataPartsVector & parts, const MergeTreeData & storage);
|
||||
};
|
||||
|
||||
}
|
@ -5,9 +5,9 @@
|
||||
#include <optional>
|
||||
#include <ranges>
|
||||
|
||||
#include <base/sort.h>
|
||||
#include <Backups/BackupEntriesCollector.h>
|
||||
#include <Databases/IDatabase.h>
|
||||
#include <IO/copyData.h>
|
||||
#include "Common/Exception.h"
|
||||
#include <Common/MemoryTracker.h>
|
||||
#include <Common/escapeForFileName.h>
|
||||
@ -20,25 +20,30 @@
|
||||
#include <Interpreters/TransactionLog.h>
|
||||
#include <Interpreters/ClusterProxy/executeQuery.h>
|
||||
#include <Interpreters/ClusterProxy/SelectStreamFactory.h>
|
||||
#include <Interpreters/InterpreterAlterQuery.h>
|
||||
#include <Interpreters/InterpreterSelectQueryAnalyzer.h>
|
||||
#include <IO/copyData.h>
|
||||
#include <Parsers/ASTCheckQuery.h>
|
||||
#include <Parsers/ASTFunction.h>
|
||||
#include <Parsers/ASTLiteral.h>
|
||||
#include <Parsers/ASTPartition.h>
|
||||
#include <Parsers/ASTSetQuery.h>
|
||||
#include <Parsers/queryToString.h>
|
||||
#include <Parsers/formatAST.h>
|
||||
#include <Parsers/queryToString.h>
|
||||
#include <Planner/Utils.h>
|
||||
#include <Storages/MergeTree/MergeTreeData.h>
|
||||
#include <Storages/MergeTree/ActiveDataPartSet.h>
|
||||
#include <Storages/AlterCommands.h>
|
||||
#include <Storages/PartitionCommands.h>
|
||||
#include <Storages/MergeTree/MergeTreeSink.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartInMemory.h>
|
||||
#include <Storages/MergeTree/MergePlainMergeTreeTask.h>
|
||||
#include <Storages/MergeTree/PartitionPruner.h>
|
||||
#include <Storages/MergeTree/MergeList.h>
|
||||
#include <Storages/MergeTree/MergePlainMergeTreeTask.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataPartInMemory.h>
|
||||
#include <Storages/MergeTree/MergeTreePartitionCompatibilityVerifier.h>
|
||||
#include <Storages/MergeTree/MergeTreeSink.h>
|
||||
#include <Storages/MergeTree/PartMetadataManagerOrdinary.h>
|
||||
#include <Storages/MergeTree/PartitionPruner.h>
|
||||
#include <Storages/MergeTree/checkDataPart.h>
|
||||
#include <Storages/PartitionCommands.h>
|
||||
#include <base/sort.h>
|
||||
#include <Storages/buildQueryTreeForShard.h>
|
||||
#include <QueryPipeline/Pipe.h>
|
||||
#include <Processors/QueryPlan/QueryPlan.h>
|
||||
#include <Processors/QueryPlan/BuildQueryPipelineSettings.h>
|
||||
@ -2039,41 +2044,73 @@ void StorageMergeTree::replacePartitionFrom(const StoragePtr & source_table, con
|
||||
ProfileEventsScope profile_events_scope;
|
||||
|
||||
MergeTreeData & src_data = checkStructureAndGetMergeTreeData(source_table, source_metadata_snapshot, my_metadata_snapshot);
|
||||
String partition_id = getPartitionIDFromQuery(partition, local_context);
|
||||
String partition_id = src_data.getPartitionIDFromQuery(partition, local_context);
|
||||
|
||||
DataPartsVector src_parts = src_data.getVisibleDataPartsVectorInPartition(local_context, partition_id);
|
||||
|
||||
bool attach_empty_partition = !replace && src_parts.empty();
|
||||
if (attach_empty_partition)
|
||||
return;
|
||||
|
||||
MutableDataPartsVector dst_parts;
|
||||
std::vector<scope_guard> dst_parts_locks;
|
||||
|
||||
static const String TMP_PREFIX = "tmp_replace_from_";
|
||||
|
||||
for (const DataPartPtr & src_part : src_parts)
|
||||
const auto my_partition_expression = my_metadata_snapshot->getPartitionKeyAST();
|
||||
const auto src_partition_expression = source_metadata_snapshot->getPartitionKeyAST();
|
||||
const auto is_partition_exp_different = queryToStringNullable(my_partition_expression) != queryToStringNullable(src_partition_expression);
|
||||
|
||||
if (is_partition_exp_different && !src_parts.empty())
|
||||
MergeTreePartitionCompatibilityVerifier::verify(src_data, /* destination_storage */ *this, src_parts);
|
||||
|
||||
for (DataPartPtr & src_part : src_parts)
|
||||
{
|
||||
if (!canReplacePartition(src_part))
|
||||
throw Exception(ErrorCodes::BAD_ARGUMENTS,
|
||||
"Cannot replace partition '{}' because part '{}' has inconsistent granularity with table",
|
||||
partition_id, src_part->name);
|
||||
|
||||
/// This will generate unique name in scope of current server process.
|
||||
Int64 temp_index = insert_increment.get();
|
||||
MergeTreePartInfo dst_part_info(partition_id, temp_index, temp_index, src_part->info.level);
|
||||
|
||||
IDataPartStorage::ClonePartParams clone_params{.txn = local_context->getCurrentTransaction()};
|
||||
auto [dst_part, part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
src_part,
|
||||
TMP_PREFIX,
|
||||
dst_part_info,
|
||||
my_metadata_snapshot,
|
||||
clone_params,
|
||||
local_context->getReadSettings(),
|
||||
local_context->getWriteSettings());
|
||||
dst_parts.emplace_back(std::move(dst_part));
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
}
|
||||
/// This will generate unique name in scope of current server process.
|
||||
auto index = insert_increment.get();
|
||||
|
||||
/// ATTACH empty part set
|
||||
if (!replace && dst_parts.empty())
|
||||
return;
|
||||
if (is_partition_exp_different)
|
||||
{
|
||||
auto [new_partition, new_min_max_index] = createPartitionAndMinMaxIndexFromSourcePart(
|
||||
src_part, my_metadata_snapshot, local_context);
|
||||
|
||||
auto [dst_part, part_lock] = cloneAndLoadPartOnSameDiskWithDifferentPartitionKey(
|
||||
src_part,
|
||||
new_partition,
|
||||
new_partition.getID(*this),
|
||||
new_min_max_index,
|
||||
TMP_PREFIX,
|
||||
my_metadata_snapshot,
|
||||
clone_params,
|
||||
local_context,
|
||||
index,
|
||||
index);
|
||||
|
||||
dst_parts.emplace_back(std::move(dst_part));
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
}
|
||||
else
|
||||
{
|
||||
MergeTreePartInfo dst_part_info(partition_id, index, index, src_part->info.level);
|
||||
|
||||
auto [dst_part, part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
src_part,
|
||||
TMP_PREFIX,
|
||||
dst_part_info,
|
||||
my_metadata_snapshot,
|
||||
clone_params,
|
||||
local_context->getReadSettings(),
|
||||
local_context->getWriteSettings());
|
||||
dst_parts.emplace_back(std::move(dst_part));
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
}
|
||||
}
|
||||
|
||||
MergeTreePartInfo drop_range;
|
||||
if (replace)
|
||||
|
@ -26,22 +26,21 @@
|
||||
|
||||
#include <base/sort.h>
|
||||
|
||||
#include <Storages/buildQueryTreeForShard.h>
|
||||
#include <Storages/AlterCommands.h>
|
||||
#include <Storages/ColumnsDescription.h>
|
||||
#include <Storages/Freeze.h>
|
||||
#include <Storages/MergeTree/AsyncBlockIDsCache.h>
|
||||
#include <Storages/MergeTree/DataPartStorageOnDiskFull.h>
|
||||
#include <Storages/MergeTree/extractZkPathFromCreateQuery.h>
|
||||
#include <Storages/MergeTree/IMergeTreeDataPart.h>
|
||||
#include <Storages/MergeTree/LeaderElection.h>
|
||||
#include <Storages/MergeTree/MergedBlockOutputStream.h>
|
||||
#include <Storages/MergeTree/MergeFromLogEntryTask.h>
|
||||
#include <Storages/MergeTree/MergeList.h>
|
||||
#include <Storages/MergeTree/MergeTreeBackgroundExecutor.h>
|
||||
#include <Storages/MergeTree/MergeTreeDataFormatVersion.h>
|
||||
#include <Storages/MergeTree/MergeTreePartInfo.h>
|
||||
#include <Storages/MergeTree/MergeTreePartitionCompatibilityVerifier.h>
|
||||
#include <Storages/MergeTree/MergeTreeReaderCompact.h>
|
||||
#include <Storages/MergeTree/MergedBlockOutputStream.h>
|
||||
#include <Storages/MergeTree/MutateFromLogEntryTask.h>
|
||||
#include <Storages/MergeTree/PinnedPartUUIDs.h>
|
||||
#include <Storages/MergeTree/ReplicatedMergeTreeAddress.h>
|
||||
@ -53,9 +52,11 @@
|
||||
#include <Storages/MergeTree/ReplicatedMergeTreeSink.h>
|
||||
#include <Storages/MergeTree/ReplicatedMergeTreeTableMetadata.h>
|
||||
#include <Storages/MergeTree/ZeroCopyLock.h>
|
||||
#include <Storages/MergeTree/extractZkPathFromCreateQuery.h>
|
||||
#include <Storages/PartitionCommands.h>
|
||||
#include <Storages/StorageReplicatedMergeTree.h>
|
||||
#include <Storages/VirtualColumnUtils.h>
|
||||
#include <Storages/buildQueryTreeForShard.h>
|
||||
|
||||
#include <Databases/DatabaseOnDisk.h>
|
||||
#include <Databases/DatabaseReplicated.h>
|
||||
@ -2713,16 +2714,48 @@ bool StorageReplicatedMergeTree::executeReplaceRange(LogEntry & entry)
|
||||
.copy_instead_of_hardlink = storage_settings_ptr->always_use_copy_instead_of_hardlinks || ((our_zero_copy_enabled || source_zero_copy_enabled) && part_desc->src_table_part->isStoredOnRemoteDiskWithZeroCopySupport()),
|
||||
.metadata_version_to_write = metadata_snapshot->getMetadataVersion()
|
||||
};
|
||||
auto [res_part, temporary_part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
part_desc->src_table_part,
|
||||
TMP_PREFIX + "clone_",
|
||||
part_desc->new_part_info,
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
getContext()->getReadSettings(),
|
||||
getContext()->getWriteSettings());
|
||||
part_desc->res_part = std::move(res_part);
|
||||
part_desc->temporary_part_lock = std::move(temporary_part_lock);
|
||||
|
||||
const auto my_partition_expression = metadata_snapshot->getPartitionKeyAST();
|
||||
const auto src_partition_expression = source_table->getInMemoryMetadataPtr()->getPartitionKeyAST();
|
||||
|
||||
const auto is_partition_exp_different = queryToStringNullable(my_partition_expression) != queryToStringNullable(src_partition_expression);
|
||||
|
||||
if (is_partition_exp_different)
|
||||
{
|
||||
auto [new_partition, new_min_max_index] = createPartitionAndMinMaxIndexFromSourcePart(
|
||||
part_desc->src_table_part, metadata_snapshot, getContext());
|
||||
|
||||
auto partition_id = new_partition.getID(*this);
|
||||
|
||||
auto [res_part, temporary_part_lock] = cloneAndLoadPartOnSameDiskWithDifferentPartitionKey(
|
||||
part_desc->src_table_part,
|
||||
new_partition,
|
||||
partition_id,
|
||||
new_min_max_index,
|
||||
TMP_PREFIX + "clone_",
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
getContext(),
|
||||
part_desc->new_part_info.min_block,
|
||||
part_desc->new_part_info.max_block);
|
||||
|
||||
part_desc->res_part = std::move(res_part);
|
||||
part_desc->temporary_part_lock = std::move(temporary_part_lock);
|
||||
}
|
||||
else
|
||||
{
|
||||
auto [res_part, temporary_part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
part_desc->src_table_part,
|
||||
TMP_PREFIX + "clone_",
|
||||
part_desc->new_part_info,
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
getContext()->getReadSettings(),
|
||||
getContext()->getWriteSettings());
|
||||
|
||||
part_desc->res_part = std::move(res_part);
|
||||
part_desc->temporary_part_lock = std::move(temporary_part_lock);
|
||||
}
|
||||
}
|
||||
else if (!part_desc->replica.empty())
|
||||
{
|
||||
@ -7852,11 +7885,22 @@ void StorageReplicatedMergeTree::replacePartitionFrom(
|
||||
ProfileEventsScope profile_events_scope;
|
||||
|
||||
MergeTreeData & src_data = checkStructureAndGetMergeTreeData(source_table, source_metadata_snapshot, metadata_snapshot);
|
||||
String partition_id = getPartitionIDFromQuery(partition, query_context);
|
||||
String partition_id = src_data.getPartitionIDFromQuery(partition, query_context);
|
||||
|
||||
/// NOTE: Some covered parts may be missing in src_all_parts if corresponding log entries are not executed yet.
|
||||
DataPartsVector src_all_parts = src_data.getVisibleDataPartsVectorInPartition(query_context, partition_id);
|
||||
|
||||
bool attach_empty_partition = !replace && src_all_parts.empty();
|
||||
if (attach_empty_partition)
|
||||
return;
|
||||
|
||||
const auto my_partition_expression = metadata_snapshot->getPartitionKeyAST();
|
||||
const auto src_partition_expression = source_metadata_snapshot->getPartitionKeyAST();
|
||||
const auto is_partition_exp_different = queryToStringNullable(my_partition_expression) != queryToStringNullable(src_partition_expression);
|
||||
|
||||
if (is_partition_exp_different && !src_all_parts.empty())
|
||||
MergeTreePartitionCompatibilityVerifier::verify(src_data, /* destination_storage */ *this, src_all_parts);
|
||||
|
||||
LOG_DEBUG(log, "Cloning {} parts", src_all_parts.size());
|
||||
|
||||
static const String TMP_PREFIX = "tmp_replace_from_";
|
||||
@ -7911,6 +7955,18 @@ void StorageReplicatedMergeTree::replacePartitionFrom(
|
||||
"Cannot replace partition '{}' because part '{}"
|
||||
"' has inconsistent granularity with table", partition_id, src_part->name);
|
||||
|
||||
IMergeTreeDataPart::MinMaxIndex min_max_index = *src_part->minmax_idx;
|
||||
MergeTreePartition merge_tree_partition = src_part->partition;
|
||||
|
||||
if (is_partition_exp_different)
|
||||
{
|
||||
auto [new_partition, new_min_max_index] = createPartitionAndMinMaxIndexFromSourcePart(src_part, metadata_snapshot, query_context);
|
||||
|
||||
merge_tree_partition = new_partition;
|
||||
min_max_index = new_min_max_index;
|
||||
partition_id = merge_tree_partition.getID(*this);
|
||||
}
|
||||
|
||||
String hash_hex = src_part->checksums.getTotalChecksumHex();
|
||||
const bool is_duplicated_part = replaced_parts.contains(hash_hex);
|
||||
replaced_parts.insert(hash_hex);
|
||||
@ -7929,27 +7985,52 @@ void StorageReplicatedMergeTree::replacePartitionFrom(
|
||||
continue;
|
||||
}
|
||||
|
||||
UInt64 index = lock->getNumber();
|
||||
MergeTreePartInfo dst_part_info(partition_id, index, index, src_part->info.level);
|
||||
|
||||
bool zero_copy_enabled = storage_settings_ptr->allow_remote_fs_zero_copy_replication
|
||||
|| dynamic_cast<const MergeTreeData *>(source_table.get())->getSettings()->allow_remote_fs_zero_copy_replication;
|
||||
|
||||
UInt64 index = lock->getNumber();
|
||||
|
||||
IDataPartStorage::ClonePartParams clone_params
|
||||
{
|
||||
.copy_instead_of_hardlink = storage_settings_ptr->always_use_copy_instead_of_hardlinks || (zero_copy_enabled && src_part->isStoredOnRemoteDiskWithZeroCopySupport()),
|
||||
.metadata_version_to_write = metadata_snapshot->getMetadataVersion()
|
||||
};
|
||||
auto [dst_part, part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
src_part,
|
||||
TMP_PREFIX,
|
||||
dst_part_info,
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
query_context->getReadSettings(),
|
||||
query_context->getWriteSettings());
|
||||
|
||||
if (is_partition_exp_different)
|
||||
{
|
||||
auto [dst_part, part_lock] = cloneAndLoadPartOnSameDiskWithDifferentPartitionKey(
|
||||
src_part,
|
||||
merge_tree_partition,
|
||||
partition_id,
|
||||
min_max_index,
|
||||
TMP_PREFIX,
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
query_context,
|
||||
index,
|
||||
index);
|
||||
|
||||
dst_parts.emplace_back(dst_part);
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
}
|
||||
else
|
||||
{
|
||||
MergeTreePartInfo dst_part_info(partition_id, index, index, src_part->info.level);
|
||||
|
||||
auto [dst_part, part_lock] = cloneAndLoadDataPartOnSameDisk(
|
||||
src_part,
|
||||
TMP_PREFIX,
|
||||
dst_part_info,
|
||||
metadata_snapshot,
|
||||
clone_params,
|
||||
query_context->getReadSettings(),
|
||||
query_context->getWriteSettings());
|
||||
|
||||
dst_parts.emplace_back(dst_part);
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
}
|
||||
|
||||
src_parts.emplace_back(src_part);
|
||||
dst_parts.emplace_back(dst_part);
|
||||
dst_parts_locks.emplace_back(std::move(part_lock));
|
||||
ephemeral_locks.emplace_back(std::move(*lock));
|
||||
block_id_paths.emplace_back(block_id_path);
|
||||
part_checksums.emplace_back(hash_hex);
|
||||
|
@ -0,0 +1,17 @@
|
||||
<clickhouse>
|
||||
<remote_servers>
|
||||
<test_cluster>
|
||||
<shard>
|
||||
<internal_replication>true</internal_replication>
|
||||
<replica>
|
||||
<host>replica1</host>
|
||||
<port>9000</port>
|
||||
</replica>
|
||||
<replica>
|
||||
<host>replica2</host>
|
||||
<port>9000</port>
|
||||
</replica>
|
||||
</shard>
|
||||
</test_cluster>
|
||||
</remote_servers>
|
||||
</clickhouse>
|
@ -0,0 +1,214 @@
|
||||
import pytest
|
||||
from helpers.cluster import ClickHouseCluster
|
||||
from helpers.test_tools import assert_eq_with_retry
|
||||
|
||||
cluster = ClickHouseCluster(__file__)
|
||||
|
||||
replica1 = cluster.add_instance(
|
||||
"replica1", with_zookeeper=True, main_configs=["configs/remote_servers.xml"]
|
||||
)
|
||||
replica2 = cluster.add_instance(
|
||||
"replica2", with_zookeeper=True, main_configs=["configs/remote_servers.xml"]
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture(scope="module")
|
||||
def start_cluster():
|
||||
try:
|
||||
cluster.start()
|
||||
yield cluster
|
||||
except Exception as ex:
|
||||
print(ex)
|
||||
finally:
|
||||
cluster.shutdown()
|
||||
|
||||
|
||||
def cleanup(nodes):
|
||||
for node in nodes:
|
||||
node.query("DROP TABLE IF EXISTS source SYNC")
|
||||
node.query("DROP TABLE IF EXISTS destination SYNC")
|
||||
|
||||
|
||||
def create_table(node, table_name, replicated):
|
||||
replica = node.name
|
||||
engine = (
|
||||
f"ReplicatedMergeTree('/clickhouse/tables/1/{table_name}', '{replica}')"
|
||||
if replicated
|
||||
else "MergeTree()"
|
||||
)
|
||||
partition_expression = (
|
||||
"toYYYYMMDD(timestamp)" if table_name == "source" else "toYYYYMM(timestamp)"
|
||||
)
|
||||
node.query_with_retry(
|
||||
"""
|
||||
CREATE TABLE {table_name}(timestamp DateTime)
|
||||
ENGINE = {engine}
|
||||
ORDER BY tuple() PARTITION BY {partition_expression}
|
||||
SETTINGS cleanup_delay_period=1, cleanup_delay_period_random_add=1, max_cleanup_delay_period=1;
|
||||
""".format(
|
||||
table_name=table_name,
|
||||
engine=engine,
|
||||
partition_expression=partition_expression,
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
def test_both_replicated(start_cluster):
|
||||
for node in [replica1, replica2]:
|
||||
create_table(node, "source", True)
|
||||
create_table(node, "destination", True)
|
||||
|
||||
replica1.query("INSERT INTO source VALUES ('2010-03-02 02:01:01')")
|
||||
replica1.query("SYSTEM SYNC REPLICA source")
|
||||
replica1.query("SYSTEM SYNC REPLICA destination")
|
||||
replica1.query(
|
||||
f"ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source"
|
||||
)
|
||||
|
||||
assert_eq_with_retry(
|
||||
replica1, f"SELECT * FROM destination", "2010-03-02 02:01:01\n"
|
||||
)
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination",
|
||||
replica2.query(f"SELECT * FROM destination"),
|
||||
)
|
||||
|
||||
cleanup([replica1, replica2])
|
||||
|
||||
|
||||
def test_only_destination_replicated(start_cluster):
|
||||
create_table(replica1, "source", False)
|
||||
create_table(replica1, "destination", True)
|
||||
create_table(replica2, "destination", True)
|
||||
|
||||
replica1.query("INSERT INTO source VALUES ('2010-03-02 02:01:01')")
|
||||
replica1.query("SYSTEM SYNC REPLICA destination")
|
||||
replica1.query(
|
||||
f"ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source"
|
||||
)
|
||||
|
||||
assert_eq_with_retry(
|
||||
replica1, f"SELECT * FROM destination", "2010-03-02 02:01:01\n"
|
||||
)
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination",
|
||||
replica2.query(f"SELECT * FROM destination"),
|
||||
)
|
||||
|
||||
cleanup([replica1, replica2])
|
||||
|
||||
|
||||
def test_both_replicated_partitioned_to_unpartitioned(start_cluster):
|
||||
def create_tables(nodes):
|
||||
for node in nodes:
|
||||
source_engine = (
|
||||
f"ReplicatedMergeTree('/clickhouse/tables/1/source', '{node.name}')"
|
||||
)
|
||||
node.query(
|
||||
"""
|
||||
CREATE TABLE source(timestamp DateTime)
|
||||
ENGINE = {engine}
|
||||
ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp)
|
||||
SETTINGS cleanup_delay_period=1, cleanup_delay_period_random_add=1, max_cleanup_delay_period=1;
|
||||
""".format(
|
||||
engine=source_engine,
|
||||
)
|
||||
)
|
||||
|
||||
destination_engine = f"ReplicatedMergeTree('/clickhouse/tables/1/destination', '{node.name}')"
|
||||
node.query(
|
||||
"""
|
||||
CREATE TABLE destination(timestamp DateTime)
|
||||
ENGINE = {engine}
|
||||
ORDER BY tuple() PARTITION BY tuple()
|
||||
SETTINGS cleanup_delay_period=1, cleanup_delay_period_random_add=1, max_cleanup_delay_period=1;
|
||||
""".format(
|
||||
engine=destination_engine,
|
||||
)
|
||||
)
|
||||
|
||||
create_tables([replica1, replica2])
|
||||
|
||||
replica1.query("INSERT INTO source VALUES ('2010-03-02 02:01:01')")
|
||||
replica1.query("INSERT INTO source VALUES ('2010-03-03 02:01:01')")
|
||||
replica1.query("SYSTEM SYNC REPLICA source")
|
||||
replica1.query("SYSTEM SYNC REPLICA destination")
|
||||
|
||||
replica1.query(
|
||||
f"ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source"
|
||||
)
|
||||
replica1.query(
|
||||
f"ALTER TABLE destination ATTACH PARTITION ID '20100303' FROM source"
|
||||
)
|
||||
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination ORDER BY timestamp",
|
||||
"2010-03-02 02:01:01\n2010-03-03 02:01:01\n",
|
||||
)
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination ORDER BY timestamp",
|
||||
replica2.query(f"SELECT * FROM destination ORDER BY timestamp"),
|
||||
)
|
||||
|
||||
cleanup([replica1, replica2])
|
||||
|
||||
|
||||
def test_both_replicated_different_exp_same_id(start_cluster):
|
||||
def create_tables(nodes):
|
||||
for node in nodes:
|
||||
source_engine = (
|
||||
f"ReplicatedMergeTree('/clickhouse/tables/1/source', '{node.name}')"
|
||||
)
|
||||
node.query(
|
||||
"""
|
||||
CREATE TABLE source(a UInt16,b UInt16,c UInt16,extra UInt64,Path String,Time DateTime,Value Float64,Timestamp Int64,sign Int8)
|
||||
ENGINE = {engine}
|
||||
ORDER BY tuple() PARTITION BY a % 3
|
||||
SETTINGS cleanup_delay_period=1, cleanup_delay_period_random_add=1, max_cleanup_delay_period=1;
|
||||
""".format(
|
||||
engine=source_engine,
|
||||
)
|
||||
)
|
||||
|
||||
destination_engine = f"ReplicatedMergeTree('/clickhouse/tables/1/destination', '{node.name}')"
|
||||
node.query(
|
||||
"""
|
||||
CREATE TABLE destination(a UInt16,b UInt16,c UInt16,extra UInt64,Path String,Time DateTime,Value Float64,Timestamp Int64,sign Int8)
|
||||
ENGINE = {engine}
|
||||
ORDER BY tuple() PARTITION BY a
|
||||
SETTINGS cleanup_delay_period=1, cleanup_delay_period_random_add=1, max_cleanup_delay_period=1;
|
||||
""".format(
|
||||
engine=destination_engine,
|
||||
)
|
||||
)
|
||||
|
||||
create_tables([replica1, replica2])
|
||||
|
||||
replica1.query(
|
||||
"INSERT INTO source (a, b, c, extra, sign) VALUES (1, 5, 9, 1000, 1)"
|
||||
)
|
||||
replica1.query(
|
||||
"INSERT INTO source (a, b, c, extra, sign) VALUES (2, 6, 10, 1000, 1)"
|
||||
)
|
||||
replica1.query("SYSTEM SYNC REPLICA source")
|
||||
replica1.query("SYSTEM SYNC REPLICA destination")
|
||||
|
||||
replica1.query(f"ALTER TABLE destination ATTACH PARTITION 1 FROM source")
|
||||
replica1.query(f"ALTER TABLE destination ATTACH PARTITION 2 FROM source")
|
||||
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination ORDER BY a",
|
||||
"1\t5\t9\t1000\t\t1970-01-01 00:00:00\t0\t0\t1\n2\t6\t10\t1000\t\t1970-01-01 00:00:00\t0\t0\t1\n",
|
||||
)
|
||||
assert_eq_with_retry(
|
||||
replica1,
|
||||
f"SELECT * FROM destination ORDER BY a",
|
||||
replica2.query(f"SELECT * FROM destination ORDER BY a"),
|
||||
)
|
||||
|
||||
cleanup([replica1, replica2])
|
@ -0,0 +1,467 @@
|
||||
-- { echoOn }
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible. Note that even though
|
||||
-- the destination partition expression is more granular, the data would still fall in the same partition. Thus, it is valid
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
20100302
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
20100302
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible for those specific values
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY intDiv(A, 6);
|
||||
CREATE TABLE destination (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY A;
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 1), ('2010-03-02 02:01:03', 1);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '0' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1
|
||||
2010-03-02 02:01:03 1
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1
|
||||
2010-03-02 02:01:03 1
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION 0 FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1
|
||||
2010-03-02 02:01:03 1
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1
|
||||
2010-03-02 02:01:03 1
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1
|
||||
-- Should be allowed because dst partition exp is monot inc and data is not split
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY cityHash64(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '17908065610379824077' from source;
|
||||
SELECT * FROM source ORDER BY productName;
|
||||
mop general
|
||||
rice food
|
||||
spaghetti food
|
||||
SELECT * FROM destination ORDER BY productName;
|
||||
rice food
|
||||
spaghetti food
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
59532f3c39a412a413f0f014c7750a9d
|
||||
59532f3c39a412a413f0f014c7750a9d
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '17908065610379824077' from source;
|
||||
SELECT * FROM source ORDER BY productName;
|
||||
mop general
|
||||
rice food
|
||||
spaghetti food
|
||||
SELECT * FROM destination ORDER BY productName;
|
||||
rice food
|
||||
spaghetti food
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
59532f3c39a412a413f0f014c7750a9d
|
||||
59532f3c39a412a413f0f014c7750a9d
|
||||
-- Should be allowed, extra test case to validate https://github.com/ClickHouse/ClickHouse/pull/39507#issuecomment-1747574133
|
||||
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp Int64) engine=MergeTree ORDER BY (timestamp) PARTITION BY intDiv(timestamp, 86400000);
|
||||
CREATE TABLE destination (timestamp Int64) engine=MergeTree ORDER BY (timestamp) PARTITION BY toYear(toDateTime(intDiv(timestamp, 1000)));
|
||||
INSERT INTO TABLE source VALUES (1267495261123);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '14670' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
1267495261123
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
1267495261123
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
2010
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '14670' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
1267495261123
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
1267495261123
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
2010
|
||||
-- Should be allowed, extra test case to validate https://github.com/ClickHouse/ClickHouse/pull/39507#issuecomment-1747511726
|
||||
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime('UTC'), key Int64, f Float64) engine=MergeTree ORDER BY (key, timestamp) PARTITION BY toYear(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime('UTC'), key Int64, f Float64) engine=MergeTree ORDER BY (key, timestamp) PARTITION BY (intDiv(toUInt32(timestamp),86400));
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01',1,1),('2010-03-02 02:01:01',1,1),('2011-02-02 02:01:03',1,1);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '2010' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1 1
|
||||
2010-03-02 02:01:01 1 1
|
||||
2011-02-02 02:01:03 1 1
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1 1
|
||||
2010-03-02 02:01:01 1 1
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
14670
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '2010' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1 1
|
||||
2010-03-02 02:01:01 1 1
|
||||
2011-02-02 02:01:03 1 1
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 1 1
|
||||
2010-03-02 02:01:01 1 1
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
14670
|
||||
-- Should be allowed, partitioned table to unpartitioned. Since the destination is unpartitioned, parts would ultimately
|
||||
-- fall into the same partition.
|
||||
-- Destination partition by expression is omitted, which causes StorageMetadata::getPartitionKeyAST() to be nullptr.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple();
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
all
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
all
|
||||
-- Same as above, but destination partition by expression is explicitly defined. Test case required to validate that
|
||||
-- partition by tuple() is accepted.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY tuple();
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
all
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
all
|
||||
-- Should be allowed because the destination partition expression columns are a subset of the source partition expression columns
|
||||
-- Columns in this case refer to the expression elements, not to the actual table columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b, c);
|
||||
CREATE TABLE destination (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b);
|
||||
INSERT INTO TABLE source VALUES (1, 2, 3), (1, 2, 4);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1-2-3' FROM source;
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
1 2 4
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1-2
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION (1, 2, 3) from source;
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
1 2 4
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1-2
|
||||
-- Should be allowed because the destination partition expression columns are a subset of the source partition expression columns
|
||||
-- Columns in this case refer to the expression elements, not to the actual table columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b, c);
|
||||
CREATE TABLE destination (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY a;
|
||||
INSERT INTO TABLE source VALUES (1, 2, 3), (1, 2, 4);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1-2-3' FROM source;
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
1 2 4
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION (1, 2, 3) from source;
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
1 2 4
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
1 2 3
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
1
|
||||
-- Should be allowed. Special test case, tricky to explain. First column of source partition expression is
|
||||
-- timestamp, while first column of destination partition expression is `A`. One of the previous implementations
|
||||
-- would not match the columns, which could lead to `timestamp` min max being used to calculate monotonicity of `A`.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (`timestamp` DateTime, `A` Int64) ENGINE = MergeTree PARTITION BY tuple(toYYYYMM(timestamp), intDiv(A, 6)) ORDER BY timestamp;
|
||||
CREATE TABLE destination (`timestamp` DateTime, `A` Int64) ENGINE = MergeTree PARTITION BY A ORDER BY timestamp;
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 5);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003-0' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 5
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 5
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
5
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION (201003, 0) from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 5
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01 5
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
5
|
||||
-- Should be allowed. Destination partition expression contains multiple expressions, but all of them are monotonically
|
||||
-- increasing in the source partition min max indexes.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(A, B) ORDER BY tuple();
|
||||
CREATE TABLE destination (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(intDiv(A, 2), intDiv(B, 2)) ORDER BY tuple();
|
||||
INSERT INTO TABLE source VALUES (6, 12);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '6-12' FROM source;
|
||||
SELECT * FROM source ORDER BY A;
|
||||
6 12
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
6 12
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
3-6
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION (6, 12) from source;
|
||||
SELECT * FROM source ORDER BY A;
|
||||
6 12
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
6 12
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
3-6
|
||||
-- Should be allowed. The same scenario as above, but partition expressions inverted.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(intDiv(A, 2), intDiv(B, 2)) ORDER BY tuple();
|
||||
CREATE TABLE destination (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(A, B) ORDER BY tuple();
|
||||
INSERT INTO TABLE source VALUES (6, 12);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '3-6' FROM source;
|
||||
SELECT * FROM source ORDER BY A;
|
||||
6 12
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
6 12
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
6-12
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION (3, 6) from source;
|
||||
SELECT * FROM source ORDER BY A;
|
||||
6 12
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
6 12
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
6-12
|
||||
-- Should be allowed, it is a local operation, no different than regular attach. Replicated to replicated.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE
|
||||
source(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/source_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMMDD(timestamp)
|
||||
ORDER BY tuple();
|
||||
CREATE TABLE
|
||||
destination(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/destination_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMM(timestamp)
|
||||
ORDER BY tuple();
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
-- Should be allowed, it is a local operation, no different than regular attach. Non replicated to replicated
|
||||
DROP TABLE IF EXISTS source SYNC;
|
||||
DROP TABLE IF EXISTS destination SYNC;
|
||||
CREATE TABLE source(timestamp DateTime) ENGINE = MergeTree() PARTITION BY toYYYYMMDD(timestamp) ORDER BY tuple();
|
||||
CREATE TABLE
|
||||
destination(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/destination_non_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMM(timestamp)
|
||||
ORDER BY tuple();
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
TRUNCATE TABLE destination;
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' from source;
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
2010-03-02 02:01:01
|
||||
2010-03-02 02:01:03
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
201003
|
||||
-- Should not be allowed because data would be split into two different partitions
|
||||
DROP TABLE IF EXISTS source SYNC;
|
||||
DROP TABLE IF EXISTS destination SYNC;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-03 02:01:03');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source; -- { serverError 248 }
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source; -- { serverError 248 }
|
||||
-- Should not be allowed because data would be split into two different partitions
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY intDiv(A, 6);
|
||||
CREATE TABLE destination (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY A;
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 1), ('2010-03-02 02:01:03', 2);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '0' FROM source; -- { serverError 248 }
|
||||
ALTER TABLE destination ATTACH PARTITION 0 FROM source; -- { serverError 248 }
|
||||
-- Should not be allowed because dst partition exp takes more than two arguments, so it's not considered monotonically inc
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY substring(category, 1, 2);
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4590ba78048910b74a47d5bfb308abed' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'food' from source; -- { serverError 36 }
|
||||
-- Should not be allowed because dst partition exp depends on a different set of columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(productName);
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4590ba78048910b74a47d5bfb308abed' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'food' from source; -- { serverError 36 }
|
||||
-- Should not be allowed because dst partition exp is not monotonically increasing
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (productName String) engine=MergeTree ORDER BY tuple() PARTITION BY left(productName, 2);
|
||||
CREATE TABLE destination (productName String) engine=MergeTree ORDER BY tuple() PARTITION BY cityHash64(productName);
|
||||
INSERT INTO TABLE source VALUES ('bread'), ('mop');
|
||||
INSERT INTO TABLE source VALUES ('broccoli');
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4589453b7ee96ce9de1265bd57674496' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'br' from source; -- { serverError 36 }
|
||||
-- Empty/ non-existent partition, same partition expression. Nothing should happen
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1' FROM source;
|
||||
ALTER TABLE destination ATTACH PARTITION 1 FROM source;
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
-- Empty/ non-existent partition, different partition expression. Nothing should happen
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1' FROM source;
|
||||
ALTER TABLE destination ATTACH PARTITION 1 FROM source;
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
-- Replace instead of attach. Empty/ non-existent partition, same partition expression. Nothing should happen
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
ALTER TABLE destination REPLACE PARTITION '1' FROM source;
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
-- Replace instead of attach. Empty/ non-existent partition to non-empty partition, same partition id.
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (A Int) engine=MergeTree ORDER BY tuple() PARTITION BY A;
|
||||
CREATE TABLE destination (A Int) engine=MergeTree ORDER BY tuple() PARTITION BY A;
|
||||
INSERT INTO TABLE destination VALUES (1);
|
||||
ALTER TABLE destination REPLACE PARTITION '1' FROM source;
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
@ -0,0 +1,485 @@
|
||||
-- { echoOn }
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible. Note that even though
|
||||
-- the destination partition expression is more granular, the data would still fall in the same partition. Thus, it is valid
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed since destination partition expr is monotonically increasing and compatible for those specific values
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY intDiv(A, 6);
|
||||
|
||||
CREATE TABLE destination (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY A;
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 1), ('2010-03-02 02:01:03', 1);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '0' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION 0 FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed because dst partition exp is monot inc and data is not split
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY cityHash64(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '17908065610379824077' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY productName;
|
||||
SELECT * FROM destination ORDER BY productName;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '17908065610379824077' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY productName;
|
||||
SELECT * FROM destination ORDER BY productName;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed, extra test case to validate https://github.com/ClickHouse/ClickHouse/pull/39507#issuecomment-1747574133
|
||||
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp Int64) engine=MergeTree ORDER BY (timestamp) PARTITION BY intDiv(timestamp, 86400000);
|
||||
CREATE TABLE destination (timestamp Int64) engine=MergeTree ORDER BY (timestamp) PARTITION BY toYear(toDateTime(intDiv(timestamp, 1000)));
|
||||
|
||||
INSERT INTO TABLE source VALUES (1267495261123);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '14670' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '14670' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed, extra test case to validate https://github.com/ClickHouse/ClickHouse/pull/39507#issuecomment-1747511726
|
||||
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime('UTC'), key Int64, f Float64) engine=MergeTree ORDER BY (key, timestamp) PARTITION BY toYear(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime('UTC'), key Int64, f Float64) engine=MergeTree ORDER BY (key, timestamp) PARTITION BY (intDiv(toUInt32(timestamp),86400));
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01',1,1),('2010-03-02 02:01:01',1,1),('2011-02-02 02:01:03',1,1);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '2010' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '2010' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed, partitioned table to unpartitioned. Since the destination is unpartitioned, parts would ultimately
|
||||
-- fall into the same partition.
|
||||
-- Destination partition by expression is omitted, which causes StorageMetadata::getPartitionKeyAST() to be nullptr.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Same as above, but destination partition by expression is explicitly defined. Test case required to validate that
|
||||
-- partition by tuple() is accepted.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed because the destination partition expression columns are a subset of the source partition expression columns
|
||||
-- Columns in this case refer to the expression elements, not to the actual table columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b, c);
|
||||
CREATE TABLE destination (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b);
|
||||
|
||||
INSERT INTO TABLE source VALUES (1, 2, 3), (1, 2, 4);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1-2-3' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION (1, 2, 3) from source;
|
||||
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed because the destination partition expression columns are a subset of the source partition expression columns
|
||||
-- Columns in this case refer to the expression elements, not to the actual table columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE source (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY (a, b, c);
|
||||
CREATE TABLE destination (a Int, b Int, c Int) engine=MergeTree ORDER BY tuple() PARTITION BY a;
|
||||
|
||||
INSERT INTO TABLE source VALUES (1, 2, 3), (1, 2, 4);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1-2-3' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION (1, 2, 3) from source;
|
||||
|
||||
SELECT * FROM source ORDER BY (a, b, c);
|
||||
SELECT * FROM destination ORDER BY (a, b, c);
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed. Special test case, tricky to explain. First column of source partition expression is
|
||||
-- timestamp, while first column of destination partition expression is `A`. One of the previous implementations
|
||||
-- would not match the columns, which could lead to `timestamp` min max being used to calculate monotonicity of `A`.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (`timestamp` DateTime, `A` Int64) ENGINE = MergeTree PARTITION BY tuple(toYYYYMM(timestamp), intDiv(A, 6)) ORDER BY timestamp;
|
||||
CREATE TABLE destination (`timestamp` DateTime, `A` Int64) ENGINE = MergeTree PARTITION BY A ORDER BY timestamp;
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 5);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003-0' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION (201003, 0) from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed. Destination partition expression contains multiple expressions, but all of them are monotonically
|
||||
-- increasing in the source partition min max indexes.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(A, B) ORDER BY tuple();
|
||||
CREATE TABLE destination (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(intDiv(A, 2), intDiv(B, 2)) ORDER BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES (6, 12);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '6-12' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY A;
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION (6, 12) from source;
|
||||
|
||||
SELECT * FROM source ORDER BY A;
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed. The same scenario as above, but partition expressions inverted.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(intDiv(A, 2), intDiv(B, 2)) ORDER BY tuple();
|
||||
CREATE TABLE destination (A Int, B Int) ENGINE = MergeTree PARTITION BY tuple(A, B) ORDER BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES (6, 12);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '3-6' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY A;
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION (3, 6) from source;
|
||||
|
||||
SELECT * FROM source ORDER BY A;
|
||||
SELECT * FROM destination ORDER BY A;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed, it is a local operation, no different than regular attach. Replicated to replicated.
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
CREATE TABLE
|
||||
source(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/source_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMMDD(timestamp)
|
||||
ORDER BY tuple();
|
||||
|
||||
CREATE TABLE
|
||||
destination(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/destination_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMM(timestamp)
|
||||
ORDER BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should be allowed, it is a local operation, no different than regular attach. Non replicated to replicated
|
||||
DROP TABLE IF EXISTS source SYNC;
|
||||
DROP TABLE IF EXISTS destination SYNC;
|
||||
CREATE TABLE source(timestamp DateTime) ENGINE = MergeTree() PARTITION BY toYYYYMMDD(timestamp) ORDER BY tuple();
|
||||
|
||||
CREATE TABLE
|
||||
destination(timestamp DateTime)
|
||||
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test/destination_non_replicated_to_replicated_distinct_expression', '1')
|
||||
PARTITION BY toYYYYMM(timestamp)
|
||||
ORDER BY tuple();
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-02 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '20100302' FROM source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
TRUNCATE TABLE destination;
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION '20100302' from source;
|
||||
|
||||
SELECT * FROM source ORDER BY timestamp;
|
||||
SELECT * FROM destination ORDER BY timestamp;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Should not be allowed because data would be split into two different partitions
|
||||
DROP TABLE IF EXISTS source SYNC;
|
||||
DROP TABLE IF EXISTS destination SYNC;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01'), ('2010-03-03 02:01:03');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '201003' FROM source; -- { serverError 248 }
|
||||
ALTER TABLE destination ATTACH PARTITION '201003' from source; -- { serverError 248 }
|
||||
|
||||
-- Should not be allowed because data would be split into two different partitions
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY intDiv(A, 6);
|
||||
|
||||
CREATE TABLE destination (timestamp DateTime, A Int64) engine=MergeTree ORDER BY timestamp PARTITION BY A;
|
||||
|
||||
INSERT INTO TABLE source VALUES ('2010-03-02 02:01:01', 1), ('2010-03-02 02:01:03', 2);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '0' FROM source; -- { serverError 248 }
|
||||
ALTER TABLE destination ATTACH PARTITION 0 FROM source; -- { serverError 248 }
|
||||
|
||||
-- Should not be allowed because dst partition exp takes more than two arguments, so it's not considered monotonically inc
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY substring(category, 1, 2);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4590ba78048910b74a47d5bfb308abed' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'food' from source; -- { serverError 36 }
|
||||
|
||||
-- Should not be allowed because dst partition exp depends on a different set of columns
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(category);
|
||||
CREATE TABLE destination (productName String, category String) engine=MergeTree ORDER BY tuple() PARTITION BY toString(productName);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('spaghetti', 'food'), ('mop', 'general');
|
||||
INSERT INTO TABLE source VALUES ('rice', 'food');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4590ba78048910b74a47d5bfb308abed' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'food' from source; -- { serverError 36 }
|
||||
|
||||
-- Should not be allowed because dst partition exp is not monotonically increasing
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (productName String) engine=MergeTree ORDER BY tuple() PARTITION BY left(productName, 2);
|
||||
CREATE TABLE destination (productName String) engine=MergeTree ORDER BY tuple() PARTITION BY cityHash64(productName);
|
||||
|
||||
INSERT INTO TABLE source VALUES ('bread'), ('mop');
|
||||
INSERT INTO TABLE source VALUES ('broccoli');
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '4589453b7ee96ce9de1265bd57674496' from source; -- { serverError 36 }
|
||||
ALTER TABLE destination ATTACH PARTITION 'br' from source; -- { serverError 36 }
|
||||
|
||||
-- Empty/ non-existent partition, same partition expression. Nothing should happen
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1' FROM source;
|
||||
ALTER TABLE destination ATTACH PARTITION 1 FROM source;
|
||||
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Empty/ non-existent partition, different partition expression. Nothing should happen
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMMDD(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
|
||||
ALTER TABLE destination ATTACH PARTITION ID '1' FROM source;
|
||||
ALTER TABLE destination ATTACH PARTITION 1 FROM source;
|
||||
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Replace instead of attach. Empty/ non-existent partition, same partition expression. Nothing should happen
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
CREATE TABLE destination (timestamp DateTime) engine=MergeTree ORDER BY tuple() PARTITION BY toYYYYMM(timestamp);
|
||||
|
||||
ALTER TABLE destination REPLACE PARTITION '1' FROM source;
|
||||
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
||||
|
||||
-- Replace instead of attach. Empty/ non-existent partition to non-empty partition, same partition id.
|
||||
-- https://github.com/ClickHouse/ClickHouse/pull/39507#discussion_r1399839045
|
||||
DROP TABLE IF EXISTS source;
|
||||
DROP TABLE IF EXISTS destination;
|
||||
|
||||
CREATE TABLE source (A Int) engine=MergeTree ORDER BY tuple() PARTITION BY A;
|
||||
CREATE TABLE destination (A Int) engine=MergeTree ORDER BY tuple() PARTITION BY A;
|
||||
|
||||
INSERT INTO TABLE destination VALUES (1);
|
||||
|
||||
ALTER TABLE destination REPLACE PARTITION '1' FROM source;
|
||||
|
||||
SELECT * FROM destination;
|
||||
SELECT partition_id FROM system.parts where table='destination' AND database = currentDatabase() AND active = 1;
|
Loading…
Reference in New Issue
Block a user