Merge branch 'master' into keeper-some-improvement2

This commit is contained in:
Antonio Andelic 2024-09-07 11:32:44 +02:00
commit 5a86371b02
53 changed files with 827 additions and 170 deletions

2
contrib/curl vendored

@ -1 +1 @@
Subproject commit de7b3e89218467159a7af72d58cea8425946e97d
Subproject commit 83bedbd730d62b83744cc26fa0433d3f6e2e4cd6

View File

@ -34,7 +34,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.8.3.59"
ARG VERSION="24.8.4.13"
ARG PACKAGES="clickhouse-keeper"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -32,7 +32,7 @@ RUN arch=${TARGETARCH:-amd64} \
# lts / testing / prestable / etc
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="https://packages.clickhouse.com/tgz/${REPO_CHANNEL}"
ARG VERSION="24.8.3.59"
ARG VERSION="24.8.4.13"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
ARG DIRECT_DOWNLOAD_URLS=""

View File

@ -28,7 +28,7 @@ RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list
ARG REPO_CHANNEL="stable"
ARG REPOSITORY="deb [signed-by=/usr/share/keyrings/clickhouse-keyring.gpg] https://packages.clickhouse.com/deb ${REPO_CHANNEL} main"
ARG VERSION="24.8.3.59"
ARG VERSION="24.8.4.13"
ARG PACKAGES="clickhouse-client clickhouse-server clickhouse-common-static"
#docker-official-library:off

View File

@ -0,0 +1,17 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.3.11.7-lts (28795d0a47e) FIXME as compared to v24.3.10.33-lts (37b6502ebf0)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#67479](https://github.com/ClickHouse/ClickHouse/issues/67479): In rare cases ClickHouse could consider parts as broken because of some unexpected projections on disk. Now it's fixed. [#66898](https://github.com/ClickHouse/ClickHouse/pull/66898) ([alesapin](https://github.com/alesapin)).
* Backported in [#69243](https://github.com/ClickHouse/ClickHouse/issues/69243): `UNION` clause in subqueries wasn't handled correctly in queries with parallel replicas and lead to LOGICAL_ERROR `Duplicate announcement received for replica`. [#69146](https://github.com/ClickHouse/ClickHouse/pull/69146) ([Igor Nikonov](https://github.com/devcrafter)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Backported in [#69221](https://github.com/ClickHouse/ClickHouse/issues/69221): Disable memory test with sanitizer. [#69193](https://github.com/ClickHouse/ClickHouse/pull/69193) ([alesapin](https://github.com/alesapin)).

View File

@ -0,0 +1,18 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.5.8.10-stable (f11729638ea) FIXME as compared to v24.5.7.31-stable (6c185e9aec1)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#69295](https://github.com/ClickHouse/ClickHouse/issues/69295): TODO. [#68744](https://github.com/ClickHouse/ClickHouse/pull/68744) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#69245](https://github.com/ClickHouse/ClickHouse/issues/69245): `UNION` clause in subqueries wasn't handled correctly in queries with parallel replicas and lead to LOGICAL_ERROR `Duplicate announcement received for replica`. [#69146](https://github.com/ClickHouse/ClickHouse/pull/69146) ([Igor Nikonov](https://github.com/devcrafter)).
* Fix crash when using `s3` table function with GLOB paths and filters. [#69176](https://github.com/ClickHouse/ClickHouse/pull/69176) ([János Benjamin Antal](https://github.com/antaljanosbenjamin)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Backported in [#69223](https://github.com/ClickHouse/ClickHouse/issues/69223): Disable memory test with sanitizer. [#69193](https://github.com/ClickHouse/ClickHouse/pull/69193) ([alesapin](https://github.com/alesapin)).

View File

@ -0,0 +1,16 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.6.6.6-stable (a4c4580e639) FIXME as compared to v24.6.5.30-stable (e6e196c92d6)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#69197](https://github.com/ClickHouse/ClickHouse/issues/69197): TODO. [#68744](https://github.com/ClickHouse/ClickHouse/pull/68744) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Backported in [#69225](https://github.com/ClickHouse/ClickHouse/issues/69225): Disable memory test with sanitizer. [#69193](https://github.com/ClickHouse/ClickHouse/pull/69193) ([alesapin](https://github.com/alesapin)).

View File

@ -0,0 +1,17 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.7.6.8-stable (7779883593a) FIXME as compared to v24.7.5.37-stable (f2533ca97be)
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#69198](https://github.com/ClickHouse/ClickHouse/issues/69198): TODO. [#68744](https://github.com/ClickHouse/ClickHouse/pull/68744) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#69249](https://github.com/ClickHouse/ClickHouse/issues/69249): `UNION` clause in subqueries wasn't handled correctly in queries with parallel replicas and lead to LOGICAL_ERROR `Duplicate announcement received for replica`. [#69146](https://github.com/ClickHouse/ClickHouse/pull/69146) ([Igor Nikonov](https://github.com/devcrafter)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Backported in [#69227](https://github.com/ClickHouse/ClickHouse/issues/69227): Disable memory test with sanitizer. [#69193](https://github.com/ClickHouse/ClickHouse/pull/69193) ([alesapin](https://github.com/alesapin)).

View File

@ -0,0 +1,22 @@
---
sidebar_position: 1
sidebar_label: 2024
---
# 2024 Changelog
### ClickHouse release v24.8.4.13-lts (53195bc189b) FIXME as compared to v24.8.3.59-lts (e729b9fa40e)
#### Improvement
* Backported in [#68699](https://github.com/ClickHouse/ClickHouse/issues/68699): Delete old code of named collections from dictionaries and substitute it to the new, which allows to use DDL created named collections in dictionaries. Closes [#60936](https://github.com/ClickHouse/ClickHouse/issues/60936), closes [#36890](https://github.com/ClickHouse/ClickHouse/issues/36890). [#68412](https://github.com/ClickHouse/ClickHouse/pull/68412) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### Bug Fix (user-visible misbehavior in an official stable release)
* Backported in [#69231](https://github.com/ClickHouse/ClickHouse/issues/69231): Fix parsing error when null should be inserted as default in some cases during JSON type parsing. [#68955](https://github.com/ClickHouse/ClickHouse/pull/68955) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#69251](https://github.com/ClickHouse/ClickHouse/issues/69251): `UNION` clause in subqueries wasn't handled correctly in queries with parallel replicas and lead to LOGICAL_ERROR `Duplicate announcement received for replica`. [#69146](https://github.com/ClickHouse/ClickHouse/pull/69146) ([Igor Nikonov](https://github.com/devcrafter)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Backported in [#69189](https://github.com/ClickHouse/ClickHouse/issues/69189): Don't create Object type if use_json_alias_for_old_object_type=1 but allow_experimental_object_type=0. [#69150](https://github.com/ClickHouse/ClickHouse/pull/69150) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#69229](https://github.com/ClickHouse/ClickHouse/issues/69229): Disable memory test with sanitizer. [#69193](https://github.com/ClickHouse/ClickHouse/pull/69193) ([alesapin](https://github.com/alesapin)).
* Backported in [#69219](https://github.com/ClickHouse/ClickHouse/issues/69219): Disable perf-like test with sanitizers. [#69194](https://github.com/ClickHouse/ClickHouse/pull/69194) ([alesapin](https://github.com/alesapin)).

View File

@ -989,7 +989,11 @@ ALTER TABLE tab DROP STATISTICS a;
These lightweight statistics aggregate information about distribution of values in columns. Statistics are stored in every part and updated when every insert comes.
They can be used for prewhere optimization only if we enable `set allow_statistics_optimize = 1`.
#### Available Types of Column Statistics {#available-types-of-column-statistics}
### Available Types of Column Statistics {#available-types-of-column-statistics}
- `MinMax`
The minimum and maximum column value which allows to estimate the selectivity of range filters on numeric columns.
- `TDigest`
@ -1003,6 +1007,27 @@ They can be used for prewhere optimization only if we enable `set allow_statisti
[Count-min](https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch) sketches which provide an approximate count of the frequency of each value in a column.
### Supported Data Types {#supported-data-types}
| | (U)Int* | Float* | Decimal(*) | Date* | Boolean | Enum* | (Fixed)String |
|-----------|---------|--------|------------|-------|---------|-------|------------------|
| count_min | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
| MinMax | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✗ |
| TDigest | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✗ |
| Uniq | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ | ✔ |
### Supported Operations {#supported-operations}
| | Equality filters (==) | Range filters (>, >=, <, <=) |
|-----------|-----------------------|------------------------------|
| count_min | ✔ | ✗ |
| MinMax | ✗ | ✔ |
| TDigest | ✗ | ✔ |
| Uniq | ✔ | ✗ |
## Column-level Settings {#column-level-settings}
Certain MergeTree settings can be overridden at column level:

View File

@ -1896,6 +1896,21 @@ void ClientBase::processParsedSingleQuery(const String & full_query, const Strin
/// Temporarily apply query settings to context.
std::optional<Settings> old_settings;
SCOPE_EXIT_SAFE({
try
{
/// We need to park ParallelFormating threads,
/// because they can use settings from global context
/// and it can lead to data race with `setSettings`
resetOutput();
}
catch (...)
{
if (!have_error)
{
client_exception = std::make_unique<Exception>(getCurrentExceptionMessageAndPattern(print_stack_trace), getCurrentExceptionCode());
have_error = true;
}
}
if (old_settings)
client_context->setSettings(*old_settings);
});

View File

@ -781,14 +781,14 @@ InterpreterCreateQuery::TableProperties InterpreterCreateQuery::getTableProperti
const auto & settings = getContext()->getSettingsRef();
if (index_desc.type == FULL_TEXT_INDEX_NAME && !settings.allow_experimental_full_text_index)
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "Experimental full-text index feature is not enabled (the setting 'allow_experimental_full_text_index')");
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "Experimental full-text index feature is disabled. Turn on setting 'allow_experimental_full_text_index'");
/// ----
/// Temporary check during a transition period. Please remove at the end of 2024.
if (index_desc.type == INVERTED_INDEX_NAME && !settings.allow_experimental_inverted_index)
throw Exception(ErrorCodes::ILLEGAL_INDEX, "Please use index type 'full_text' instead of 'inverted'");
/// ----
if (index_desc.type == "vector_similarity" && !settings.allow_experimental_vector_similarity_index)
throw Exception(ErrorCodes::INCORRECT_QUERY, "Vector similarity index is disabled. Turn on allow_experimental_vector_similarity_index");
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "Experimental vector similarity index is disabled. Turn on setting 'allow_experimental_vector_similarity_index'");
properties.indices.push_back(index_desc);
}

View File

@ -198,6 +198,7 @@ FiltersForTableExpressionMap collectFiltersForAnalysis(const QueryTreeNodePtr &
auto & result_query_plan = planner.getQueryPlan();
auto optimization_settings = QueryPlanOptimizationSettings::fromContext(query_context);
optimization_settings.build_sets = false; // no need to build sets to collect filters
result_query_plan.optimize(optimization_settings);
FiltersForTableExpressionMap res;

View File

@ -16,7 +16,7 @@ void optimizeTreeFirstPass(const QueryPlanOptimizationSettings & settings, Query
void optimizeTreeSecondPass(const QueryPlanOptimizationSettings & optimization_settings, QueryPlan::Node & root, QueryPlan::Nodes & nodes);
/// Third pass is used to apply filters such as key conditions and skip indexes to the storages that support them.
/// After that it add CreateSetsStep for the subqueries that has not be used in the filters.
void optimizeTreeThirdPass(QueryPlan & plan, QueryPlan::Node & root, QueryPlan::Nodes & nodes);
void addStepsToBuildSets(QueryPlan & plan, QueryPlan::Node & root, QueryPlan::Nodes & nodes);
/// Optimization (first pass) is a function applied to QueryPlan::Node.
/// It can read and update subtree of specified node.

View File

@ -75,6 +75,8 @@ struct QueryPlanOptimizationSettings
String force_projection_name;
bool optimize_use_implicit_projections = false;
bool build_sets = true;
static QueryPlanOptimizationSettings fromSettings(const Settings & from);
static QueryPlanOptimizationSettings fromContext(ContextPtr from);
};

View File

@ -216,7 +216,7 @@ void optimizeTreeSecondPass(const QueryPlanOptimizationSettings & optimization_s
optimization_settings.force_projection_name);
}
void optimizeTreeThirdPass(QueryPlan & plan, QueryPlan::Node & root, QueryPlan::Nodes & nodes)
void addStepsToBuildSets(QueryPlan & plan, QueryPlan::Node & root, QueryPlan::Nodes & nodes)
{
Stack stack;
stack.push_back({.node = &root});

View File

@ -504,7 +504,8 @@ void QueryPlan::optimize(const QueryPlanOptimizationSettings & optimization_sett
QueryPlanOptimizations::optimizeTreeFirstPass(optimization_settings, *root, nodes);
QueryPlanOptimizations::optimizeTreeSecondPass(optimization_settings, *root, nodes);
QueryPlanOptimizations::optimizeTreeThirdPass(*this, *root, nodes);
if (optimization_settings.build_sets)
QueryPlanOptimizations::addStepsToBuildSets(*this, *root, nodes);
updateDataStreams(*root);
}

View File

@ -1142,6 +1142,16 @@ bool AlterCommands::hasFullTextIndex(const StorageInMemoryMetadata & metadata)
return false;
}
bool AlterCommands::hasVectorSimilarityIndex(const StorageInMemoryMetadata & metadata)
{
for (const auto & index : metadata.secondary_indices)
{
if (index.type == "vector_similarity")
return true;
}
return false;
}
void AlterCommands::apply(StorageInMemoryMetadata & metadata, ContextPtr context) const
{
if (!prepared)

View File

@ -237,6 +237,9 @@ public:
/// Check if commands have any full-text index
static bool hasFullTextIndex(const StorageInMemoryMetadata & metadata);
/// Check if commands have any vector similarity index
static bool hasVectorSimilarityIndex(const StorageInMemoryMetadata & metadata);
};
}

View File

@ -3230,6 +3230,10 @@ void MergeTreeData::checkAlterIsPossible(const AlterCommands & commands, Context
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED,
"Experimental full-text index feature is not enabled (turn on setting 'allow_experimental_full_text_index')");
if (AlterCommands::hasVectorSimilarityIndex(new_metadata) && !settings.allow_experimental_vector_similarity_index)
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED,
"Experimental vector similarity index is disabled (turn on setting 'allow_experimental_vector_similarity_index')");
for (const auto & disk : getDisks())
if (!disk->supportsHardLinks() && !commands.isSettingsAlter() && !commands.isCommentAlter())
throw Exception(

View File

@ -195,7 +195,7 @@ void MergeTreeIndexGranuleVectorSimilarity::serializeBinary(WriteBuffer & ostr)
LOG_TRACE(logger, "Start writing vector similarity index");
if (empty())
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty minmax index {}", backQuote(index_name));
throw Exception(ErrorCodes::LOGICAL_ERROR, "Attempt to write empty vector similarity index {}", backQuote(index_name));
writeIntBinary(FILE_FORMAT_VERSION, ostr);

View File

@ -40,10 +40,12 @@ void extractReferenceVectorFromLiteral(std::vector<Float64> & reference_vector,
}
}
VectorSimilarityCondition::Info::DistanceFunction stringToDistanceFunction(std::string_view distance_function)
VectorSimilarityCondition::Info::DistanceFunction stringToDistanceFunction(const String & distance_function)
{
if (distance_function == "L2Distance")
return VectorSimilarityCondition::Info::DistanceFunction::L2;
else if (distance_function == "cosineDistance")
return VectorSimilarityCondition::Info::DistanceFunction::Cosine;
else
return VectorSimilarityCondition::Info::DistanceFunction::Unknown;
}
@ -57,7 +59,7 @@ VectorSimilarityCondition::VectorSimilarityCondition(const SelectQueryInfo & que
, index_is_useful(checkQueryStructure(query_info))
{}
bool VectorSimilarityCondition::alwaysUnknownOrTrue(String distance_function) const
bool VectorSimilarityCondition::alwaysUnknownOrTrue(const String & distance_function) const
{
if (!index_is_useful)
return true; /// query isn't supported

View File

@ -57,7 +57,8 @@ public:
enum class DistanceFunction : uint8_t
{
Unknown,
L2
L2,
Cosine
};
std::vector<Float64> reference_vector;
@ -68,7 +69,7 @@ public:
};
/// Returns false if query can be speeded up by an ANN index, true otherwise.
bool alwaysUnknownOrTrue(String distance_function) const;
bool alwaysUnknownOrTrue(const String & distance_function) const;
std::vector<Float64> getReferenceVector() const;
size_t getDimensions() const;
@ -141,18 +142,12 @@ private:
/// Traverses the AST of ORDERBY section
void traverseOrderByAST(const ASTPtr & node, RPN & rpn);
/// Returns true and stores ANNExpr if the query has valid WHERE section
static bool matchRPNWhere(RPN & rpn, Info & info);
/// Returns true and stores ANNExpr if the query has valid ORDERBY section
static bool matchRPNOrderBy(RPN & rpn, Info & info);
/// Returns true and stores Length if we have valid LIMIT clause in query
static bool matchRPNLimit(RPNElement & rpn, UInt64 & limit);
/// Matches dist function, reference vector, column name
static bool matchMainParts(RPN::iterator & iter, const RPN::iterator & end, Info & info);
/// Gets float or int from AST node
static float getFloatOrIntLiteralOrPanic(const RPN::iterator& iter);

View File

@ -9,6 +9,7 @@
#include <Storages/ColumnsDescription.h>
#include <Storages/Statistics/ConditionSelectivityEstimator.h>
#include <Storages/Statistics/StatisticsCountMinSketch.h>
#include <Storages/Statistics/StatisticsMinMax.h>
#include <Storages/Statistics/StatisticsTDigest.h>
#include <Storages/Statistics/StatisticsUniq.h>
#include <Storages/StatisticsDescription.h>
@ -101,6 +102,8 @@ Float64 ColumnStatistics::estimateLess(const Field & val) const
{
if (stats.contains(StatisticsType::TDigest))
return stats.at(StatisticsType::TDigest)->estimateLess(val);
if (stats.contains(StatisticsType::MinMax))
return stats.at(StatisticsType::MinMax)->estimateLess(val);
return rows * ConditionSelectivityEstimator::default_cond_range_factor;
}
@ -121,6 +124,14 @@ Float64 ColumnStatistics::estimateEqual(const Field & val) const
if (stats.contains(StatisticsType::CountMinSketch))
return stats.at(StatisticsType::CountMinSketch)->estimateEqual(val);
#endif
if (stats.contains(StatisticsType::Uniq))
{
UInt64 cardinality = stats.at(StatisticsType::Uniq)->estimateCardinality();
if (cardinality == 0 || rows == 0)
return 0;
return 1.0 / cardinality * rows; /// assume uniform distribution
}
return rows * ConditionSelectivityEstimator::default_cond_equal_factor;
}
@ -198,6 +209,9 @@ void MergeTreeStatisticsFactory::registerValidator(StatisticsType stats_type, Va
MergeTreeStatisticsFactory::MergeTreeStatisticsFactory()
{
registerValidator(StatisticsType::MinMax, minMaxStatisticsValidator);
registerCreator(StatisticsType::MinMax, minMaxStatisticsCreator);
registerValidator(StatisticsType::TDigest, tdigestStatisticsValidator);
registerCreator(StatisticsType::TDigest, tdigestStatisticsCreator);
@ -234,7 +248,7 @@ ColumnStatisticsPtr MergeTreeStatisticsFactory::get(const ColumnDescription & co
{
auto it = creators.find(type);
if (it == creators.end())
throw Exception(ErrorCodes::INCORRECT_QUERY, "Unknown statistic type '{}'. Available types: 'tdigest' 'uniq' and 'count_min'", type);
throw Exception(ErrorCodes::INCORRECT_QUERY, "Unknown statistic type '{}'. Available types: 'count_min', 'minmax', 'tdigest' and 'uniq'", type);
auto stat_ptr = (it->second)(desc, column_desc.type);
column_stat->stats[type] = stat_ptr;
}

View File

@ -1,4 +1,3 @@
#include <Storages/Statistics/StatisticsCountMinSketch.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNullable.h>

View File

@ -0,0 +1,86 @@
#include <Storages/Statistics/StatisticsMinMax.h>
#include <DataTypes/DataTypeLowCardinality.h>
#include <DataTypes/DataTypeNullable.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <algorithm>
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_STATISTICS;
}
StatisticsMinMax::StatisticsMinMax(const SingleStatisticsDescription & description, const DataTypePtr & data_type_)
: IStatistics(description)
, data_type(data_type_)
{
}
void StatisticsMinMax::update(const ColumnPtr & column)
{
for (size_t row = 0; row < column->size(); ++row)
{
if (column->isNullAt(row))
continue;
auto value = column->getFloat64(row);
min = std::min(value, min);
max = std::max(value, max);
}
row_count += column->size();
}
void StatisticsMinMax::serialize(WriteBuffer & buf)
{
writeIntBinary(row_count, buf);
writeFloatBinary(min, buf);
writeFloatBinary(max, buf);
}
void StatisticsMinMax::deserialize(ReadBuffer & buf)
{
readIntBinary(row_count, buf);
readFloatBinary(min, buf);
readFloatBinary(max, buf);
}
Float64 StatisticsMinMax::estimateLess(const Field & val) const
{
if (row_count == 0)
return 0;
auto val_as_float = StatisticsUtils::tryConvertToFloat64(val, data_type);
if (!val_as_float.has_value())
return 0;
if (val_as_float < min)
return 0;
if (val_as_float > max)
return row_count;
if (min == max)
return (val_as_float != max) ? 0 : row_count;
return ((*val_as_float - min) / (max - min)) * row_count;
}
void minMaxStatisticsValidator(const SingleStatisticsDescription & /*description*/, const DataTypePtr & data_type)
{
auto inner_data_type = removeNullable(data_type);
inner_data_type = removeLowCardinalityAndNullable(inner_data_type);
if (!inner_data_type->isValueRepresentedByNumber())
throw Exception(ErrorCodes::ILLEGAL_STATISTICS, "Statistics of type 'minmax' do not support type {}", data_type->getName());
}
StatisticsPtr minMaxStatisticsCreator(const SingleStatisticsDescription & description, const DataTypePtr & data_type)
{
return std::make_shared<StatisticsMinMax>(description, data_type);
}
}

View File

@ -0,0 +1,33 @@
#pragma once
#include <Storages/Statistics/Statistics.h>
#include <DataTypes/IDataType.h>
namespace DB
{
class StatisticsMinMax : public IStatistics
{
public:
StatisticsMinMax(const SingleStatisticsDescription & statistics_description, const DataTypePtr & data_type_);
void update(const ColumnPtr & column) override;
void serialize(WriteBuffer & buf) override;
void deserialize(ReadBuffer & buf) override;
Float64 estimateLess(const Field & val) const override;
private:
Float64 min = std::numeric_limits<Float64>::max();
Float64 max = std::numeric_limits<Float64>::min();
UInt64 row_count = 0;
DataTypePtr data_type;
};
void minMaxStatisticsValidator(const SingleStatisticsDescription & description, const DataTypePtr & data_type);
StatisticsPtr minMaxStatisticsCreator(const SingleStatisticsDescription & description, const DataTypePtr & data_type);
}

View File

@ -56,7 +56,7 @@ void uniqStatisticsValidator(const SingleStatisticsDescription & /*description*/
{
DataTypePtr inner_data_type = removeNullable(data_type);
inner_data_type = removeLowCardinalityAndNullable(inner_data_type);
if (!inner_data_type->isValueRepresentedByNumber())
if (!inner_data_type->isValueRepresentedByNumber() && !isStringOrFixedString(inner_data_type))
throw Exception(ErrorCodes::ILLEGAL_STATISTICS, "Statistics of type 'uniq' do not support type {}", data_type->getName());
}

View File

@ -50,7 +50,9 @@ static StatisticsType stringToStatisticsType(String type)
return StatisticsType::Uniq;
if (type == "count_min")
return StatisticsType::CountMinSketch;
throw Exception(ErrorCodes::INCORRECT_QUERY, "Unknown statistics type: {}. Supported statistics types are 'tdigest', 'uniq' and 'count_min'.", type);
if (type == "minmax")
return StatisticsType::MinMax;
throw Exception(ErrorCodes::INCORRECT_QUERY, "Unknown statistics type: {}. Supported statistics types are 'count_min', 'minmax', 'tdigest' and 'uniq'.", type);
}
String SingleStatisticsDescription::getTypeName() const
@ -63,8 +65,10 @@ String SingleStatisticsDescription::getTypeName() const
return "Uniq";
case StatisticsType::CountMinSketch:
return "count_min";
case StatisticsType::MinMax:
return "minmax";
default:
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown statistics type: {}. Supported statistics types are 'tdigest', 'uniq' and 'count_min'.", type);
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unknown statistics type: {}. Supported statistics types are 'count_min', 'minmax', 'tdigest' and 'uniq'.", type);
}
}

View File

@ -14,6 +14,7 @@ enum class StatisticsType : UInt8
TDigest = 0,
Uniq = 1,
CountMinSketch = 2,
MinMax = 3,
Max = 63,
};

View File

@ -410,7 +410,9 @@ class CI:
num_batches=6,
),
JobNames.INTEGRATION_TEST_TSAN: CommonJobConfigs.INTEGRATION_TEST.with_properties(
required_builds=[BuildNames.PACKAGE_TSAN], num_batches=6
required_builds=[BuildNames.PACKAGE_TSAN],
num_batches=6,
timeout=9000, # the job timed out with default value (7200)
),
JobNames.INTEGRATION_TEST_ARM: CommonJobConfigs.INTEGRATION_TEST.with_properties(
required_builds=[BuildNames.PACKAGE_AARCH64],

View File

@ -343,6 +343,13 @@ def test_increment_backup_without_changes():
def test_incremental_backup_overflow():
if (
instance.is_built_with_thread_sanitizer()
or instance.is_built_with_memory_sanitizer()
or instance.is_built_with_address_sanitizer()
):
pytest.skip("The test is slow in builds with sanitizer")
backup_name = new_backup_name()
incremental_backup_name = new_backup_name()

View File

@ -154,6 +154,13 @@ def test_aggregate_states(start_cluster):
def test_string_functions(start_cluster):
if (
upstream.is_built_with_thread_sanitizer()
or upstream.is_built_with_memory_sanitizer()
or upstream.is_built_with_address_sanitizer()
):
pytest.skip("The test is slow in builds with sanitizer")
functions = backward.query(
"""
SELECT if(NOT empty(alias_to), alias_to, name)

View File

@ -5,6 +5,7 @@
<access_key_id>minio</access_key_id>
<secret_access_key>minio123</secret_access_key>
</s3_snapshot>
<use_cluster>false</use_cluster>
<tcp_port>9181</tcp_port>
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>

View File

@ -5,6 +5,7 @@
<access_key_id>minio</access_key_id>
<secret_access_key>minio123</secret_access_key>
</s3_snapshot>
<use_cluster>false</use_cluster>
<tcp_port>9181</tcp_port>
<server_id>2</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>

View File

@ -5,6 +5,7 @@
<access_key_id>minio</access_key_id>
<secret_access_key>minio123</secret_access_key>
</s3_snapshot>
<use_cluster>false</use_cluster>
<tcp_port>9181</tcp_port>
<server_id>3</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>

View File

@ -2,6 +2,9 @@ import pytest
from helpers.cluster import ClickHouseCluster
from time import sleep
from retry import retry
from multiprocessing.dummy import Pool
import helpers.keeper_utils as keeper_utils
from minio.deleteobjects import DeleteObject
from kazoo.client import KazooClient
@ -75,7 +78,18 @@ def wait_node(node):
raise Exception("Can't wait node", node.name, "to become ready")
def delete_keeper_snapshots_logs(nodex):
nodex.exec_in_container(
[
"bash",
"-c",
"rm -rf /var/lib/clickhouse/coordination/log /var/lib/clickhouse/coordination/snapshots",
]
)
def test_s3_upload(started_cluster):
node1_zk = get_fake_zk(node1.name)
# we defined in configs snapshot_distance as 50
@ -89,6 +103,11 @@ def test_s3_upload(started_cluster):
for obj in list(cluster.minio_client.list_objects("snapshots"))
]
def delete_s3_snapshots():
snapshots = cluster.minio_client.list_objects("snapshots")
for s in snapshots:
cluster.minio_client.remove_object("snapshots", s.object_name)
# Keeper sends snapshots asynchornously, hence we need to retry.
@retry(AssertionError, tries=10, delay=2)
def _check_snapshots():
@ -125,3 +144,26 @@ def test_s3_upload(started_cluster):
)
destroy_zk_client(node2_zk)
node2.stop_clickhouse()
delete_keeper_snapshots_logs(node2)
node3.stop_clickhouse()
delete_keeper_snapshots_logs(node3)
delete_keeper_snapshots_logs(node1)
p = Pool(3)
waiters = []
def start_clickhouse(node):
node.start_clickhouse()
waiters.append(p.apply_async(start_clickhouse, args=(node1,)))
waiters.append(p.apply_async(start_clickhouse, args=(node2,)))
waiters.append(p.apply_async(start_clickhouse, args=(node3,)))
delete_s3_snapshots() # for next iteration
for waiter in waiters:
waiter.wait()
keeper_utils.wait_until_connected(cluster, node1)
keeper_utils.wait_until_connected(cluster, node2)
keeper_utils.wait_until_connected(cluster, node3)

View File

@ -629,5 +629,6 @@ def test_roles_cache():
check()
instance.query("DROP USER " + ", ".join(users))
instance.query("DROP ROLE " + ", ".join(roles))
if roles:
instance.query("DROP ROLE " + ", ".join(roles))
instance.query("DROP TABLE tbl")

View File

@ -10,6 +10,8 @@ import string
import ast
import math
from multiprocessing.dummy import Pool
import avro.schema
import avro.io
import avro.datafile
@ -998,7 +1000,11 @@ def test_kafka_formats(kafka_cluster, create_query_generator):
}
topic_postfix = str(hash(create_query_generator))
for format_name, format_opts in list(all_formats.items()):
p = Pool(10)
results = []
def run_for_format(format_name, format_opts):
logging.debug(f"Set up {format_name}")
topic_name = f"format_tests_{format_name}-{topic_postfix}"
data_sample = format_opts["data_sample"]
@ -1034,6 +1040,13 @@ def test_kafka_formats(kafka_cluster, create_query_generator):
),
)
)
for format_name, format_opts in list(all_formats.items()):
results.append(p.apply_async(run_for_format, args=(format_name, format_opts)))
for result in results:
result.get()
raw_expected = """\
0 0 AM 0.5 1 {topic_name} 0 {offset_0}
1 0 AM 0.5 1 {topic_name} 0 {offset_1}
@ -1064,7 +1077,9 @@ def test_kafka_formats(kafka_cluster, create_query_generator):
)
assert result_checker(res)
for format_name, format_opts in list(all_formats.items()):
results = []
def run_for_format2(format_name, format_opts):
logging.debug(("Checking {}".format(format_name)))
topic_name = f"format_tests_{format_name}-{topic_postfix}"
# shift offsets by 1 if format supports empty value
@ -1088,6 +1103,12 @@ def test_kafka_formats(kafka_cluster, create_query_generator):
)
kafka_delete_topic(get_admin_client(kafka_cluster), topic_name)
for format_name, format_opts in list(all_formats.items()):
results.append(p.apply_async(run_for_format2, args=(format_name, format_opts)))
for result in results:
result.get()
# Since everything is async and shaky when receiving messages from Kafka,
# we may want to try and check results multiple times in a loop.
@ -4237,7 +4258,11 @@ def test_kafka_formats_with_broken_message(kafka_cluster, create_query_generator
topic_name_prefix = "format_tests_4_stream_"
topic_name_postfix = get_topic_postfix(create_query_generator)
for format_name, format_opts in list(all_formats.items()):
p = Pool(10)
results = []
def run_for_format(format_name, format_opts):
logging.debug(f"Set up {format_name}")
topic_name = f"{topic_name_prefix}{format_name}{topic_name_postfix}"
data_sample = format_opts["data_sample"]
@ -4278,6 +4303,12 @@ def test_kafka_formats_with_broken_message(kafka_cluster, create_query_generator
"""
)
for format_name, format_opts in list(all_formats.items()):
results.append(p.apply_async(run_for_format, args=(format_name, format_opts)))
for result in results:
result.get()
raw_expected = """\
0 0 AM 0.5 1 {topic_name} 0 {offset_0}
1 0 AM 0.5 1 {topic_name} 0 {offset_1}
@ -4308,7 +4339,9 @@ def test_kafka_formats_with_broken_message(kafka_cluster, create_query_generator
)
assert result_checker(res)
for format_name, format_opts in list(all_formats.items()):
results = []
def run_for_format2(format_name, format_opts):
logging.debug(f"Checking {format_name}")
topic_name = f"{topic_name_prefix}{format_name}{topic_name_postfix}"
# shift offsets by 1 if format supports empty value
@ -4348,6 +4381,12 @@ def test_kafka_formats_with_broken_message(kafka_cluster, create_query_generator
), "Proper error for format: {}".format(format_name)
kafka_delete_topic(admin_client, topic_name)
for format_name, format_opts in list(all_formats.items()):
results.append(p.apply_async(run_for_format2, args=(format_name, format_opts)))
for result in results:
result.get()
@pytest.mark.parametrize(
"create_query_generator",
@ -4831,27 +4870,13 @@ def test_max_rows_per_message(kafka_cluster, create_query_generator):
def test_row_based_formats(kafka_cluster, create_query_generator):
admin_client = get_admin_client(kafka_cluster)
for format_name in [
"TSV",
"TSVWithNamesAndTypes",
"TSKV",
"CSV",
"CSVWithNamesAndTypes",
"CustomSeparatedWithNamesAndTypes",
"Values",
"JSON",
"JSONEachRow",
"JSONCompactEachRow",
"JSONCompactEachRowWithNamesAndTypes",
"JSONObjectEachRow",
"Avro",
"RowBinary",
"RowBinaryWithNamesAndTypes",
"MsgPack",
]:
p = Pool(10)
def run_for_format_name(format_name):
logging.debug("Checking {format_name}")
topic_name = format_name + get_topic_postfix(create_query_generator)
view_name = f"kafka_view_{format_name}"
table_name = f"kafka_{format_name}"
with kafka_topic(admin_client, topic_name):
@ -4870,12 +4895,12 @@ def test_row_based_formats(kafka_cluster, create_query_generator):
instance.query(
f"""
DROP TABLE IF EXISTS test.view;
DROP TABLE IF EXISTS test.{view_name};
DROP TABLE IF EXISTS test.{table_name};
{create_query};
CREATE MATERIALIZED VIEW test.view ENGINE=MergeTree ORDER BY (key, value) AS
CREATE MATERIALIZED VIEW test.{view_name} ENGINE=MergeTree ORDER BY (key, value) AS
SELECT key, value FROM test.{table_name};
INSERT INTO test.{table_name} SELECT number * 10 as key, number * 100 as value FROM numbers({num_rows});
@ -4886,18 +4911,44 @@ def test_row_based_formats(kafka_cluster, create_query_generator):
kafka_cluster, topic_name, message_count, need_decode=False
)
assert len(messages) == message_count
assert (
len(messages) == message_count
), f"Invalid message count for {format_name}"
instance.query_with_retry(
"SELECT count() FROM test.view",
f"SELECT count() FROM test.{view_name}",
check_callback=lambda res: int(res) == num_rows,
)
result = instance.query("SELECT * FROM test.view")
result = instance.query(f"SELECT * FROM test.{view_name}")
expected = ""
for i in range(num_rows):
expected += str(i * 10) + "\t" + str(i * 100) + "\n"
assert result == expected
assert result == expected, f"Invalid result for {format_name}"
results = []
for format_name in [
"TSV",
"TSVWithNamesAndTypes",
"TSKV",
"CSV",
"CSVWithNamesAndTypes",
"CustomSeparatedWithNamesAndTypes",
"Values",
"JSON",
"JSONEachRow",
"JSONCompactEachRow",
"JSONCompactEachRowWithNamesAndTypes",
"JSONObjectEachRow",
"Avro",
"RowBinary",
"RowBinaryWithNamesAndTypes",
"MsgPack",
]:
results.append(p.apply_async(run_for_format_name, args=(format_name,)))
for result in results:
result.get()
@pytest.mark.parametrize(
@ -4955,16 +5006,12 @@ def test_block_based_formats_2(kafka_cluster, create_query_generator):
num_rows = 100
message_count = 9
for format_name in [
"JSONColumns",
"Native",
"Arrow",
"Parquet",
"ORC",
"JSONCompactColumns",
]:
p = Pool(10)
def run_for_format_name(format_name):
topic_name = format_name + get_topic_postfix(create_query_generator)
table_name = f"kafka_{format_name}"
view_name = f"kafka_view_{format_name}"
logging.debug(f"Checking format {format_name}")
with kafka_topic(admin_client, topic_name):
create_query = create_query_generator(
@ -4977,12 +5024,12 @@ def test_block_based_formats_2(kafka_cluster, create_query_generator):
instance.query(
f"""
DROP TABLE IF EXISTS test.view;
DROP TABLE IF EXISTS test.{view_name};
DROP TABLE IF EXISTS test.{table_name};
{create_query};
CREATE MATERIALIZED VIEW test.view ENGINE=MergeTree ORDER BY (key, value) AS
CREATE MATERIALIZED VIEW test.{view_name} ENGINE=MergeTree ORDER BY (key, value) AS
SELECT key, value FROM test.{table_name};
INSERT INTO test.{table_name} SELECT number * 10 as key, number * 100 as value FROM numbers({num_rows}) settings max_block_size=12, optimize_trivial_insert_select=0;
@ -4991,22 +5038,38 @@ def test_block_based_formats_2(kafka_cluster, create_query_generator):
messages = kafka_consume_with_retry(
kafka_cluster, topic_name, message_count, need_decode=False
)
assert len(messages) == message_count
assert (
len(messages) == message_count
), f"Invalid message count for {format_name}"
rows = int(
instance.query_with_retry(
"SELECT count() FROM test.view",
f"SELECT count() FROM test.{view_name}",
check_callback=lambda res: int(res) == num_rows,
)
)
assert rows == num_rows
assert rows == num_rows, f"Invalid row count for {format_name}"
result = instance.query("SELECT * FROM test.view ORDER by key")
result = instance.query(f"SELECT * FROM test.{view_name} ORDER by key")
expected = ""
for i in range(num_rows):
expected += str(i * 10) + "\t" + str(i * 100) + "\n"
assert result == expected
assert result == expected, f"Invalid result for {format_name}"
results = []
for format_name in [
"JSONColumns",
"Native",
"Arrow",
"Parquet",
"ORC",
"JSONCompactColumns",
]:
results.append(p.apply_async(run_for_format_name, args=(format_name,)))
for result in results:
result.get()
def test_system_kafka_consumers(kafka_cluster):
@ -5300,6 +5363,54 @@ def test_formats_errors(kafka_cluster):
bootstrap_servers="localhost:{}".format(kafka_cluster.kafka_port)
)
p = Pool(10)
def run_for_format_name(format_name):
with kafka_topic(admin_client, format_name):
table_name = f"kafka_{format_name}"
view_name = f"kafka_view_{format_name}"
instance.query(
f"""
DROP TABLE IF EXISTS test.{view_name};
DROP TABLE IF EXISTS test.{table_name};
CREATE TABLE test.{table_name} (key UInt64, value UInt64)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'kafka1:19092',
kafka_topic_list = '{format_name}',
kafka_group_name = '{format_name}',
kafka_format = '{format_name}',
kafka_max_rows_per_message = 5,
format_template_row='template_row.format',
format_regexp='id: (.+?)',
input_format_with_names_use_header=0,
format_schema='key_value_message:Message';
CREATE MATERIALIZED VIEW test.{view_name} ENGINE=MergeTree ORDER BY (key, value) AS
SELECT key, value FROM test.{table_name};
"""
)
kafka_produce(
kafka_cluster,
format_name,
["Broken message\nBroken message\nBroken message\n"],
)
num_errors = int(
instance.query_with_retry(
f"SELECT length(exceptions.text) from system.kafka_consumers where database = 'test' and table = '{table_name}'",
check_callback=lambda res: int(res) > 0,
)
)
assert num_errors > 0, f"No errors for {format_name}"
instance.query(f"DROP TABLE test.{table_name}")
instance.query(f"DROP TABLE test.{view_name}")
results = []
for format_name in [
"Template",
"Regexp",
@ -5342,48 +5453,10 @@ def test_formats_errors(kafka_cluster):
"HiveText",
"MySQLDump",
]:
with kafka_topic(admin_client, format_name):
table_name = f"kafka_{format_name}"
results.append(p.apply_async(run_for_format_name, args=(format_name,)))
instance.query(
f"""
DROP TABLE IF EXISTS test.view;
DROP TABLE IF EXISTS test.{table_name};
CREATE TABLE test.{table_name} (key UInt64, value UInt64)
ENGINE = Kafka
SETTINGS kafka_broker_list = 'kafka1:19092',
kafka_topic_list = '{format_name}',
kafka_group_name = '{format_name}',
kafka_format = '{format_name}',
kafka_max_rows_per_message = 5,
format_template_row='template_row.format',
format_regexp='id: (.+?)',
input_format_with_names_use_header=0,
format_schema='key_value_message:Message';
CREATE MATERIALIZED VIEW test.view ENGINE=MergeTree ORDER BY (key, value) AS
SELECT key, value FROM test.{table_name};
"""
)
kafka_produce(
kafka_cluster,
format_name,
["Broken message\nBroken message\nBroken message\n"],
)
num_errors = int(
instance.query_with_retry(
f"SELECT length(exceptions.text) from system.kafka_consumers where database = 'test' and table = '{table_name}'",
check_callback=lambda res: int(res) > 0,
)
)
assert num_errors > 0
instance.query(f"DROP TABLE test.{table_name}")
instance.query("DROP TABLE test.view")
for result in results:
result.get()
@pytest.mark.parametrize(

View File

@ -0,0 +1,32 @@
-- Tags: no-fasttest, no-ordinary-database
-- Tests that CREATE TABLE and ADD INDEX respect setting 'allow_experimental_vector_similarity_index'.
DROP TABLE IF EXISTS tab;
-- Test CREATE TABLE
SET allow_experimental_vector_similarity_index = 0;
CREATE TABLE tab (id UInt32, vec Array(Float32), INDEX idx vec TYPE vector_similarity('hnsw', 'L2Distance')) ENGINE = MergeTree ORDER BY tuple(); -- { serverError SUPPORT_IS_DISABLED }
SET allow_experimental_vector_similarity_index = 1;
CREATE TABLE tab (id UInt32, vec Array(Float32), INDEX idx vec TYPE vector_similarity('hnsw', 'L2Distance')) ENGINE = MergeTree ORDER BY tuple();
DROP TABLE tab;
-- Test ADD INDEX
CREATE TABLE tab (id UInt32, vec Array(Float32)) ENGINE = MergeTree ORDER BY tuple();
SET allow_experimental_vector_similarity_index = 0;
ALTER TABLE tab ADD INDEX idx vec TYPE vector_similarity('hnsw', 'L2Distance'); -- { serverError SUPPORT_IS_DISABLED }
SET allow_experimental_vector_similarity_index = 1;
ALTER TABLE tab ADD INDEX idx vec TYPE vector_similarity('hnsw', 'L2Distance');
-- Other index DDL must work regardless of the setting
SET allow_experimental_vector_similarity_index = 0;
ALTER TABLE tab MATERIALIZE INDEX idx;
-- ALTER TABLE tab CLEAR INDEX idx; -- <-- Should work but doesn't w/o enabled setting. Unexpected but not terrible.
ALTER TABLE tab DROP INDEX idx;
DROP TABLE tab;

View File

@ -41,6 +41,21 @@ Special cases
6 [1,9.3] 0.005731362878640178
1 [2,3.2] 0.15200169244542905
7 [5.5,4.7] 0.3503476876550442
Expression (Projection)
Limit (preliminary LIMIT (without OFFSET))
Sorting (Sorting for ORDER BY)
Expression (Before ORDER BY)
ReadFromMergeTree (default.tab)
Indexes:
PrimaryKey
Condition: true
Parts: 1/1
Granules: 4/4
Skip
Name: idx
Description: vector_similarity GRANULARITY 2
Parts: 1/1
Granules: 2/4
-- Setting "max_limit_for_ann_queries"
Expression (Projection)
Limit (preliminary LIMIT (without OFFSET))

View File

@ -63,6 +63,13 @@ FROM tab
ORDER BY cosineDistance(vec, reference_vec)
LIMIT 3;
EXPLAIN indexes = 1
WITH [0.0, 2.0] AS reference_vec
SELECT id, vec, cosineDistance(vec, reference_vec)
FROM tab
ORDER BY cosineDistance(vec, reference_vec)
LIMIT 3;
SELECT '-- Setting "max_limit_for_ann_queries"';
EXPLAIN indexes=1
WITH [0.0, 2.0] as reference_vec

View File

@ -0,0 +1,77 @@
#!/usr/bin/env bash
# Tags: no-shared-merge-tree
# Tag no-shared-merge-tree: depend on events with local disk
CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
${CLICKHOUSE_CLIENT} --query "
DROP TABLE IF EXISTS test;
CREATE TABLE test (key UInt64, val UInt64) engine = MergeTree Order by key PARTITION BY key >= 128;
SET max_block_size = 64, max_insert_block_size = 64, min_insert_block_size_rows = 64;
INSERT INTO test SELECT number AS key, sipHash64(number) AS val FROM numbers(512);
"
${CLICKHOUSE_CLIENT} --query "
SYSTEM FLUSH LOGS;
SELECT
if(count(DISTINCT query_id) == 1, 'Ok', 'Error: ' || toString(count(DISTINCT query_id))),
if(count() == 512 / 64, 'Ok', 'Error: ' || toString(count())), -- 512 rows inserted, 64 rows per block
if(SUM(ProfileEvents['MergeTreeDataWriterRows']) == 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterRows']))),
if(SUM(ProfileEvents['MergeTreeDataWriterUncompressedBytes']) >= 1024, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterUncompressedBytes']))),
if(SUM(ProfileEvents['MergeTreeDataWriterCompressedBytes']) >= 1024, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterCompressedBytes']))),
if(SUM(ProfileEvents['MergeTreeDataWriterBlocks']) >= 8, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterBlocks'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'NewPart';
"
${CLICKHOUSE_CLIENT} --query "OPTIMIZE TABLE test FINAL;"
${CLICKHOUSE_CLIENT} --query "
SYSTEM FLUSH LOGS;
SELECT
if(count() > 2, 'Ok', 'Error: ' || toString(count())),
if(SUM(ProfileEvents['MergedRows']) >= 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergedRows'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'MergeParts';
"
${CLICKHOUSE_CLIENT} --query "
ALTER TABLE test UPDATE val = 0 WHERE key % 2 == 0 SETTINGS mutations_sync = 2
"
# The mutation query may return before the entry is added to the system.part_log table.
# Retry SYSTEM FLUSH LOGS until all entries are fully flushed.
for _ in {1..10}; do
${CLICKHOUSE_CLIENT} --query "SYSTEM FLUSH LOGS"
res=$(${CLICKHOUSE_CLIENT} --query "
SELECT count() FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'MutatePart';"
)
if [[ $res -eq 2 ]]; then
break
fi
sleep 2.0
done
${CLICKHOUSE_CLIENT} --query "
SELECT
if(count() == 2, 'Ok', 'Error: ' || toString(count())),
if(SUM(ProfileEvents['MutatedRows']) == 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MutatedRows']))),
if(SUM(ProfileEvents['FileOpen']) > 1, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['FileOpen'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'MutatePart';
"
${CLICKHOUSE_CLIENT} --query "DROP TABLE test"

View File

@ -1,50 +0,0 @@
DROP TABLE IF EXISTS test;
CREATE TABLE test (key UInt64, val UInt64) engine = MergeTree Order by key PARTITION BY key >= 128;
SET max_block_size = 64, max_insert_block_size = 64, min_insert_block_size_rows = 64;
INSERT INTO test SELECT number AS key, sipHash64(number) AS val FROM numbers(512);
SYSTEM FLUSH LOGS;
SELECT
if(count(DISTINCT query_id) == 1, 'Ok', 'Error: ' || toString(count(DISTINCT query_id))),
if(count() == 512 / 64, 'Ok', 'Error: ' || toString(count())), -- 512 rows inserted, 64 rows per block
if(SUM(ProfileEvents['MergeTreeDataWriterRows']) == 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterRows']))),
if(SUM(ProfileEvents['MergeTreeDataWriterUncompressedBytes']) >= 1024, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterUncompressedBytes']))),
if(SUM(ProfileEvents['MergeTreeDataWriterCompressedBytes']) >= 1024, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterCompressedBytes']))),
if(SUM(ProfileEvents['MergeTreeDataWriterBlocks']) >= 8, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergeTreeDataWriterBlocks'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'NewPart'
;
OPTIMIZE TABLE test FINAL;
SYSTEM FLUSH LOGS;
SELECT
if(count() > 2, 'Ok', 'Error: ' || toString(count())),
if(SUM(ProfileEvents['MergedRows']) >= 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MergedRows'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'MergeParts'
;
ALTER TABLE test UPDATE val = 0 WHERE key % 2 == 0 SETTINGS mutations_sync = 2;
SYSTEM FLUSH LOGS;
SELECT
if(count() == 2, 'Ok', 'Error: ' || toString(count())),
if(SUM(ProfileEvents['MutatedRows']) == 512, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['MutatedRows']))),
if(SUM(ProfileEvents['FileOpen']) > 1, 'Ok', 'Error: ' || toString(SUM(ProfileEvents['FileOpen'])))
FROM system.part_log
WHERE event_time > now() - INTERVAL 10 MINUTE
AND database == currentDatabase() AND table == 'test'
AND event_type == 'MutatePart'
;
DROP TABLE test;

View File

@ -0,0 +1,5 @@
Test create statistics:
CREATE TABLE default.tab\n(\n `a` LowCardinality(Int64) STATISTICS(tdigest, uniq, count_min, minmax),\n `b` LowCardinality(Nullable(String)) STATISTICS(uniq, count_min),\n `c` LowCardinality(Nullable(Int64)) STATISTICS(tdigest, uniq, count_min, minmax),\n `d` DateTime STATISTICS(tdigest, uniq, count_min, minmax),\n `pk` String\n)\nENGINE = MergeTree\nORDER BY pk\nSETTINGS index_granularity = 8192
Test materialize and drop statistics:
CREATE TABLE default.tab\n(\n `a` LowCardinality(Int64),\n `b` LowCardinality(Nullable(String)) STATISTICS(uniq, count_min),\n `c` LowCardinality(Nullable(Int64)),\n `d` DateTime,\n `pk` String\n)\nENGINE = MergeTree\nORDER BY pk\nSETTINGS index_granularity = 8192
CREATE TABLE default.tab\n(\n `a` LowCardinality(Int64),\n `b` LowCardinality(Nullable(String)),\n `c` LowCardinality(Nullable(Int64)),\n `d` DateTime,\n `pk` String\n)\nENGINE = MergeTree\nORDER BY pk\nSETTINGS index_granularity = 8192

View File

@ -0,0 +1,35 @@
-- Tags: no-fasttest
DROP TABLE IF EXISTS tab SYNC;
SET allow_experimental_statistics = 1;
SET allow_statistics_optimize = 1;
SET allow_suspicious_low_cardinality_types=1;
SET mutations_sync = 2;
SELECT 'Test create statistics:';
CREATE TABLE tab
(
a LowCardinality(Int64) STATISTICS(count_min, minmax, tdigest, uniq),
b LowCardinality(Nullable(String)) STATISTICS(count_min, uniq),
c LowCardinality(Nullable(Int64)) STATISTICS(count_min, minmax, tdigest, uniq),
d DateTime STATISTICS(count_min, minmax, tdigest, uniq),
pk String,
) Engine = MergeTree() ORDER BY pk;
INSERT INTO tab select number, number, number, toDateTime(number), generateUUIDv4() FROM system.numbers LIMIT 10000;
SHOW CREATE TABLE tab;
SELECT 'Test materialize and drop statistics:';
ALTER TABLE tab DROP STATISTICS a, b, c, d;
ALTER TABLE tab ADD STATISTICS b TYPE count_min, uniq;
ALTER TABLE tab MATERIALIZE STATISTICS b;
SHOW CREATE TABLE tab;
ALTER TABLE tab DROP STATISTICS b;
SHOW CREATE TABLE tab;
DROP TABLE IF EXISTS tab SYNC;

View File

@ -7,6 +7,7 @@ SET mutations_sync = 1;
DROP TABLE IF EXISTS tab;
SET allow_experimental_statistics = 0;
-- Error case: Can't create statistics when allow_experimental_statistics = 0
CREATE TABLE tab (col Float64 STATISTICS(tdigest)) Engine = MergeTree() ORDER BY tuple(); -- { serverError INCORRECT_QUERY }
@ -46,7 +47,7 @@ CREATE TABLE tab (col Map(UInt64, UInt64) STATISTICS(tdigest)) Engine = MergeTre
CREATE TABLE tab (col UUID STATISTICS(tdigest)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col IPv6 STATISTICS(tdigest)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
-- uniq requires data_type.isValueRepresentedByInteger
-- uniq requires data_type.isValueRepresentedByInteger or (Fixed)String
-- These types work:
CREATE TABLE tab (col UInt8 STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col UInt256 STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
@ -61,9 +62,9 @@ CREATE TABLE tab (col IPv4 STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple
CREATE TABLE tab (col Nullable(UInt8) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col LowCardinality(UInt8) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col LowCardinality(Nullable(UInt8)) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col String STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col FixedString(1) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
-- These types don't work:
CREATE TABLE tab (col String STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col FixedString(1) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Array(Float64) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Tuple(Float64, Float64) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Map(UInt64, UInt64) STATISTICS(uniq)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
@ -94,6 +95,30 @@ CREATE TABLE tab (col Map(UInt64, UInt64) STATISTICS(count_min)) Engine = MergeT
CREATE TABLE tab (col UUID STATISTICS(count_min)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col IPv6 STATISTICS(count_min)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
-- minmax requires data_type.isValueRepresentedByInteger
-- These types work:
CREATE TABLE tab (col UInt8 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col UInt256 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Float32 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Decimal32(3) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Date STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Date32 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col DateTime STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col DateTime64 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Enum('hello', 'world') STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col IPv4 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col Nullable(UInt8) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col LowCardinality(UInt8) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
CREATE TABLE tab (col LowCardinality(Nullable(UInt8)) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); DROP TABLE tab;
-- These types don't work:
CREATE TABLE tab (col String STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col FixedString(1) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Array(Float64) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Tuple(Float64, Float64) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col Map(UInt64, UInt64) STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col UUID STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
CREATE TABLE tab (col IPv6 STATISTICS(minmax)) Engine = MergeTree() ORDER BY tuple(); -- { serverError ILLEGAL_STATISTICS }
-- CREATE TABLE was easy, ALTER is more fun
CREATE TABLE tab
@ -173,6 +198,13 @@ ALTER TABLE tab MODIFY STATISTICS f64 TYPE count_min; ALTER TABLE tab DROP STATI
-- Doesn't work:
ALTER TABLE tab ADD STATISTICS a TYPE count_min; -- { serverError ILLEGAL_STATISTICS }
ALTER TABLE tab MODIFY STATISTICS a TYPE count_min; -- { serverError ILLEGAL_STATISTICS }
-- minmax
-- Works:
ALTER TABLE tab ADD STATISTICS f64 TYPE minmax; ALTER TABLE tab DROP STATISTICS f64;
ALTER TABLE tab MODIFY STATISTICS f64 TYPE minmax; ALTER TABLE tab DROP STATISTICS f64;
-- Doesn't work:
ALTER TABLE tab ADD STATISTICS a TYPE minmax; -- { serverError ILLEGAL_STATISTICS }
ALTER TABLE tab MODIFY STATISTICS a TYPE minmax; -- { serverError ILLEGAL_STATISTICS }
-- Any data type changes on columns with statistics are disallowed, for simplicity even if the new data type is compatible with all existing
-- statistics objects (e.g. tdigest can be created on Float64 and UInt64)

View File

@ -3,10 +3,13 @@ u64 and =
10
10
10
10
0
0
0
0
0
10
10
10
10
@ -16,10 +19,13 @@ u64 and <
70
70
70
70
80
80
80
80
80
70
70
70
70
@ -29,6 +35,8 @@ f64 and =
10
10
10
10
0
0
0
0
@ -37,6 +45,8 @@ f64 and =
10
10
10
10
0
0
0
0
@ -46,6 +56,8 @@ f64 and <
70
70
70
70
80
80
80
80
@ -54,6 +66,8 @@ f64 and <
70
70
70
70
80
80
80
80
@ -63,6 +77,8 @@ dt and =
0
0
0
0
10
10
10
10
@ -72,6 +88,8 @@ dt and <
10000
10000
10000
10000
70
70
70
70
@ -89,6 +107,10 @@ b and =
5000
5000
5000
5000
5000
5000
0
0
0
0
@ -96,3 +118,4 @@ b and =
s and =
10
10
10

View File

@ -12,46 +12,56 @@ CREATE TABLE tab
(
u64 UInt64,
u64_tdigest UInt64 STATISTICS(tdigest),
u64_minmax UInt64 STATISTICS(minmax),
u64_count_min UInt64 STATISTICS(count_min),
u64_uniq UInt64 STATISTICS(uniq),
f64 Float64,
f64_tdigest Float64 STATISTICS(tdigest),
f64_minmax Float64 STATISTICS(minmax),
f64_count_min Float64 STATISTICS(count_min),
f64_uniq Float64 STATISTICS(uniq),
dt DateTime,
dt_tdigest DateTime STATISTICS(tdigest),
dt_minmax DateTime STATISTICS(minmax),
dt_count_min DateTime STATISTICS(count_min),
dt_uniq DateTime STATISTICS(uniq),
b Bool,
b_tdigest Bool STATISTICS(tdigest),
b_minmax Bool STATISTICS(minmax),
b_count_min Bool STATISTICS(count_min),
b_uniq Bool STATISTICS(uniq),
s String,
-- s_tdigest String STATISTICS(tdigest), -- not supported by tdigest
s_count_min String STATISTICS(count_min)
-- s_uniq String STATISTICS(uniq), -- not supported by uniq
-- s_minmax String STATISTICS(minmax), -- not supported by minmax
s_count_min String STATISTICS(count_min),
s_uniq String STATISTICS(uniq)
) Engine = MergeTree() ORDER BY tuple()
SETTINGS min_bytes_for_wide_part = 0;
INSERT INTO tab
-- SELECT number % 10000, number % 1000, -(number % 100) FROM system.numbers LIMIT 10000;
SELECT number % 1000,
SELECT number % 1000, -- u64
number % 1000,
number % 1000,
number % 1000,
number % 1000,
number % 1000, -- f64
number % 1000,
number % 1000,
number % 1000,
number % 1000,
number % 1000, -- dt
number % 1000,
number % 1000,
number % 1000,
number % 1000,
number % 2, -- b
number % 2,
number % 2,
number % 2,
number % 2,
toString(number % 1000),
toString(number % 1000),
toString(number % 1000)
FROM system.numbers LIMIT 10000;
@ -61,21 +71,25 @@ SELECT 'u64 and =';
SELECT count(*) FROM tab WHERE u64 = 7;
SELECT count(*) FROM tab WHERE u64_tdigest = 7;
SELECT count(*) FROM tab WHERE u64_minmax = 7;
SELECT count(*) FROM tab WHERE u64_count_min = 7;
SELECT count(*) FROM tab WHERE u64_uniq = 7;
SELECT count(*) FROM tab WHERE u64 = 7.7;
SELECT count(*) FROM tab WHERE u64_tdigest = 7.7;
SELECT count(*) FROM tab WHERE u64_minmax = 7.7;
SELECT count(*) FROM tab WHERE u64_count_min = 7.7;
SELECT count(*) FROM tab WHERE u64_uniq = 7.7;
SELECT count(*) FROM tab WHERE u64 = '7';
SELECT count(*) FROM tab WHERE u64_tdigest = '7';
SELECT count(*) FROM tab WHERE u64_minmax = '7';
SELECT count(*) FROM tab WHERE u64_count_min = '7';
SELECT count(*) FROM tab WHERE u64_uniq = '7';
SELECT count(*) FROM tab WHERE u64 = '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_tdigest = '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_minmax = '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_count_min = '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_uniq = '7.7'; -- { serverError TYPE_MISMATCH }
@ -83,21 +97,25 @@ SELECT 'u64 and <';
SELECT count(*) FROM tab WHERE u64 < 7;
SELECT count(*) FROM tab WHERE u64_tdigest < 7;
SELECT count(*) FROM tab WHERE u64_minmax < 7;
SELECT count(*) FROM tab WHERE u64_count_min < 7;
SELECT count(*) FROM tab WHERE u64_uniq < 7;
SELECT count(*) FROM tab WHERE u64 < 7.7;
SELECT count(*) FROM tab WHERE u64_tdigest < 7.7;
SELECT count(*) FROM tab WHERE u64_minmax < 7.7;
SELECT count(*) FROM tab WHERE u64_count_min < 7.7;
SELECT count(*) FROM tab WHERE u64_uniq < 7.7;
SELECT count(*) FROM tab WHERE u64 < '7';
SELECT count(*) FROM tab WHERE u64_tdigest < '7';
SELECT count(*) FROM tab WHERE u64_minmax < '7';
SELECT count(*) FROM tab WHERE u64_count_min < '7';
SELECT count(*) FROM tab WHERE u64_uniq < '7';
SELECT count(*) FROM tab WHERE u64 < '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_tdigest < '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_minmax < '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_count_min < '7.7'; -- { serverError TYPE_MISMATCH }
SELECT count(*) FROM tab WHERE u64_uniq < '7.7'; -- { serverError TYPE_MISMATCH }
@ -107,21 +125,25 @@ SELECT 'f64 and =';
SELECT count(*) FROM tab WHERE f64 = 7;
SELECT count(*) FROM tab WHERE f64_tdigest = 7;
SELECT count(*) FROM tab WHERE f64_minmax = 7;
SELECT count(*) FROM tab WHERE f64_count_min = 7;
SELECT count(*) FROM tab WHERE f64_uniq = 7;
SELECT count(*) FROM tab WHERE f64 = 7.7;
SELECT count(*) FROM tab WHERE f64_tdigest = 7.7;
SELECT count(*) FROM tab WHERE f64_minmax = 7.7;
SELECT count(*) FROM tab WHERE f64_count_min = 7.7;
SELECT count(*) FROM tab WHERE f64_uniq = 7.7;
SELECT count(*) FROM tab WHERE f64 = '7';
SELECT count(*) FROM tab WHERE f64_tdigest = '7';
SELECT count(*) FROM tab WHERE f64_minmax = '7';
SELECT count(*) FROM tab WHERE f64_count_min = '7';
SELECT count(*) FROM tab WHERE f64_uniq = '7';
SELECT count(*) FROM tab WHERE f64 = '7.7';
SELECT count(*) FROM tab WHERE f64_tdigest = '7.7';
SELECT count(*) FROM tab WHERE f64_minmax = '7.7';
SELECT count(*) FROM tab WHERE f64_count_min = '7.7';
SELECT count(*) FROM tab WHERE f64_uniq = '7.7';
@ -129,21 +151,25 @@ SELECT 'f64 and <';
SELECT count(*) FROM tab WHERE f64 < 7;
SELECT count(*) FROM tab WHERE f64_tdigest < 7;
SELECT count(*) FROM tab WHERE f64_minmax < 7;
SELECT count(*) FROM tab WHERE f64_count_min < 7;
SELECT count(*) FROM tab WHERE f64_uniq < 7;
SELECT count(*) FROM tab WHERE f64 < 7.7;
SELECT count(*) FROM tab WHERE f64_tdigest < 7.7;
SELECT count(*) FROM tab WHERE f64_minmax < 7.7;
SELECT count(*) FROM tab WHERE f64_count_min < 7.7;
SELECT count(*) FROM tab WHERE f64_uniq < 7.7;
SELECT count(*) FROM tab WHERE f64 < '7';
SELECT count(*) FROM tab WHERE f64_tdigest < '7';
SELECT count(*) FROM tab WHERE f64_minmax < '7';
SELECT count(*) FROM tab WHERE f64_count_min < '7';
SELECT count(*) FROM tab WHERE f64_uniq < '7';
SELECT count(*) FROM tab WHERE f64 < '7.7';
SELECT count(*) FROM tab WHERE f64_tdigest < '7.7';
SELECT count(*) FROM tab WHERE f64_minmax < '7.7';
SELECT count(*) FROM tab WHERE f64_count_min < '7.7';
SELECT count(*) FROM tab WHERE f64_uniq < '7.7';
@ -153,11 +179,13 @@ SELECT 'dt and =';
SELECT count(*) FROM tab WHERE dt = '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_tdigest = '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_minmax = '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_count_min = '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_uniq = '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt = 7;
SELECT count(*) FROM tab WHERE dt_tdigest = 7;
SELECT count(*) FROM tab WHERE dt_minmax = 7;
SELECT count(*) FROM tab WHERE dt_count_min = 7;
SELECT count(*) FROM tab WHERE dt_uniq = 7;
@ -165,11 +193,13 @@ SELECT 'dt and <';
SELECT count(*) FROM tab WHERE dt < '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_tdigest < '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_minmax < '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_count_min < '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt_uniq < '2024-08-08 11:12:13';
SELECT count(*) FROM tab WHERE dt < 7;
SELECT count(*) FROM tab WHERE dt_tdigest < 7;
SELECT count(*) FROM tab WHERE dt_minmax < 7;
SELECT count(*) FROM tab WHERE dt_count_min < 7;
SELECT count(*) FROM tab WHERE dt_uniq < 7;
@ -179,21 +209,25 @@ SELECT 'b and =';
SELECT count(*) FROM tab WHERE b = true;
SELECT count(*) FROM tab WHERE b_tdigest = true;
SELECT count(*) FROM tab WHERE b_minmax = true;
SELECT count(*) FROM tab WHERE b_count_min = true;
SELECT count(*) FROM tab WHERE b_uniq = true;
SELECT count(*) FROM tab WHERE b = 'true';
SELECT count(*) FROM tab WHERE b_tdigest = 'true';
SELECT count(*) FROM tab WHERE b_minmax = 'true';
SELECT count(*) FROM tab WHERE b_count_min = 'true';
SELECT count(*) FROM tab WHERE b_uniq = 'true';
SELECT count(*) FROM tab WHERE b = 1;
SELECT count(*) FROM tab WHERE b_tdigest = 1;
SELECT count(*) FROM tab WHERE b_minmax = 1;
SELECT count(*) FROM tab WHERE b_count_min = 1;
SELECT count(*) FROM tab WHERE b_uniq = 1;
SELECT count(*) FROM tab WHERE b = 1.1;
SELECT count(*) FROM tab WHERE b_tdigest = 1.1;
SELECT count(*) FROM tab WHERE b_minmax = 1.1;
SELECT count(*) FROM tab WHERE b_count_min = 1.1;
SELECT count(*) FROM tab WHERE b_uniq = 1.1;
@ -203,12 +237,14 @@ SELECT 's and =';
SELECT count(*) FROM tab WHERE s = 7; -- { serverError NO_COMMON_TYPE }
-- SELECT count(*) FROM tab WHERE s_tdigest = 7; -- not supported
-- SELECT count(*) FROM tab WHERE s_minmax = 7; -- not supported
SELECT count(*) FROM tab WHERE s_count_min = 7; -- { serverError NO_COMMON_TYPE }
-- SELECT count(*) FROM tab WHERE s_uniq = 7; -- not supported
SELECT count(*) FROM tab WHERE s_uniq = 7; -- { serverError NO_COMMON_TYPE }
SELECT count(*) FROM tab WHERE s = '7';
-- SELECT count(*) FROM tab WHERE s_tdigest = '7'; -- not supported
-- SELECT count(*) FROM tab WHERE s_minmax = '7'; -- not supported
SELECT count(*) FROM tab WHERE s_count_min = '7';
-- SELECT count(*) FROM tab WHERE s_uniq = '7'; -- not supported
SELECT count(*) FROM tab WHERE s_uniq = '7';
DROP TABLE tab;

View File

@ -0,0 +1,18 @@
SELECT
is_initial_query,
count() AS c,
replaceRegexpAll(query, '_data_(\\d+)_(\\d+)', '_data_') AS query
FROM system.query_log
WHERE (event_date >= yesterday()) AND (type = 'QueryFinish') AND (ignore(54, 0, ignore('QueryFinish', 11, toLowCardinality(toLowCardinality(11)), 11, 11, 11), 'QueryFinish', materialize(11), toUInt128(11)) IN (
SELECT query_id
FROM system.query_log
WHERE (current_database = currentDatabase()) AND (event_date >= yesterday()) AND (type = 'QueryFinish') AND (query LIKE '-- Parallel inner query alone%')
))
GROUP BY
is_initial_query,
query
ORDER BY
is_initial_query ASC,
c ASC,
query ASC
SETTINGS allow_experimental_parallel_reading_from_replicas=1, max_parallel_replicas=3, cluster_for_parallel_replicas='test_cluster_one_shard_three_replicas_localhost', parallel_replicas_for_non_replicated_merge_tree=1, parallel_replicas_min_number_of_rows_per_replica=10;

View File

@ -1,16 +1,22 @@
v24.8.4.13-lts 2024-09-06
v24.8.3.59-lts 2024-09-03
v24.8.2.3-lts 2024-08-22
v24.8.1.2684-lts 2024-08-21
v24.7.6.8-stable 2024-09-06
v24.7.5.37-stable 2024-09-03
v24.7.4.51-stable 2024-08-23
v24.7.3.47-stable 2024-09-04
v24.7.3.42-stable 2024-08-08
v24.7.2.13-stable 2024-08-01
v24.7.1.2915-stable 2024-07-30
v24.6.6.6-stable 2024-09-06
v24.6.5.30-stable 2024-09-03
v24.6.4.42-stable 2024-08-23
v24.6.3.95-stable 2024-08-06
v24.6.3.38-stable 2024-09-04
v24.6.2.17-stable 2024-07-05
v24.6.1.4423-stable 2024-07-01
v24.5.8.10-stable 2024-09-06
v24.5.7.31-stable 2024-09-03
v24.5.6.45-stable 2024-08-23
v24.5.5.78-stable 2024-08-05
@ -22,6 +28,7 @@ v24.4.4.113-stable 2024-08-02
v24.4.3.25-stable 2024-06-14
v24.4.2.141-stable 2024-06-07
v24.4.1.2088-stable 2024-05-01
v24.3.11.7-lts 2024-09-06
v24.3.10.33-lts 2024-09-03
v24.3.9.5-lts 2024-08-22
v24.3.8.13-lts 2024-08-20

1 v24.8.3.59-lts v24.8.4.13-lts 2024-09-03 2024-09-06
1 v24.8.4.13-lts 2024-09-06
2 v24.8.3.59-lts v24.8.3.59-lts 2024-09-03 2024-09-03
3 v24.8.2.3-lts v24.8.2.3-lts 2024-08-22 2024-08-22
4 v24.8.1.2684-lts v24.8.1.2684-lts 2024-08-21 2024-08-21
5 v24.7.6.8-stable 2024-09-06
6 v24.7.5.37-stable v24.7.5.37-stable 2024-09-03 2024-09-03
7 v24.7.4.51-stable v24.7.4.51-stable 2024-08-23 2024-08-23
8 v24.7.3.47-stable 2024-09-04
9 v24.7.3.42-stable v24.7.3.42-stable 2024-08-08 2024-08-08
10 v24.7.2.13-stable v24.7.2.13-stable 2024-08-01 2024-08-01
11 v24.7.1.2915-stable v24.7.1.2915-stable 2024-07-30 2024-07-30
12 v24.6.6.6-stable 2024-09-06
13 v24.6.5.30-stable v24.6.5.30-stable 2024-09-03 2024-09-03
14 v24.6.4.42-stable v24.6.4.42-stable 2024-08-23 2024-08-23
15 v24.6.3.95-stable v24.6.3.95-stable 2024-08-06 2024-08-06
16 v24.6.3.38-stable 2024-09-04
17 v24.6.2.17-stable v24.6.2.17-stable 2024-07-05 2024-07-05
18 v24.6.1.4423-stable v24.6.1.4423-stable 2024-07-01 2024-07-01
19 v24.5.8.10-stable 2024-09-06
20 v24.5.7.31-stable v24.5.7.31-stable 2024-09-03 2024-09-03
21 v24.5.6.45-stable v24.5.6.45-stable 2024-08-23 2024-08-23
22 v24.5.5.78-stable v24.5.5.78-stable 2024-08-05 2024-08-05
28 v24.4.3.25-stable v24.4.3.25-stable 2024-06-14 2024-06-14
29 v24.4.2.141-stable v24.4.2.141-stable 2024-06-07 2024-06-07
30 v24.4.1.2088-stable v24.4.1.2088-stable 2024-05-01 2024-05-01
31 v24.3.11.7-lts 2024-09-06
32 v24.3.10.33-lts v24.3.10.33-lts 2024-09-03 2024-09-03
33 v24.3.9.5-lts v24.3.9.5-lts 2024-08-22 2024-08-22
34 v24.3.8.13-lts v24.3.8.13-lts 2024-08-20 2024-08-20