mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-10-04 15:40:49 +00:00
Merge branch 'master' of github.com:yandex/ClickHouse
This commit is contained in:
commit
b5786fec29
2
.github/label-pr.yml
vendored
Normal file
2
.github/label-pr.yml
vendored
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
- regExp: ".*\\.md$"
|
||||||
|
labels: ["documentation", "pr-documentation"]
|
9
.github/main.workflow
vendored
Normal file
9
.github/main.workflow
vendored
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
workflow "Main workflow" {
|
||||||
|
resolves = ["Label PR"]
|
||||||
|
on = "pull_request"
|
||||||
|
}
|
||||||
|
|
||||||
|
action "Label PR" {
|
||||||
|
uses = "decathlon/pull-request-labeler-action@v1.0.0"
|
||||||
|
secrets = ["GITHUB_TOKEN"]
|
||||||
|
}
|
53
CHANGELOG.md
53
CHANGELOG.md
@ -1,3 +1,54 @@
|
|||||||
|
## ClickHouse release 19.13.2.19, 2019-08-14
|
||||||
|
|
||||||
|
### New Feature
|
||||||
|
* Sampling profiler on query level. [Example](https://gist.github.com/alexey-milovidov/92758583dd41c24c360fdb8d6a4da194). [#4247](https://github.com/yandex/ClickHouse/issues/4247) ([laplab](https://github.com/laplab)) [#6124](https://github.com/yandex/ClickHouse/pull/6124) ([alexey-milovidov](https://github.com/alexey-milovidov)) [#6250](https://github.com/yandex/ClickHouse/pull/6250) [#6283](https://github.com/yandex/ClickHouse/pull/6283) [#6386](https://github.com/yandex/ClickHouse/pull/6386)
|
||||||
|
* Allow to specify a list of columns with `COLUMNS('regexp')` expression that works like a more sophisticated variant of `*` asterisk. [#5951](https://github.com/yandex/ClickHouse/pull/5951) ([mfridental](https://github.com/mfridental)), ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* `CREATE TABLE AS table_function()` is now possible [#6057](https://github.com/yandex/ClickHouse/pull/6057) ([dimarub2000](https://github.com/dimarub2000))
|
||||||
|
* Adam optimizer for stochastic gradient descent is used by default in `stochasticLinearRegression()` and `stochasticLogisticRegression()` aggregate functions, because it shows good quality without almost any tuning. [#6000](https://github.com/yandex/ClickHouse/pull/6000) ([Quid37](https://github.com/Quid37))
|
||||||
|
* Added functions for working with the сustom week number [#5212](https://github.com/yandex/ClickHouse/pull/5212) ([Andy Yang](https://github.com/andyyzh))
|
||||||
|
* `RENAME` queries now work with all storages. [#5953](https://github.com/yandex/ClickHouse/pull/5953) ([Ivan](https://github.com/abyss7))
|
||||||
|
* Now client receive logs from server with any desired level by setting `send_logs_level` regardless to the log level specified in server settings. [#5964](https://github.com/yandex/ClickHouse/pull/5964) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
|
||||||
|
|
||||||
|
### Experimental features
|
||||||
|
* New query processing pipeline. Use `experimental_use_processors=1` option to enable it. Use for your own trouble. [#4914](https://github.com/yandex/ClickHouse/pull/4914) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
|
||||||
|
|
||||||
|
### Bug Fix
|
||||||
|
* Kafka integration has been fixed in this version.
|
||||||
|
* Fixed `DoubleDelta` encoding of `Int64` for large `DoubleDelta` values, improved `DoubleDelta` encoding for random data for `Int32`. [#5998](https://github.com/yandex/ClickHouse/pull/5998) ([Vasily Nemkov](https://github.com/Enmk))
|
||||||
|
* Fixed overestimation of `max_rows_to_read` if the setting `merge_tree_uniform_read_distribution` is set to 0. [#6019](https://github.com/yandex/ClickHouse/pull/6019) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
|
||||||
|
### Improvement
|
||||||
|
* The setting `input_format_defaults_for_omitted_fields` is enabled by default. It enables calculation of complex default expressions for omitted fields in `JSONEachRow` and `CSV*` formats. It should be the expected behaviour but may lead to negligible performance difference or subtle incompatibilities. [#6043](https://github.com/yandex/ClickHouse/pull/6043) ([Artem Zuikov](https://github.com/4ertus2)), [#5625](https://github.com/yandex/ClickHouse/pull/5625) ([akuzm](https://github.com/akuzm))
|
||||||
|
* Throws an exception if `config.d` file doesn't have the corresponding root element as the config file [#6123](https://github.com/yandex/ClickHouse/pull/6123) ([dimarub2000](https://github.com/dimarub2000))
|
||||||
|
|
||||||
|
### Performance Improvement
|
||||||
|
* Optimize `count()`. Now it uses the smallest column (if possible). [#6028](https://github.com/yandex/ClickHouse/pull/6028) ([Amos Bird](https://github.com/amosbird))
|
||||||
|
|
||||||
|
### Build/Testing/Packaging Improvement
|
||||||
|
* Report memory usage in performance tests. [#5899](https://github.com/yandex/ClickHouse/pull/5899) ([akuzm](https://github.com/akuzm))
|
||||||
|
* Fix build with external `libcxx` [#6010](https://github.com/yandex/ClickHouse/pull/6010) ([Ivan](https://github.com/abyss7))
|
||||||
|
* Fix shared build with `rdkafka` library [#6101](https://github.com/yandex/ClickHouse/pull/6101) ([Ivan](https://github.com/abyss7))
|
||||||
|
|
||||||
|
## ClickHouse release 19.11.7.40, 2019-08-14
|
||||||
|
|
||||||
|
### Bug fix
|
||||||
|
* Kafka integration has been fixed in this version.
|
||||||
|
* Fix segfault when using `arrayReduce` for constant arguments. [#6326](https://github.com/yandex/ClickHouse/pull/6326) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* Fixed `toFloat()` monotonicity. [#6374](https://github.com/yandex/ClickHouse/pull/6374) ([dimarub2000](https://github.com/dimarub2000))
|
||||||
|
* Fix segfault with enabled `optimize_skip_unused_shards` and missing sharding key. [#6384](https://github.com/yandex/ClickHouse/pull/6384) ([CurtizJ](https://github.com/CurtizJ))
|
||||||
|
* Fixed logic of `arrayEnumerateUniqRanked` function. [#6423](https://github.com/yandex/ClickHouse/pull/6423) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* Removed extra verbose logging from MySQL handler. [#6389](https://github.com/yandex/ClickHouse/pull/6389) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* Fix wrong behavior and possible segfaults in `topK` and `topKWeighted` aggregated functions. [#6404](https://github.com/yandex/ClickHouse/pull/6404) ([CurtizJ](https://github.com/CurtizJ))
|
||||||
|
* Do not expose virtual columns in `system.columns` table. This is required for backward compatibility. [#6406](https://github.com/yandex/ClickHouse/pull/6406) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* Fix bug with memory allocation for string fields in complex key cache dictionary. [#6447](https://github.com/yandex/ClickHouse/pull/6447) ([alesapin](https://github.com/alesapin))
|
||||||
|
* Fix bug with enabling adaptive granularity when creating new replica for `Replicated*MergeTree` table. [#6452](https://github.com/yandex/ClickHouse/pull/6452) ([alesapin](https://github.com/alesapin))
|
||||||
|
* Fix infinite loop when reading Kafka messages. [#6354](https://github.com/yandex/ClickHouse/pull/6354) ([abyss7](https://github.com/abyss7))
|
||||||
|
* Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser and possibility of stack overflow in `Merge` and `Distributed` tables [#6433](https://github.com/yandex/ClickHouse/pull/6433) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
* Fixed Gorilla encoding error on small sequences. [#6444](https://github.com/yandex/ClickHouse/pull/6444) ([Enmk](https://github.com/Enmk))
|
||||||
|
|
||||||
|
### Improvement
|
||||||
|
* Allow user to override `poll_interval` and `idle_connection_timeout` settings on connection. [#6230](https://github.com/yandex/ClickHouse/pull/6230) ([alexey-milovidov](https://github.com/alexey-milovidov))
|
||||||
|
|
||||||
## ClickHouse release 19.11.5.28, 2019-08-05
|
## ClickHouse release 19.11.5.28, 2019-08-05
|
||||||
|
|
||||||
### Bug fix
|
### Bug fix
|
||||||
@ -299,7 +350,7 @@ It allows to set commit mode: after every batch of messages is handled, or after
|
|||||||
* Renamed functions `leastSqr` to `simpleLinearRegression`, `LinearRegression` to `linearRegression`, `LogisticRegression` to `logisticRegression`. [#5391](https://github.com/yandex/ClickHouse/pull/5391) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
|
* Renamed functions `leastSqr` to `simpleLinearRegression`, `LinearRegression` to `linearRegression`, `LogisticRegression` to `logisticRegression`. [#5391](https://github.com/yandex/ClickHouse/pull/5391) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
|
||||||
|
|
||||||
### Performance Improvements
|
### Performance Improvements
|
||||||
* Paralellize processing of parts in alter modify query. [#4639](https://github.com/yandex/ClickHouse/pull/4639) ([Ivan Kush](https://github.com/IvanKush))
|
* Paralellize processing of parts of non-replicated MergeTree tables in ALTER MODIFY query. [#4639](https://github.com/yandex/ClickHouse/pull/4639) ([Ivan Kush](https://github.com/IvanKush))
|
||||||
* Optimizations in regular expressions extraction. [#5193](https://github.com/yandex/ClickHouse/pull/5193) [#5191](https://github.com/yandex/ClickHouse/pull/5191) ([Danila Kutenin](https://github.com/danlark1))
|
* Optimizations in regular expressions extraction. [#5193](https://github.com/yandex/ClickHouse/pull/5193) [#5191](https://github.com/yandex/ClickHouse/pull/5191) ([Danila Kutenin](https://github.com/danlark1))
|
||||||
* Do not add right join key column to join result if it's used only in join on section. [#5260](https://github.com/yandex/ClickHouse/pull/5260) ([Artem Zuikov](https://github.com/4ertus2))
|
* Do not add right join key column to join result if it's used only in join on section. [#5260](https://github.com/yandex/ClickHouse/pull/5260) ([Artem Zuikov](https://github.com/4ertus2))
|
||||||
* Freeze the Kafka buffer after first empty response. It avoids multiple invokations of `ReadBuffer::next()` for empty result in some row-parsing streams. [#5283](https://github.com/yandex/ClickHouse/pull/5283) ([Ivan](https://github.com/abyss7))
|
* Freeze the Kafka buffer after first empty response. It avoids multiple invokations of `ReadBuffer::next()` for empty result in some row-parsing streams. [#5283](https://github.com/yandex/ClickHouse/pull/5283) ([Ivan](https://github.com/abyss7))
|
||||||
|
@ -5,7 +5,6 @@ set(CLICKHOUSE_ODBC_BRIDGE_SOURCES
|
|||||||
${CMAKE_CURRENT_SOURCE_DIR}/IdentifierQuoteHandler.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/IdentifierQuoteHandler.cpp
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/MainHandler.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/MainHandler.cpp
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBlockInputStream.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBlockInputStream.cpp
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/odbc-bridge.cpp
|
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBridge.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBridge.cpp
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/PingHandler.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/PingHandler.cpp
|
||||||
${CMAKE_CURRENT_SOURCE_DIR}/validateODBCConnectionString.cpp
|
${CMAKE_CURRENT_SOURCE_DIR}/validateODBCConnectionString.cpp
|
||||||
|
@ -24,6 +24,12 @@ template <typename Value, bool FloatReturn> using FuncQuantilesDeterministic = A
|
|||||||
template <typename Value, bool _> using FuncQuantileExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantileExact, false, void, false>;
|
template <typename Value, bool _> using FuncQuantileExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantileExact, false, void, false>;
|
||||||
template <typename Value, bool _> using FuncQuantilesExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantilesExact, false, void, true>;
|
template <typename Value, bool _> using FuncQuantilesExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantilesExact, false, void, true>;
|
||||||
|
|
||||||
|
template <typename Value, bool _> using FuncQuantileExactExclusive = AggregateFunctionQuantile<Value, QuantileExactExclusive<Value>, NameQuantileExactExclusive, false, Float64, false>;
|
||||||
|
template <typename Value, bool _> using FuncQuantilesExactExclusive = AggregateFunctionQuantile<Value, QuantileExactExclusive<Value>, NameQuantilesExactExclusive, false, Float64, true>;
|
||||||
|
|
||||||
|
template <typename Value, bool _> using FuncQuantileExactInclusive = AggregateFunctionQuantile<Value, QuantileExactInclusive<Value>, NameQuantileExactInclusive, false, Float64, false>;
|
||||||
|
template <typename Value, bool _> using FuncQuantilesExactInclusive = AggregateFunctionQuantile<Value, QuantileExactInclusive<Value>, NameQuantilesExactInclusive, false, Float64, true>;
|
||||||
|
|
||||||
template <typename Value, bool _> using FuncQuantileExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantileExactWeighted, true, void, false>;
|
template <typename Value, bool _> using FuncQuantileExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantileExactWeighted, true, void, false>;
|
||||||
template <typename Value, bool _> using FuncQuantilesExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantilesExactWeighted, true, void, true>;
|
template <typename Value, bool _> using FuncQuantilesExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantilesExactWeighted, true, void, true>;
|
||||||
|
|
||||||
@ -92,6 +98,12 @@ void registerAggregateFunctionsQuantile(AggregateFunctionFactory & factory)
|
|||||||
factory.registerFunction(NameQuantileExact::name, createAggregateFunctionQuantile<FuncQuantileExact>);
|
factory.registerFunction(NameQuantileExact::name, createAggregateFunctionQuantile<FuncQuantileExact>);
|
||||||
factory.registerFunction(NameQuantilesExact::name, createAggregateFunctionQuantile<FuncQuantilesExact>);
|
factory.registerFunction(NameQuantilesExact::name, createAggregateFunctionQuantile<FuncQuantilesExact>);
|
||||||
|
|
||||||
|
factory.registerFunction(NameQuantileExactExclusive::name, createAggregateFunctionQuantile<FuncQuantileExactExclusive>);
|
||||||
|
factory.registerFunction(NameQuantilesExactExclusive::name, createAggregateFunctionQuantile<FuncQuantilesExactExclusive>);
|
||||||
|
|
||||||
|
factory.registerFunction(NameQuantileExactInclusive::name, createAggregateFunctionQuantile<FuncQuantileExactInclusive>);
|
||||||
|
factory.registerFunction(NameQuantilesExactInclusive::name, createAggregateFunctionQuantile<FuncQuantilesExactInclusive>);
|
||||||
|
|
||||||
factory.registerFunction(NameQuantileExactWeighted::name, createAggregateFunctionQuantile<FuncQuantileExactWeighted>);
|
factory.registerFunction(NameQuantileExactWeighted::name, createAggregateFunctionQuantile<FuncQuantileExactWeighted>);
|
||||||
factory.registerFunction(NameQuantilesExactWeighted::name, createAggregateFunctionQuantile<FuncQuantilesExactWeighted>);
|
factory.registerFunction(NameQuantilesExactWeighted::name, createAggregateFunctionQuantile<FuncQuantilesExactWeighted>);
|
||||||
|
|
||||||
|
@ -199,8 +199,15 @@ struct NameQuantileDeterministic { static constexpr auto name = "quantileDetermi
|
|||||||
struct NameQuantilesDeterministic { static constexpr auto name = "quantilesDeterministic"; };
|
struct NameQuantilesDeterministic { static constexpr auto name = "quantilesDeterministic"; };
|
||||||
|
|
||||||
struct NameQuantileExact { static constexpr auto name = "quantileExact"; };
|
struct NameQuantileExact { static constexpr auto name = "quantileExact"; };
|
||||||
struct NameQuantileExactWeighted { static constexpr auto name = "quantileExactWeighted"; };
|
|
||||||
struct NameQuantilesExact { static constexpr auto name = "quantilesExact"; };
|
struct NameQuantilesExact { static constexpr auto name = "quantilesExact"; };
|
||||||
|
|
||||||
|
struct NameQuantileExactExclusive { static constexpr auto name = "quantileExactExclusive"; };
|
||||||
|
struct NameQuantilesExactExclusive { static constexpr auto name = "quantilesExactExclusive"; };
|
||||||
|
|
||||||
|
struct NameQuantileExactInclusive { static constexpr auto name = "quantileExactInclusive"; };
|
||||||
|
struct NameQuantilesExactInclusive { static constexpr auto name = "quantilesExactInclusive"; };
|
||||||
|
|
||||||
|
struct NameQuantileExactWeighted { static constexpr auto name = "quantileExactWeighted"; };
|
||||||
struct NameQuantilesExactWeighted { static constexpr auto name = "quantilesExactWeighted"; };
|
struct NameQuantilesExactWeighted { static constexpr auto name = "quantilesExactWeighted"; };
|
||||||
|
|
||||||
struct NameQuantileTiming { static constexpr auto name = "quantileTiming"; };
|
struct NameQuantileTiming { static constexpr auto name = "quantileTiming"; };
|
||||||
|
@ -17,8 +17,8 @@ namespace
|
|||||||
template <template <typename> class Data>
|
template <template <typename> class Data>
|
||||||
AggregateFunctionPtr createAggregateFunctionWindowFunnel(const std::string & name, const DataTypes & arguments, const Array & params)
|
AggregateFunctionPtr createAggregateFunctionWindowFunnel(const std::string & name, const DataTypes & arguments, const Array & params)
|
||||||
{
|
{
|
||||||
if (params.size() != 1)
|
if (params.size() < 1)
|
||||||
throw Exception{"Aggregate function " + name + " requires exactly one parameter.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH};
|
throw Exception{"Aggregate function " + name + " requires at least one parameter: <window>, [option, [option, ...]]", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH};
|
||||||
|
|
||||||
if (arguments.size() < 2)
|
if (arguments.size() < 2)
|
||||||
throw Exception("Aggregate function " + name + " requires one timestamp argument and at least one event condition.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
throw Exception("Aggregate function " + name + " requires one timestamp argument and at least one event condition.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
|
||||||
|
@ -139,6 +139,7 @@ class AggregateFunctionWindowFunnel final
|
|||||||
private:
|
private:
|
||||||
UInt64 window;
|
UInt64 window;
|
||||||
UInt8 events_size;
|
UInt8 events_size;
|
||||||
|
UInt8 strict;
|
||||||
|
|
||||||
|
|
||||||
// Loop through the entire events_list, update the event timestamp value
|
// Loop through the entire events_list, update the event timestamp value
|
||||||
@ -165,6 +166,10 @@ private:
|
|||||||
|
|
||||||
if (event_idx == 0)
|
if (event_idx == 0)
|
||||||
events_timestamp[0] = timestamp;
|
events_timestamp[0] = timestamp;
|
||||||
|
else if (strict && events_timestamp[event_idx] >= 0)
|
||||||
|
{
|
||||||
|
return event_idx + 1;
|
||||||
|
}
|
||||||
else if (events_timestamp[event_idx - 1] >= 0 && timestamp <= events_timestamp[event_idx - 1] + window)
|
else if (events_timestamp[event_idx - 1] >= 0 && timestamp <= events_timestamp[event_idx - 1] + window)
|
||||||
{
|
{
|
||||||
events_timestamp[event_idx] = events_timestamp[event_idx - 1];
|
events_timestamp[event_idx] = events_timestamp[event_idx - 1];
|
||||||
@ -191,8 +196,17 @@ public:
|
|||||||
{
|
{
|
||||||
events_size = arguments.size() - 1;
|
events_size = arguments.size() - 1;
|
||||||
window = params.at(0).safeGet<UInt64>();
|
window = params.at(0).safeGet<UInt64>();
|
||||||
}
|
|
||||||
|
|
||||||
|
strict = 0;
|
||||||
|
for (size_t i = 1; i < params.size(); ++i)
|
||||||
|
{
|
||||||
|
String option = params.at(i).safeGet<String>();
|
||||||
|
if (option.compare("strict") == 0)
|
||||||
|
strict = 1;
|
||||||
|
else
|
||||||
|
throw Exception{"Aggregate function " + getName() + " doesn't support a parameter: " + option, ErrorCodes::BAD_ARGUMENTS};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
DataTypePtr getReturnType() const override
|
DataTypePtr getReturnType() const override
|
||||||
{
|
{
|
||||||
|
@ -14,6 +14,7 @@ namespace DB
|
|||||||
namespace ErrorCodes
|
namespace ErrorCodes
|
||||||
{
|
{
|
||||||
extern const int NOT_IMPLEMENTED;
|
extern const int NOT_IMPLEMENTED;
|
||||||
|
extern const int BAD_ARGUMENTS;
|
||||||
}
|
}
|
||||||
|
|
||||||
/** Calculates quantile by collecting all values into array
|
/** Calculates quantile by collecting all values into array
|
||||||
@ -106,16 +107,134 @@ struct QuantileExact
|
|||||||
result[i] = Value();
|
result[i] = Value();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
};
|
||||||
|
|
||||||
/// The same, but in the case of an empty state, NaN is returned.
|
/// QuantileExactExclusive is equivalent to Excel PERCENTILE.EXC, R-6, SAS-4, SciPy-(0,0)
|
||||||
Float64 getFloat(Float64) const
|
template <typename Value>
|
||||||
|
struct QuantileExactExclusive : public QuantileExact<Value>
|
||||||
|
{
|
||||||
|
using QuantileExact<Value>::array;
|
||||||
|
|
||||||
|
/// Get the value of the `level` quantile. The level must be between 0 and 1 excluding bounds.
|
||||||
|
Float64 getFloat(Float64 level)
|
||||||
{
|
{
|
||||||
throw Exception("Method getFloat is not implemented for QuantileExact", ErrorCodes::NOT_IMPLEMENTED);
|
if (!array.empty())
|
||||||
|
{
|
||||||
|
if (level == 0. || level == 1.)
|
||||||
|
throw Exception("QuantileExactExclusive cannot interpolate for the percentiles 1 and 0", ErrorCodes::BAD_ARGUMENTS);
|
||||||
|
|
||||||
|
Float64 h = level * (array.size() + 1);
|
||||||
|
auto n = static_cast<size_t>(h);
|
||||||
|
|
||||||
|
if (n >= array.size())
|
||||||
|
return array[array.size() - 1];
|
||||||
|
else if (n < 1)
|
||||||
|
return array[0];
|
||||||
|
|
||||||
|
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
|
||||||
|
auto nth_element = std::min_element(array.begin() + n, array.end());
|
||||||
|
|
||||||
|
return array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
return std::numeric_limits<Float64>::quiet_NaN();
|
||||||
}
|
}
|
||||||
|
|
||||||
void getManyFloat(const Float64 *, const size_t *, size_t, Float64 *) const
|
void getManyFloat(const Float64 * levels, const size_t * indices, size_t size, Float64 * result)
|
||||||
{
|
{
|
||||||
throw Exception("Method getManyFloat is not implemented for QuantileExact", ErrorCodes::NOT_IMPLEMENTED);
|
if (!array.empty())
|
||||||
|
{
|
||||||
|
size_t prev_n = 0;
|
||||||
|
for (size_t i = 0; i < size; ++i)
|
||||||
|
{
|
||||||
|
auto level = levels[indices[i]];
|
||||||
|
if (level == 0. || level == 1.)
|
||||||
|
throw Exception("QuantileExactExclusive cannot interpolate for the percentiles 1 and 0", ErrorCodes::BAD_ARGUMENTS);
|
||||||
|
|
||||||
|
Float64 h = level * (array.size() + 1);
|
||||||
|
auto n = static_cast<size_t>(h);
|
||||||
|
|
||||||
|
if (n >= array.size())
|
||||||
|
result[indices[i]] = array[array.size() - 1];
|
||||||
|
else if (n < 1)
|
||||||
|
result[indices[i]] = array[0];
|
||||||
|
else
|
||||||
|
{
|
||||||
|
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
|
||||||
|
auto nth_element = std::min_element(array.begin() + n, array.end());
|
||||||
|
|
||||||
|
result[indices[i]] = array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
|
||||||
|
prev_n = n - 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (size_t i = 0; i < size; ++i)
|
||||||
|
result[i] = std::numeric_limits<Float64>::quiet_NaN();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
/// QuantileExactInclusive is equivalent to Excel PERCENTILE and PERCENTILE.INC, R-7, SciPy-(1,1)
|
||||||
|
template <typename Value>
|
||||||
|
struct QuantileExactInclusive : public QuantileExact<Value>
|
||||||
|
{
|
||||||
|
using QuantileExact<Value>::array;
|
||||||
|
|
||||||
|
/// Get the value of the `level` quantile. The level must be between 0 and 1 including bounds.
|
||||||
|
Float64 getFloat(Float64 level)
|
||||||
|
{
|
||||||
|
if (!array.empty())
|
||||||
|
{
|
||||||
|
Float64 h = level * (array.size() - 1) + 1;
|
||||||
|
auto n = static_cast<size_t>(h);
|
||||||
|
|
||||||
|
if (n >= array.size())
|
||||||
|
return array[array.size() - 1];
|
||||||
|
else if (n < 1)
|
||||||
|
return array[0];
|
||||||
|
|
||||||
|
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
|
||||||
|
auto nth_element = std::min_element(array.begin() + n, array.end());
|
||||||
|
|
||||||
|
return array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
|
||||||
|
}
|
||||||
|
|
||||||
|
return std::numeric_limits<Float64>::quiet_NaN();
|
||||||
|
}
|
||||||
|
|
||||||
|
void getManyFloat(const Float64 * levels, const size_t * indices, size_t size, Float64 * result)
|
||||||
|
{
|
||||||
|
if (!array.empty())
|
||||||
|
{
|
||||||
|
size_t prev_n = 0;
|
||||||
|
for (size_t i = 0; i < size; ++i)
|
||||||
|
{
|
||||||
|
auto level = levels[indices[i]];
|
||||||
|
|
||||||
|
Float64 h = level * (array.size() - 1) + 1;
|
||||||
|
auto n = static_cast<size_t>(h);
|
||||||
|
|
||||||
|
if (n >= array.size())
|
||||||
|
result[indices[i]] = array[array.size() - 1];
|
||||||
|
else if (n < 1)
|
||||||
|
result[indices[i]] = array[0];
|
||||||
|
else
|
||||||
|
{
|
||||||
|
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
|
||||||
|
auto nth_element = std::min_element(array.begin() + n, array.end());
|
||||||
|
|
||||||
|
result[indices[i]] = array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
|
||||||
|
prev_n = n - 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
for (size_t i = 0; i < size; ++i)
|
||||||
|
result[i] = std::numeric_limits<Float64>::quiet_NaN();
|
||||||
|
}
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
@ -348,10 +348,9 @@ bool OPTIMIZE(1) CSVRowInputStream::parseRowAndPrintDiagnosticInfo(MutableColumn
|
|||||||
const auto & current_column_type = data_types[table_column];
|
const auto & current_column_type = data_types[table_column];
|
||||||
const bool is_last_file_column =
|
const bool is_last_file_column =
|
||||||
file_column + 1 == column_indexes_for_input_fields.size();
|
file_column + 1 == column_indexes_for_input_fields.size();
|
||||||
const bool at_delimiter = *istr.position() == delimiter;
|
const bool at_delimiter = !istr.eof() && *istr.position() == delimiter;
|
||||||
const bool at_last_column_line_end = is_last_file_column
|
const bool at_last_column_line_end = is_last_file_column
|
||||||
&& (*istr.position() == '\n' || *istr.position() == '\r'
|
&& (istr.eof() || *istr.position() == '\n' || *istr.position() == '\r');
|
||||||
|| istr.eof());
|
|
||||||
|
|
||||||
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
|
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
|
||||||
<< "name: " << header.safeGetByPosition(table_column).name << ", " << std::string(max_length_of_column_name - header.safeGetByPosition(table_column).name.size(), ' ')
|
<< "name: " << header.safeGetByPosition(table_column).name << ", " << std::string(max_length_of_column_name - header.safeGetByPosition(table_column).name.size(), ' ')
|
||||||
@ -514,10 +513,9 @@ void CSVRowInputStream::updateDiagnosticInfo()
|
|||||||
|
|
||||||
bool CSVRowInputStream::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
|
bool CSVRowInputStream::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
|
||||||
{
|
{
|
||||||
const bool at_delimiter = *istr.position() == format_settings.csv.delimiter;
|
const bool at_delimiter = !istr.eof() || *istr.position() == format_settings.csv.delimiter;
|
||||||
const bool at_last_column_line_end = is_last_file_column
|
const bool at_last_column_line_end = is_last_file_column
|
||||||
&& (*istr.position() == '\n' || *istr.position() == '\r'
|
&& (istr.eof() || *istr.position() == '\n' || *istr.position() == '\r');
|
||||||
|| istr.eof());
|
|
||||||
|
|
||||||
if (format_settings.csv.empty_as_default
|
if (format_settings.csv.empty_as_default
|
||||||
&& (at_delimiter || at_last_column_line_end))
|
&& (at_delimiter || at_last_column_line_end))
|
||||||
|
@ -320,13 +320,7 @@ ColumnPtr Set::execute(const Block & block, bool negative) const
|
|||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (data_types.size() != num_key_columns)
|
checkColumnsNumber(num_key_columns);
|
||||||
{
|
|
||||||
std::stringstream message;
|
|
||||||
message << "Number of columns in section IN doesn't match. "
|
|
||||||
<< num_key_columns << " at left, " << data_types.size() << " at right.";
|
|
||||||
throw Exception(message.str(), ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH);
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Remember the columns we will work with. Also check that the data types are correct.
|
/// Remember the columns we will work with. Also check that the data types are correct.
|
||||||
ColumnRawPtrs key_columns;
|
ColumnRawPtrs key_columns;
|
||||||
@ -337,11 +331,7 @@ ColumnPtr Set::execute(const Block & block, bool negative) const
|
|||||||
|
|
||||||
for (size_t i = 0; i < num_key_columns; ++i)
|
for (size_t i = 0; i < num_key_columns; ++i)
|
||||||
{
|
{
|
||||||
if (!removeNullable(data_types[i])->equals(*removeNullable(block.safeGetByPosition(i).type)))
|
checkTypesEqual(i, block.safeGetByPosition(i).type);
|
||||||
throw Exception("Types of column " + toString(i + 1) + " in section IN don't match: "
|
|
||||||
+ data_types[i]->getName() + " on the right, " + block.safeGetByPosition(i).type->getName() +
|
|
||||||
" on the left.", ErrorCodes::TYPE_MISMATCH);
|
|
||||||
|
|
||||||
materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst());
|
materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst());
|
||||||
key_columns.emplace_back() = materialized_columns.back().get();
|
key_columns.emplace_back() = materialized_columns.back().get();
|
||||||
}
|
}
|
||||||
@ -421,6 +411,24 @@ void Set::executeOrdinary(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void Set::checkColumnsNumber(size_t num_key_columns) const
|
||||||
|
{
|
||||||
|
if (data_types.size() != num_key_columns)
|
||||||
|
{
|
||||||
|
std::stringstream message;
|
||||||
|
message << "Number of columns in section IN doesn't match. "
|
||||||
|
<< num_key_columns << " at left, " << data_types.size() << " at right.";
|
||||||
|
throw Exception(message.str(), ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void Set::checkTypesEqual(size_t set_type_idx, const DataTypePtr & other_type) const
|
||||||
|
{
|
||||||
|
if (!removeNullable(data_types[set_type_idx])->equals(*removeNullable(other_type)))
|
||||||
|
throw Exception("Types of column " + toString(set_type_idx + 1) + " in section IN don't match: "
|
||||||
|
+ data_types[set_type_idx]->getName() + " on the right, " + other_type->getName() +
|
||||||
|
" on the left.", ErrorCodes::TYPE_MISMATCH);
|
||||||
|
}
|
||||||
|
|
||||||
MergeTreeSetIndex::MergeTreeSetIndex(const Columns & set_elements, std::vector<KeyTuplePositionMapping> && index_mapping_)
|
MergeTreeSetIndex::MergeTreeSetIndex(const Columns & set_elements, std::vector<KeyTuplePositionMapping> && index_mapping_)
|
||||||
: indexes_mapping(std::move(index_mapping_))
|
: indexes_mapping(std::move(index_mapping_))
|
||||||
|
@ -70,6 +70,9 @@ public:
|
|||||||
bool hasExplicitSetElements() const { return fill_set_elements; }
|
bool hasExplicitSetElements() const { return fill_set_elements; }
|
||||||
Columns getSetElements() const { return { set_elements.begin(), set_elements.end() }; }
|
Columns getSetElements() const { return { set_elements.begin(), set_elements.end() }; }
|
||||||
|
|
||||||
|
void checkColumnsNumber(size_t num_key_columns) const;
|
||||||
|
void checkTypesEqual(size_t set_type_idx, const DataTypePtr & other_type) const;
|
||||||
|
|
||||||
private:
|
private:
|
||||||
size_t keys_size = 0;
|
size_t keys_size = 0;
|
||||||
Sizes key_sizes;
|
Sizes key_sizes;
|
||||||
|
@ -349,10 +349,9 @@ bool OPTIMIZE(1) CSVRowInputFormat::parseRowAndPrintDiagnosticInfo(MutableColumn
|
|||||||
const auto & current_column_type = data_types[table_column];
|
const auto & current_column_type = data_types[table_column];
|
||||||
const bool is_last_file_column =
|
const bool is_last_file_column =
|
||||||
file_column + 1 == column_indexes_for_input_fields.size();
|
file_column + 1 == column_indexes_for_input_fields.size();
|
||||||
const bool at_delimiter = *in.position() == delimiter;
|
const bool at_delimiter = !in.eof() && *in.position() == delimiter;
|
||||||
const bool at_last_column_line_end = is_last_file_column
|
const bool at_last_column_line_end = is_last_file_column
|
||||||
&& (*in.position() == '\n' || *in.position() == '\r'
|
&& (in.eof() || *in.position() == '\n' || *in.position() == '\r');
|
||||||
|| in.eof());
|
|
||||||
|
|
||||||
auto & header = getPort().getHeader();
|
auto & header = getPort().getHeader();
|
||||||
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
|
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
|
||||||
@ -516,10 +515,9 @@ void CSVRowInputFormat::updateDiagnosticInfo()
|
|||||||
|
|
||||||
bool CSVRowInputFormat::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
|
bool CSVRowInputFormat::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
|
||||||
{
|
{
|
||||||
const bool at_delimiter = *in.position() == format_settings.csv.delimiter;
|
const bool at_delimiter = !in.eof() && *in.position() == format_settings.csv.delimiter;
|
||||||
const bool at_last_column_line_end = is_last_file_column
|
const bool at_last_column_line_end = is_last_file_column
|
||||||
&& (*in.position() == '\n' || *in.position() == '\r'
|
&& (in.eof() || *in.position() == '\n' || *in.position() == '\r');
|
||||||
|| in.eof());
|
|
||||||
|
|
||||||
if (format_settings.csv.empty_as_default
|
if (format_settings.csv.empty_as_default
|
||||||
&& (at_delimiter || at_last_column_line_end))
|
&& (at_delimiter || at_last_column_line_end))
|
||||||
|
@ -25,25 +25,26 @@ IMergedBlockOutputStream::IMergedBlockOutputStream(
|
|||||||
size_t aio_threshold_,
|
size_t aio_threshold_,
|
||||||
bool blocks_are_granules_size_,
|
bool blocks_are_granules_size_,
|
||||||
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
|
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
|
||||||
const MergeTreeIndexGranularity & index_granularity_)
|
const MergeTreeIndexGranularity & index_granularity_,
|
||||||
|
const MergeTreeIndexGranularityInfo * index_granularity_info_)
|
||||||
: storage(storage_)
|
: storage(storage_)
|
||||||
, part_path(part_path_)
|
, part_path(part_path_)
|
||||||
, min_compress_block_size(min_compress_block_size_)
|
, min_compress_block_size(min_compress_block_size_)
|
||||||
, max_compress_block_size(max_compress_block_size_)
|
, max_compress_block_size(max_compress_block_size_)
|
||||||
, aio_threshold(aio_threshold_)
|
, aio_threshold(aio_threshold_)
|
||||||
, marks_file_extension(storage.canUseAdaptiveGranularity() ? getAdaptiveMrkExtension() : getNonAdaptiveMrkExtension())
|
, can_use_adaptive_granularity(index_granularity_info_ ? index_granularity_info_->is_adaptive : storage.canUseAdaptiveGranularity())
|
||||||
|
, marks_file_extension(can_use_adaptive_granularity ? getAdaptiveMrkExtension() : getNonAdaptiveMrkExtension())
|
||||||
, blocks_are_granules_size(blocks_are_granules_size_)
|
, blocks_are_granules_size(blocks_are_granules_size_)
|
||||||
, index_granularity(index_granularity_)
|
, index_granularity(index_granularity_)
|
||||||
, compute_granularity(index_granularity.empty())
|
, compute_granularity(index_granularity.empty())
|
||||||
, codec(std::move(codec_))
|
, codec(std::move(codec_))
|
||||||
, skip_indices(indices_to_recalc)
|
, skip_indices(indices_to_recalc)
|
||||||
, with_final_mark(storage.settings.write_final_mark && storage.canUseAdaptiveGranularity())
|
, with_final_mark(storage.settings.write_final_mark && can_use_adaptive_granularity)
|
||||||
{
|
{
|
||||||
if (blocks_are_granules_size && !index_granularity.empty())
|
if (blocks_are_granules_size && !index_granularity.empty())
|
||||||
throw Exception("Can't take information about index granularity from blocks, when non empty index_granularity array specified", ErrorCodes::LOGICAL_ERROR);
|
throw Exception("Can't take information about index granularity from blocks, when non empty index_granularity array specified", ErrorCodes::LOGICAL_ERROR);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
void IMergedBlockOutputStream::addStreams(
|
void IMergedBlockOutputStream::addStreams(
|
||||||
const String & path,
|
const String & path,
|
||||||
const String & name,
|
const String & name,
|
||||||
@ -145,7 +146,7 @@ void IMergedBlockOutputStream::fillIndexGranularity(const Block & block)
|
|||||||
blocks_are_granules_size,
|
blocks_are_granules_size,
|
||||||
index_offset,
|
index_offset,
|
||||||
index_granularity,
|
index_granularity,
|
||||||
storage.canUseAdaptiveGranularity());
|
can_use_adaptive_granularity);
|
||||||
}
|
}
|
||||||
|
|
||||||
void IMergedBlockOutputStream::writeSingleMark(
|
void IMergedBlockOutputStream::writeSingleMark(
|
||||||
@ -176,7 +177,7 @@ void IMergedBlockOutputStream::writeSingleMark(
|
|||||||
|
|
||||||
writeIntBinary(stream.plain_hashing.count(), stream.marks);
|
writeIntBinary(stream.plain_hashing.count(), stream.marks);
|
||||||
writeIntBinary(stream.compressed.offset(), stream.marks);
|
writeIntBinary(stream.compressed.offset(), stream.marks);
|
||||||
if (storage.canUseAdaptiveGranularity())
|
if (can_use_adaptive_granularity)
|
||||||
writeIntBinary(number_of_rows, stream.marks);
|
writeIntBinary(number_of_rows, stream.marks);
|
||||||
}, path);
|
}, path);
|
||||||
}
|
}
|
||||||
@ -362,7 +363,7 @@ void IMergedBlockOutputStream::calculateAndSerializeSkipIndices(
|
|||||||
writeIntBinary(stream.compressed.offset(), stream.marks);
|
writeIntBinary(stream.compressed.offset(), stream.marks);
|
||||||
/// Actually this numbers is redundant, but we have to store them
|
/// Actually this numbers is redundant, but we have to store them
|
||||||
/// to be compatible with normal .mrk2 file format
|
/// to be compatible with normal .mrk2 file format
|
||||||
if (storage.canUseAdaptiveGranularity())
|
if (can_use_adaptive_granularity)
|
||||||
writeIntBinary(1UL, stream.marks);
|
writeIntBinary(1UL, stream.marks);
|
||||||
|
|
||||||
++skip_index_current_mark;
|
++skip_index_current_mark;
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
#pragma once
|
#pragma once
|
||||||
|
|
||||||
#include <Storages/MergeTree/MergeTreeIndexGranularity.h>
|
#include <Storages/MergeTree/MergeTreeIndexGranularity.h>
|
||||||
|
#include <Storages/MergeTree/MergeTreeIndexGranularityInfo.h>
|
||||||
#include <IO/WriteBufferFromFile.h>
|
#include <IO/WriteBufferFromFile.h>
|
||||||
#include <Compression/CompressedWriteBuffer.h>
|
#include <Compression/CompressedWriteBuffer.h>
|
||||||
#include <IO/HashingWriteBuffer.h>
|
#include <IO/HashingWriteBuffer.h>
|
||||||
@ -23,7 +24,8 @@ public:
|
|||||||
size_t aio_threshold_,
|
size_t aio_threshold_,
|
||||||
bool blocks_are_granules_size_,
|
bool blocks_are_granules_size_,
|
||||||
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
|
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
|
||||||
const MergeTreeIndexGranularity & index_granularity_);
|
const MergeTreeIndexGranularity & index_granularity_,
|
||||||
|
const MergeTreeIndexGranularityInfo * index_granularity_info_ = nullptr);
|
||||||
|
|
||||||
using WrittenOffsetColumns = std::set<std::string>;
|
using WrittenOffsetColumns = std::set<std::string>;
|
||||||
|
|
||||||
@ -141,6 +143,7 @@ protected:
|
|||||||
size_t current_mark = 0;
|
size_t current_mark = 0;
|
||||||
size_t skip_index_mark = 0;
|
size_t skip_index_mark = 0;
|
||||||
|
|
||||||
|
const bool can_use_adaptive_granularity;
|
||||||
const std::string marks_file_extension;
|
const std::string marks_file_extension;
|
||||||
const bool blocks_are_granules_size;
|
const bool blocks_are_granules_size;
|
||||||
|
|
||||||
|
@ -546,11 +546,13 @@ bool KeyCondition::tryPrepareSetIndex(
|
|||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
size_t left_args_count = 1;
|
||||||
const auto * left_arg_tuple = left_arg->as<ASTFunction>();
|
const auto * left_arg_tuple = left_arg->as<ASTFunction>();
|
||||||
if (left_arg_tuple && left_arg_tuple->name == "tuple")
|
if (left_arg_tuple && left_arg_tuple->name == "tuple")
|
||||||
{
|
{
|
||||||
const auto & tuple_elements = left_arg_tuple->arguments->children;
|
const auto & tuple_elements = left_arg_tuple->arguments->children;
|
||||||
for (size_t i = 0; i < tuple_elements.size(); ++i)
|
left_args_count = tuple_elements.size();
|
||||||
|
for (size_t i = 0; i < left_args_count; ++i)
|
||||||
get_key_tuple_position_mapping(tuple_elements[i], i);
|
get_key_tuple_position_mapping(tuple_elements[i], i);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
@ -577,6 +579,10 @@ bool KeyCondition::tryPrepareSetIndex(
|
|||||||
if (!prepared_set->hasExplicitSetElements())
|
if (!prepared_set->hasExplicitSetElements())
|
||||||
return false;
|
return false;
|
||||||
|
|
||||||
|
prepared_set->checkColumnsNumber(left_args_count);
|
||||||
|
for (size_t i = 0; i < indexes_mapping.size(); ++i)
|
||||||
|
prepared_set->checkTypesEqual(indexes_mapping[i].tuple_index, removeLowCardinality(data_types[i]));
|
||||||
|
|
||||||
out.set_index = std::make_shared<MergeTreeSetIndex>(prepared_set->getSetElements(), std::move(indexes_mapping));
|
out.set_index = std::make_shared<MergeTreeSetIndex>(prepared_set->getSetElements(), std::move(indexes_mapping));
|
||||||
|
|
||||||
return true;
|
return true;
|
||||||
|
@ -1581,7 +1581,8 @@ void MergeTreeData::alterDataPart(
|
|||||||
true /* skip_offsets */,
|
true /* skip_offsets */,
|
||||||
{},
|
{},
|
||||||
unused_written_offsets,
|
unused_written_offsets,
|
||||||
part->index_granularity);
|
part->index_granularity,
|
||||||
|
&part->index_granularity_info);
|
||||||
|
|
||||||
in.readPrefix();
|
in.readPrefix();
|
||||||
out.writePrefix();
|
out.writePrefix();
|
||||||
|
@ -934,6 +934,7 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
|
|||||||
new_data_part->relative_path = "tmp_mut_" + future_part.name;
|
new_data_part->relative_path = "tmp_mut_" + future_part.name;
|
||||||
new_data_part->is_temp = true;
|
new_data_part->is_temp = true;
|
||||||
new_data_part->ttl_infos = source_part->ttl_infos;
|
new_data_part->ttl_infos = source_part->ttl_infos;
|
||||||
|
new_data_part->index_granularity_info = source_part->index_granularity_info;
|
||||||
|
|
||||||
String new_part_tmp_path = new_data_part->getFullPath();
|
String new_part_tmp_path = new_data_part->getFullPath();
|
||||||
|
|
||||||
@ -1069,7 +1070,8 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
|
|||||||
/* skip_offsets = */ false,
|
/* skip_offsets = */ false,
|
||||||
std::vector<MergeTreeIndexPtr>(indices_to_recalc.begin(), indices_to_recalc.end()),
|
std::vector<MergeTreeIndexPtr>(indices_to_recalc.begin(), indices_to_recalc.end()),
|
||||||
unused_written_offsets,
|
unused_written_offsets,
|
||||||
source_part->index_granularity
|
source_part->index_granularity,
|
||||||
|
&source_part->index_granularity_info
|
||||||
);
|
);
|
||||||
|
|
||||||
in->readPrefix();
|
in->readPrefix();
|
||||||
|
@ -8,14 +8,16 @@ MergedColumnOnlyOutputStream::MergedColumnOnlyOutputStream(
|
|||||||
CompressionCodecPtr default_codec_, bool skip_offsets_,
|
CompressionCodecPtr default_codec_, bool skip_offsets_,
|
||||||
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
|
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
|
||||||
WrittenOffsetColumns & already_written_offset_columns_,
|
WrittenOffsetColumns & already_written_offset_columns_,
|
||||||
const MergeTreeIndexGranularity & index_granularity_)
|
const MergeTreeIndexGranularity & index_granularity_,
|
||||||
|
const MergeTreeIndexGranularityInfo * index_granularity_info_)
|
||||||
: IMergedBlockOutputStream(
|
: IMergedBlockOutputStream(
|
||||||
storage_, part_path_, storage_.global_context.getSettings().min_compress_block_size,
|
storage_, part_path_, storage_.global_context.getSettings().min_compress_block_size,
|
||||||
storage_.global_context.getSettings().max_compress_block_size, default_codec_,
|
storage_.global_context.getSettings().max_compress_block_size, default_codec_,
|
||||||
storage_.global_context.getSettings().min_bytes_to_use_direct_io,
|
storage_.global_context.getSettings().min_bytes_to_use_direct_io,
|
||||||
false,
|
false,
|
||||||
indices_to_recalc_,
|
indices_to_recalc_,
|
||||||
index_granularity_),
|
index_granularity_,
|
||||||
|
index_granularity_info_),
|
||||||
header(header_), sync(sync_), skip_offsets(skip_offsets_),
|
header(header_), sync(sync_), skip_offsets(skip_offsets_),
|
||||||
already_written_offset_columns(already_written_offset_columns_)
|
already_written_offset_columns(already_written_offset_columns_)
|
||||||
{
|
{
|
||||||
|
@ -17,7 +17,8 @@ public:
|
|||||||
CompressionCodecPtr default_codec_, bool skip_offsets_,
|
CompressionCodecPtr default_codec_, bool skip_offsets_,
|
||||||
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
|
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
|
||||||
WrittenOffsetColumns & already_written_offset_columns_,
|
WrittenOffsetColumns & already_written_offset_columns_,
|
||||||
const MergeTreeIndexGranularity & index_granularity_);
|
const MergeTreeIndexGranularity & index_granularity_,
|
||||||
|
const MergeTreeIndexGranularityInfo * index_granularity_info_ = nullptr);
|
||||||
|
|
||||||
Block getHeader() const override { return header; }
|
Block getHeader() const override { return header; }
|
||||||
void write(const Block & block) override;
|
void write(const Block & block) override;
|
||||||
|
@ -176,6 +176,22 @@ IColumn::Selector createSelector(const ClusterPtr cluster, const ColumnWithTypeA
|
|||||||
throw Exception{"Sharding key expression does not evaluate to an integer type", ErrorCodes::TYPE_MISMATCH};
|
throw Exception{"Sharding key expression does not evaluate to an integer type", ErrorCodes::TYPE_MISMATCH};
|
||||||
}
|
}
|
||||||
|
|
||||||
|
std::string makeFormattedListOfShards(const ClusterPtr & cluster)
|
||||||
|
{
|
||||||
|
std::ostringstream os;
|
||||||
|
|
||||||
|
bool head = true;
|
||||||
|
os << "[";
|
||||||
|
for (const auto & shard_info : cluster->getShardsInfo())
|
||||||
|
{
|
||||||
|
(head ? os : os << ", ") << shard_info.shard_num;
|
||||||
|
head = false;
|
||||||
|
}
|
||||||
|
os << "]";
|
||||||
|
|
||||||
|
return os.str();
|
||||||
|
}
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@ -312,10 +328,23 @@ BlockInputStreams StorageDistributed::read(
|
|||||||
|
|
||||||
if (settings.optimize_skip_unused_shards)
|
if (settings.optimize_skip_unused_shards)
|
||||||
{
|
{
|
||||||
auto smaller_cluster = skipUnusedShards(cluster, query_info);
|
if (has_sharding_key)
|
||||||
|
{
|
||||||
|
auto smaller_cluster = skipUnusedShards(cluster, query_info);
|
||||||
|
|
||||||
if (smaller_cluster)
|
if (smaller_cluster)
|
||||||
cluster = smaller_cluster;
|
{
|
||||||
|
cluster = smaller_cluster;
|
||||||
|
LOG_DEBUG(log, "Reading from " << database_name << "." << table_name << ": "
|
||||||
|
"Skipping irrelevant shards - the query will be sent to the following shards of the cluster (shard numbers): "
|
||||||
|
" " << makeFormattedListOfShards(cluster));
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
LOG_DEBUG(log, "Reading from " << database_name << "." << table_name << ": "
|
||||||
|
"Unable to figure out irrelevant shards from WHERE/PREWHERE clauses - the query will be sent to all shards of the cluster");
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
return ClusterProxy::executeQuery(
|
return ClusterProxy::executeQuery(
|
||||||
@ -488,15 +517,32 @@ void StorageDistributed::ClusterNodeData::shutdownAndDropAllData()
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Returns a new cluster with fewer shards if constant folding for `sharding_key_expr` is possible
|
/// Returns a new cluster with fewer shards if constant folding for `sharding_key_expr` is possible
|
||||||
/// using constraints from "WHERE" condition, otherwise returns `nullptr`
|
/// using constraints from "PREWHERE" and "WHERE" conditions, otherwise returns `nullptr`
|
||||||
ClusterPtr StorageDistributed::skipUnusedShards(ClusterPtr cluster, const SelectQueryInfo & query_info)
|
ClusterPtr StorageDistributed::skipUnusedShards(ClusterPtr cluster, const SelectQueryInfo & query_info)
|
||||||
{
|
{
|
||||||
|
if (!has_sharding_key)
|
||||||
|
{
|
||||||
|
throw Exception("Internal error: cannot determine shards of a distributed table if no sharding expression is supplied", ErrorCodes::LOGICAL_ERROR);
|
||||||
|
}
|
||||||
|
|
||||||
const auto & select = query_info.query->as<ASTSelectQuery &>();
|
const auto & select = query_info.query->as<ASTSelectQuery &>();
|
||||||
|
|
||||||
if (!select.where() || !sharding_key_expr)
|
if (!select.prewhere() && !select.where())
|
||||||
|
{
|
||||||
return nullptr;
|
return nullptr;
|
||||||
|
}
|
||||||
|
|
||||||
const auto & blocks = evaluateExpressionOverConstantCondition(select.where(), sharding_key_expr);
|
ASTPtr condition_ast;
|
||||||
|
if (select.prewhere() && select.where())
|
||||||
|
{
|
||||||
|
condition_ast = makeASTFunction("and", select.prewhere()->clone(), select.where()->clone());
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
condition_ast = select.prewhere() ? select.prewhere()->clone() : select.where()->clone();
|
||||||
|
}
|
||||||
|
|
||||||
|
const auto blocks = evaluateExpressionOverConstantCondition(condition_ast, sharding_key_expr);
|
||||||
|
|
||||||
// Can't get definite answer if we can skip any shards
|
// Can't get definite answer if we can skip any shards
|
||||||
if (!blocks)
|
if (!blocks)
|
||||||
|
@ -1956,10 +1956,37 @@ void StorageReplicatedMergeTree::cloneReplica(const String & source_replica, Coo
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Add to the queue jobs to receive all the active parts that the reference/master replica has.
|
/// Add to the queue jobs to receive all the active parts that the reference/master replica has.
|
||||||
Strings parts = zookeeper->getChildren(source_path + "/parts");
|
Strings source_replica_parts = zookeeper->getChildren(source_path + "/parts");
|
||||||
ActiveDataPartSet active_parts_set(format_version, parts);
|
ActiveDataPartSet active_parts_set(format_version, source_replica_parts);
|
||||||
|
|
||||||
Strings active_parts = active_parts_set.getParts();
|
Strings active_parts = active_parts_set.getParts();
|
||||||
|
|
||||||
|
/// Remove local parts if source replica does not have them, because such parts will never be fetched by other replicas.
|
||||||
|
Strings local_parts_in_zk = zookeeper->getChildren(replica_path + "/parts");
|
||||||
|
Strings parts_to_remove_from_zk;
|
||||||
|
for (const auto & part : local_parts_in_zk)
|
||||||
|
{
|
||||||
|
if (active_parts_set.getContainingPart(part).empty())
|
||||||
|
{
|
||||||
|
queue.remove(zookeeper, part);
|
||||||
|
parts_to_remove_from_zk.emplace_back(part);
|
||||||
|
LOG_WARNING(log, "Source replica does not have part " << part << ". Removing it from ZooKeeper.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
tryRemovePartsFromZooKeeperWithRetries(parts_to_remove_from_zk);
|
||||||
|
|
||||||
|
auto local_active_parts = getDataParts();
|
||||||
|
DataPartsVector parts_to_remove_from_working_set;
|
||||||
|
for (const auto & part : local_active_parts)
|
||||||
|
{
|
||||||
|
if (active_parts_set.getContainingPart(part->name).empty())
|
||||||
|
{
|
||||||
|
parts_to_remove_from_working_set.emplace_back(part);
|
||||||
|
LOG_WARNING(log, "Source replica does not have part " << part->name << ". Removing it from working set.");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
removePartsFromWorkingSet(parts_to_remove_from_working_set, true);
|
||||||
|
|
||||||
for (const String & name : active_parts)
|
for (const String & name : active_parts)
|
||||||
{
|
{
|
||||||
LogEntry log_entry;
|
LogEntry log_entry;
|
||||||
|
@ -148,10 +148,9 @@ StoragesInfoStream::StoragesInfoStream(const SelectQueryInfo & query_info, const
|
|||||||
|
|
||||||
StoragesInfo StoragesInfoStream::next()
|
StoragesInfo StoragesInfoStream::next()
|
||||||
{
|
{
|
||||||
StoragesInfo info;
|
|
||||||
|
|
||||||
while (next_row < rows)
|
while (next_row < rows)
|
||||||
{
|
{
|
||||||
|
StoragesInfo info;
|
||||||
|
|
||||||
info.database = (*database_column)[next_row].get<String>();
|
info.database = (*database_column)[next_row].get<String>();
|
||||||
info.table = (*table_column)[next_row].get<String>();
|
info.table = (*table_column)[next_row].get<String>();
|
||||||
@ -198,10 +197,10 @@ StoragesInfo StoragesInfoStream::next()
|
|||||||
if (!info.data)
|
if (!info.data)
|
||||||
throw Exception("Unknown engine " + info.engine, ErrorCodes::LOGICAL_ERROR);
|
throw Exception("Unknown engine " + info.engine, ErrorCodes::LOGICAL_ERROR);
|
||||||
|
|
||||||
break;
|
return info;
|
||||||
}
|
}
|
||||||
|
|
||||||
return info;
|
return {};
|
||||||
}
|
}
|
||||||
|
|
||||||
BlockInputStreams StorageSystemPartsBase::read(
|
BlockInputStreams StorageSystemPartsBase::read(
|
||||||
|
@ -288,6 +288,17 @@ def test_mixed_granularity_single_node(start_dynamic_cluster, node):
|
|||||||
|
|
||||||
node.exec_in_container(["bash", "-c", "find {p} -name '*.mrk' | grep '.*'".format(p=path_to_old_part)]) # check that we have non adaptive files
|
node.exec_in_container(["bash", "-c", "find {p} -name '*.mrk' | grep '.*'".format(p=path_to_old_part)]) # check that we have non adaptive files
|
||||||
|
|
||||||
|
node.query("ALTER TABLE table_with_default_granularity UPDATE dummy = dummy + 1 WHERE 1")
|
||||||
|
# still works
|
||||||
|
assert node.query("SELECT count() from table_with_default_granularity") == '6\n'
|
||||||
|
|
||||||
|
node.query("ALTER TABLE table_with_default_granularity MODIFY COLUMN dummy String")
|
||||||
|
node.query("ALTER TABLE table_with_default_granularity ADD COLUMN dummy2 Float64")
|
||||||
|
|
||||||
|
#still works
|
||||||
|
assert node.query("SELECT count() from table_with_default_granularity") == '6\n'
|
||||||
|
|
||||||
|
|
||||||
def test_version_update_two_nodes(start_dynamic_cluster):
|
def test_version_update_two_nodes(start_dynamic_cluster):
|
||||||
node11.query("INSERT INTO table_with_default_granularity VALUES (toDate('2018-10-01'), 1, 333), (toDate('2018-10-02'), 2, 444)")
|
node11.query("INSERT INTO table_with_default_granularity VALUES (toDate('2018-10-01'), 1, 333), (toDate('2018-10-02'), 2, 444)")
|
||||||
node12.query("SYSTEM SYNC REPLICA table_with_default_granularity")
|
node12.query("SYSTEM SYNC REPLICA table_with_default_granularity")
|
||||||
|
@ -0,0 +1,19 @@
|
|||||||
|
<yandex>
|
||||||
|
<remote_servers>
|
||||||
|
<test_cluster>
|
||||||
|
<shard>
|
||||||
|
<internal_replication>true</internal_replication>
|
||||||
|
<replica>
|
||||||
|
<default_database>shard_0</default_database>
|
||||||
|
<host>node1</host>
|
||||||
|
<port>9000</port>
|
||||||
|
</replica>
|
||||||
|
<replica>
|
||||||
|
<default_database>shard_0</default_database>
|
||||||
|
<host>node2</host>
|
||||||
|
<port>9000</port>
|
||||||
|
</replica>
|
||||||
|
</shard>
|
||||||
|
</test_cluster>
|
||||||
|
</remote_servers>
|
||||||
|
</yandex>
|
@ -0,0 +1,60 @@
|
|||||||
|
import pytest
|
||||||
|
|
||||||
|
from helpers.cluster import ClickHouseCluster
|
||||||
|
from helpers.network import PartitionManager
|
||||||
|
from helpers.test_tools import assert_eq_with_retry
|
||||||
|
|
||||||
|
|
||||||
|
def fill_nodes(nodes, shard):
|
||||||
|
for node in nodes:
|
||||||
|
node.query(
|
||||||
|
'''
|
||||||
|
CREATE DATABASE test;
|
||||||
|
CREATE TABLE test_table(date Date, id UInt32)
|
||||||
|
ENGINE = ReplicatedMergeTree('/clickhouse/tables/test{shard}/replicated', '{replica}')
|
||||||
|
ORDER BY id PARTITION BY toYYYYMM(date)
|
||||||
|
SETTINGS min_replicated_logs_to_keep=3, max_replicated_logs_to_keep=5, cleanup_delay_period=0, cleanup_delay_period_random_add=0;
|
||||||
|
'''.format(shard=shard, replica=node.name))
|
||||||
|
|
||||||
|
|
||||||
|
cluster = ClickHouseCluster(__file__)
|
||||||
|
node1 = cluster.add_instance('node1', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
|
||||||
|
node2 = cluster.add_instance('node2', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
|
||||||
|
|
||||||
|
@pytest.fixture(scope="module")
|
||||||
|
def start_cluster():
|
||||||
|
try:
|
||||||
|
cluster.start()
|
||||||
|
fill_nodes([node1, node2], 1)
|
||||||
|
yield cluster
|
||||||
|
except Exception as ex:
|
||||||
|
print ex
|
||||||
|
finally:
|
||||||
|
cluster.shutdown()
|
||||||
|
|
||||||
|
|
||||||
|
def test_inconsistent_parts_if_drop_while_replica_not_active(start_cluster):
|
||||||
|
with PartitionManager() as pm:
|
||||||
|
# insert into all replicas
|
||||||
|
for i in range(50):
|
||||||
|
node1.query("INSERT INTO test_table VALUES ('2019-08-16', {})".format(i))
|
||||||
|
assert_eq_with_retry(node2, "SELECT count(*) FROM test_table", node1.query("SELECT count(*) FROM test_table"))
|
||||||
|
|
||||||
|
# disable network on the first replica
|
||||||
|
pm.partition_instances(node1, node2)
|
||||||
|
pm.drop_instance_zk_connections(node1)
|
||||||
|
|
||||||
|
# drop all parts on the second replica
|
||||||
|
node2.query_with_retry("ALTER TABLE test_table DROP PARTITION 201908")
|
||||||
|
assert_eq_with_retry(node2, "SELECT count(*) FROM test_table", "0")
|
||||||
|
|
||||||
|
# insert into the second replica
|
||||||
|
# DROP_RANGE will be removed from the replication log and the first replica will be lost
|
||||||
|
for i in range(50):
|
||||||
|
node2.query("INSERT INTO test_table VALUES ('2019-08-16', {})".format(50 + i))
|
||||||
|
|
||||||
|
# the first replica will be cloned from the second
|
||||||
|
pm.heal_all()
|
||||||
|
assert_eq_with_retry(node1, "SELECT count(*) FROM test_table", node2.query("SELECT count(*) FROM test_table"))
|
||||||
|
|
||||||
|
|
@ -16,3 +16,5 @@
|
|||||||
1
|
1
|
||||||
1
|
1
|
||||||
1
|
1
|
||||||
|
1
|
||||||
|
1
|
||||||
|
@ -9,7 +9,6 @@ select 3 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 10
|
|||||||
select 4 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 1002, event = 1008) from funnel_test;
|
select 4 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 1002, event = 1008) from funnel_test;
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
select 1 = windowFunnel(1)(timestamp, event = 1000) from funnel_test;
|
select 1 = windowFunnel(1)(timestamp, event = 1000) from funnel_test;
|
||||||
select 3 = windowFunnel(2)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
|
select 3 = windowFunnel(2)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
|
||||||
select 4 = windowFunnel(3)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
|
select 4 = windowFunnel(3)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
|
||||||
@ -39,6 +38,16 @@ select 1 = windowFunnel(10000)(timestamp, event = 1008, event = 1001) from funne
|
|||||||
select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test_u64;
|
select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test_u64;
|
||||||
select 4 = windowFunnel(4)(timestamp, event <= 1007, event >= 1002, event <= 1006, event >= 1004) from funnel_test_u64;
|
select 4 = windowFunnel(4)(timestamp, event <= 1007, event >= 1002, event <= 1006, event >= 1004) from funnel_test_u64;
|
||||||
|
|
||||||
|
|
||||||
|
drop table if exists funnel_test_strict;
|
||||||
|
create table funnel_test_strict (timestamp UInt32, event UInt32) engine=Memory;
|
||||||
|
insert into funnel_test_strict values (00,1000),(10,1001),(20,1002),(30,1003),(40,1004),(50,1005),(51,1005),(60,1006),(70,1007),(80,1008);
|
||||||
|
|
||||||
|
select 6 = windowFunnel(10000, 'strict')(timestamp, event = 1000, event = 1001, event = 1002, event = 1003, event = 1004, event = 1005, event = 1006) from funnel_test_strict;
|
||||||
|
select 7 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 1002, event = 1003, event = 1004, event = 1005, event = 1006) from funnel_test_strict;
|
||||||
|
|
||||||
|
|
||||||
drop table funnel_test;
|
drop table funnel_test;
|
||||||
drop table funnel_test2;
|
drop table funnel_test2;
|
||||||
drop table funnel_test_u64;
|
drop table funnel_test_u64;
|
||||||
|
drop table funnel_test_strict;
|
||||||
|
@ -0,0 +1,15 @@
|
|||||||
|
OK
|
||||||
|
OK
|
||||||
|
1
|
||||||
|
OK
|
||||||
|
0
|
||||||
|
1
|
||||||
|
4
|
||||||
|
4
|
||||||
|
2
|
||||||
|
4
|
||||||
|
OK
|
||||||
|
OK
|
||||||
|
OK
|
||||||
|
OK
|
||||||
|
OK
|
@ -0,0 +1,100 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
|
||||||
|
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||||
|
. $CURDIR/../shell_config.sh
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS distributed_00754;"
|
||||||
|
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS mergetree_00754;"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} --query "
|
||||||
|
CREATE TABLE mergetree_00754 (a Int64, b Int64, c String) ENGINE = MergeTree ORDER BY (a, b);
|
||||||
|
"
|
||||||
|
${CLICKHOUSE_CLIENT} --query "
|
||||||
|
CREATE TABLE distributed_00754 AS mergetree_00754
|
||||||
|
ENGINE = Distributed(test_unavailable_shard, ${CLICKHOUSE_DATABASE}, mergetree_00754, jumpConsistentHash(a+b, 2));
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (0, 0, 'Hello');"
|
||||||
|
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (1, 0, 'World');"
|
||||||
|
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (0, 1, 'Hello');"
|
||||||
|
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (1, 1, 'World');"
|
||||||
|
|
||||||
|
# Should fail because the second shard is unavailable
|
||||||
|
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM distributed_00754;" 2>&1 \
|
||||||
|
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
# Should fail without setting `optimize_skip_unused_shards` = 1
|
||||||
|
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0;" 2>&1 \
|
||||||
|
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
# Should pass now
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0;
|
||||||
|
"
|
||||||
|
|
||||||
|
|
||||||
|
# Should still fail because of matching unavailable shard
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 2 AND b = 2;
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
# Try more complex expressions for constant folding - all should pass.
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 1 AND a = 0 WHERE b = 0;
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 1 WHERE b = 1 AND length(c) = 5;
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a IN (0, 1) AND b IN (0, 1) WHERE c LIKE '%l%';
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a IN (0, 1) WHERE b IN (0, 1) AND c LIKE '%l%';
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR a = 1 AND b = 1 WHERE c LIKE '%l%';
|
||||||
|
"
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE (a = 0 OR a = 1) WHERE (b = 0 OR b = 1);
|
||||||
|
"
|
||||||
|
|
||||||
|
# These should fail.
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b <= 1;
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 WHERE c LIKE '%l%';
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 OR a = 1 AND b = 0;
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR a = 2 AND b = 2;
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
||||||
|
|
||||||
|
${CLICKHOUSE_CLIENT} -n --query="
|
||||||
|
SET optimize_skip_unused_shards = 1;
|
||||||
|
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR c LIKE '%l%';
|
||||||
|
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
|
@ -0,0 +1,6 @@
|
|||||||
|
[249.25,499.5,749.75,899.9,949.9499999999999,989.99,998.999]
|
||||||
|
[249.75,499.5,749.25,899.1,949.05,989.01,998.001]
|
||||||
|
[250,500,750,900,950,990,999]
|
||||||
|
599.6
|
||||||
|
599.4
|
||||||
|
600
|
@ -0,0 +1,12 @@
|
|||||||
|
DROP TABLE IF EXISTS num;
|
||||||
|
CREATE TABLE num AS numbers(1000);
|
||||||
|
|
||||||
|
SELECT quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
SELECT quantilesExactInclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
SELECT quantilesExact(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
|
||||||
|
SELECT quantileExactExclusive(0.6)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
SELECT quantileExactInclusive(0.6)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
SELECT quantileExact(0.6)(x) FROM (SELECT number AS x FROM num);
|
||||||
|
|
||||||
|
DROP TABLE num;
|
@ -0,0 +1,7 @@
|
|||||||
|
OK1
|
||||||
|
OK2
|
||||||
|
OK3
|
||||||
|
OK4
|
||||||
|
OK5
|
||||||
|
2019-08-11 world
|
||||||
|
2019-08-12 hello
|
21
dbms/tests/queries/0_stateless/00981_in_subquery_with_tuple.sh
Executable file
21
dbms/tests/queries/0_stateless/00981_in_subquery_with_tuple.sh
Executable file
@ -0,0 +1,21 @@
|
|||||||
|
#!/usr/bin/env bash
|
||||||
|
|
||||||
|
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||||
|
. $CURDIR/../shell_config.sh
|
||||||
|
|
||||||
|
$CLICKHOUSE_CLIENT --query="DROP TABLE IF EXISTS bug";
|
||||||
|
$CLICKHOUSE_CLIENT --query="CREATE TABLE bug (d Date, s String) ENGINE = MergeTree(d, s, 8192)";
|
||||||
|
$CLICKHOUSE_CLIENT --query="INSERT INTO bug VALUES ('2019-08-09', 'hello'), ('2019-08-10', 'world'), ('2019-08-11', 'world'), ('2019-08-12', 'hello')";
|
||||||
|
|
||||||
|
#SET force_primary_key = 1;
|
||||||
|
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT (s, max(d)) FROM bug GROUP BY s) ORDER BY d" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK1";
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d, s) IN (SELECT s, max(d) FROM bug GROUP BY s)" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK2";
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, max(d), s FROM bug GROUP BY s)" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK3";
|
||||||
|
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, toDateTime(d)) IN (SELECT s, max(d) FROM bug GROUP BY s)" 2>&1 | grep "Types of column 2 in section IN don't match" > /dev/null && echo "OK4";
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, toDateTime(max(d)) FROM bug GROUP BY s)" 2>&1 | grep "Types of column 2 in section IN don't match" > /dev/null && echo "OK5";
|
||||||
|
|
||||||
|
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, max(d) FROM bug GROUP BY s) ORDER BY d";
|
||||||
|
|
||||||
|
$CLICKHOUSE_CLIENT --query="DROP TABLE bug";
|
@ -7,9 +7,9 @@
|
|||||||
"docker/test/performance": "yandex/clickhouse-performance-test",
|
"docker/test/performance": "yandex/clickhouse-performance-test",
|
||||||
"docker/test/pvs": "yandex/clickhouse-pvs-test",
|
"docker/test/pvs": "yandex/clickhouse-pvs-test",
|
||||||
"docker/test/stateful": "yandex/clickhouse-stateful-test",
|
"docker/test/stateful": "yandex/clickhouse-stateful-test",
|
||||||
"docker/test/stateful_with_coverage": "yandex/clickhouse-stateful-with-coverage-test",
|
"docker/test/stateful_with_coverage": "yandex/clickhouse-stateful-test-with-coverage",
|
||||||
"docker/test/stateless": "yandex/clickhouse-stateless-test",
|
"docker/test/stateless": "yandex/clickhouse-stateless-test",
|
||||||
"docker/test/stateless_with_coverage": "yandex/clickhouse-stateless-with-coverage-test",
|
"docker/test/stateless_with_coverage": "yandex/clickhouse-stateless-test-with-coverage",
|
||||||
"docker/test/unit": "yandex/clickhouse-unit-test",
|
"docker/test/unit": "yandex/clickhouse-unit-test",
|
||||||
"docker/test/stress": "yandex/clickhouse-stress-test",
|
"docker/test/stress": "yandex/clickhouse-stress-test",
|
||||||
"dbms/tests/integration/image": "yandex/clickhouse-integration-tests-runner"
|
"dbms/tests/integration/image": "yandex/clickhouse-integration-tests-runner"
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
# docker build -t yandex/clickhouse-stateful-test .
|
# docker build -t yandex/clickhouse-stateful-test .
|
||||||
FROM yandex/clickhouse-stateless-test
|
FROM yandex/clickhouse-stateless-test
|
||||||
|
|
||||||
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic main" >> /etc/apt/sources.list
|
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-9 main" >> /etc/apt/sources.list
|
||||||
|
|
||||||
RUN apt-get update -y \
|
RUN apt-get update -y \
|
||||||
&& env DEBIAN_FRONTEND=noninteractive \
|
&& env DEBIAN_FRONTEND=noninteractive \
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
# docker build -t yandex/clickhouse-stateless-with-coverage-test .
|
# docker build -t yandex/clickhouse-stateless-with-coverage-test .
|
||||||
FROM yandex/clickhouse-deb-builder
|
FROM yandex/clickhouse-deb-builder
|
||||||
|
|
||||||
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic main" >> /etc/apt/sources.list
|
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-9 main" >> /etc/apt/sources.list
|
||||||
|
|
||||||
RUN apt-get update -y \
|
RUN apt-get update -y \
|
||||||
&& env DEBIAN_FRONTEND=noninteractive \
|
&& env DEBIAN_FRONTEND=noninteractive \
|
||||||
|
@ -1,15 +1,14 @@
|
|||||||
# Roadmap
|
# Roadmap
|
||||||
|
|
||||||
## Q2 2019
|
## Q3 2019
|
||||||
|
|
||||||
- DDL for dictionaries
|
- DDL for dictionaries
|
||||||
- Integration with S3-like object stores
|
- Integration with S3-like object stores
|
||||||
- Multiple storages for hot/cold data, JBOD support
|
- Multiple storages for hot/cold data, JBOD support
|
||||||
|
|
||||||
## Q3 2019
|
## Q4 2019
|
||||||
|
|
||||||
- JOIN execution improvements:
|
- JOIN not limited by available memory
|
||||||
- Distributed join not limited by memory
|
|
||||||
- Resource pools for more precise distribution of cluster capacity between users
|
- Resource pools for more precise distribution of cluster capacity between users
|
||||||
- Fine-grained authorization
|
- Fine-grained authorization
|
||||||
- Integration with external authentication services
|
- Integration with external authentication services
|
||||||
|
Loading…
Reference in New Issue
Block a user