Merge branch 'master' into better-local-object-storage

This commit is contained in:
Kseniia Sumarokova 2023-04-28 21:05:05 +02:00 committed by GitHub
commit d4aa96e262
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
32 changed files with 1043 additions and 1445 deletions

View File

@ -80,11 +80,9 @@ def process_test_log(log_path, broken_tests):
test_results.append(
(
test_name,
"FAIL",
"SKIPPED",
test_time,
[
"Test is expected to fail! Please, update broken_tests.txt!\n"
],
["This test passed. Update broken_tests.txt.\n"],
)
)
else:

View File

@ -72,7 +72,7 @@ cmake -S . -B build
cmake --build build # or: `cd build; ninja`
```
To create an executable, run `cmake --build --target clickhouse` (or: `cd build; ninja clickhouse`).
To create an executable, run `cmake --build build --target clickhouse` (or: `cd build; ninja clickhouse`).
This will create executable `build/programs/clickhouse` which can be used with `client` or `server` arguments.
## Building on Any Linux {#how-to-build-clickhouse-on-any-linux}

View File

@ -39,7 +39,7 @@ Samples must belong to continuous, one-dimensional probability distributions.
Which in fact means that F(x) <= G(x) for all x. And the alternative in this case is that F(x) > G(x) for at least one x.
- `computation_method` — the method used to compute p-value. (Optional, default: `'auto'`.) [String](../../../sql-reference/data-types/string.md).
- `'exact'` - calculation is performed using precise probability distribution of the test statistics. Compute intensive and wasteful except for small samples.
- `'asymp'` - calculation is performed using an approximation. For large sample sizes, the exact and asymptotic p-values are very similar.
- `'asymp'` (`'asymptotic'`) - calculation is performed using an approximation. For large sample sizes, the exact and asymptotic p-values are very similar.
- `'auto'` - the `'exact'` method is used when a maximum number of samples is less than 10'000.

View File

@ -0,0 +1,117 @@
---
slug: /ru/sql-reference/aggregate-functions/reference/kolmogorovsmirnovtest
sidebar_position: 300
sidebar_label: kolmogorovSmirnovTest
---
# kolmogorovSmirnovTest {#kolmogorovSmirnovTest}
Проводит статистический тест Колмогорова-Смирнова для двух независимых выборок.
**Синтаксис**
``` sql
kolmogorovSmirnovTest([alternative, computation_method])(sample_data, sample_index)
```
Значения выборок берутся из столбца `sample_data`. Если `sample_index` равно 0, то значение из этой строки принадлежит первой выборке. Во всех остальных случаях значение принадлежит второй выборке.
Выборки должны принадлежать непрерывным одномерным распределениям.
**Аргументы**
- `sample_data` — данные выборок. [Integer](../../../sql-reference/data-types/int-uint.md), [Float](../../../sql-reference/data-types/float.md) or [Decimal](../../../sql-reference/data-types/decimal.md).
- `sample_index` — индексы выборок. [Integer](../../../sql-reference/data-types/int-uint.md).
**Параметры**
- `alternative` — альтернативная гипотеза (Необязательный параметр, по умолчанию: `'two-sided'`.) [String](../../../sql-reference/data-types/string.md).
Пусть F(x) и G(x) - функции распределения первой и второй выборки соотвественно.
- `'two-sided'`
Нулевая гипотеза состоит в том, что выборки происходит из одного и того же распределение, то есть F(x) = G(x) для любого x.
Альтернатива - выборки принадлежат разным распределениям.
- `'greater'`
Нулевая гипотеза состоит в том, что элементы первой выборки в асимптотически почти наверное меньше элементов из второй выборки,
то есть функция распределения первой выборки лежит выше и соотвественно левее, чем функция распределения второй выборки.
Таким образом это означает, что F(x) >= G(x) for любого x, а альтернатива в этом случае состоит в том, что F(x) < G(x) хотя бы для одного x.
- `'less'`.
Нулевая гипотеза состоит в том, что элементы первой выборки в асимптотически почти наверное больше элементов из второй выборки,
то есть функция распределения первой выборки лежит ниже и соотвественно правее, чем функция распределения второй выборки.
Таким образом это означает, что F(x) <= G(x) for любого x, а альтернатива в этом случае состоит в том, что F(x) > G(x) хотя бы для одного x.
- `computation_method` — метод, используемый для вычисления p-value. (Необязательный параметр, по умолчанию: `'auto'`.) [String](../../../sql-reference/data-types/string.md).
- `'exact'` - вычисление производится с помощью вычисления точного распределения статистики. Требует большого количества вычислительных ресурсов и расточительно для больших выборок.
- `'asymp'`(`'asymptotic'`) - используется приближенное вычисление. Для больших выборок приближенный результат и точный почти идентичны.
- `'auto'` - значение вычисляется точно (с помощью метода `'exact'`), если максимальный размер двух выборок не превышает 10'000.
**Возвращаемые значения**
[Кортеж](../../../sql-reference/data-types/tuple.md) с двумя элементами:
- вычисленное статистики. [Float64](../../../sql-reference/data-types/float.md).
- вычисленное p-value. [Float64](../../../sql-reference/data-types/float.md).
**Пример**
Запрос:
``` sql
SELECT kolmogorovSmirnovTest('less', 'exact')(value, num)
FROM
(
SELECT
randNormal(0, 10) AS value,
0 AS num
FROM numbers(10000)
UNION ALL
SELECT
randNormal(0, 10) AS value,
1 AS num
FROM numbers(10000)
)
```
Результат:
``` text
┌─kolmogorovSmirnovTest('less', 'exact')(value, num)─┐
│ (0.009899999999999996,0.37528595205132287) │
└────────────────────────────────────────────────────┘
```
Заметки:
P-value больше чем 0.05 (для уровня значимости 95%), то есть нулевая гипотеза не отвергается.
Запрос:
``` sql
SELECT kolmogorovSmirnovTest('two-sided', 'exact')(value, num)
FROM
(
SELECT
randStudentT(10) AS value,
0 AS num
FROM numbers(100)
UNION ALL
SELECT
randNormal(0, 10) AS value,
1 AS num
FROM numbers(100)
)
```
Результат:
``` text
┌─kolmogorovSmirnovTest('two-sided', 'exact')(value, num)─┐
│ (0.4100000000000002,6.61735760482795e-8) │
└─────────────────────────────────────────────────────────┘
```
Заметки:
P-value меньше чем 0.05 (для уровня значимости 95%), то есть нулевая гипотеза отвергается.
**Смотрите также**
- [Критерий согласия Колмогорова-Смирнова](https://ru.wikipedia.org/wiki/%D0%9A%D1%80%D0%B8%D1%82%D0%B5%D1%80%D0%B8%D0%B9_%D1%81%D0%BE%D0%B3%D0%BB%D0%B0%D1%81%D0%B8%D1%8F_%D0%9A%D0%BE%D0%BB%D0%BC%D0%BE%D0%B3%D0%BE%D1%80%D0%BE%D0%B2%D0%B0)

View File

@ -91,9 +91,9 @@ struct KolmogorovSmirnov : public StatisticalSample<Float64, Float64>
UInt64 ny_g = n2 / g;
if (method == "auto")
method = std::max(n1, n2) <= 10000 ? "exact" : "asymp";
method = std::max(n1, n2) <= 10000 ? "exact" : "asymptotic";
else if (method == "exact" && nx_g >= std::numeric_limits<Int32>::max() / ny_g)
method = "asymp";
method = "asymptotic";
Float64 p_value = std::numeric_limits<Float64>::infinity();
@ -143,7 +143,7 @@ struct KolmogorovSmirnov : public StatisticalSample<Float64, Float64>
}
p_value = c[n1];
}
else if (method == "asymp")
else if (method == "asymp" || method == "asymptotic")
{
Float64 n = std::min(n1, n2);
Float64 m = std::max(n1, n2);
@ -242,9 +242,9 @@ public:
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Aggregate function {} require second parameter to be a String", getName());
method = params[1].get<String>();
if (method != "auto" && method != "exact" && method != "asymp")
if (method != "auto" && method != "exact" && method != "asymp" && method != "asymptotic")
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Unknown method in aggregate function {}. "
"It must be one of: 'auto', 'exact', 'asymp'", getName());
"It must be one of: 'auto', 'exact', 'asymp' (or 'asymptotic')", getName());
}
String getName() const override

View File

@ -4081,12 +4081,12 @@ ProjectionNames QueryAnalyzer::resolveMatcher(QueryTreeNodePtr & matcher_node, I
if (apply_transformer_was_used || replace_transformer_was_used)
continue;
replace_transformer_was_used = true;
auto replace_expression = replace_transformer->findReplacementExpression(column_name);
if (!replace_expression)
continue;
replace_transformer_was_used = true;
if (replace_transformer->isStrict())
strict_transformer_to_used_column_names[replace_transformer].insert(column_name);
@ -6679,7 +6679,9 @@ void QueryAnalyzer::resolveQuery(const QueryTreeNodePtr & query_node, Identifier
bool is_rollup_or_cube = query_node_typed.isGroupByWithRollup() || query_node_typed.isGroupByWithCube();
if (query_node_typed.isGroupByWithGroupingSets() && query_node_typed.isGroupByWithTotals())
if (query_node_typed.isGroupByWithGroupingSets()
&& query_node_typed.isGroupByWithTotals()
&& query_node_typed.getGroupBy().getNodes().size() != 1)
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "WITH TOTALS and GROUPING SETS are not supported together");
if (query_node_typed.isGroupByWithGroupingSets() && is_rollup_or_cube)

View File

@ -67,8 +67,15 @@ AsynchronousMetrics::AsynchronousMetrics(
openFileIfExists("/proc/uptime", uptime);
openFileIfExists("/proc/net/dev", net_dev);
openFileIfExists("/sys/fs/cgroup/memory/memory.limit_in_bytes", cgroupmem_limit_in_bytes);
openFileIfExists("/sys/fs/cgroup/memory/memory.usage_in_bytes", cgroupmem_usage_in_bytes);
/// CGroups v2
openFileIfExists("/sys/fs/cgroup/memory.max", cgroupmem_limit_in_bytes);
openFileIfExists("/sys/fs/cgroup/memory.current", cgroupmem_usage_in_bytes);
/// CGroups v1
if (!cgroupmem_limit_in_bytes)
openFileIfExists("/sys/fs/cgroup/memory/memory.limit_in_bytes", cgroupmem_limit_in_bytes);
if (!cgroupmem_usage_in_bytes)
openFileIfExists("/sys/fs/cgroup/memory/memory.usage_in_bytes", cgroupmem_usage_in_bytes);
openSensors();
openBlockDevices();
@ -900,33 +907,25 @@ void AsynchronousMetrics::update(TimePoint update_time)
if (cgroupmem_limit_in_bytes && cgroupmem_usage_in_bytes)
{
try {
try
{
cgroupmem_limit_in_bytes->rewind();
cgroupmem_usage_in_bytes->rewind();
uint64_t cgroup_mem_limit_in_bytes = 0;
uint64_t cgroup_mem_usage_in_bytes = 0;
uint64_t limit = 0;
uint64_t usage = 0;
readText(cgroup_mem_limit_in_bytes, *cgroupmem_limit_in_bytes);
readText(cgroup_mem_usage_in_bytes, *cgroupmem_usage_in_bytes);
tryReadText(limit, *cgroupmem_limit_in_bytes);
tryReadText(usage, *cgroupmem_usage_in_bytes);
if (cgroup_mem_limit_in_bytes && cgroup_mem_usage_in_bytes)
{
new_values["CgroupMemoryTotal"] = { cgroup_mem_limit_in_bytes, "The total amount of memory in cgroup, in bytes." };
new_values["CgroupMemoryUsed"] = { cgroup_mem_usage_in_bytes, "The amount of memory used in cgroup, in bytes." };
}
else
{
LOG_DEBUG(log, "Cannot read statistics about the cgroup memory total and used. Total got '{}', Used got '{}'.",
cgroup_mem_limit_in_bytes, cgroup_mem_usage_in_bytes);
}
new_values["CGroupMemoryTotal"] = { limit, "The total amount of memory in cgroup, in bytes. If stated zero, the limit is the same as OSMemoryTotal." };
new_values["CGroupMemoryUsed"] = { usage, "The amount of memory used in cgroup, in bytes." };
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__);
}
}
if (meminfo)
{
try

View File

@ -24,6 +24,9 @@ public:
explicit FunctionCaseWithExpression(ContextPtr context_) : context(context_) {}
bool isVariadic() const override { return true; }
bool useDefaultImplementationForConstants() const override { return false; }
bool useDefaultImplementationForNulls() const override { return false; }
bool useDefaultImplementationForNothing() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override { return 0; }
String getName() const override { return name; }

File diff suppressed because it is too large Load Diff

View File

@ -655,11 +655,8 @@ bool FileCache::tryReserve(FileSegment & file_segment, size_t size)
{
auto locked_key = deletion_info.getMetadata().tryLock();
if (!locked_key)
{
/// key could become invalid after we released the key lock above, just skip it.
chassert(locked_key->getKeyState() != KeyMetadata::KeyState::ACTIVE);
continue;
}
continue; /// key could become invalid after we released the key lock above, just skip it.
for (auto it = deletion_info.begin(); it != deletion_info.end();)
{
chassert((*it)->releasable());

View File

@ -219,7 +219,7 @@ bool ExecutingGraph::updateNode(uint64_t pid, Queue & queue, Queue & async_queue
std::stack<uint64_t> updated_processors;
updated_processors.push(pid);
UpgradableMutex::ReadGuard read_lock(nodes_mutex);
std::shared_lock read_lock(nodes_mutex);
while (!updated_processors.empty() || !updated_edges.empty())
{
@ -382,11 +382,14 @@ bool ExecutingGraph::updateNode(uint64_t pid, Queue & queue, Queue & async_queue
if (need_expand_pipeline)
{
// We do not need to upgrade lock atomically, so we can safely release shared_lock and acquire unique_lock
read_lock.unlock();
{
UpgradableMutex::WriteGuard lock(read_lock);
std::unique_lock lock(nodes_mutex);
if (!expandPipeline(updated_processors, pid))
return false;
}
read_lock.lock();
/// Add itself back to be prepared again.
updated_processors.push(pid);

View File

@ -2,7 +2,7 @@
#include <Processors/Port.h>
#include <Processors/IProcessor.h>
#include <Processors/Executors/UpgradableLock.h>
#include <Common/SharedMutex.h>
#include <mutex>
#include <queue>
#include <stack>
@ -156,7 +156,7 @@ private:
std::vector<bool> source_processors;
std::mutex processors_mutex;
UpgradableMutex nodes_mutex;
SharedMutex nodes_mutex;
const bool profile_processors;
bool cancelled = false;

View File

@ -1,175 +0,0 @@
#pragma once
#include <atomic>
#include <cassert>
#include <list>
#include <mutex>
#include <condition_variable>
namespace DB
{
/// RWLock which allows to upgrade read lock to write lock.
/// Read locks should be fast if there is no write lock.
///
/// Newly created write lock waits for all active read locks.
/// Newly created read lock waits for all write locks. Starvation is possible.
///
/// Mutex must live longer than locks.
/// Read lock must live longer than corresponding write lock.
///
/// For every write lock, a new internal state is created inside mutex.
/// This state is not deallocated until the destruction of mutex itself.
///
/// Usage example:
///
/// UpgradableMutex mutex;
/// {
/// UpgradableMutex::ReadLock read_lock(mutex);
/// ...
/// {
/// UpgradableMutex::WriteLock write_lock(read_lock);
/// ...
/// }
/// ...
/// }
class UpgradableMutex
{
private:
/// Implementation idea
///
/// ----------- (read scope)
/// ++num_readers
/// ** wait for active writer (in loop, starvation is possible here) **
///
/// =========== (write scope)
/// ** create new State **
/// ** wait for active writer (in loop, starvation is possible here) **
/// ** wait for all active readers **
///
/// ** notify all waiting readers for the current state.
/// =========== (end write scope)
///
/// --num_readers
/// ** notify current active writer **
/// ----------- (end read scope)
struct State
{
size_t num_waiting = 0;
bool is_done = false;
std::mutex mutex;
std::condition_variable read_condvar;
std::condition_variable write_condvar;
void wait() noexcept
{
std::unique_lock lock(mutex);
++num_waiting;
write_condvar.notify_one();
while (!is_done)
read_condvar.wait(lock);
}
void lock(std::atomic_size_t & num_readers_) noexcept
{
/// Note : num_locked is an atomic
/// which can change it's value without locked mutex.
/// We support an invariant that after changing num_locked value,
/// UpgradableMutex::write_state is checked, and in case of active
/// write lock, we always notify it's write condvar.
std::unique_lock lock(mutex);
++num_waiting;
while (num_waiting < num_readers_.load())
write_condvar.wait(lock);
}
void unlock() noexcept
{
{
std::unique_lock lock(mutex);
is_done = true;
}
read_condvar.notify_all();
}
};
std::atomic_size_t num_readers = 0;
std::list<State> states;
std::mutex states_mutex;
std::atomic<State *> write_state{nullptr};
void lock() noexcept
{
++num_readers;
while (auto * state = write_state.load())
state->wait();
}
void unlock() noexcept
{
--num_readers;
while (auto * state = write_state.load())
state->write_condvar.notify_one();
}
State * allocState()
{
std::lock_guard guard(states_mutex);
return &states.emplace_back();
}
void upgrade(State & state) noexcept
{
State * expected = nullptr;
/// Only change nullptr -> state is possible.
while (!write_state.compare_exchange_strong(expected, &state))
{
expected->wait();
expected = nullptr;
}
state.lock(num_readers);
}
void degrade(State & state) noexcept
{
State * my = write_state.exchange(nullptr);
if (&state != my)
std::terminate();
state.unlock();
}
public:
class ReadGuard
{
public:
explicit ReadGuard(UpgradableMutex & lock_) : lock(lock_) { lock.lock(); }
~ReadGuard() { lock.unlock(); }
UpgradableMutex & lock;
};
class WriteGuard
{
public:
explicit WriteGuard(ReadGuard & read_guard_) : read_guard(read_guard_)
{
state = read_guard.lock.allocState();
read_guard.lock.upgrade(*state);
}
~WriteGuard()
{
if (state)
read_guard.lock.degrade(*state);
}
private:
ReadGuard & read_guard;
State * state = nullptr;
};
};
}

View File

@ -7617,7 +7617,7 @@ bool StorageReplicatedMergeTree::waitForProcessingQueue(UInt64 max_wait_millisec
background_operations_assignee.trigger();
std::unordered_set<String> wait_for_ids;
bool was_interrupted = false;
std::atomic_bool was_interrupted = false;
Poco::Event target_entry_event;
auto callback = [this, &target_entry_event, &wait_for_ids, &was_interrupted, sync_mode]

View File

@ -123,11 +123,7 @@
02713_array_low_cardinality_string
02707_skip_index_with_in
02707_complex_query_fails_analyzer
02699_polygons_sym_difference_rollup
02680_mysql_ast_logical_err
02677_analyzer_bitmap_has_any
02661_quantile_approx
02540_duplicate_primary_key2
02516_join_with_totals_and_subquery_bug
02324_map_combinator_bug
02241_join_rocksdb_bs

View File

@ -11,6 +11,7 @@ import shutil
import sys
import os
import os.path
import platform
import signal
import re
import copy
@ -542,7 +543,10 @@ class SettingsRandomizer:
0.2, 0.5, 1, 10 * 1024 * 1024 * 1024
),
"local_filesystem_read_method": lambda: random.choice(
# Allow to use uring only when running on Linux
["read", "pread", "mmap", "pread_threadpool", "io_uring"]
if platform.system().lower() == "linux"
else ["read", "pread", "mmap", "pread_threadpool"]
),
"remote_filesystem_read_method": lambda: random.choice(["read", "threadpool"]),
"local_filesystem_read_prefetch": lambda: random.randint(0, 1),

View File

@ -1,4 +1,4 @@
SELECT transform(1, [1], [toDecimal32(1, 2)]); -- { serverError 44 }
SELECT transform(1, [1], [toDecimal32(1, 2)]);
SELECT transform(toDecimal32(number, 2), [toDecimal32(3, 2)], [toDecimal32(30, 2)]) FROM system.numbers LIMIT 10;
SELECT transform(toDecimal32(number, 2), [toDecimal32(3, 2)], [toDecimal32(30, 2)], toDecimal32(1000, 2)) FROM system.numbers LIMIT 10;
SELECT transform(number, [3, 5, 11], [toDecimal32(30, 2), toDecimal32(50, 2), toDecimal32(70,2)], toDecimal32(1000, 2)) FROM system.numbers LIMIT 10;

View File

@ -1,2 +1,2 @@
WITH 2 AS `b.c`, [4, 5] AS a, 6 AS u, 3 AS v, 2 AS d, TRUE AS e, 1 AS f, 0 AS g, 2 AS h, 'Hello' AS i, 'World' AS j, TIMESTAMP '2022-02-02 02:02:02' AS w, [] AS k, (1, 2) AS l, 2 AS m, 3 AS n, [] AS o, [1] AS p, 1 AS q, q AS r, 1 AS s, 1 AS t
WITH 2 AS `b.c`, [4, 5] AS a, 6 AS u, 3 AS v, 2 AS d, TRUE AS e, 1 AS f, 0 AS g, 2 AS h, 'Hello' AS i, 'World' AS j, 'hi' AS w, NULL AS k, (1, 2) AS l, 2 AS m, 3 AS n, [] AS o, [1] AS p, 1 AS q, q AS r, 1 AS s, 1 AS t
SELECT INTERVAL CASE CASE WHEN NOT -a[`b.c`] * u DIV v + d IS NOT NULL AND e OR f BETWEEN g AND h THEN i ELSE j END WHEN w THEN k END || [l, (m, n)] MINUTE IS NULL OR NOT o::Array(INT) = p <> q < r > s != t AS upyachka;

View File

@ -405,16 +405,6 @@ QUERY id: 0
TABLE id: 7, table_name: system.numbers
LIMIT
CONSTANT id: 17, constant_value: UInt64_10, constant_value_type: UInt64
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
SELECT transform(number, [NULL], _CAST([\'google\', \'censor.net\', \'yahoo\'], \'Array(Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4))\'), _CAST(\'other\', \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\'))
FROM
(
@ -424,56 +414,38 @@ FROM
)
QUERY id: 0
PROJECTION COLUMNS
transform(number, [NULL], [\'google\', \'censor.net\', \'yahoo\'], \'other\') Nullable(Nothing)
transform(number, [NULL], [\'google\', \'censor.net\', \'yahoo\'], \'other\') String
PROJECTION
LIST id: 1, nodes: 1
FUNCTION id: 2, function_name: transform, function_type: ordinary, result_type: Nullable(Nothing)
FUNCTION id: 2, function_name: toString, function_type: ordinary, result_type: String
ARGUMENTS
LIST id: 3, nodes: 4
COLUMN id: 4, column_name: number, result_type: Nullable(Nothing), source_id: 5
CONSTANT id: 6, constant_value: Array_[NULL], constant_value_type: Array(Nullable(Nothing))
CONSTANT id: 7, constant_value: Array_[\'google\', \'censor.net\', \'yahoo\'], constant_value_type: Array(String)
CONSTANT id: 8, constant_value: \'other\', constant_value_type: String
LIST id: 3, nodes: 1
FUNCTION id: 4, function_name: transform, function_type: ordinary, result_type: Enum8(\'censor.net\' = 1, \'google\' = 2, \'other\' = 3, \'yahoo\' = 4)
ARGUMENTS
LIST id: 5, nodes: 4
COLUMN id: 6, column_name: number, result_type: Nullable(Nothing), source_id: 7
CONSTANT id: 8, constant_value: Array_[NULL], constant_value_type: Array(Nullable(Nothing))
FUNCTION id: 9, function_name: _CAST, function_type: ordinary, result_type: Array(Enum8(\'censor.net\' = 1, \'google\' = 2, \'other\' = 3, \'yahoo\' = 4))
ARGUMENTS
LIST id: 10, nodes: 2
CONSTANT id: 11, constant_value: Array_[\'google\', \'censor.net\', \'yahoo\'], constant_value_type: Array(String)
CONSTANT id: 12, constant_value: \'Array(Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4))\', constant_value_type: String
FUNCTION id: 13, function_name: _CAST, function_type: ordinary, result_type: Enum8(\'censor.net\' = 1, \'google\' = 2, \'other\' = 3, \'yahoo\' = 4)
ARGUMENTS
LIST id: 14, nodes: 2
CONSTANT id: 15, constant_value: \'other\', constant_value_type: String
CONSTANT id: 16, constant_value: \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\', constant_value_type: String
JOIN TREE
QUERY id: 5, is_subquery: 1
QUERY id: 7, is_subquery: 1
PROJECTION COLUMNS
number Nullable(Nothing)
PROJECTION
LIST id: 9, nodes: 1
CONSTANT id: 10, constant_value: NULL, constant_value_type: Nullable(Nothing)
LIST id: 17, nodes: 1
CONSTANT id: 18, constant_value: NULL, constant_value_type: Nullable(Nothing)
JOIN TREE
TABLE id: 11, table_name: system.numbers
TABLE id: 19, table_name: system.numbers
LIMIT
CONSTANT id: 12, constant_value: UInt64_10, constant_value_type: UInt64
\N
\N
\N
\N
\N
\N
\N
\N
\N
\N
SELECT transform(number, NULL, _CAST([\'google\', \'censor.net\', \'yahoo\'], \'Array(Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4))\'), _CAST(\'other\', \'Enum8(\\\'censor.net\\\' = 1, \\\'google\\\' = 2, \\\'other\\\' = 3, \\\'yahoo\\\' = 4)\'))
FROM system.numbers
LIMIT 10
QUERY id: 0
PROJECTION COLUMNS
transform(number, NULL, [\'google\', \'censor.net\', \'yahoo\'], \'other\') Nullable(Nothing)
PROJECTION
LIST id: 1, nodes: 1
FUNCTION id: 2, function_name: transform, function_type: ordinary, result_type: Nullable(Nothing)
ARGUMENTS
LIST id: 3, nodes: 4
COLUMN id: 4, column_name: number, result_type: UInt64, source_id: 5
CONSTANT id: 6, constant_value: NULL, constant_value_type: Nullable(Nothing)
CONSTANT id: 7, constant_value: Array_[\'google\', \'censor.net\', \'yahoo\'], constant_value_type: Array(String)
CONSTANT id: 8, constant_value: \'other\', constant_value_type: String
JOIN TREE
TABLE id: 5, table_name: system.numbers
LIMIT
CONSTANT id: 9, constant_value: UInt64_10, constant_value_type: UInt64
CONSTANT id: 20, constant_value: UInt64_10, constant_value_type: UInt64
other
other
google

View File

@ -33,13 +33,13 @@ SELECT transform(number, [2, 4, 6], ['google', 'censor.net', 'yahoo'], 'other')
EXPLAIN SYNTAX SELECT transform(number, [2, 4, 6], ['google', 'censor.net', 'yahoo'], 'other') as value, value FROM system.numbers LIMIT 10;
EXPLAIN QUERY TREE run_passes = 1 SELECT transform(number, [2, 4, 6], ['google', 'censor.net', 'yahoo'], 'other') as value, value FROM system.numbers LIMIT 10;
SELECT transform(number, [NULL], ['google', 'censor.net', 'yahoo'], 'other') FROM (SELECT NULL as number FROM system.numbers LIMIT 10);
SELECT transform(number, [NULL], ['google', 'censor.net', 'yahoo'], 'other') FROM (SELECT NULL as number FROM system.numbers LIMIT 10); -- { serverError 36 }
EXPLAIN SYNTAX SELECT transform(number, [NULL], ['google', 'censor.net', 'yahoo'], 'other') FROM (SELECT NULL as number FROM system.numbers LIMIT 10);
EXPLAIN QUERY TREE run_passes = 1 SELECT transform(number, [NULL], ['google', 'censor.net', 'yahoo'], 'other') FROM (SELECT NULL as number FROM system.numbers LIMIT 10);
SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10;
EXPLAIN SYNTAX SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10;
EXPLAIN QUERY TREE run_passes = 1 SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10;
SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10; -- { serverError 43 }
EXPLAIN SYNTAX SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10; -- { serverError 43 }
EXPLAIN QUERY TREE run_passes = 1 SELECT transform(number, NULL, ['google', 'censor.net', 'yahoo'], 'other') FROM system.numbers LIMIT 10; -- { serverError 43 }
SET optimize_if_transform_strings_to_enum = 0;

View File

@ -0,0 +1,3 @@
2
1 Z
1 Z

View File

@ -0,0 +1,14 @@
SELECT CASE 1 WHEN 1 THEN 2 END;
SELECT id,
CASE id
WHEN 1 THEN 'Z'
END x
FROM (SELECT 1 as id);
SELECT id,
CASE id
WHEN 1 THEN 'Z'
ELSE 'X'
END x
FROM (SELECT 1 as id);

View File

@ -0,0 +1,32 @@
1
1
1
1
9
9
\N
7
1
9
7
b
b
b
b
a
a
\N
c
sep1
80000
80000
sep2
80000
80000
sep3
1
sep4
8000
sep5
8000
sep6

View File

@ -0,0 +1,35 @@
select transform(2, [1,2], [9,1], materialize(null));
select transform(2, [1,2], [9,1], materialize(7));
select transform(2, [1,2], [9,1], null);
select transform(2, [1,2], [9,1], 7);
select transform(1, [1,2], [9,1], null);
select transform(1, [1,2], [9,1], 7);
select transform(5, [1,2], [9,1], null);
select transform(5, [1,2], [9,1], 7);
select transform(2, [1,2], [9,1]);
select transform(1, [1,2], [9,1]);
select transform(7, [1,2], [9,1]);
select transform(2, [1,2], ['a','b'], materialize(null));
select transform(2, [1,2], ['a','b'], materialize('c'));
select transform(2, [1,2], ['a','b'], null);
select transform(2, [1,2], ['a','b'], 'c');
select transform(1, [1,2], ['a','b'], null);
select transform(1, [1,2], ['a','b'], 'c');
select transform(5, [1,2], ['a','b'], null);
select transform(5, [1,2], ['a','b'], 'c');
select 'sep1';
SELECT transform(number, [2], [toDecimal32(1, 1)], materialize(80000)) as x FROM numbers(2);
select 'sep2';
SELECT transform(number, [2], [toDecimal32(1, 1)], 80000) as x FROM numbers(2);
select 'sep3';
SELECT transform(toDecimal32(2, 1), [toDecimal32(2, 1)], [1]);
select 'sep4';
SELECT transform(8000, [1], [toDecimal32(2, 1)]);
select 'sep5';
SELECT transform(toDecimal32(8000,0), [1], [toDecimal32(2, 1)]);
select 'sep6';
SELECT transform(-9223372036854775807, [-1], [toDecimal32(1024, 3)]) FROM system.numbers LIMIT 7; -- { serverError BAD_ARGUMENTS }
SELECT [NULL, NULL, NULL, NULL], transform(number, [2147483648], [toDecimal32(1, 2)]) AS x FROM numbers(257) WHERE materialize(10); -- { serverError BAD_ARGUMENTS }
SELECT transform(-2147483649, [1], [toDecimal32(1, 2)]) GROUP BY [1] WITH TOTALS; -- { serverError BAD_ARGUMENTS }

View File

@ -0,0 +1,72 @@
google
other
yahoo
yandex
#1
20
21
22
29
#2
0
1
3
5
7
8
9
20
21
29
#3
20
21
22
29
#4
google
other
yahoo
yandex
#5
0
1
3
5
7
8
9
google
yahoo
yandex
----
google
other
yahoo
yandex
#1
20
21
22
29
#3
20
21
22
29
#4
google
other
yahoo
yandex
----
2000
2100
2200
2900
#1
2000
2100
2200
2900
----

View File

@ -0,0 +1,25 @@
SELECT transform(number, [2, 4, 6], ['google', 'yandex', 'yahoo'], 'other') as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#1';
SELECT transform(number, [2, 4, 6], [29, 20, 21], 22) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#2';
SELECT transform(number, [2, 4, 6], [29, 20, 21]) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#3';
SELECT transform(toString(number), ['2', '4', '6'], [29, 20, 21], 22) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#4';
SELECT transform(toString(number), ['2', '4', '6'], ['google', 'yandex', 'yahoo'], 'other') as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#5';
SELECT transform(toString(number), ['2', '4', '6'], ['google', 'yandex', 'yahoo']) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '----';
SELECT transform(number, [2, 4, 6], ['google', 'yandex', 'yahoo'], materialize('other')) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#1';
SELECT transform(number, [2, 4, 6], [29, 20, 21], materialize(22)) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#3';
SELECT transform(toString(number), ['2', '4', '6'], [29, 20, 21], materialize(22)) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#4';
SELECT transform(toString(number), ['2', '4', '6'], ['google', 'yandex', 'yahoo'], materialize('other')) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '----';
SELECT transform(number, [2, 4, 6], [2900, 2000, 2100], 2200) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '#1';
SELECT transform(number, [2, 4, 6], [2900, 2000, 2100], materialize(2200)) as x FROM numbers(10) GROUP BY x ORDER BY x;
SELECT '----';
SELECT transform(number, [1], [null]) FROM system.numbers LIMIT 1; -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }

View File

@ -19,8 +19,10 @@ select quantilesGK(1000, 100/1000, 200/1000, 250/1000, 314/1000, 777/1000)(numbe
[99,199,249,313,776]
select quantilesGK(10000, 100/1000, 200/1000, 250/1000, 314/1000, 777/1000)(number + 1) from numbers(1000);
[100,200,250,314,777]
select medianGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantileGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select medianGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select medianGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantileGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select quantileGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select medianGK(100)(number) from numbers(10);
4
select quantileGK(100)(number) from numbers(10);
@ -31,7 +33,8 @@ select quantileGK(100, 0.5, 0.75)(number) from numbers(10); -- { serverError NUM
select quantileGK('abc', 0.5)(number) from numbers(10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
select quantileGK(1.23, 0.5)(number) from numbers(10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
select quantileGK(-100, 0.5)(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantilesGK(100)(number) from numbers(10); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantilesGK(100, 0.5)(number) from numbers(10);
[4]

View File

@ -1,3 +1,5 @@
set allow_experimental_analyzer = 1;
-- { echoOn }
with arrayJoin([0, 1, 2, 10]) as x select quantilesGK(100, 0.5, 0.4, 0.1)(x);
with arrayJoin([0, 6, 7, 9, 10]) as x select quantileGK(100, 0.5)(x);
@ -14,8 +16,12 @@ select quantilesGK(1000, 100/1000, 200/1000, 250/1000, 314/1000, 777/1000)(numbe
select quantilesGK(10000, 100/1000, 200/1000, 250/1000, 314/1000, 777/1000)(number + 1) from numbers(1000);
select medianGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantileGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select medianGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select medianGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantileGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select quantileGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select medianGK(100)(number) from numbers(10);
select quantileGK(100)(number) from numbers(10);
select quantileGK(100, 0.5)(number) from numbers(10);
@ -24,7 +30,9 @@ select quantileGK('abc', 0.5)(number) from numbers(10); -- { serverError ILLEGAL
select quantileGK(1.23, 0.5)(number) from numbers(10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }
select quantileGK(-100, 0.5)(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10); -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 0; -- { serverError BAD_ARGUMENTS }
select quantilesGK()(number) from numbers(10) SETTINGS allow_experimental_analyzer = 1; -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantilesGK(100)(number) from numbers(10); -- { serverError NUMBER_OF_ARGUMENTS_DOESNT_MATCH }
select quantilesGK(100, 0.5)(number) from numbers(10);
select quantilesGK('abc', 0.5, 0.75)(number) from numbers(10); -- { serverError ILLEGAL_TYPE_OF_ARGUMENT }

View File

@ -18,7 +18,7 @@ FROM
bitmapHasAny(bitmapBuild([toUInt64(1)]), (
SELECT groupBitmapState(toUInt64(2))
)) has2
); -- { serverError 43 }
) SETTINGS allow_experimental_analyzer = 0; -- { serverError 43 }
SELECT '--------------';

View File

@ -2,6 +2,8 @@
[]
[[(2147483647,0),(10.0001,65535),(1,255),(1023,2147483646)]] [[[(2147483647,0),(10.0001,65535),(1023,2147483646),(2147483647,0)]]]
[[(2147483647,0),(10.0001,65535),(1,255),(1023,2147483646)]] []
[[(2147483647,0),(10.0001,65535),(1,255),(1023,2147483646)]] [[[(2147483647,0),(10.0001,65535),(1023,2147483646),(2147483647,0)]]]
[[(2147483647,0),(10.0001,65535),(1,255),(1023,2147483646)]] [[[(2147483647,0),(10.0001,65535),(1023,2147483646),(2147483647,0)]]]
[[[(100.0001,1000.0001),(1000.0001,1.1920928955078125e-7),(20,-20),(20,20),(10,10),(-20,20),(100.0001,1000.0001)]]]
[[[(100.0001,1000.0001),(1000.0001,1.1920928955078125e-7),(20,-20),(20,20),(10,10),(-20,20),(100.0001,1000.0001)]]]
[(9223372036854775807,1.1754943508222875e-38)] [[(1,1.0001)]] \N []

View File

@ -1,5 +1,5 @@
SELECT polygonsSymDifferenceCartesian([[[(1., 1.)]] AS x], [x]) GROUP BY x WITH ROLLUP;
SELECT [[(2147483647, 0.), (10.0001, 65535), (1, 255), (1023, 2147483646)]], polygonsSymDifferenceCartesian([[[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]]], [[[(1000.0001, 10.0001)]]]) GROUP BY [[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]] WITH ROLLUP;
SELECT [[(2147483647, 0.), (10.0001, 65535), (1, 255), (1023, 2147483646)]], polygonsSymDifferenceCartesian([[[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]]], [[[(1000.0001, 10.0001)]]]) GROUP BY [[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]] WITH ROLLUP SETTINGS allow_experimental_analyzer=0;
SELECT [[(2147483647, 0.), (10.0001, 65535), (1, 255), (1023, 2147483646)]], polygonsSymDifferenceCartesian([[[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]]], [[[(1000.0001, 10.0001)]]]) GROUP BY [[(2147483647, 0.), (10.0001, 65535), (1023, 2147483646)]] WITH ROLLUP SETTINGS allow_experimental_analyzer=1;
SELECT polygonsSymDifferenceCartesian([[[(100.0001, 1000.0001), (-20., 20.), (10., 10.), (20., 20.), (20., -20.), (1000.0001, 1.1920928955078125e-7)]],[[(0.0001, 100000000000000000000.)]] AS x],[x]) GROUP BY x WITH ROLLUP;
SELECT [(9223372036854775807, 1.1754943508222875e-38)], x, NULL, polygonsSymDifferenceCartesian([[[(1.1754943508222875e-38, 1.1920928955078125e-7), (0.5, 0.5)]], [[(1.1754943508222875e-38, 1.1920928955078125e-7), (1.1754943508222875e-38, 1.1920928955078125e-7)], [(0., 1.0001)]], [[(1., 1.0001)]] AS x], [[[(3.4028234663852886e38, 0.9999)]]]) GROUP BY GROUPING SETS ((x)) WITH TOTALS