ClickHouse/tests/queries
Azat Khuzhin 7b209694d5 Fix optimize_distributed_group_by_sharding_key for multiple columns
Before we incorrectly check that columns from GROUP BY was a subset of
columns from sharding key, while this is not right, consider the
following example:

    select k1, any(k2), sum(v) from remote('127.{1,2}', view(select 1 k1, 2 k2, 3 v), cityHash64(k1, k2)) group by k1

Here the columns from GROUP BY is a subset of columns from sharding key,
but the optimization cannot be applied, since there is no guarantee that
particular shard contains distinct values of k1.

So instead we should check that GROUP BY contains all columns that is
required for calculating sharding key expression, i.e.:

    select k1, k2, sum(v) from remote('127.{1,2}', view(select 1 k1, 2 k2, 3 v), cityHash64(k1, k2)) group by k1, k2
2021-07-15 09:09:58 +03:00
..
0_stateless Fix optimize_distributed_group_by_sharding_key for multiple columns 2021-07-15 09:09:58 +03:00
1_stateful Compile AggregateFunctionBitwise 2021-07-10 01:51:34 +03:00
bugs
__init__.py
conftest.py
pytest.ini
query_test.py
server.py
shell_config.sh
skip_list.json Merge pull request #23140 from amosbird/fixrandomoneshardinsert 2021-07-13 11:47:53 +03:00