Merge branch 'master' into correctly_send_close_request

This commit is contained in:
alesapin 2020-11-18 17:05:57 +03:00
commit 8e7f7b74d0
214 changed files with 4173 additions and 2915 deletions

View File

@ -1,3 +1,12 @@
## ClickHouse release 20.11
### ClickHouse release v20.11.3.3-stable, 2020-11-13
#### Bug Fix
* Fix rare silent crashes when query profiler is on and ClickHouse is installed on OS with glibc version that has (supposedly) broken asynchronous unwind tables for some functions. This fixes [#15301](https://github.com/ClickHouse/ClickHouse/issues/15301). This fixes [#13098](https://github.com/ClickHouse/ClickHouse/issues/13098). [#16846](https://github.com/ClickHouse/ClickHouse/pull/16846) ([alexey-milovidov](https://github.com/alexey-milovidov)).
### ClickHouse release v20.11.2.1, 2020-11-11
#### Backward Incompatible Change
@ -119,6 +128,24 @@
## ClickHouse release 20.10
### ClickHouse release v20.10.4.1-stable, 2020-11-13
#### Bug Fix
* Fix rare silent crashes when query profiler is on and ClickHouse is installed on OS with glibc version that has (supposedly) broken asynchronous unwind tables for some functions. This fixes [#15301](https://github.com/ClickHouse/ClickHouse/issues/15301). This fixes [#13098](https://github.com/ClickHouse/ClickHouse/issues/13098). [#16846](https://github.com/ClickHouse/ClickHouse/pull/16846) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix `IN` operator over several columns and tuples with enabled `transform_null_in` setting. Fixes [#15310](https://github.com/ClickHouse/ClickHouse/issues/15310). [#16722](https://github.com/ClickHouse/ClickHouse/pull/16722) ([Anton Popov](https://github.com/CurtizJ)).
* This will fix optimize_read_in_order/optimize_aggregation_in_order with max_threads>0 and expression in ORDER BY. [#16637](https://github.com/ClickHouse/ClickHouse/pull/16637) ([Azat Khuzhin](https://github.com/azat)).
* Now when parsing AVRO from input the LowCardinality is removed from type. Fixes [#16188](https://github.com/ClickHouse/ClickHouse/issues/16188). [#16521](https://github.com/ClickHouse/ClickHouse/pull/16521) ([Mike](https://github.com/myrrc)).
* Fix rapid growth of metadata when using MySQL Master -> MySQL Slave -> ClickHouse MaterializeMySQL Engine, and `slave_parallel_worker` enabled on MySQL Slave, by properly shrinking GTID sets. This fixes [#15951](https://github.com/ClickHouse/ClickHouse/issues/15951). [#16504](https://github.com/ClickHouse/ClickHouse/pull/16504) ([TCeason](https://github.com/TCeason)).
* Fix DROP TABLE for Distributed (racy with INSERT). [#16409](https://github.com/ClickHouse/ClickHouse/pull/16409) ([Azat Khuzhin](https://github.com/azat)).
* Fix processing of very large entries in replication queue. Very large entries may appear in ALTER queries if table structure is extremely large (near 1 MB). This fixes [#16307](https://github.com/ClickHouse/ClickHouse/issues/16307). [#16332](https://github.com/ClickHouse/ClickHouse/pull/16332) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix bug with MySQL database. When MySQL server used as database engine is down some queries raise Exception, because they try to get tables from disabled server, while it's unnecessary. For example, query `SELECT ... FROM system.parts` should work only with MergeTree tables and don't touch MySQL database at all. [#16032](https://github.com/ClickHouse/ClickHouse/pull/16032) ([Kruglov Pavel](https://github.com/Avogar)).
#### Improvement
* Workaround for use S3 with nginx server as proxy. Nginx currenty does not accept urls with empty path like http://domain.com?delete, but vanilla aws-sdk-cpp produces this kind of urls. This commit uses patched aws-sdk-cpp version, which makes urls with "/" as path in this cases, like http://domain.com/?delete. [#16813](https://github.com/ClickHouse/ClickHouse/pull/16813) ([ianton-ru](https://github.com/ianton-ru)).
### ClickHouse release v20.10.3.30, 2020-10-28
#### Backward Incompatible Change
@ -331,6 +358,84 @@
## ClickHouse release 20.9
### ClickHouse release v20.9.5.5-stable, 2020-11-13
#### Bug Fix
* Fix rare silent crashes when query profiler is on and ClickHouse is installed on OS with glibc version that has (supposedly) broken asynchronous unwind tables for some functions. This fixes [#15301](https://github.com/ClickHouse/ClickHouse/issues/15301). This fixes [#13098](https://github.com/ClickHouse/ClickHouse/issues/13098). [#16846](https://github.com/ClickHouse/ClickHouse/pull/16846) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now when parsing AVRO from input the LowCardinality is removed from type. Fixes [#16188](https://github.com/ClickHouse/ClickHouse/issues/16188). [#16521](https://github.com/ClickHouse/ClickHouse/pull/16521) ([Mike](https://github.com/myrrc)).
* Fix rapid growth of metadata when using MySQL Master -> MySQL Slave -> ClickHouse MaterializeMySQL Engine, and `slave_parallel_worker` enabled on MySQL Slave, by properly shrinking GTID sets. This fixes [#15951](https://github.com/ClickHouse/ClickHouse/issues/15951). [#16504](https://github.com/ClickHouse/ClickHouse/pull/16504) ([TCeason](https://github.com/TCeason)).
* Fix DROP TABLE for Distributed (racy with INSERT). [#16409](https://github.com/ClickHouse/ClickHouse/pull/16409) ([Azat Khuzhin](https://github.com/azat)).
* Fix processing of very large entries in replication queue. Very large entries may appear in ALTER queries if table structure is extremely large (near 1 MB). This fixes [#16307](https://github.com/ClickHouse/ClickHouse/issues/16307). [#16332](https://github.com/ClickHouse/ClickHouse/pull/16332) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the inconsistent behaviour when a part of return data could be dropped because the set for its filtration wasn't created. [#16308](https://github.com/ClickHouse/ClickHouse/pull/16308) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix bug with MySQL database. When MySQL server used as database engine is down some queries raise Exception, because they try to get tables from disabled server, while it's unnecessary. For example, query `SELECT ... FROM system.parts` should work only with MergeTree tables and don't touch MySQL database at all. [#16032](https://github.com/ClickHouse/ClickHouse/pull/16032) ([Kruglov Pavel](https://github.com/Avogar)).
### ClickHouse release v20.9.4.76-stable (2020-10-29)
#### Bug Fix
* Fix double free in case of exception in function `dictGet`. It could have happened if dictionary was loaded with error. [#16429](https://github.com/ClickHouse/ClickHouse/pull/16429) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix group by with totals/rollup/cube modifers and min/max functions over group by keys. Fixes [#16393](https://github.com/ClickHouse/ClickHouse/issues/16393). [#16397](https://github.com/ClickHouse/ClickHouse/pull/16397) ([Anton Popov](https://github.com/CurtizJ)).
* Fix async Distributed INSERT w/ prefer_localhost_replica=0 and internal_replication. [#16358](https://github.com/ClickHouse/ClickHouse/pull/16358) ([Azat Khuzhin](https://github.com/azat)).
* Fix a very wrong code in TwoLevelStringHashTable implementation, which might lead to memory leak. I'm suprised how this bug can lurk for so long.... [#16264](https://github.com/ClickHouse/ClickHouse/pull/16264) ([Amos Bird](https://github.com/amosbird)).
* Fix the case when memory can be overallocated regardless to the limit. This closes [#14560](https://github.com/ClickHouse/ClickHouse/issues/14560). [#16206](https://github.com/ClickHouse/ClickHouse/pull/16206) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix `ALTER MODIFY ... ORDER BY` query hang for `ReplicatedVersionedCollapsingMergeTree`. This fixes [#15980](https://github.com/ClickHouse/ClickHouse/issues/15980). [#16011](https://github.com/ClickHouse/ClickHouse/pull/16011) ([alesapin](https://github.com/alesapin)).
* Fix collate name & charset name parser and support `length = 0` for string type. [#16008](https://github.com/ClickHouse/ClickHouse/pull/16008) ([Winter Zhang](https://github.com/zhang2014)).
* Allow to use direct layout for dictionaries with complex keys. [#16007](https://github.com/ClickHouse/ClickHouse/pull/16007) ([Anton Popov](https://github.com/CurtizJ)).
* Prevent replica hang for 5-10 mins when replication error happens after a period of inactivity. [#15987](https://github.com/ClickHouse/ClickHouse/pull/15987) ([filimonov](https://github.com/filimonov)).
* Fix rare segfaults when inserting into or selecting from MaterializedView and concurrently dropping target table (for Atomic database engine). [#15984](https://github.com/ClickHouse/ClickHouse/pull/15984) ([tavplubix](https://github.com/tavplubix)).
* Fix ambiguity in parsing of settings profiles: `CREATE USER ... SETTINGS profile readonly` is now considered as using a profile named `readonly`, not a setting named `profile` with the readonly constraint. This fixes https://github.com/ClickHouse/ClickHouse/issues/15628. [#15982](https://github.com/ClickHouse/ClickHouse/pull/15982) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix a crash when database creation fails. [#15954](https://github.com/ClickHouse/ClickHouse/pull/15954) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `DROP TABLE IF EXISTS` failure with `Table ... doesn't exist` error when table is concurrently renamed (for Atomic database engine). Fixed rare deadlock when concurrently executing some DDL queries with multiple tables (like `DROP DATABASE` and `RENAME TABLE`) Fixed `DROP/DETACH DATABASE` failure with `Table ... doesn't exist` when concurrently executing `DROP/DETACH TABLE`. [#15934](https://github.com/ClickHouse/ClickHouse/pull/15934) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect empty result for query from `Distributed` table if query has `WHERE`, `PREWHERE` and `GLOBAL IN`. Fixes [#15792](https://github.com/ClickHouse/ClickHouse/issues/15792). [#15933](https://github.com/ClickHouse/ClickHouse/pull/15933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix possible deadlocks in RBAC. [#15875](https://github.com/ClickHouse/ClickHouse/pull/15875) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix exception `Block structure mismatch` in `SELECT ... ORDER BY DESC` queries which were executed after `ALTER MODIFY COLUMN` query. Fixes [#15800](https://github.com/ClickHouse/ClickHouse/issues/15800). [#15852](https://github.com/ClickHouse/ClickHouse/pull/15852) ([alesapin](https://github.com/alesapin)).
* Fix `select count()` inaccuracy for MaterializeMySQL. [#15767](https://github.com/ClickHouse/ClickHouse/pull/15767) ([tavplubix](https://github.com/tavplubix)).
* Fix some cases of queries, in which only virtual columns are selected. Previously `Not found column _nothing in block` exception may be thrown. Fixes [#12298](https://github.com/ClickHouse/ClickHouse/issues/12298). [#15756](https://github.com/ClickHouse/ClickHouse/pull/15756) ([Anton Popov](https://github.com/CurtizJ)).
* Fixed too low default value of `max_replicated_logs_to_keep` setting, which might cause replicas to become lost too often. Improve lost replica recovery process by choosing the most up-to-date replica to clone. Also do not remove old parts from lost replica, detach them instead. [#15701](https://github.com/ClickHouse/ClickHouse/pull/15701) ([tavplubix](https://github.com/tavplubix)).
* Fix error `Cannot add simple transform to empty Pipe` which happened while reading from `Buffer` table which has different structure than destination table. It was possible if destination table returned empty result for query. Fixes [#15529](https://github.com/ClickHouse/ClickHouse/issues/15529). [#15662](https://github.com/ClickHouse/ClickHouse/pull/15662) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed bug with globs in S3 table function, region from URL was not applied to S3 client configuration. [#15646](https://github.com/ClickHouse/ClickHouse/pull/15646) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Decrement the `ReadonlyReplica` metric when detaching read-only tables. This fixes https://github.com/ClickHouse/ClickHouse/issues/15598. [#15592](https://github.com/ClickHouse/ClickHouse/pull/15592) ([sundyli](https://github.com/sundy-li)).
* Throw an error when a single parameter is passed to ReplicatedMergeTree instead of ignoring it. [#15516](https://github.com/ClickHouse/ClickHouse/pull/15516) ([nvartolomei](https://github.com/nvartolomei)).
#### Improvement
* Now it's allowed to execute `ALTER ... ON CLUSTER` queries regardless of the `<internal_replication>` setting in cluster config. [#16075](https://github.com/ClickHouse/ClickHouse/pull/16075) ([alesapin](https://github.com/alesapin)).
* Unfold `{database}`, `{table}` and `{uuid}` macros in `ReplicatedMergeTree` arguments on table creation. [#16160](https://github.com/ClickHouse/ClickHouse/pull/16160) ([tavplubix](https://github.com/tavplubix)).
### ClickHouse release v20.9.3.45-stable (2020-10-09)
#### Bug Fix
* Fix error `Cannot find column` which may happen at insertion into `MATERIALIZED VIEW` in case if query for `MV` containes `ARRAY JOIN`. [#15717](https://github.com/ClickHouse/ClickHouse/pull/15717) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix race condition in AMQP-CPP. [#15667](https://github.com/ClickHouse/ClickHouse/pull/15667) ([alesapin](https://github.com/alesapin)).
* Fix the order of destruction for resources in `ReadFromStorage` step of query plan. It might cause crashes in rare cases. Possibly connected with [#15610](https://github.com/ClickHouse/ClickHouse/issues/15610). [#15645](https://github.com/ClickHouse/ClickHouse/pull/15645) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed `Element ... is not a constant expression` error when using `JSON*` function result in `VALUES`, `LIMIT` or right side of `IN` operator. [#15589](https://github.com/ClickHouse/ClickHouse/pull/15589) ([tavplubix](https://github.com/tavplubix)).
* Prevent the possibility of error message `Could not calculate available disk space (statvfs), errno: 4, strerror: Interrupted system call`. This fixes [#15541](https://github.com/ClickHouse/ClickHouse/issues/15541). [#15557](https://github.com/ClickHouse/ClickHouse/pull/15557) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Significantly reduce memory usage in AggregatingInOrderTransform/optimize_aggregation_in_order. [#15543](https://github.com/ClickHouse/ClickHouse/pull/15543) ([Azat Khuzhin](https://github.com/azat)).
* Mutation might hang waiting for some non-existent part after `MOVE` or `REPLACE PARTITION` or, in rare cases, after `DETACH` or `DROP PARTITION`. It's fixed. [#15537](https://github.com/ClickHouse/ClickHouse/pull/15537) ([tavplubix](https://github.com/tavplubix)).
* Fix bug when `ILIKE` operator stops being case insensitive if `LIKE` with the same pattern was executed. [#15536](https://github.com/ClickHouse/ClickHouse/pull/15536) ([alesapin](https://github.com/alesapin)).
* Fix `Missing columns` errors when selecting columns which absent in data, but depend on other columns which also absent in data. Fixes [#15530](https://github.com/ClickHouse/ClickHouse/issues/15530). [#15532](https://github.com/ClickHouse/ClickHouse/pull/15532) ([alesapin](https://github.com/alesapin)).
* Fix bug with event subscription in DDLWorker which rarely may lead to query hangs in `ON CLUSTER`. Introduced in [#13450](https://github.com/ClickHouse/ClickHouse/issues/13450). [#15477](https://github.com/ClickHouse/ClickHouse/pull/15477) ([alesapin](https://github.com/alesapin)).
* Report proper error when the second argument of `boundingRatio` aggregate function has a wrong type. [#15407](https://github.com/ClickHouse/ClickHouse/pull/15407) ([detailyang](https://github.com/detailyang)).
* Fix bug where queries like SELECT toStartOfDay(today()) fail complaining about empty time_zone argument. [#15319](https://github.com/ClickHouse/ClickHouse/pull/15319) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix race condition during MergeTree table rename and background cleanup. [#15304](https://github.com/ClickHouse/ClickHouse/pull/15304) ([alesapin](https://github.com/alesapin)).
* Fix rare race condition on server startup when system.logs are enabled. [#15300](https://github.com/ClickHouse/ClickHouse/pull/15300) ([alesapin](https://github.com/alesapin)).
* Fix MSan report in QueryLog. Uninitialized memory can be used for the field `memory_usage`. [#15258](https://github.com/ClickHouse/ClickHouse/pull/15258) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes https://github.com/ClickHouse/ClickHouse/issues/15214. [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix bug in table engine `Buffer` which doesn't allow to insert data of new structure into `Buffer` after `ALTER` query. Fixes [#15117](https://github.com/ClickHouse/ClickHouse/issues/15117). [#15192](https://github.com/ClickHouse/ClickHouse/pull/15192) ([alesapin](https://github.com/alesapin)).
* Adjust decimals field size in mysql column definition packet. [#15152](https://github.com/ClickHouse/ClickHouse/pull/15152) ([maqroll](https://github.com/maqroll)).
* Fixed `Cannot rename ... errno: 22, strerror: Invalid argument` error on DDL query execution in Atomic database when running clickhouse-server in docker on Mac OS. [#15024](https://github.com/ClickHouse/ClickHouse/pull/15024) ([tavplubix](https://github.com/tavplubix)).
* Fix to make predicate push down work when subquery contains finalizeAggregation function. Fixes [#14847](https://github.com/ClickHouse/ClickHouse/issues/14847). [#14937](https://github.com/ClickHouse/ClickHouse/pull/14937) ([filimonov](https://github.com/filimonov)).
* Fix a problem where the server may get stuck on startup while talking to ZooKeeper, if the configuration files have to be fetched from ZK (using the `from_zk` include option). This fixes [#14814](https://github.com/ClickHouse/ClickHouse/issues/14814). [#14843](https://github.com/ClickHouse/ClickHouse/pull/14843) ([Alexander Kuzmenkov](https://github.com/akuzm)).
#### Improvement
* Now it's possible to change the type of version column for `VersionedCollapsingMergeTree` with `ALTER` query. [#15442](https://github.com/ClickHouse/ClickHouse/pull/15442) ([alesapin](https://github.com/alesapin)).
### ClickHouse release v20.9.2.20, 2020-09-22
#### New Feature
@ -405,6 +510,110 @@
## ClickHouse release 20.8
### ClickHouse release v20.8.6.6-lts, 2020-11-13
#### Bug Fix
* Fix rare silent crashes when query profiler is on and ClickHouse is installed on OS with glibc version that has (supposedly) broken asynchronous unwind tables for some functions. This fixes [#15301](https://github.com/ClickHouse/ClickHouse/issues/15301). This fixes [#13098](https://github.com/ClickHouse/ClickHouse/issues/13098). [#16846](https://github.com/ClickHouse/ClickHouse/pull/16846) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Now when parsing AVRO from input the LowCardinality is removed from type. Fixes [#16188](https://github.com/ClickHouse/ClickHouse/issues/16188). [#16521](https://github.com/ClickHouse/ClickHouse/pull/16521) ([Mike](https://github.com/myrrc)).
* Fix rapid growth of metadata when using MySQL Master -> MySQL Slave -> ClickHouse MaterializeMySQL Engine, and `slave_parallel_worker` enabled on MySQL Slave, by properly shrinking GTID sets. This fixes [#15951](https://github.com/ClickHouse/ClickHouse/issues/15951). [#16504](https://github.com/ClickHouse/ClickHouse/pull/16504) ([TCeason](https://github.com/TCeason)).
* Fix DROP TABLE for Distributed (racy with INSERT). [#16409](https://github.com/ClickHouse/ClickHouse/pull/16409) ([Azat Khuzhin](https://github.com/azat)).
* Fix processing of very large entries in replication queue. Very large entries may appear in ALTER queries if table structure is extremely large (near 1 MB). This fixes [#16307](https://github.com/ClickHouse/ClickHouse/issues/16307). [#16332](https://github.com/ClickHouse/ClickHouse/pull/16332) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed the inconsistent behaviour when a part of return data could be dropped because the set for its filtration wasn't created. [#16308](https://github.com/ClickHouse/ClickHouse/pull/16308) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix bug with MySQL database. When MySQL server used as database engine is down some queries raise Exception, because they try to get tables from disabled server, while it's unnecessary. For example, query `SELECT ... FROM system.parts` should work only with MergeTree tables and don't touch MySQL database at all. [#16032](https://github.com/ClickHouse/ClickHouse/pull/16032) ([Kruglov Pavel](https://github.com/Avogar)).
### ClickHouse release v20.8.5.45-lts, 2020-10-29
#### Bug Fix
* Fix double free in case of exception in function `dictGet`. It could have happened if dictionary was loaded with error. [#16429](https://github.com/ClickHouse/ClickHouse/pull/16429) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix group by with totals/rollup/cube modifers and min/max functions over group by keys. Fixes [#16393](https://github.com/ClickHouse/ClickHouse/issues/16393). [#16397](https://github.com/ClickHouse/ClickHouse/pull/16397) ([Anton Popov](https://github.com/CurtizJ)).
* Fix async Distributed INSERT w/ prefer_localhost_replica=0 and internal_replication. [#16358](https://github.com/ClickHouse/ClickHouse/pull/16358) ([Azat Khuzhin](https://github.com/azat)).
* Fix a possible memory leak during `GROUP BY` with string keys, caused by an error in `TwoLevelStringHashTable` implementation. [#16264](https://github.com/ClickHouse/ClickHouse/pull/16264) ([Amos Bird](https://github.com/amosbird)).
* Fix the case when memory can be overallocated regardless to the limit. This closes [#14560](https://github.com/ClickHouse/ClickHouse/issues/14560). [#16206](https://github.com/ClickHouse/ClickHouse/pull/16206) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix `ALTER MODIFY ... ORDER BY` query hang for `ReplicatedVersionedCollapsingMergeTree`. This fixes [#15980](https://github.com/ClickHouse/ClickHouse/issues/15980). [#16011](https://github.com/ClickHouse/ClickHouse/pull/16011) ([alesapin](https://github.com/alesapin)).
* Fix collate name & charset name parser and support `length = 0` for string type. [#16008](https://github.com/ClickHouse/ClickHouse/pull/16008) ([Winter Zhang](https://github.com/zhang2014)).
* Allow to use direct layout for dictionaries with complex keys. [#16007](https://github.com/ClickHouse/ClickHouse/pull/16007) ([Anton Popov](https://github.com/CurtizJ)).
* Prevent replica hang for 5-10 mins when replication error happens after a period of inactivity. [#15987](https://github.com/ClickHouse/ClickHouse/pull/15987) ([filimonov](https://github.com/filimonov)).
* Fix rare segfaults when inserting into or selecting from MaterializedView and concurrently dropping target table (for Atomic database engine). [#15984](https://github.com/ClickHouse/ClickHouse/pull/15984) ([tavplubix](https://github.com/tavplubix)).
* Fix ambiguity in parsing of settings profiles: `CREATE USER ... SETTINGS profile readonly` is now considered as using a profile named `readonly`, not a setting named `profile` with the readonly constraint. This fixes [#15628](https://github.com/ClickHouse/ClickHouse/issues/15628). [#15982](https://github.com/ClickHouse/ClickHouse/pull/15982) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix a crash when database creation fails. [#15954](https://github.com/ClickHouse/ClickHouse/pull/15954) ([Winter Zhang](https://github.com/zhang2014)).
* Fixed `DROP TABLE IF EXISTS` failure with `Table ... doesn't exist` error when table is concurrently renamed (for Atomic database engine). Fixed rare deadlock when concurrently executing some DDL queries with multiple tables (like `DROP DATABASE` and `RENAME TABLE`) Fixed `DROP/DETACH DATABASE` failure with `Table ... doesn't exist` when concurrently executing `DROP/DETACH TABLE`. [#15934](https://github.com/ClickHouse/ClickHouse/pull/15934) ([tavplubix](https://github.com/tavplubix)).
* Fix incorrect empty result for query from `Distributed` table if query has `WHERE`, `PREWHERE` and `GLOBAL IN`. Fixes [#15792](https://github.com/ClickHouse/ClickHouse/issues/15792). [#15933](https://github.com/ClickHouse/ClickHouse/pull/15933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix possible deadlocks in RBAC. [#15875](https://github.com/ClickHouse/ClickHouse/pull/15875) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix exception `Block structure mismatch` in `SELECT ... ORDER BY DESC` queries which were executed after `ALTER MODIFY COLUMN` query. Fixes [#15800](https://github.com/ClickHouse/ClickHouse/issues/15800). [#15852](https://github.com/ClickHouse/ClickHouse/pull/15852) ([alesapin](https://github.com/alesapin)).
* Fix some cases of queries, in which only virtual columns are selected. Previously `Not found column _nothing in block` exception may be thrown. Fixes [#12298](https://github.com/ClickHouse/ClickHouse/issues/12298). [#15756](https://github.com/ClickHouse/ClickHouse/pull/15756) ([Anton Popov](https://github.com/CurtizJ)).
* Fix error `Cannot find column` which may happen at insertion into `MATERIALIZED VIEW` in case if query for `MV` containes `ARRAY JOIN`. [#15717](https://github.com/ClickHouse/ClickHouse/pull/15717) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed too low default value of `max_replicated_logs_to_keep` setting, which might cause replicas to become lost too often. Improve lost replica recovery process by choosing the most up-to-date replica to clone. Also do not remove old parts from lost replica, detach them instead. [#15701](https://github.com/ClickHouse/ClickHouse/pull/15701) ([tavplubix](https://github.com/tavplubix)).
* Fix error `Cannot add simple transform to empty Pipe` which happened while reading from `Buffer` table which has different structure than destination table. It was possible if destination table returned empty result for query. Fixes [#15529](https://github.com/ClickHouse/ClickHouse/issues/15529). [#15662](https://github.com/ClickHouse/ClickHouse/pull/15662) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed bug with globs in S3 table function, region from URL was not applied to S3 client configuration. [#15646](https://github.com/ClickHouse/ClickHouse/pull/15646) ([Vladimir Chebotarev](https://github.com/excitoon)).
* Decrement the `ReadonlyReplica` metric when detaching read-only tables. This fixes [#15598](https://github.com/ClickHouse/ClickHouse/issues/15598). [#15592](https://github.com/ClickHouse/ClickHouse/pull/15592) ([sundyli](https://github.com/sundy-li)).
* Throw an error when a single parameter is passed to ReplicatedMergeTree instead of ignoring it. [#15516](https://github.com/ClickHouse/ClickHouse/pull/15516) ([nvartolomei](https://github.com/nvartolomei)).
#### Improvement
* Now it's allowed to execute `ALTER ... ON CLUSTER` queries regardless of the `<internal_replication>` setting in cluster config. [#16075](https://github.com/ClickHouse/ClickHouse/pull/16075) ([alesapin](https://github.com/alesapin)).
* Unfold `{database}`, `{table}` and `{uuid}` macros in `ReplicatedMergeTree` arguments on table creation. [#16159](https://github.com/ClickHouse/ClickHouse/pull/16159) ([tavplubix](https://github.com/tavplubix)).
### ClickHouse release v20.8.4.11-lts, 2020-10-09
#### Bug Fix
* Fix the order of destruction for resources in `ReadFromStorage` step of query plan. It might cause crashes in rare cases. Possibly connected with [#15610](https://github.com/ClickHouse/ClickHouse/issues/15610). [#15645](https://github.com/ClickHouse/ClickHouse/pull/15645) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed `Element ... is not a constant expression` error when using `JSON*` function result in `VALUES`, `LIMIT` or right side of `IN` operator. [#15589](https://github.com/ClickHouse/ClickHouse/pull/15589) ([tavplubix](https://github.com/tavplubix)).
* Prevent the possibility of error message `Could not calculate available disk space (statvfs), errno: 4, strerror: Interrupted system call`. This fixes [#15541](https://github.com/ClickHouse/ClickHouse/issues/15541). [#15557](https://github.com/ClickHouse/ClickHouse/pull/15557) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Significantly reduce memory usage in AggregatingInOrderTransform/optimize_aggregation_in_order. [#15543](https://github.com/ClickHouse/ClickHouse/pull/15543) ([Azat Khuzhin](https://github.com/azat)).
* Mutation might hang waiting for some non-existent part after `MOVE` or `REPLACE PARTITION` or, in rare cases, after `DETACH` or `DROP PARTITION`. It's fixed. [#15537](https://github.com/ClickHouse/ClickHouse/pull/15537) ([tavplubix](https://github.com/tavplubix)).
* Fix bug when `ILIKE` operator stops being case insensitive if `LIKE` with the same pattern was executed. [#15536](https://github.com/ClickHouse/ClickHouse/pull/15536) ([alesapin](https://github.com/alesapin)).
* Fix `Missing columns` errors when selecting columns which absent in data, but depend on other columns which also absent in data. Fixes [#15530](https://github.com/ClickHouse/ClickHouse/issues/15530). [#15532](https://github.com/ClickHouse/ClickHouse/pull/15532) ([alesapin](https://github.com/alesapin)).
* Fix bug with event subscription in DDLWorker which rarely may lead to query hangs in `ON CLUSTER`. Introduced in [#13450](https://github.com/ClickHouse/ClickHouse/issues/13450). [#15477](https://github.com/ClickHouse/ClickHouse/pull/15477) ([alesapin](https://github.com/alesapin)).
* Report proper error when the second argument of `boundingRatio` aggregate function has a wrong type. [#15407](https://github.com/ClickHouse/ClickHouse/pull/15407) ([detailyang](https://github.com/detailyang)).
* Fix race condition during MergeTree table rename and background cleanup. [#15304](https://github.com/ClickHouse/ClickHouse/pull/15304) ([alesapin](https://github.com/alesapin)).
* Fix rare race condition on server startup when system.logs are enabled. [#15300](https://github.com/ClickHouse/ClickHouse/pull/15300) ([alesapin](https://github.com/alesapin)).
* Fix MSan report in QueryLog. Uninitialized memory can be used for the field `memory_usage`. [#15258](https://github.com/ClickHouse/ClickHouse/pull/15258) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix instance crash when using joinGet with LowCardinality types. This fixes https://github.com/ClickHouse/ClickHouse/issues/15214. [#15220](https://github.com/ClickHouse/ClickHouse/pull/15220) ([Amos Bird](https://github.com/amosbird)).
* Fix bug in table engine `Buffer` which doesn't allow to insert data of new structure into `Buffer` after `ALTER` query. Fixes [#15117](https://github.com/ClickHouse/ClickHouse/issues/15117). [#15192](https://github.com/ClickHouse/ClickHouse/pull/15192) ([alesapin](https://github.com/alesapin)).
* Adjust decimals field size in mysql column definition packet. [#15152](https://github.com/ClickHouse/ClickHouse/pull/15152) ([maqroll](https://github.com/maqroll)).
* We already use padded comparison between String and FixedString (https://github.com/ClickHouse/ClickHouse/blob/master/src/Functions/FunctionsComparison.h#L333). This PR applies the same logic to field comparison which corrects the usage of FixedString as primary keys. This fixes https://github.com/ClickHouse/ClickHouse/issues/14908. [#15033](https://github.com/ClickHouse/ClickHouse/pull/15033) ([Amos Bird](https://github.com/amosbird)).
* If function `bar` was called with specifically crafter arguments, buffer overflow was possible. This closes [#13926](https://github.com/ClickHouse/ClickHouse/issues/13926). [#15028](https://github.com/ClickHouse/ClickHouse/pull/15028) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fixed `Cannot rename ... errno: 22, strerror: Invalid argument` error on DDL query execution in Atomic database when running clickhouse-server in docker on Mac OS. [#15024](https://github.com/ClickHouse/ClickHouse/pull/15024) ([tavplubix](https://github.com/tavplubix)).
* Now settings `number_of_free_entries_in_pool_to_execute_mutation` and `number_of_free_entries_in_pool_to_lower_max_size_of_merge` can be equal to `background_pool_size`. [#14975](https://github.com/ClickHouse/ClickHouse/pull/14975) ([alesapin](https://github.com/alesapin)).
* Fix to make predicate push down work when subquery contains finalizeAggregation function. Fixes [#14847](https://github.com/ClickHouse/ClickHouse/issues/14847). [#14937](https://github.com/ClickHouse/ClickHouse/pull/14937) ([filimonov](https://github.com/filimonov)).
* Publish CPU frequencies per logical core in `system.asynchronous_metrics`. This fixes https://github.com/ClickHouse/ClickHouse/issues/14923. [#14924](https://github.com/ClickHouse/ClickHouse/pull/14924) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fixed `.metadata.tmp File exists` error when using `MaterializeMySQL` database engine. [#14898](https://github.com/ClickHouse/ClickHouse/pull/14898) ([Winter Zhang](https://github.com/zhang2014)).
* Fix a problem where the server may get stuck on startup while talking to ZooKeeper, if the configuration files have to be fetched from ZK (using the `from_zk` include option). This fixes [#14814](https://github.com/ClickHouse/ClickHouse/issues/14814). [#14843](https://github.com/ClickHouse/ClickHouse/pull/14843) ([Alexander Kuzmenkov](https://github.com/akuzm)).
* Fix wrong monotonicity detection for shrunk `Int -> Int` cast of signed types. It might lead to incorrect query result. This bug is unveiled in [#14513](https://github.com/ClickHouse/ClickHouse/issues/14513). [#14783](https://github.com/ClickHouse/ClickHouse/pull/14783) ([Amos Bird](https://github.com/amosbird)).
* Fixed the incorrect sorting order of `Nullable` column. This fixes [#14344](https://github.com/ClickHouse/ClickHouse/issues/14344). [#14495](https://github.com/ClickHouse/ClickHouse/pull/14495) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
#### Improvement
* Now it's possible to change the type of version column for `VersionedCollapsingMergeTree` with `ALTER` query. [#15442](https://github.com/ClickHouse/ClickHouse/pull/15442) ([alesapin](https://github.com/alesapin)).
### ClickHouse release v20.8.3.18-stable, 2020-09-18
#### Bug Fix
* Fix the issue when some invocations of `extractAllGroups` function may trigger "Memory limit exceeded" error. This fixes [#13383](https://github.com/ClickHouse/ClickHouse/issues/13383). [#14889](https://github.com/ClickHouse/ClickHouse/pull/14889) ([alexey-milovidov](https://github.com/alexey-milovidov)).
* Fix SIGSEGV for an attempt to INSERT into StorageFile(fd). [#14887](https://github.com/ClickHouse/ClickHouse/pull/14887) ([Azat Khuzhin](https://github.com/azat)).
* Fix rare error in `SELECT` queries when the queried column has `DEFAULT` expression which depends on the other column which also has `DEFAULT` and not present in select query and not exists on disk. Partially fixes [#14531](https://github.com/ClickHouse/ClickHouse/issues/14531). [#14845](https://github.com/ClickHouse/ClickHouse/pull/14845) ([alesapin](https://github.com/alesapin)).
* Fixed missed default database name in metadata of materialized view when executing `ALTER ... MODIFY QUERY`. [#14664](https://github.com/ClickHouse/ClickHouse/pull/14664) ([tavplubix](https://github.com/tavplubix)).
* Fix bug when `ALTER UPDATE` mutation with Nullable column in assignment expression and constant value (like `UPDATE x = 42`) leads to incorrect value in column or segfault. Fixes [#13634](https://github.com/ClickHouse/ClickHouse/issues/13634), [#14045](https://github.com/ClickHouse/ClickHouse/issues/14045). [#14646](https://github.com/ClickHouse/ClickHouse/pull/14646) ([alesapin](https://github.com/alesapin)).
* Fix wrong Decimal multiplication result caused wrong decimal scale of result column. [#14603](https://github.com/ClickHouse/ClickHouse/pull/14603) ([Artem Zuikov](https://github.com/4ertus2)).
* Added the checker as neither calling `lc->isNullable()` nor calling `ls->getDictionaryPtr()->isNullable()` would return the correct result. [#14591](https://github.com/ClickHouse/ClickHouse/pull/14591) ([myrrc](https://github.com/myrrc)).
* Cleanup data directory after Zookeeper exceptions during CreateQuery for StorageReplicatedMergeTree Engine. [#14563](https://github.com/ClickHouse/ClickHouse/pull/14563) ([Bharat Nallan](https://github.com/bharatnc)).
* Fix rare segfaults in functions with combinator -Resample, which could appear in result of overflow with very large parameters. [#14562](https://github.com/ClickHouse/ClickHouse/pull/14562) ([Anton Popov](https://github.com/CurtizJ)).
#### Improvement
* Speed up server shutdown process if there are ongoing S3 requests. [#14858](https://github.com/ClickHouse/ClickHouse/pull/14858) ([Pavel Kovalenko](https://github.com/Jokser)).
* Allow using multi-volume storage configuration in storage Distributed. [#14839](https://github.com/ClickHouse/ClickHouse/pull/14839) ([Pavel Kovalenko](https://github.com/Jokser)).
* Speed up server shutdown process if there are ongoing S3 requests. [#14496](https://github.com/ClickHouse/ClickHouse/pull/14496) ([Pavel Kovalenko](https://github.com/Jokser)).
* Support custom codecs in compact parts. [#12183](https://github.com/ClickHouse/ClickHouse/pull/12183) ([Anton Popov](https://github.com/CurtizJ)).
### ClickHouse release v20.8.2.3-stable, 2020-09-08
#### Backward Incompatible Change
@ -1755,6 +1964,74 @@ No changes compared to v20.4.3.16-stable.
## ClickHouse release v20.3
### ClickHouse release v20.3.21.2-lts, 2020-11-02
#### Bug Fix
* Fix dictGet in sharding_key (and similar places, i.e. when the function context is stored permanently). [#16205](https://github.com/ClickHouse/ClickHouse/pull/16205) ([Azat Khuzhin](https://github.com/azat)).
* Fix incorrect empty result for query from `Distributed` table if query has `WHERE`, `PREWHERE` and `GLOBAL IN`. Fixes [#15792](https://github.com/ClickHouse/ClickHouse/issues/15792). [#15933](https://github.com/ClickHouse/ClickHouse/pull/15933) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix missing or excessive headers in `TSV/CSVWithNames` formats. This fixes [#12504](https://github.com/ClickHouse/ClickHouse/issues/12504). [#13343](https://github.com/ClickHouse/ClickHouse/pull/13343) ([Azat Khuzhin](https://github.com/azat)).
### ClickHouse release v20.3.20.6-lts, 2020-10-09
#### Bug Fix
* Mutation might hang waiting for some non-existent part after `MOVE` or `REPLACE PARTITION` or, in rare cases, after `DETACH` or `DROP PARTITION`. It's fixed. [#15724](https://github.com/ClickHouse/ClickHouse/pull/15724), [#15537](https://github.com/ClickHouse/ClickHouse/pull/15537) ([tavplubix](https://github.com/tavplubix)).
* Fix hang of queries with a lot of subqueries to same table of `MySQL` engine. Previously, if there were more than 16 subqueries to same `MySQL` table in query, it hang forever. [#15299](https://github.com/ClickHouse/ClickHouse/pull/15299) ([Anton Popov](https://github.com/CurtizJ)).
* Fix 'Unknown identifier' in GROUP BY when query has JOIN over Merge table. [#15242](https://github.com/ClickHouse/ClickHouse/pull/15242) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix to make predicate push down work when subquery contains finalizeAggregation function. Fixes [#14847](https://github.com/ClickHouse/ClickHouse/issues/14847). [#14937](https://github.com/ClickHouse/ClickHouse/pull/14937) ([filimonov](https://github.com/filimonov)).
* Concurrent `ALTER ... REPLACE/MOVE PARTITION ...` queries might cause deadlock. It's fixed. [#13626](https://github.com/ClickHouse/ClickHouse/pull/13626) ([tavplubix](https://github.com/tavplubix)).
### ClickHouse release v20.3.19.4-lts, 2020-09-18
#### Bug Fix
* Fix rare error in `SELECT` queries when the queried column has `DEFAULT` expression which depends on the other column which also has `DEFAULT` and not present in select query and not exists on disk. Partially fixes [#14531](https://github.com/ClickHouse/ClickHouse/issues/14531). [#14845](https://github.com/ClickHouse/ClickHouse/pull/14845) ([alesapin](https://github.com/alesapin)).
* Fix bug when `ALTER UPDATE` mutation with Nullable column in assignment expression and constant value (like `UPDATE x = 42`) leads to incorrect value in column or segfault. Fixes [#13634](https://github.com/ClickHouse/ClickHouse/issues/13634), [#14045](https://github.com/ClickHouse/ClickHouse/issues/14045). [#14646](https://github.com/ClickHouse/ClickHouse/pull/14646) ([alesapin](https://github.com/alesapin)).
* Fix wrong Decimal multiplication result caused wrong decimal scale of result column. [#14603](https://github.com/ClickHouse/ClickHouse/pull/14603) ([Artem Zuikov](https://github.com/4ertus2)).
#### Improvement
* Support custom codecs in compact parts. [#12183](https://github.com/ClickHouse/ClickHouse/pull/12183) ([Anton Popov](https://github.com/CurtizJ)).
### ClickHouse release v20.3.18.10-lts, 2020-09-08
#### Bug Fix
* Stop query execution if exception happened in `PipelineExecutor` itself. This could prevent rare possible query hung. Continuation of [#14334](https://github.com/ClickHouse/ClickHouse/issues/14334). [#14402](https://github.com/ClickHouse/ClickHouse/pull/14402) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fixed the behaviour when sometimes cache-dictionary returned default value instead of present value from source. [#13624](https://github.com/ClickHouse/ClickHouse/pull/13624) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov)).
* Fix parsing row policies from users.xml when names of databases or tables contain dots. This fixes [#5779](https://github.com/ClickHouse/ClickHouse/issues/5779), [#12527](https://github.com/ClickHouse/ClickHouse/issues/12527). [#13199](https://github.com/ClickHouse/ClickHouse/pull/13199) ([Vitaly Baranov](https://github.com/vitlibar)).
* Fix CAST(Nullable(String), Enum()). [#12745](https://github.com/ClickHouse/ClickHouse/pull/12745) ([Azat Khuzhin](https://github.com/azat)).
* Fixed data race in `text_log`. It does not correspond to any real bug. [#9726](https://github.com/ClickHouse/ClickHouse/pull/9726) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### Improvement
* Fix wrong error for long queries. It was possible to get syntax error other than `Max query size exceeded` for correct query. [#13928](https://github.com/ClickHouse/ClickHouse/pull/13928) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Return NULL/zero when value is not parsed completely in parseDateTimeBestEffortOrNull/Zero functions. This fixes [#7876](https://github.com/ClickHouse/ClickHouse/issues/7876). [#11653](https://github.com/ClickHouse/ClickHouse/pull/11653) ([alexey-milovidov](https://github.com/alexey-milovidov)).
#### Performance Improvement
* Slightly optimize very short queries with LowCardinality. [#14129](https://github.com/ClickHouse/ClickHouse/pull/14129) ([Anton Popov](https://github.com/CurtizJ)).
#### Build/Testing/Packaging Improvement
* Fix UBSan report (adding zero to nullptr) in HashTable that appeared after migration to clang-10. [#10638](https://github.com/ClickHouse/ClickHouse/pull/10638) ([alexey-milovidov](https://github.com/alexey-milovidov)).
### ClickHouse release v20.3.17.173-lts, 2020-08-15
#### Bug Fix
* Fix crash in JOIN with StorageMerge and `set enable_optimize_predicate_expression=1`. [#13679](https://github.com/ClickHouse/ClickHouse/pull/13679) ([Artem Zuikov](https://github.com/4ertus2)).
* Fix invalid return type for comparison of tuples with `NULL` elements. Fixes [#12461](https://github.com/ClickHouse/ClickHouse/issues/12461). [#13420](https://github.com/ClickHouse/ClickHouse/pull/13420) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Fix queries with constant columns and `ORDER BY` prefix of primary key. [#13396](https://github.com/ClickHouse/ClickHouse/pull/13396) ([Anton Popov](https://github.com/CurtizJ)).
* Return passed number for numbers with MSB set in roundUpToPowerOfTwoOrZero(). [#13234](https://github.com/ClickHouse/ClickHouse/pull/13234) ([Azat Khuzhin](https://github.com/azat)).
### ClickHouse release v20.3.16.165-lts 2020-08-10
#### Bug Fix

37
base/common/sort.h Normal file
View File

@ -0,0 +1,37 @@
#pragma once
#if !defined(ARCADIA_BUILD)
# include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#else
# include <algorithm>
#endif
template <class RandomIt>
void nth_element(RandomIt first, RandomIt nth, RandomIt last)
{
#if !defined(ARCADIA_BUILD)
::miniselect::floyd_rivest_select(first, nth, last);
#else
::std::nth_element(first, nth, last);
#endif
}
template <class RandomIt>
void partial_sort(RandomIt first, RandomIt middle, RandomIt last)
{
#if !defined(ARCADIA_BUILD)
::miniselect::floyd_rivest_partial_sort(first, middle, last);
#else
::std::partial_sort(first, middle, last);
#endif
}
template <class RandomIt, class Compare>
void partial_sort(RandomIt first, RandomIt middle, RandomIt last, Compare compare)
{
#if !defined(ARCADIA_BUILD)
::miniselect::floyd_rivest_partial_sort(first, middle, last, compare);
#else
::std::partial_sort(first, middle, last, compare);
#endif
}

View File

@ -5,6 +5,9 @@
/// (See at http://www.boost.org/LICENSE_1_0.txt)
#include "throwError.h"
#include <cfloat>
#include <limits>
#include <cassert>
namespace wide
{
@ -192,7 +195,7 @@ struct integer<Bits, Signed>::_impl
}
template <typename T>
constexpr static auto to_Integral(T f) noexcept
__attribute__((no_sanitize("undefined"))) constexpr static auto to_Integral(T f) noexcept
{
if constexpr (std::is_same_v<T, __int128>)
return f;
@ -225,25 +228,54 @@ struct integer<Bits, Signed>::_impl
self.items[i] = 0;
}
constexpr static void wide_integer_from_bultin(integer<Bits, Signed> & self, double rhs) noexcept
{
if ((rhs > 0 && rhs < std::numeric_limits<uint64_t>::max()) || (rhs < 0 && rhs > std::numeric_limits<int64_t>::min()))
/**
* N.B. t is constructed from double, so max(t) = max(double) ~ 2^310
* the recursive call happens when t / 2^64 > 2^64, so there won't be more than 5 of them.
*
* t = a1 * max_int + b1, a1 > max_int, b1 < max_int
* a1 = a2 * max_int + b2, a2 > max_int, b2 < max_int
* a_(n - 1) = a_n * max_int + b2, a_n <= max_int <- base case.
*/
template <class T>
constexpr static void set_multiplier(integer<Bits, Signed> & self, T t) noexcept {
constexpr uint64_t max_int = std::numeric_limits<uint64_t>::max();
const T alpha = t / max_int;
if (alpha <= max_int)
self = static_cast<uint64_t>(alpha);
else // max(double) / 2^64 will surely contain less than 52 precision bits, so speed up computations.
set_multiplier<double>(self, alpha);
self *= max_int;
self += static_cast<uint64_t>(t - alpha * max_int); // += b_i
}
constexpr static void wide_integer_from_bultin(integer<Bits, Signed>& self, double rhs) noexcept {
constexpr int64_t max_int = std::numeric_limits<int64_t>::max();
constexpr int64_t min_int = std::numeric_limits<int64_t>::min();
/// There are values in int64 that have more than 53 significant bits (in terms of double
/// representation). Such values, being promoted to double, are rounded up or down. If they are rounded up,
/// the result may not fit in 64 bits.
/// The example of such a number is 9.22337e+18.
/// As to_Integral does a static_cast to int64_t, it may result in UB.
/// The necessary check here is that long double has enough significant (mantissa) bits to store the
/// int64_t max value precisely.
static_assert(LDBL_MANT_DIG >= 64,
"On your system long double has less than 64 precision bits,"
"which may result in UB when initializing double from int64_t");
if ((rhs > 0 && rhs < max_int) || (rhs < 0 && rhs > min_int))
{
self = to_Integral(rhs);
self = static_cast<int64_t>(rhs);
return;
}
long double r = rhs;
if (r < 0)
r = -r;
const long double rhs_long_double = (static_cast<long double>(rhs) < 0)
? -static_cast<long double>(rhs)
: rhs;
size_t count = r / std::numeric_limits<uint64_t>::max();
self = count;
self *= std::numeric_limits<uint64_t>::max();
long double to_diff = count;
to_diff *= std::numeric_limits<uint64_t>::max();
self += to_Integral(r - to_diff);
set_multiplier(self, rhs_long_double);
if (rhs < 0)
self = -self;

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
NO_COMPILER_WARNINGS()

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL (GLOBAL clickhouse/base/pcg-random)

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
CFLAGS(-g0)

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(GLOBAL clickhouse/base/widechar_width)

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
RECURSE(
common
daemon

2
contrib/libunwind vendored

@ -1 +1 @@
Subproject commit 198458b35f100da32bd3e74c2a3ce8d236db299b
Subproject commit 7d78d3618910752c256b2b58c3895f4efea47fac

View File

@ -287,6 +287,8 @@ TESTS_TO_SKIP=(
01322_ttest_scipy
01545_system_errors
# Checks system.errors
01563_distributed_query_finish
)
time clickhouse-test -j 8 --order=random --no-long --testname --shard --zookeeper --skip "${TESTS_TO_SKIP[@]}" 2>&1 | ts '%Y-%m-%d %H:%M:%S' | tee "$FASTTEST_OUTPUT/test_log.txt"

View File

@ -16,7 +16,7 @@
<max_execution_time>300</max_execution_time>
<!-- One NUMA node w/o hyperthreading -->
<max_threads>20</max_threads>
<max_threads>12</max_threads>
</default>
</profiles>
</yandex>

View File

@ -29,7 +29,7 @@ def dowload_with_progress(url, path):
logging.info("Downloading from %s to temp path %s", url, path)
for i in range(RETRIES_COUNT):
try:
with open(path, 'w') as f:
with open(path, 'wb') as f:
response = requests.get(url, stream=True)
response.raise_for_status()
total_length = response.headers.get('content-length')

View File

@ -26,6 +26,9 @@ toc_title: Client Libraries
- [go-clickhouse](https://github.com/roistat/go-clickhouse)
- [mailrugo-clickhouse](https://github.com/mailru/go-clickhouse)
- [golang-clickhouse](https://github.com/leprosus/golang-clickhouse)
- Swift
- [ClickHouseNIO](https://github.com/patrick-zippenfenig/ClickHouseNIO)
- [ClickHouseVapor ORM](https://github.com/patrick-zippenfenig/ClickHouseVapor)
- NodeJs
- [clickhouse (NodeJs)](https://github.com/TimonKK/clickhouse)
- [node-clickhouse](https://github.com/apla/node-clickhouse)

View File

@ -1081,4 +1081,45 @@ Default value: `/var/lib/clickhouse/access/`.
- [Access Control and Account Management](../../operations/access-rights.md#access-control)
## user_directories {#user_directories}
Section of the configuration file that contains settings:
- Path to configuration file with predefined users.
- Path to folder where users created by SQL commands are stored.
If this section is specified, the path from [users_config](../../operations/server-configuration-parameters/settings.md#users-config) and [access_control_path](../../operations/server-configuration-parameters/settings.md#access_control_path) won't be used.
The `user_directories` section can contain any number of items, the order of the items means their precedence (the higher the item the higher the precedence).
**Example**
``` xml
<user_directories>
<users_xml>
<path>/etc/clickhouse-server/users.xml</path>
</users_xml>
<local_directory>
<path>/var/lib/clickhouse/access/</path>
</local_directory>
</user_directories>
```
You can also specify settings `memory` — means storing information only in memory, without writing to disk, and `ldap` — means storing information on an LDAP server.
To add an LDAP server as a remote user directory of users that are not defined locally, define a single `ldap` section with a following parameters:
- `server` — one of LDAP server names defined in `ldap_servers` config section. This parameter is mandatory and cannot be empty.
- `roles` — section with a list of locally defined roles that will be assigned to each user retrieved from the LDAP server. If no roles are specified, user will not be able to perform any actions after authentication. If any of the listed roles is not defined locally at the time of authentication, the authenthication attept will fail as if the provided password was incorrect.
**Example**
``` xml
<ldap>
<server>my_ldap_server</server>
<roles>
<my_local_role1 />
<my_local_role2 />
</roles>
</ldap>
```
[Original article](https://clickhouse.tech/docs/en/operations/server_configuration_parameters/settings/) <!--hide-->

View File

@ -306,3 +306,67 @@ execute_native_thread_routine
start_thread
clone
```
## tid {#tid}
Returns id of the thread, in which current [Block](https://clickhouse.tech/docs/en/development/architecture/#block) is processed.
**Syntax**
``` sql
tid()
```
**Returned value**
- Current thread id. [Uint64](../../sql-reference/data-types/int-uint.md#uint-ranges).
**Example**
Query:
``` sql
SELECT tid();
```
Result:
``` text
┌─tid()─┐
│ 3878 │
└───────┘
```
## logTrace {#logtrace}
Emits trace log message to server log for each [Block](https://clickhouse.tech/docs/en/development/architecture/#block).
**Syntax**
``` sql
logTrace('message')
```
**Parameters**
- `message` — Message that is emitted to server log. [String](../../sql-reference/data-types/string.md#string).
**Returned value**
- Always returns 0.
**Example**
Query:
``` sql
SELECT logTrace('logTrace message');
```
Result:
``` text
┌─logTrace('logTrace message')─┐
│ 0 │
└──────────────────────────────┘
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/introspection/) <!--hide-->

View File

@ -204,7 +204,7 @@ SYSTEM STOP MOVES [[db.]merge_tree_family_table_name]
## Managing ReplicatedMergeTree Tables {#query-language-system-replicated}
ClickHouse can manage background replication related processes in [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) tables.
ClickHouse can manage background replication related processes in [ReplicatedMergeTree](../../engines/table-engines/mergetree-family/replication/#table_engines-replication) tables.
### STOP FETCHES {#query_language-system-stop-fetches}

View File

@ -57,7 +57,7 @@ Identifiers are:
Identifiers can be quoted or non-quoted. The latter is preferred.
Non-quoted identifiers must match the regex `^[a-zA-Z_][0-9a-zA-Z_]*$` and can not be equal to [keywords](#syntax-keywords). Examples: `x, _1, X_y__Z123_.`
Non-quoted identifiers must match the regex `^[0-9a-zA-Z_]*[a-zA-Z_]$` and can not be equal to [keywords](#syntax-keywords). Examples: `x, _1, X_y__Z123_.`
If you want to use identifiers the same as keywords or you want to use other symbols in identifiers, quote it using double quotes or backticks, for example, `"id"`, `` `id` ``.

View File

@ -1068,4 +1068,45 @@ ClickHouse использует ZooKeeper для хранения метадан
- [Управление доступом](../access-rights.md#access-control)
## user_directories {#user_directories}
Секция конфигурационного файла,которая содержит настройки:
- Путь к конфигурационному файлу с предустановленными пользователями.
- Путь к файлу, в котором содержатся пользователи, созданные при помощи SQL команд.
Если эта секция определена, путь из [users_config](../../operations/server-configuration-parameters/settings.md#users-config) и [access_control_path](../../operations/server-configuration-parameters/settings.md#access_control_path) не используется.
Секция `user_directories` может содержать любое количество элементов, порядок расположения элементов обозначает их приоритет (чем выше элемент, тем выше приоритет).
**Пример**
``` xml
<user_directories>
<users_xml>
<path>/etc/clickhouse-server/users.xml</path>
</users_xml>
<local_directory>
<path>/var/lib/clickhouse/access/</path>
</local_directory>
</user_directories>
```
Также вы можете указать настройку `memory` — означает хранение информации только в памяти, без записи на диск, и `ldap` — означает хранения информации на [LDAP-сервере](https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol).
Чтобы добавить LDAP-сервер в качестве удаленного каталога пользователей, которые не определены локально, определите один раздел `ldap` со следующими параметрами:
- `server` — имя одного из LDAP-серверов, определенных в секции `ldap_servers` конфигурациионного файла. Этот параметр явялется необязательным и может быть пустым.
- `roles` — раздел со списком локально определенных ролей, которые будут назначены каждому пользователю, полученному с LDAP-сервера. Если роли не заданы, пользователь не сможет выполнять никаких действий после аутентификации. Если какая-либо из перечисленных ролей не определена локально во время проверки подлинности, попытка проверки подлинности завершится неудачей, как если бы предоставленный пароль был неверным.
**Пример**
``` xml
<ldap>
<server>my_ldap_server</server>
<roles>
<my_local_role1 />
<my_local_role2 />
</roles>
</ldap>
```
[Оригинальная статья](https://clickhouse.tech/docs/ru/operations/server_configuration_parameters/settings/) <!--hide-->

View File

@ -1157,6 +1157,7 @@ SELECT arrayCumSum([1, 1, 1, 1]) AS res
┌─res──────────┐
│ [1, 2, 3, 4] │
└──────────────┘
```
## arrayAUC {#arrayauc}

View File

@ -306,3 +306,68 @@ execute_native_thread_routine
start_thread
clone
```
## tid {#tid}
Возвращает id потока, в котором обрабатывается текущий [Block](https://clickhouse.tech/docs/ru/development/architecture/#block).
**Синтаксис**
``` sql
tid()
```
**Возвращаемое значение**
- Id текущего потока. [Uint64](../../sql-reference/data-types/int-uint.md#uint-ranges).
**Пример**
Запрос:
``` sql
SELECT tid();
```
Результат:
``` text
┌─tid()─┐
│ 3878 │
└───────┘
```
## logTrace {#logtrace}
Выводит сообщение в лог сервера для каждого [Block](https://clickhouse.tech/docs/ru/development/architecture/#block).
**Синтаксис**
``` sql
logTrace('message')
```
**Параметры**
- `message` — сообщение, которое отправляется в серверный лог. [String](../../sql-reference/data-types/string.md#string).
**Возвращаемое значение**
- Всегда возвращает 0.
**Example**
Запрос:
``` sql
SELECT logTrace('logTrace message');
```
Результат:
``` text
┌─logTrace('logTrace message')─┐
│ 0 │
└──────────────────────────────┘
```
[Original article](https://clickhouse.tech/docs/en/query_language/functions/introspection/) <!--hide-->

View File

@ -21,7 +21,7 @@ mkdocs-htmlproofer-plugin==0.0.3
mkdocs-macros-plugin==0.4.20
nltk==3.5
nose==1.3.7
protobuf==3.13.0
protobuf==3.14.0
numpy==1.19.2
Pygments==2.5.2
pymdown-extensions==8.0

View File

@ -329,14 +329,20 @@ int mainEntryClickHouseInstall(int argc, char ** argv)
bool has_password_for_default_user = false;
if (!fs::exists(main_config_file))
if (!fs::exists(config_d))
{
fmt::print("Creating config directory {} that is used for tweaks of main server configuration.\n", config_d.string());
fs::create_directory(config_d);
}
if (!fs::exists(users_d))
{
fmt::print("Creating config directory {} that is used for tweaks of users configuration.\n", users_d.string());
fs::create_directory(users_d);
}
if (!fs::exists(main_config_file))
{
std::string_view main_config_content = getResource("config.xml");
if (main_config_content.empty())
{
@ -349,7 +355,30 @@ int mainEntryClickHouseInstall(int argc, char ** argv)
out.sync();
out.finalize();
}
}
else
{
fmt::print("Config file {} already exists, will keep it and extract path info from it.\n", main_config_file.string());
ConfigProcessor processor(main_config_file.string(), /* throw_on_bad_incl = */ false, /* log_to_console = */ false);
ConfigurationPtr configuration(new Poco::Util::XMLConfiguration(processor.processConfig()));
if (configuration->has("path"))
{
data_path = configuration->getString("path");
fmt::print("{} has {} as data path.\n", main_config_file.string(), data_path);
}
if (configuration->has("logger.log"))
{
log_path = fs::path(configuration->getString("logger.log")).remove_filename();
fmt::print("{} has {} as log path.\n", main_config_file.string(), log_path);
}
}
if (!fs::exists(users_config_file))
{
std::string_view users_config_content = getResource("users.xml");
if (users_config_content.empty())
{
@ -365,38 +394,17 @@ int mainEntryClickHouseInstall(int argc, char ** argv)
}
else
{
{
fmt::print("Config file {} already exists, will keep it and extract path info from it.\n", main_config_file.string());
ConfigProcessor processor(main_config_file.string(), /* throw_on_bad_incl = */ false, /* log_to_console = */ false);
ConfigurationPtr configuration(new Poco::Util::XMLConfiguration(processor.processConfig()));
if (configuration->has("path"))
{
data_path = configuration->getString("path");
fmt::print("{} has {} as data path.\n", main_config_file.string(), data_path);
}
if (configuration->has("logger.log"))
{
log_path = fs::path(configuration->getString("logger.log")).remove_filename();
fmt::print("{} has {} as log path.\n", main_config_file.string(), log_path);
}
}
fmt::print("Users config file {} already exists, will keep it and extract users info from it.\n", users_config_file.string());
/// Check if password for default user already specified.
ConfigProcessor processor(users_config_file.string(), /* throw_on_bad_incl = */ false, /* log_to_console = */ false);
ConfigurationPtr configuration(new Poco::Util::XMLConfiguration(processor.processConfig()));
if (fs::exists(users_config_file))
if (!configuration->getString("users.default.password", "").empty()
|| configuration->getString("users.default.password_sha256_hex", "").empty()
|| configuration->getString("users.default.password_double_sha1_hex", "").empty())
{
ConfigProcessor processor(users_config_file.string(), /* throw_on_bad_incl = */ false, /* log_to_console = */ false);
ConfigurationPtr configuration(new Poco::Util::XMLConfiguration(processor.processConfig()));
if (!configuration->getString("users.default.password", "").empty()
|| configuration->getString("users.default.password_sha256_hex", "").empty()
|| configuration->getString("users.default.password_double_sha1_hex", "").empty())
{
has_password_for_default_user = true;
}
has_password_for_default_user = true;
}
}

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
PROGRAM(clickhouse-server)
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
PROGRAM(clickhouse)
CFLAGS(

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -8,7 +8,7 @@ namespace DB
{
AggregateFunctionPtr AggregateFunctionCount::getOwnNullAdapter(
const AggregateFunctionPtr &, const DataTypes & types, const Array & params) const
const AggregateFunctionPtr &, const DataTypes & types, const Array & params, const AggregateFunctionProperties & /*properties*/) const
{
return std::make_shared<AggregateFunctionCountNotNullUnary>(types[0], params);
}

View File

@ -69,7 +69,7 @@ public:
}
AggregateFunctionPtr getOwnNullAdapter(
const AggregateFunctionPtr &, const DataTypes & types, const Array & params) const override;
const AggregateFunctionPtr &, const DataTypes & types, const Array & params, const AggregateFunctionProperties & /*properties*/) const override;
};

View File

@ -1,6 +1,7 @@
#include <AggregateFunctions/AggregateFunctionIf.h>
#include <AggregateFunctions/AggregateFunctionCombinatorFactory.h>
#include "registerAggregateFunctions.h"
#include "AggregateFunctionNull.h"
namespace DB
@ -8,6 +9,7 @@ namespace DB
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH;
}
@ -40,6 +42,164 @@ public:
}
};
/** There are two cases: for single argument and variadic.
* Code for single argument is much more efficient.
*/
template <bool result_is_nullable, bool serialize_flag>
class AggregateFunctionIfNullUnary final
: public AggregateFunctionNullBase<result_is_nullable, serialize_flag,
AggregateFunctionIfNullUnary<result_is_nullable, serialize_flag>>
{
private:
size_t num_arguments;
using Base = AggregateFunctionNullBase<result_is_nullable, serialize_flag,
AggregateFunctionIfNullUnary<result_is_nullable, serialize_flag>>;
public:
String getName() const override
{
return Base::getName();
}
AggregateFunctionIfNullUnary(AggregateFunctionPtr nested_function_, const DataTypes & arguments, const Array & params)
: Base(std::move(nested_function_), arguments, params), num_arguments(arguments.size())
{
if (num_arguments == 0)
throw Exception("Aggregate function " + getName() + " require at least one argument",
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
}
static inline bool singleFilter(const IColumn ** columns, size_t row_num, size_t num_arguments)
{
const IColumn * filter_column = columns[num_arguments - 1];
if (const ColumnNullable * nullable_column = typeid_cast<const ColumnNullable *>(filter_column))
filter_column = nullable_column->getNestedColumnPtr().get();
return assert_cast<const ColumnUInt8 &>(*filter_column).getData()[row_num];
}
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena * arena) const override
{
const ColumnNullable * column = assert_cast<const ColumnNullable *>(columns[0]);
const IColumn * nested_column = &column->getNestedColumn();
if (!column->isNullAt(row_num) && singleFilter(columns, row_num, num_arguments))
{
this->setFlag(place);
this->nested_function->add(this->nestedPlace(place), &nested_column, row_num, arena);
}
}
};
template <bool result_is_nullable, bool serialize_flag, bool null_is_skipped>
class AggregateFunctionIfNullVariadic final
: public AggregateFunctionNullBase<result_is_nullable, serialize_flag,
AggregateFunctionIfNullVariadic<result_is_nullable, serialize_flag, null_is_skipped>>
{
public:
String getName() const override
{
return Base::getName();
}
AggregateFunctionIfNullVariadic(AggregateFunctionPtr nested_function_, const DataTypes & arguments, const Array & params)
: Base(std::move(nested_function_), arguments, params), number_of_arguments(arguments.size())
{
if (number_of_arguments == 1)
throw Exception("Logical error: single argument is passed to AggregateFunctionIfNullVariadic", ErrorCodes::LOGICAL_ERROR);
if (number_of_arguments > MAX_ARGS)
throw Exception("Maximum number of arguments for aggregate function with Nullable types is " + toString(size_t(MAX_ARGS)),
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
for (size_t i = 0; i < number_of_arguments; ++i)
is_nullable[i] = arguments[i]->isNullable();
}
static inline bool singleFilter(const IColumn ** columns, size_t row_num, size_t num_arguments)
{
return assert_cast<const ColumnUInt8 &>(*columns[num_arguments - 1]).getData()[row_num];
}
void add(AggregateDataPtr place, const IColumn ** columns, size_t row_num, Arena * arena) const override
{
/// This container stores the columns we really pass to the nested function.
const IColumn * nested_columns[number_of_arguments];
for (size_t i = 0; i < number_of_arguments; ++i)
{
if (is_nullable[i])
{
const ColumnNullable & nullable_col = assert_cast<const ColumnNullable &>(*columns[i]);
if (null_is_skipped && nullable_col.isNullAt(row_num))
{
/// If at least one column has a null value in the current row,
/// we don't process this row.
return;
}
nested_columns[i] = &nullable_col.getNestedColumn();
}
else
nested_columns[i] = columns[i];
}
if (singleFilter(nested_columns, row_num, number_of_arguments))
{
this->setFlag(place);
this->nested_function->add(this->nestedPlace(place), nested_columns, row_num, arena);
}
}
private:
using Base = AggregateFunctionNullBase<result_is_nullable, serialize_flag,
AggregateFunctionIfNullVariadic<result_is_nullable, serialize_flag, null_is_skipped>>;
enum { MAX_ARGS = 8 };
size_t number_of_arguments = 0;
std::array<char, MAX_ARGS> is_nullable; /// Plain array is better than std::vector due to one indirection less.
};
AggregateFunctionPtr AggregateFunctionIf::getOwnNullAdapter(
const AggregateFunctionPtr & nested_function, const DataTypes & arguments,
const Array & params, const AggregateFunctionProperties & properties) const
{
bool return_type_is_nullable = !properties.returns_default_when_only_null && getReturnType()->canBeInsideNullable();
size_t nullable_size = std::count_if(arguments.begin(), arguments.end(), [](const auto & element) { return element->isNullable(); });
return_type_is_nullable &= nullable_size != 1 || !arguments.back()->isNullable(); /// If only condition is nullable. we should non-nullable type.
bool serialize_flag = return_type_is_nullable || properties.returns_default_when_only_null;
if (arguments.size() <= 2 && arguments.front()->isNullable())
{
if (return_type_is_nullable)
{
return std::make_shared<AggregateFunctionIfNullUnary<true, true>>(nested_func, arguments, params);
}
else
{
if (serialize_flag)
return std::make_shared<AggregateFunctionIfNullUnary<false, true>>(nested_func, arguments, params);
else
return std::make_shared<AggregateFunctionIfNullUnary<false, false>>(nested_func, arguments, params);
}
}
else
{
if (return_type_is_nullable)
{
return std::make_shared<AggregateFunctionIfNullVariadic<true, true, true>>(nested_function, arguments, params);
}
else
{
if (serialize_flag)
return std::make_shared<AggregateFunctionIfNullVariadic<false, true, true>>(nested_function, arguments, params);
else
return std::make_shared<AggregateFunctionIfNullVariadic<false, false, true>>(nested_function, arguments, params);
}
}
}
void registerAggregateFunctionCombinatorIf(AggregateFunctionCombinatorFactory & factory)
{
factory.registerCombinator(std::make_shared<AggregateFunctionCombinatorIf>());

View File

@ -109,6 +109,10 @@ public:
{
return nested_func->isState();
}
AggregateFunctionPtr getOwnNullAdapter(
const AggregateFunctionPtr & nested_function, const DataTypes & arguments,
const Array & params, const AggregateFunctionProperties & properties) const override;
};
}

View File

@ -72,7 +72,7 @@ public:
assert(nested_function);
if (auto adapter = nested_function->getOwnNullAdapter(nested_function, arguments, params))
if (auto adapter = nested_function->getOwnNullAdapter(nested_function, arguments, params, properties))
return adapter;
/// If applied to aggregate function with -State combinator, we apply -Null combinator to it's nested_function instead of itself.

View File

@ -239,7 +239,8 @@ public:
}
AggregateFunctionPtr getOwnNullAdapter(
const AggregateFunctionPtr & nested_function, const DataTypes & arguments, const Array & params) const override
const AggregateFunctionPtr & nested_function, const DataTypes & arguments, const Array & params,
const AggregateFunctionProperties & /*properties*/) const override
{
return std::make_shared<AggregateFunctionNullVariadic<false, false, false>>(nested_function, arguments, params);
}

View File

@ -33,6 +33,7 @@ using ConstAggregateDataPtr = const char *;
class IAggregateFunction;
using AggregateFunctionPtr = std::shared_ptr<IAggregateFunction>;
struct AggregateFunctionProperties;
/** Aggregate functions interface.
* Instances of classes with this interface do not contain the data itself for aggregation,
@ -185,7 +186,8 @@ public:
* arguments and params are for nested_function.
*/
virtual AggregateFunctionPtr getOwnNullAdapter(
const AggregateFunctionPtr & /*nested_function*/, const DataTypes & /*arguments*/, const Array & /*params*/) const
const AggregateFunctionPtr & /*nested_function*/, const DataTypes & /*arguments*/,
const Array & /*params*/, const AggregateFunctionProperties & /*properties*/) const
{
return nullptr;
}

View File

@ -1,19 +1,17 @@
#pragma once
#include <algorithm>
#include <common/types.h>
#include <IO/ReadBuffer.h>
#include <IO/VarInt.h>
#include <IO/WriteBuffer.h>
#include <Common/NaNUtils.h>
#include <Common/PODArray.h>
#include <common/sort.h>
#include <common/types.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int NOT_IMPLEMENTED;
@ -89,12 +87,7 @@ struct QuantileExact : QuantileExactBase<Value, QuantileExact<Value>>
if (!array.empty())
{
size_t n = level < 1 ? level * array.size() : (array.size() - 1);
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin(), array.begin() + n, array.end()); /// NOTE You can think of the radix-select algorithm.
#else
std::nth_element(array.begin(), array.begin() + n, array.end()); /// NOTE You can think of the radix-select algorithm.
#endif
nth_element(array.begin(), array.begin() + n, array.end()); /// NOTE: You can think of the radix-select algorithm.
return array[n];
}
@ -113,12 +106,7 @@ struct QuantileExact : QuantileExactBase<Value, QuantileExact<Value>>
auto level = levels[indices[i]];
size_t n = level < 1 ? level * array.size() : (array.size() - 1);
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin() + prev_n, array.begin() + n, array.end());
#else
std::nth_element(array.begin() + prev_n, array.begin() + n, array.end());
#endif
nth_element(array.begin() + prev_n, array.begin() + n, array.end());
result[indices[i]] = array[n];
prev_n = n;
}
@ -154,14 +142,10 @@ struct QuantileExactExclusive : public QuantileExact<Value>
else if (n < 1)
return static_cast<Float64>(array[0]);
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin(), array.begin() + n - 1, array.end());
#else
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
#endif
auto nth_element = std::min_element(array.begin() + n, array.end());
nth_element(array.begin(), array.begin() + n - 1, array.end());
auto nth_elem = std::min_element(array.begin() + n, array.end());
return static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_element - array[n - 1]);
return static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_elem - array[n - 1]);
}
return std::numeric_limits<Float64>::quiet_NaN();
@ -187,14 +171,10 @@ struct QuantileExactExclusive : public QuantileExact<Value>
result[indices[i]] = static_cast<Float64>(array[0]);
else
{
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin() + prev_n, array.begin() + n - 1, array.end());
#else
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
#endif
auto nth_element = std::min_element(array.begin() + n, array.end());
nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
auto nth_elem = std::min_element(array.begin() + n, array.end());
result[indices[i]] = static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_element - array[n - 1]);
result[indices[i]] = static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_elem - array[n - 1]);
prev_n = n - 1;
}
}
@ -226,14 +206,10 @@ struct QuantileExactInclusive : public QuantileExact<Value>
return static_cast<Float64>(array[array.size() - 1]);
else if (n < 1)
return static_cast<Float64>(array[0]);
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin(), array.begin() + n - 1, array.end());
#else
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
#endif
auto nth_element = std::min_element(array.begin() + n, array.end());
nth_element(array.begin(), array.begin() + n - 1, array.end());
auto nth_elem = std::min_element(array.begin() + n, array.end());
return static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_element - array[n - 1]);
return static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_elem - array[n - 1]);
}
return std::numeric_limits<Float64>::quiet_NaN();
@ -257,14 +233,10 @@ struct QuantileExactInclusive : public QuantileExact<Value>
result[indices[i]] = static_cast<Float64>(array[0]);
else
{
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin() + prev_n, array.begin() + n - 1, array.end());
#else
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
#endif
auto nth_element = std::min_element(array.begin() + n, array.end());
nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
auto nth_elem = std::min_element(array.begin() + n, array.end());
result[indices[i]] = static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_element - array[n - 1]);
result[indices[i]] = static_cast<Float64>(array[n - 1]) + (h - n) * static_cast<Float64>(*nth_elem - array[n - 1]);
prev_n = n - 1;
}
}

View File

@ -1,15 +1,13 @@
#pragma once
#include <IO/ReadBuffer.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteBuffer.h>
#include <IO/WriteHelpers.h>
#include <Common/HashTable/Hash.h>
#include <Common/PODArray.h>
#include <IO/ReadBuffer.h>
#include <IO/WriteBuffer.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <common/sort.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
{
@ -140,7 +138,7 @@ namespace detail
using Array = PODArray<UInt16, 128>;
mutable Array elems; /// mutable because array sorting is not considered a state change.
QuantileTimingMedium() {}
QuantileTimingMedium() = default;
QuantileTimingMedium(const UInt16 * begin, const UInt16 * end) : elems(begin, end) {}
void insert(UInt64 x)
@ -182,11 +180,7 @@ namespace detail
/// Sorting an array will not be considered a violation of constancy.
auto & array = elems;
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin(), array.begin() + n, array.end());
#else
std::nth_element(array.begin(), array.begin() + n, array.end());
#endif
nth_element(array.begin(), array.begin() + n, array.end());
quantile = array[n];
}
@ -207,11 +201,7 @@ namespace detail
? level * elems.size()
: (elems.size() - 1);
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_select(array.begin() + prev_n, array.begin() + n, array.end());
#else
std::nth_element(array.begin() + prev_n, array.begin() + n, array.end());
#endif
nth_element(array.begin() + prev_n, array.begin() + n, array.end());
result[level_index] = array[n];
prev_n = n;
@ -282,7 +272,7 @@ namespace detail
}
public:
Iterator(const QuantileTimingLarge & parent)
explicit Iterator(const QuantileTimingLarge & parent)
: begin(parent.count_small), pos(begin), end(&parent.count_big[BIG_SIZE])
{
adjust();
@ -429,8 +419,8 @@ namespace detail
template <typename ResultType>
void getMany(const double * levels, const size_t * indices, size_t size, ResultType * result) const
{
const auto indices_end = indices + size;
auto index = indices;
const auto * indices_end = indices + size;
const auto * index = indices;
UInt64 pos = std::ceil(count * levels[*index]);

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -73,6 +73,11 @@ void Connection::connect(const ConnectionTimeouts & timeouts)
{
#if USE_SSL
socket = std::make_unique<Poco::Net::SecureStreamSocket>();
/// we resolve the ip when we open SecureStreamSocket, so to make Server Name Indication (SNI)
/// work we need to pass host name separately. It will be send into TLS Hello packet to let
/// the server know which host we want to talk with (single IP can process requests for multiple hosts using SNI).
static_cast<Poco::Net::SecureStreamSocket*>(socket.get())->setPeerHostName(host);
#else
throw Exception{"tcp_secure protocol is disabled because poco library was built without NetSSL support.", ErrorCodes::SUPPORT_IS_DISABLED};
#endif

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -9,6 +9,7 @@
#include <Columns/ColumnsCommon.h>
#include <common/unaligned.h>
#include <common/sort.h>
#include <DataStreams/ColumnGathererStream.h>
@ -20,10 +21,6 @@
#include <Common/WeakHash.h>
#include <Common/HashTable/Hash.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
{
@ -786,11 +783,7 @@ void ColumnArray::getPermutationImpl(size_t limit, Permutation & res, Comparator
auto less = [&cmp](size_t lhs, size_t rhs){ return cmp(lhs, rhs) < 0; };
if (limit)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), less);
else
std::sort(res.begin(), res.end(), less);
}
@ -842,11 +835,7 @@ void ColumnArray::updatePermutationImpl(size_t limit, Permutation & res, EqualRa
return;
/// Since then we are working inside the interval.
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
auto new_first = first;
for (auto j = first + 1; j < limit; ++j)
{

View File

@ -7,10 +7,8 @@
#include <Core/BigInt.h>
#include <common/unaligned.h>
#include <common/sort.h>
#include <ext/scope_guard.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
#include <IO/WriteHelpers.h>
@ -57,32 +55,16 @@ void ColumnDecimal<T>::compareColumn(const IColumn & rhs, size_t rhs_row_num,
template <typename T>
StringRef ColumnDecimal<T>::serializeValueIntoArena(size_t n, Arena & arena, char const *& begin) const
{
if constexpr (is_POD)
{
auto * pos = arena.allocContinue(sizeof(T), begin);
memcpy(pos, &data[n], sizeof(T));
return StringRef(pos, sizeof(T));
}
else
{
char * pos = arena.allocContinue(BigInt<T>::size, begin);
return BigInt<Int256>::serialize(data[n], pos);
}
auto * pos = arena.allocContinue(sizeof(T), begin);
memcpy(pos, &data[n], sizeof(T));
return StringRef(pos, sizeof(T));
}
template <typename T>
const char * ColumnDecimal<T>::deserializeAndInsertFromArena(const char * pos)
{
if constexpr (is_POD)
{
data.push_back(unalignedLoad<T>(pos));
return pos + sizeof(T);
}
else
{
data.push_back(BigInt<Int256>::deserialize(pos));
return pos + BigInt<Int256>::size;
}
data.push_back(unalignedLoad<T>(pos));
return pos + sizeof(T);
}
template <typename T>
@ -197,21 +179,11 @@ void ColumnDecimal<T>::updatePermutation(bool reverse, size_t limit, int, IColum
/// Since then we are working inside the interval.
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
[this](size_t a, size_t b) { return data[a] > data[b]; });
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
[this](size_t a, size_t b) { return data[a] > data[b]; });
#endif
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
[this](size_t a, size_t b) { return data[a] < data[b]; });
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last,
[this](size_t a, size_t b) { return data[a] > data[b]; });
#endif
auto new_first = first;
for (auto j = first + 1; j < limit; ++j)
{
@ -264,24 +236,13 @@ MutableColumnPtr ColumnDecimal<T>::cloneResized(size_t size) const
new_col.data.resize(size);
size_t count = std::min(this->size(), size);
if constexpr (is_POD)
{
memcpy(new_col.data.data(), data.data(), count * sizeof(data[0]));
if (size > count)
{
void * tail = &new_col.data[count];
memset(tail, 0, (size - count) * sizeof(T));
}
}
else
{
for (size_t i = 0; i < count; i++)
new_col.data[i] = data[i];
memcpy(new_col.data.data(), data.data(), count * sizeof(data[0]));
if (size > count)
for (size_t i = count; i < size; i++)
new_col.data[i] = T{};
if (size > count)
{
void * tail = &new_col.data[count];
memset(tail, 0, (size - count) * sizeof(T));
}
}
@ -291,16 +252,9 @@ MutableColumnPtr ColumnDecimal<T>::cloneResized(size_t size) const
template <typename T>
void ColumnDecimal<T>::insertData(const char * src, size_t /*length*/)
{
if constexpr (is_POD)
{
T tmp;
memcpy(&tmp, src, sizeof(T));
data.emplace_back(tmp);
}
else
{
data.push_back(BigInt<Int256>::deserialize(src));
}
T tmp;
memcpy(&tmp, src, sizeof(T));
data.emplace_back(tmp);
}
template <typename T>
@ -315,13 +269,8 @@ void ColumnDecimal<T>::insertRangeFrom(const IColumn & src, size_t start, size_t
size_t old_size = data.size();
data.resize(old_size + length);
if constexpr (is_POD)
memcpy(data.data() + old_size, &src_vec.data[start], length * sizeof(data[0]));
else
{
for (size_t i = 0; i < length; i++)
data[old_size + i] = src_vec.data[start + i];
}
memcpy(data.data() + old_size, &src_vec.data[start], length * sizeof(data[0]));
}
template <typename T>

View File

@ -1,25 +1,18 @@
#pragma once
#include <cmath>
#include <Common/typeid_cast.h>
#include <Columns/ColumnVectorHelper.h>
#include <Columns/IColumn.h>
#include <Columns/IColumnImpl.h>
#include <Columns/ColumnVectorHelper.h>
#include <Core/Field.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
#include <Core/DecimalFunctions.h>
#include <Common/typeid_cast.h>
#include <common/sort.h>
#include <cmath>
namespace DB
{
namespace ErrorCodes
{
extern const int NOT_IMPLEMENTED;
}
/// PaddedPODArray extended by Decimal scale
template <typename T>
class DecimalPaddedPODArray : public PaddedPODArray<T>
@ -57,43 +50,6 @@ private:
UInt32 scale;
};
/// std::vector extended by Decimal scale
template <typename T>
class DecimalVector : public std::vector<T>
{
public:
using Base = std::vector<T>;
using Base::operator[];
DecimalVector(size_t size, UInt32 scale_)
: Base(size),
scale(scale_)
{}
DecimalVector(const DecimalVector & other)
: Base(other.begin(), other.end()),
scale(other.scale)
{}
DecimalVector(DecimalVector && other)
{
this->swap(other);
std::swap(scale, other.scale);
}
DecimalVector & operator=(DecimalVector && other)
{
this->swap(other);
std::swap(scale, other.scale);
return *this;
}
UInt32 getScale() const { return scale; }
private:
UInt32 scale;
};
/// A ColumnVector for Decimals
template <typename T>
class ColumnDecimal final : public COWHelper<ColumnVectorHelper, ColumnDecimal<T>>
@ -107,10 +63,7 @@ private:
public:
using ValueType = T;
using NativeT = typename T::NativeType;
static constexpr bool is_POD = !is_big_int_v<NativeT>;
using Container = std::conditional_t<is_POD,
DecimalPaddedPODArray<T>,
DecimalVector<T>>;
using Container = DecimalPaddedPODArray<T>;
private:
ColumnDecimal(const size_t n, UInt32 scale_)
@ -134,18 +87,8 @@ public:
size_t size() const override { return data.size(); }
size_t byteSize() const override { return data.size() * sizeof(data[0]); }
size_t allocatedBytes() const override
{
if constexpr (is_POD)
return data.allocated_bytes();
else
return data.capacity() * sizeof(data[0]);
}
void protect() override
{
if constexpr (is_POD)
data.protect();
}
size_t allocatedBytes() const override { return data.allocated_bytes(); }
void protect() override { data.protect(); }
void reserve(size_t n) override { data.reserve(n); }
void insertFrom(const IColumn & src, size_t n) override { data.push_back(static_cast<const Self &>(src).getData()[n]); }
@ -153,38 +96,28 @@ public:
void insertDefault() override { data.push_back(T()); }
virtual void insertManyDefaults(size_t length) override
{
if constexpr (is_POD)
data.resize_fill(data.size() + length);
else
data.resize(data.size() + length);
data.resize_fill(data.size() + length);
}
void insert(const Field & x) override { data.push_back(DB::get<NearestFieldType<T>>(x)); }
void insertRangeFrom(const IColumn & src, size_t start, size_t length) override;
void popBack(size_t n) override
{
if constexpr (is_POD)
data.resize_assume_reserved(data.size() - n);
else
data.resize(data.size() - n);
data.resize_assume_reserved(data.size() - n);
}
StringRef getRawData() const override
{
if constexpr (is_POD)
return StringRef(reinterpret_cast<const char*>(data.data()), byteSize());
else
throw Exception("getRawData() is not implemented for big integers", ErrorCodes::NOT_IMPLEMENTED);
return StringRef(reinterpret_cast<const char*>(data.data()), byteSize());
}
StringRef getDataAt(size_t n) const override
{
if constexpr (is_POD)
return StringRef(reinterpret_cast<const char *>(&data[n]), sizeof(data[n]));
else
throw Exception("getDataAt() is not implemented for big integers", ErrorCodes::NOT_IMPLEMENTED);
return StringRef(reinterpret_cast<const char *>(&data[n]), sizeof(data[n]));
}
Float64 getFloat64(size_t n) const final { return DecimalUtils::convertTo<Float64>(data[n], scale); }
StringRef serializeValueIntoArena(size_t n, Arena & arena, char const *& begin) const override;
const char * deserializeAndInsertFromArena(const char * pos) override;
void updateHashWithValue(size_t n, SipHash & hash) const override;
@ -256,17 +189,9 @@ protected:
sort_end = res.begin() + limit;
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] > data[b]; });
#else
std::partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] > data[b]; });
#endif
partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] > data[b]; });
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] < data[b]; });
#else
std::partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] < data[b]; });
#endif
partial_sort(res.begin(), sort_end, res.end(), [this](size_t a, size_t b) { return data[a] < data[b]; });
}
};

View File

@ -1,25 +1,20 @@
#include <Columns/ColumnFixedString.h>
#include <Columns/ColumnsCommon.h>
#include <Common/Arena.h>
#include <Common/SipHash.h>
#include <Common/memcpySmall.h>
#include <Common/memcmpSmall.h>
#include <Common/assert_cast.h>
#include <Common/WeakHash.h>
#include <Common/HashTable/Hash.h>
#include <ext/scope_guard.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
#include <DataStreams/ColumnGathererStream.h>
#include <IO/WriteHelpers.h>
#include <Common/Arena.h>
#include <Common/HashTable/Hash.h>
#include <Common/SipHash.h>
#include <Common/WeakHash.h>
#include <Common/assert_cast.h>
#include <Common/memcmpSmall.h>
#include <Common/memcpySmall.h>
#include <common/sort.h>
#include <ext/scope_guard.h>
#ifdef __SSE2__
#include <emmintrin.h>
#if defined(__SSE2__)
# include <emmintrin.h>
#endif
@ -160,17 +155,9 @@ void ColumnFixedString::getPermutation(bool reverse, size_t limit, int /*nan_dir
if (limit)
{
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less<false>(*this));
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less<false>(*this));
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), less<false>(*this));
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less<true>(*this));
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less<true>(*this));
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), less<true>(*this));
}
else
{
@ -228,17 +215,9 @@ void ColumnFixedString::updatePermutation(bool reverse, size_t limit, int, Permu
/// Since then we are working inside the interval.
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<false>(*this));
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<false>(*this));
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<false>(*this));
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<true>(*this));
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<true>(*this));
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less<true>(*this));
auto new_first = first;
for (auto j = first + 1; j < limit; ++j)

View File

@ -1,20 +1,19 @@
#include <Columns/ColumnLowCardinality.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnsNumber.h>
#include <DataStreams/ColumnGathererStream.h>
#include <DataTypes/NumberTraits.h>
#include <Common/HashTable/HashMap.h>
#include <Common/assert_cast.h>
#include <Common/WeakHash.h>
#include <Common/assert_cast.h>
#include <common/sort.h>
#include <ext/scope_guard.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
@ -397,11 +396,7 @@ void ColumnLowCardinality::updatePermutationImpl(size_t limit, Permutation & res
/// Since then we are working inside the interval.
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
auto new_first = first;
for (auto j = first + 1; j < limit; ++j)

View File

@ -1,18 +1,16 @@
#include <Common/Arena.h>
#include <Common/memcmpSmall.h>
#include <Common/assert_cast.h>
#include <Common/WeakHash.h>
#include <Common/HashTable/Hash.h>
#include <Columns/Collator.h>
#include <Columns/ColumnString.h>
#include <Columns/Collator.h>
#include <Columns/ColumnsCommon.h>
#include <DataStreams/ColumnGathererStream.h>
#include <Common/Arena.h>
#include <Common/HashTable/Hash.h>
#include <Common/WeakHash.h>
#include <Common/assert_cast.h>
#include <Common/memcmpSmall.h>
#include <common/sort.h>
#include <common/unaligned.h>
#include <ext/scope_guard.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
@ -317,11 +315,7 @@ void ColumnString::getPermutationImpl(size_t limit, Permutation & res, Comparato
auto less = [&cmp](size_t lhs, size_t rhs){ return cmp(lhs, rhs) < 0; };
if (limit)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), less);
else
std::sort(res.begin(), res.end(), less);
}
@ -372,11 +366,7 @@ void ColumnString::updatePermutationImpl(size_t limit, Permutation & res, EqualR
return;
/// Since then we are working inside the interval.
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less);
size_t new_first = first;
for (size_t j = first + 1; j < limit; ++j)

View File

@ -1,17 +1,16 @@
#include <Columns/ColumnTuple.h>
#include <Columns/IColumnImpl.h>
#include <Core/Field.h>
#include <DataStreams/ColumnGathererStream.h>
#include <IO/WriteBufferFromString.h>
#include <IO/Operators.h>
#include <IO/WriteBufferFromString.h>
#include <Common/WeakHash.h>
#include <Common/assert_cast.h>
#include <Common/typeid_cast.h>
#include <common/sort.h>
#include <ext/map.h>
#include <ext/range.h>
#include <Common/typeid_cast.h>
#include <Common/assert_cast.h>
#include <Common/WeakHash.h>
#include <Core/Field.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
namespace DB
@ -354,17 +353,9 @@ void ColumnTuple::getPermutationImpl(size_t limit, Permutation & res, LessOperat
limit = 0;
if (limit)
{
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less);
#endif
}
partial_sort(res.begin(), res.begin() + limit, res.end(), less);
else
{
std::sort(res.begin(), res.end(), less);
}
}
void ColumnTuple::updatePermutationImpl(bool reverse, size_t limit, int nan_direction_hint, IColumn::Permutation & res, EqualRanges & equal_ranges, const Collator * collator) const

View File

@ -1,28 +1,27 @@
#include "ColumnVector.h"
#include <cstring>
#include <cmath>
#include <common/unaligned.h>
#include <Common/Exception.h>
#include <Common/Arena.h>
#include <Common/SipHash.h>
#include <Common/NaNUtils.h>
#include <Common/RadixSort.h>
#include <Common/assert_cast.h>
#include <Common/WeakHash.h>
#include <Common/HashTable/Hash.h>
#include <IO/WriteHelpers.h>
#include <pdqsort.h>
#include <Columns/ColumnsCommon.h>
#include <DataStreams/ColumnGathererStream.h>
#include <IO/WriteHelpers.h>
#include <Common/Arena.h>
#include <Common/Exception.h>
#include <Common/HashTable/Hash.h>
#include <Common/NaNUtils.h>
#include <Common/RadixSort.h>
#include <Common/SipHash.h>
#include <Common/WeakHash.h>
#include <Common/assert_cast.h>
#include <common/sort.h>
#include <common/unaligned.h>
#include <ext/bit_cast.h>
#include <ext/scope_guard.h>
#include <pdqsort.h>
#if !defined(ARCADIA_BUILD)
#include <miniselect/floyd_rivest_select.h> // Y_IGNORE
#endif
#ifdef __SSE2__
#include <emmintrin.h>
#include <cmath>
#include <cstring>
#if defined(__SSE2__)
# include <emmintrin.h>
#endif
namespace DB
@ -158,17 +157,9 @@ void ColumnVector<T>::getPermutation(bool reverse, size_t limit, int nan_directi
res[i] = i;
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), greater(*this, nan_direction_hint));
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), greater(*this, nan_direction_hint));
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), greater(*this, nan_direction_hint));
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin(), res.begin() + limit, res.end(), less(*this, nan_direction_hint));
#else
std::partial_sort(res.begin(), res.begin() + limit, res.end(), less(*this, nan_direction_hint));
#endif
partial_sort(res.begin(), res.begin() + limit, res.end(), less(*this, nan_direction_hint));
}
else
{
@ -264,17 +255,9 @@ void ColumnVector<T>::updatePermutation(bool reverse, size_t limit, int nan_dire
/// Since then, we are working inside the interval.
if (reverse)
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, greater(*this, nan_direction_hint));
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, greater(*this, nan_direction_hint));
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, greater(*this, nan_direction_hint));
else
#if !defined(ARCADIA_BUILD)
miniselect::floyd_rivest_partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less(*this, nan_direction_hint));
#else
std::partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less(*this, nan_direction_hint));
#endif
partial_sort(res.begin() + first, res.begin() + limit, res.begin() + last, less(*this, nan_direction_hint));
size_t new_first = first;
for (size_t j = first + 1; j < limit; ++j)

View File

@ -1,6 +1,7 @@
#pragma once
#include <Columns/IColumn.h>
#include <Common/PODArray.h>
namespace DB

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -519,9 +519,11 @@
M(550, CONDITIONAL_TREE_PARENT_NOT_FOUND) \
M(551, ILLEGAL_PROJECTION_MANIPULATOR) \
M(552, UNRECOGNIZED_ARGUMENTS) \
M(553, ROCKSDB_ERROR) \
M(553, LZMA_STREAM_ENCODER_FAILED) \
M(554, LZMA_STREAM_DECODER_FAILED) \
M(555, ROCKSDB_ERROR) \
M(556, SYNC_MYSQL_USER_ACCESS_ERROR)\
\
M(999, KEEPER_EXCEPTION) \
M(1000, POCO_EXCEPTION) \
M(1001, STD_EXCEPTION) \

View File

@ -11,6 +11,7 @@
#include <cstdint>
#include <cassert>
#include <type_traits>
#include <memory>
#include <ext/bit_cast.h>
#include <common/extended_types.h>

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
ADDINCL (

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL (

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
ADDINCL(

View File

@ -40,7 +40,7 @@ Block::Block(const ColumnsWithTypeAndName & data_) : data{data_}
void Block::initializeIndexByName()
{
for (size_t i = 0, size = data.size(); i < size; ++i)
index_by_name[data[i].name] = i;
index_by_name.emplace(data[i].name, i);
}
@ -295,6 +295,20 @@ std::string Block::dumpStructure() const
return out.str();
}
std::string Block::dumpIndex() const
{
WriteBufferFromOwnString out;
bool first = true;
for (const auto & [name, pos] : index_by_name)
{
if (!first)
out << ", ";
first = false;
out << name << ' ' << pos;
}
return out.str();
}
Block Block::cloneEmpty() const
{

View File

@ -119,6 +119,9 @@ public:
/** List of names, types and lengths of columns. Designed for debugging. */
std::string dumpStructure() const;
/** List of column names and positions from index */
std::string dumpIndex() const;
/** Get the same block, but empty. */
Block cloneEmpty() const;
@ -156,7 +159,7 @@ private:
/// This is needed to allow function execution over data.
/// It is safe because functions does not change column names, so index is unaffected.
/// It is temporary.
friend struct ExpressionAction;
friend class ExpressionActions;
friend class ActionsDAG;
};

View File

@ -57,6 +57,7 @@ public:
using Op = Operation<CompareInt, CompareInt>;
using ColVecA = std::conditional_t<IsDecimalNumber<A>, ColumnDecimal<A>, ColumnVector<A>>;
using ColVecB = std::conditional_t<IsDecimalNumber<B>, ColumnDecimal<B>, ColumnVector<B>>;
using ArrayA = typename ColVecA::Container;
using ArrayB = typename ColVecB::Container;

View File

@ -70,7 +70,7 @@
/// Minimum revision supporting OpenTelemetry
#define DBMS_MIN_REVISION_WITH_OPENTELEMETRY 54442
/// Mininum revision supporting interserver secret.
/// Minimum revision supporting interserver secret.
#define DBMS_MIN_REVISION_WITH_INTERSERVER_SECRET 54441
/// Version of ClickHouse TCP protocol. Increment it manually when you change the protocol.

View File

@ -145,7 +145,7 @@ struct Decimal
operator T () const { return value; }
template <typename U>
U convertTo()
U convertTo() const
{
/// no IsDecimalNumber defined yet
if constexpr (std::is_same_v<U, Decimal<Int32>> ||

View File

@ -106,12 +106,6 @@ std::ostream & operator<<(std::ostream & stream, const Packet & what)
return stream;
}
std::ostream & operator<<(std::ostream & stream, const ExpressionAction & what)
{
stream << "ExpressionAction(" << what.toString() << ")";
return stream;
}
std::ostream & operator<<(std::ostream & stream, const ExpressionActions & what)
{
stream << "ExpressionActions(" << what.dumpActions() << ")";

View File

@ -40,9 +40,6 @@ std::ostream & operator<<(std::ostream & stream, const IColumn & what);
struct Packet;
std::ostream & operator<<(std::ostream & stream, const Packet & what);
struct ExpressionAction;
std::ostream & operator<<(std::ostream & stream, const ExpressionAction & what);
class ExpressionActions;
std::ostream & operator<<(std::ostream & stream, const ExpressionActions & what);

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -46,7 +46,7 @@ void CheckConstraintsBlockOutputStream::write(const Block & block)
auto * constraint_ptr = constraints.constraints[i]->as<ASTConstraintDeclaration>();
ColumnWithTypeAndName res_column = block_to_calculate.getByPosition(block_to_calculate.columns() - 1);
ColumnWithTypeAndName res_column = block_to_calculate.getByName(constraint_ptr->expr->getColumnName());
if (!isUInt8(res_column.type))
throw Exception(ErrorCodes::LOGICAL_ERROR, "Constraint {} does not return a value of type UInt8",

View File

@ -10,6 +10,7 @@
#include <Common/CurrentThread.h>
#include <Common/setThreadName.h>
#include <Common/ThreadPool.h>
#include <Common/checkStackSize.h>
#include <Storages/MergeTree/ReplicatedMergeTreeBlockOutputStream.h>
#include <Storages/StorageValues.h>
#include <Storages/LiveView/StorageLiveView.h>
@ -29,6 +30,8 @@ PushingToViewsBlockOutputStream::PushingToViewsBlockOutputStream(
, context(context_)
, query_ptr(query_ptr_)
{
checkStackSize();
/** TODO This is a very important line. At any insertion into the table one of streams should own lock.
* Although now any insertion into the table is done via PushingToViewsBlockOutputStream,
* but it's clear that here is not the best place for this functionality.

View File

@ -103,6 +103,15 @@ bool TTLBlockInputStream::isTTLExpired(time_t ttl) const
return (ttl && (ttl <= current_time));
}
Block reorderColumns(Block block, const Block & header)
{
Block res;
for (const auto & col : header)
res.insert(block.getByName(col.name));
return res;
}
Block TTLBlockInputStream::readImpl()
{
/// Skip all data if table ttl is expired for part
@ -136,7 +145,7 @@ Block TTLBlockInputStream::readImpl()
updateMovesTTL(block);
updateRecompressionTTL(block);
return block;
return reorderColumns(std::move(block), header);
}
void TTLBlockInputStream::readSuffixImpl()

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -466,75 +466,66 @@ struct WhichDataType
{
TypeIndex idx;
WhichDataType(TypeIndex idx_ = TypeIndex::Nothing)
: idx(idx_)
{}
constexpr WhichDataType(TypeIndex idx_ = TypeIndex::Nothing) : idx(idx_) {}
constexpr WhichDataType(const IDataType & data_type) : idx(data_type.getTypeId()) {}
constexpr WhichDataType(const IDataType * data_type) : idx(data_type->getTypeId()) {}
WhichDataType(const IDataType & data_type)
: idx(data_type.getTypeId())
{}
// shared ptr -> is non-constexpr in gcc
WhichDataType(const DataTypePtr & data_type) : idx(data_type->getTypeId()) {}
WhichDataType(const IDataType * data_type)
: idx(data_type->getTypeId())
{}
constexpr bool isUInt8() const { return idx == TypeIndex::UInt8; }
constexpr bool isUInt16() const { return idx == TypeIndex::UInt16; }
constexpr bool isUInt32() const { return idx == TypeIndex::UInt32; }
constexpr bool isUInt64() const { return idx == TypeIndex::UInt64; }
constexpr bool isUInt128() const { return idx == TypeIndex::UInt128; }
constexpr bool isUInt256() const { return idx == TypeIndex::UInt256; }
constexpr bool isUInt() const { return isUInt8() || isUInt16() || isUInt32() || isUInt64() || isUInt128() || isUInt256(); }
constexpr bool isNativeUInt() const { return isUInt8() || isUInt16() || isUInt32() || isUInt64(); }
WhichDataType(const DataTypePtr & data_type)
: idx(data_type->getTypeId())
{}
constexpr bool isInt8() const { return idx == TypeIndex::Int8; }
constexpr bool isInt16() const { return idx == TypeIndex::Int16; }
constexpr bool isInt32() const { return idx == TypeIndex::Int32; }
constexpr bool isInt64() const { return idx == TypeIndex::Int64; }
constexpr bool isInt128() const { return idx == TypeIndex::Int128; }
constexpr bool isInt256() const { return idx == TypeIndex::Int256; }
constexpr bool isInt() const { return isInt8() || isInt16() || isInt32() || isInt64() || isInt128() || isInt256(); }
constexpr bool isNativeInt() const { return isInt8() || isInt16() || isInt32() || isInt64(); }
bool isUInt8() const { return idx == TypeIndex::UInt8; }
bool isUInt16() const { return idx == TypeIndex::UInt16; }
bool isUInt32() const { return idx == TypeIndex::UInt32; }
bool isUInt64() const { return idx == TypeIndex::UInt64; }
bool isUInt128() const { return idx == TypeIndex::UInt128; }
bool isUInt256() const { return idx == TypeIndex::UInt256; }
bool isUInt() const { return isUInt8() || isUInt16() || isUInt32() || isUInt64() || isUInt128() || isUInt256(); }
bool isNativeUInt() const { return isUInt8() || isUInt16() || isUInt32() || isUInt64(); }
constexpr bool isDecimal32() const { return idx == TypeIndex::Decimal32; }
constexpr bool isDecimal64() const { return idx == TypeIndex::Decimal64; }
constexpr bool isDecimal128() const { return idx == TypeIndex::Decimal128; }
constexpr bool isDecimal256() const { return idx == TypeIndex::Decimal256; }
constexpr bool isDecimal() const { return isDecimal32() || isDecimal64() || isDecimal128() || isDecimal256(); }
bool isInt8() const { return idx == TypeIndex::Int8; }
bool isInt16() const { return idx == TypeIndex::Int16; }
bool isInt32() const { return idx == TypeIndex::Int32; }
bool isInt64() const { return idx == TypeIndex::Int64; }
bool isInt128() const { return idx == TypeIndex::Int128; }
bool isInt256() const { return idx == TypeIndex::Int256; }
bool isInt() const { return isInt8() || isInt16() || isInt32() || isInt64() || isInt128() || isInt256(); }
bool isNativeInt() const { return isInt8() || isInt16() || isInt32() || isInt64(); }
constexpr bool isFloat32() const { return idx == TypeIndex::Float32; }
constexpr bool isFloat64() const { return idx == TypeIndex::Float64; }
constexpr bool isFloat() const { return isFloat32() || isFloat64(); }
bool isDecimal32() const { return idx == TypeIndex::Decimal32; }
bool isDecimal64() const { return idx == TypeIndex::Decimal64; }
bool isDecimal128() const { return idx == TypeIndex::Decimal128; }
bool isDecimal256() const { return idx == TypeIndex::Decimal256; }
bool isDecimal() const { return isDecimal32() || isDecimal64() || isDecimal128() || isDecimal256(); }
constexpr bool isEnum8() const { return idx == TypeIndex::Enum8; }
constexpr bool isEnum16() const { return idx == TypeIndex::Enum16; }
constexpr bool isEnum() const { return isEnum8() || isEnum16(); }
bool isFloat32() const { return idx == TypeIndex::Float32; }
bool isFloat64() const { return idx == TypeIndex::Float64; }
bool isFloat() const { return isFloat32() || isFloat64(); }
constexpr bool isDate() const { return idx == TypeIndex::Date; }
constexpr bool isDateTime() const { return idx == TypeIndex::DateTime; }
constexpr bool isDateTime64() const { return idx == TypeIndex::DateTime64; }
constexpr bool isDateOrDateTime() const { return isDate() || isDateTime() || isDateTime64(); }
bool isEnum8() const { return idx == TypeIndex::Enum8; }
bool isEnum16() const { return idx == TypeIndex::Enum16; }
bool isEnum() const { return isEnum8() || isEnum16(); }
constexpr bool isString() const { return idx == TypeIndex::String; }
constexpr bool isFixedString() const { return idx == TypeIndex::FixedString; }
constexpr bool isStringOrFixedString() const { return isString() || isFixedString(); }
bool isDate() const { return idx == TypeIndex::Date; }
bool isDateTime() const { return idx == TypeIndex::DateTime; }
bool isDateTime64() const { return idx == TypeIndex::DateTime64; }
bool isDateOrDateTime() const { return isDate() || isDateTime() || isDateTime64(); }
constexpr bool isUUID() const { return idx == TypeIndex::UUID; }
constexpr bool isArray() const { return idx == TypeIndex::Array; }
constexpr bool isTuple() const { return idx == TypeIndex::Tuple; }
constexpr bool isSet() const { return idx == TypeIndex::Set; }
constexpr bool isInterval() const { return idx == TypeIndex::Interval; }
bool isString() const { return idx == TypeIndex::String; }
bool isFixedString() const { return idx == TypeIndex::FixedString; }
bool isStringOrFixedString() const { return isString() || isFixedString(); }
constexpr bool isNothing() const { return idx == TypeIndex::Nothing; }
constexpr bool isNullable() const { return idx == TypeIndex::Nullable; }
constexpr bool isFunction() const { return idx == TypeIndex::Function; }
constexpr bool isAggregateFunction() const { return idx == TypeIndex::AggregateFunction; }
bool isUUID() const { return idx == TypeIndex::UUID; }
bool isArray() const { return idx == TypeIndex::Array; }
bool isTuple() const { return idx == TypeIndex::Tuple; }
bool isSet() const { return idx == TypeIndex::Set; }
bool isInterval() const { return idx == TypeIndex::Interval; }
bool isNothing() const { return idx == TypeIndex::Nothing; }
bool isNullable() const { return idx == TypeIndex::Nullable; }
bool isFunction() const { return idx == TypeIndex::Function; }
bool isAggregateFunction() const { return idx == TypeIndex::AggregateFunction; }
bool IsBigIntOrDeimal() const { return isInt128() || isInt256() || isUInt256() || isDecimal256(); }
constexpr bool IsBigIntOrDeimal() const { return isInt128() || isInt256() || isUInt256() || isDecimal256(); }
};
/// IDataType helpers (alternative for IDataType virtual methods with single point of truth)

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -12,6 +12,7 @@
#include <Common/quoteString.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <IO/Operators.h>
namespace DB
{
@ -19,6 +20,7 @@ namespace DB
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int SYNC_MYSQL_USER_ACCESS_ERROR;
}
static std::unordered_map<String, String> fetchTablesCreateQuery(
@ -64,6 +66,7 @@ static std::vector<String> fetchTablesInDB(const mysqlxx::PoolWithFailover::Entr
return tables_in_db;
}
void MaterializeMetadata::fetchMasterStatus(mysqlxx::PoolWithFailover::Entry & connection)
{
Block header{
@ -105,6 +108,49 @@ static Block getShowMasterLogHeader(const String & mysql_version)
};
}
static bool checkSyncUserPrivImpl(mysqlxx::PoolWithFailover::Entry & connection, WriteBuffer & out)
{
Block sync_user_privs_header
{
{std::make_shared<DataTypeString>(), "current_user_grants"}
};
String grants_query, sub_privs;
MySQLBlockInputStream input(connection, "SHOW GRANTS FOR CURRENT_USER();", sync_user_privs_header, DEFAULT_BLOCK_SIZE);
while (Block block = input.read())
{
for (size_t index = 0; index < block.rows(); ++index)
{
grants_query = (*block.getByPosition(0).column)[index].safeGet<String>();
out << grants_query << "; ";
sub_privs = grants_query.substr(0, grants_query.find(" ON "));
if (sub_privs.find("ALL PRIVILEGES") == std::string::npos)
{
if ((sub_privs.find("RELOAD") != std::string::npos and
sub_privs.find("REPLICATION SLAVE") != std::string::npos and
sub_privs.find("REPLICATION CLIENT") != std::string::npos))
return true;
}
else
{
return true;
}
}
}
return false;
}
static void checkSyncUserPriv(mysqlxx::PoolWithFailover::Entry & connection)
{
WriteBufferFromOwnString out;
if (!checkSyncUserPrivImpl(connection, out))
throw Exception("MySQL SYNC USER ACCESS ERR: mysql sync user needs "
"at least GLOBAL PRIVILEGES:'RELOAD, REPLICATION SLAVE, REPLICATION CLIENT' "
"and SELECT PRIVILEGE on MySQL Database."
"But the SYNC USER grant query is: " + out.str(), ErrorCodes::SYNC_MYSQL_USER_ACCESS_ERROR);
}
bool MaterializeMetadata::checkBinlogFileExists(mysqlxx::PoolWithFailover::Entry & connection, const String & mysql_version) const
{
MySQLBlockInputStream input(connection, "SHOW MASTER LOGS", getShowMasterLogHeader(mysql_version), DEFAULT_BLOCK_SIZE);
@ -167,6 +213,8 @@ MaterializeMetadata::MaterializeMetadata(
const String & database, bool & opened_transaction, const String & mysql_version)
: persistent_path(path_)
{
checkSyncUserPriv(connection);
if (Poco::File(persistent_path).exists())
{
ReadBufferFromFile in(persistent_path, DBMS_DEFAULT_BUFFER_SIZE);

View File

@ -5,7 +5,6 @@
#if USE_MYSQL
#include <Databases/MySQL/MaterializeMySQLSyncThread.h>
# include <cstdlib>
# include <random>
# include <Columns/ColumnTuple.h>
@ -34,6 +33,8 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR;
extern const int NOT_IMPLEMENTED;
extern const int ILLEGAL_MYSQL_VARIABLE;
extern const int SYNC_MYSQL_USER_ACCESS_ERROR;
extern const int UNKNOWN_DATABASE;
}
static constexpr auto MYSQL_BACKGROUND_THREAD_NAME = "MySQLDBSync";
@ -214,10 +215,33 @@ void MaterializeMySQLSyncThread::stopSynchronization()
void MaterializeMySQLSyncThread::startSynchronization()
{
const auto & mysql_server_version = checkVariableAndGetVersion(pool.get());
try
{
const auto & mysql_server_version = checkVariableAndGetVersion(pool.get());
background_thread_pool = std::make_unique<ThreadFromGlobalPool>(
[this, mysql_server_version = mysql_server_version]() { synchronization(mysql_server_version); });
background_thread_pool = std::make_unique<ThreadFromGlobalPool>(
[this, mysql_server_version = mysql_server_version]() { synchronization(mysql_server_version); });
}
catch (...)
{
try
{
throw;
}
catch (mysqlxx::ConnectionFailed & e)
{
if (e.errnum() == ER_ACCESS_DENIED_ERROR
|| e.errnum() == ER_DBACCESS_DENIED_ERROR)
throw Exception("MySQL SYNC USER ACCESS ERR: mysql sync user needs "
"at least GLOBAL PRIVILEGES:'RELOAD, REPLICATION SLAVE, REPLICATION CLIENT' "
"and SELECT PRIVILEGE on Database " + mysql_database_name
, ErrorCodes::SYNC_MYSQL_USER_ACCESS_ERROR);
else if (e.errnum() == ER_BAD_DB_ERROR)
throw Exception("Unknown database '" + mysql_database_name + "' on MySQL", ErrorCodes::UNKNOWN_DATABASE);
else
throw;
}
}
}
static inline void cleanOutdatedTables(const String & database_name, const Context & context)

View File

@ -20,6 +20,7 @@
# include <mysqlxx/Pool.h>
# include <mysqlxx/PoolWithFailover.h>
namespace DB
{
@ -63,6 +64,12 @@ private:
MaterializeMySQLSettings * settings;
String query_prefix;
// USE MySQL ERROR CODE:
// https://dev.mysql.com/doc/mysql-errors/5.7/en/server-error-reference.html
const int ER_ACCESS_DENIED_ERROR = 1045;
const int ER_DBACCESS_DENIED_ERROR = 1044;
const int ER_BAD_DB_ERROR = 1049;
struct Buffers
{
String database;

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -201,7 +201,7 @@ public:
{
/// Check that expression does not contain unusual actions that will break columnss structure.
for (const auto & action : expression_actions->getActions())
if (action.type == ExpressionAction::Type::ARRAY_JOIN)
if (action.node->type == ActionsDAG::ActionType::ARRAY_JOIN)
throw Exception("Expression with arrayJoin or other unusual action cannot be captured", ErrorCodes::BAD_ARGUMENTS);
std::unordered_map<std::string, DataTypePtr> arguments_map;

View File

@ -3,6 +3,7 @@
#include <Common/typeid_cast.h>
#include <Common/assert_cast.h>
#include <Common/LRUCache.h>
#include <Common/SipHash.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnTuple.h>

View File

@ -32,6 +32,8 @@ public:
return false;
}
bool isSuitableForConstantFolding() const override { return false; }
size_t getNumberOfArguments() const override
{
return 0;

View File

@ -26,6 +26,10 @@ public:
return name;
}
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const override { return false; }
bool isSuitableForConstantFolding() const override { return false; }
size_t getNumberOfArguments() const override
{
return 0;

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
CFLAGS(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
CFLAGS(

View File

@ -1,4 +1,6 @@
# This file is generated automatically, do not edit. See 'ya.make.in' and use 'utils/generate-ya-make' to regenerate it.
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -1,3 +1,5 @@
OWNER(g:clickhouse)
LIBRARY()
PEERDIR(

View File

@ -0,0 +1,690 @@
#include <Interpreters/ActionsDAG.h>
#include <DataTypes/DataTypeArray.h>
#include <Functions/IFunction.h>
#include <Interpreters/Context.h>
#include <Interpreters/ExpressionJIT.h>
#include <IO/WriteBufferFromString.h>
#include <IO/Operators.h>
#include <stack>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int DUPLICATE_COLUMN;
extern const int UNKNOWN_IDENTIFIER;
extern const int TYPE_MISMATCH;
}
ActionsDAG::ActionsDAG(const NamesAndTypesList & inputs)
{
for (const auto & input : inputs)
addInput(input.name, input.type);
}
ActionsDAG::ActionsDAG(const ColumnsWithTypeAndName & inputs)
{
for (const auto & input : inputs)
{
if (input.column && isColumnConst(*input.column))
addInput(input);
else
addInput(input.name, input.type);
}
}
ActionsDAG::Node & ActionsDAG::addNode(Node node, bool can_replace)
{
auto it = index.find(node.result_name);
if (it != index.end() && !can_replace)
throw Exception("Column '" + node.result_name + "' already exists", ErrorCodes::DUPLICATE_COLUMN);
auto & res = nodes.emplace_back(std::move(node));
index.replace(&res);
return res;
}
ActionsDAG::Node & ActionsDAG::getNode(const std::string & name)
{
auto it = index.find(name);
if (it == index.end())
throw Exception("Unknown identifier: '" + name + "'", ErrorCodes::UNKNOWN_IDENTIFIER);
return **it;
}
const ActionsDAG::Node & ActionsDAG::addInput(std::string name, DataTypePtr type)
{
Node node;
node.type = ActionType::INPUT;
node.result_type = std::move(type);
node.result_name = std::move(name);
return addNode(std::move(node));
}
const ActionsDAG::Node & ActionsDAG::addInput(ColumnWithTypeAndName column)
{
Node node;
node.type = ActionType::INPUT;
node.result_type = std::move(column.type);
node.result_name = std::move(column.name);
node.column = std::move(column.column);
return addNode(std::move(node));
}
const ActionsDAG::Node & ActionsDAG::addColumn(ColumnWithTypeAndName column)
{
if (!column.column)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Cannot add column {} because it is nullptr", column.name);
Node node;
node.type = ActionType::COLUMN;
node.result_type = std::move(column.type);
node.result_name = std::move(column.name);
node.column = std::move(column.column);
return addNode(std::move(node));
}
const ActionsDAG::Node & ActionsDAG::addAlias(const std::string & name, std::string alias, bool can_replace)
{
auto & child = getNode(name);
Node node;
node.type = ActionType::ALIAS;
node.result_type = child.result_type;
node.result_name = std::move(alias);
node.column = child.column;
node.allow_constant_folding = child.allow_constant_folding;
node.children.emplace_back(&child);
return addNode(std::move(node), can_replace);
}
const ActionsDAG::Node & ActionsDAG::addArrayJoin(const std::string & source_name, std::string result_name)
{
auto & child = getNode(source_name);
const DataTypeArray * array_type = typeid_cast<const DataTypeArray *>(child.result_type.get());
if (!array_type)
throw Exception("ARRAY JOIN requires array argument", ErrorCodes::TYPE_MISMATCH);
Node node;
node.type = ActionType::ARRAY_JOIN;
node.result_type = array_type->getNestedType();
node.result_name = std::move(result_name);
node.children.emplace_back(&child);
return addNode(std::move(node));
}
const ActionsDAG::Node & ActionsDAG::addFunction(
const FunctionOverloadResolverPtr & function,
const Names & argument_names,
std::string result_name,
const Context & context [[maybe_unused]])
{
const auto & all_settings = context.getSettingsRef();
settings.max_temporary_columns = all_settings.max_temporary_columns;
settings.max_temporary_non_const_columns = all_settings.max_temporary_non_const_columns;
#if USE_EMBEDDED_COMPILER
settings.compile_expressions = all_settings.compile_expressions;
settings.min_count_to_compile_expression = all_settings.min_count_to_compile_expression;
if (!compilation_cache)
compilation_cache = context.getCompiledExpressionCache();
#endif
size_t num_arguments = argument_names.size();
Node node;
node.type = ActionType::FUNCTION;
node.function_builder = function;
node.children.reserve(num_arguments);
bool all_const = true;
ColumnsWithTypeAndName arguments(num_arguments);
for (size_t i = 0; i < num_arguments; ++i)
{
auto & child = getNode(argument_names[i]);
node.children.emplace_back(&child);
node.allow_constant_folding = node.allow_constant_folding && child.allow_constant_folding;
ColumnWithTypeAndName argument;
argument.column = child.column;
argument.type = child.result_type;
argument.name = child.result_name;
if (!argument.column || !isColumnConst(*argument.column))
all_const = false;
arguments[i] = std::move(argument);
}
node.function_base = function->build(arguments);
node.result_type = node.function_base->getResultType();
node.function = node.function_base->prepare(arguments);
/// If all arguments are constants, and function is suitable to be executed in 'prepare' stage - execute function.
/// But if we compile expressions compiled version of this function maybe placed in cache,
/// so we don't want to unfold non deterministic functions
if (all_const && node.function_base->isSuitableForConstantFolding()
&& (!settings.compile_expressions || node.function_base->isDeterministic()))
{
size_t num_rows = arguments.empty() ? 0 : arguments.front().column->size();
auto col = node.function->execute(arguments, node.result_type, num_rows, true);
/// If the result is not a constant, just in case, we will consider the result as unknown.
if (isColumnConst(*col))
{
/// All constant (literal) columns in block are added with size 1.
/// But if there was no columns in block before executing a function, the result has size 0.
/// Change the size to 1.
if (col->empty())
col = col->cloneResized(1);
node.column = std::move(col);
}
}
/// Some functions like ignore() or getTypeName() always return constant result even if arguments are not constant.
/// We can't do constant folding, but can specify in sample block that function result is constant to avoid
/// unnecessary materialization.
if (!node.column && node.function_base->isSuitableForConstantFolding())
{
if (auto col = node.function_base->getResultIfAlwaysReturnsConstantAndHasArguments(arguments))
{
node.column = std::move(col);
node.allow_constant_folding = false;
}
}
if (result_name.empty())
{
result_name = function->getName() + "(";
for (size_t i = 0; i < argument_names.size(); ++i)
{
if (i)
result_name += ", ";
result_name += argument_names[i];
}
result_name += ")";
}
node.result_name = std::move(result_name);
return addNode(std::move(node));
}
NamesAndTypesList ActionsDAG::getRequiredColumns() const
{
NamesAndTypesList result;
for (const auto & node : nodes)
if (node.type == ActionType::INPUT)
result.emplace_back(node.result_name, node.result_type);
return result;
}
ColumnsWithTypeAndName ActionsDAG::getResultColumns() const
{
ColumnsWithTypeAndName result;
result.reserve(index.size());
for (const auto & node : index)
result.emplace_back(node->column, node->result_type, node->result_name);
return result;
}
NamesAndTypesList ActionsDAG::getNamesAndTypesList() const
{
NamesAndTypesList result;
for (const auto & node : index)
result.emplace_back(node->result_name, node->result_type);
return result;
}
Names ActionsDAG::getNames() const
{
Names names;
names.reserve(index.size());
for (const auto & node : index)
names.emplace_back(node->result_name);
return names;
}
std::string ActionsDAG::dumpNames() const
{
WriteBufferFromOwnString out;
for (auto it = nodes.begin(); it != nodes.end(); ++it)
{
if (it != nodes.begin())
out << ", ";
out << it->result_name;
}
return out.str();
}
void ActionsDAG::removeUnusedActions(const Names & required_names)
{
std::unordered_set<Node *> nodes_set;
std::vector<Node *> required_nodes;
required_nodes.reserve(required_names.size());
for (const auto & name : required_names)
{
auto it = index.find(name);
if (it == index.end())
throw Exception(ErrorCodes::UNKNOWN_IDENTIFIER,
"Unknown column: {}, there are only columns {}", name, dumpNames());
if (nodes_set.insert(*it).second)
required_nodes.push_back(*it);
}
removeUnusedActions(required_nodes);
}
void ActionsDAG::removeUnusedActions(const std::vector<Node *> & required_nodes)
{
{
Index new_index;
for (auto * node : required_nodes)
new_index.insert(node);
index.swap(new_index);
}
removeUnusedActions();
}
void ActionsDAG::removeUnusedActions()
{
std::unordered_set<const Node *> visited_nodes;
std::stack<Node *> stack;
for (auto * node : index)
{
visited_nodes.insert(node);
stack.push(node);
}
while (!stack.empty())
{
auto * node = stack.top();
stack.pop();
if (!node->children.empty() && node->column && isColumnConst(*node->column) && node->allow_constant_folding)
{
/// Constant folding.
node->type = ActionsDAG::ActionType::COLUMN;
node->children.clear();
}
for (auto * child : node->children)
{
if (visited_nodes.count(child) == 0)
{
stack.push(child);
visited_nodes.insert(child);
}
}
}
nodes.remove_if([&](const Node & node) { return visited_nodes.count(&node) == 0; });
}
void ActionsDAG::addAliases(const NamesWithAliases & aliases, std::vector<Node *> & result_nodes)
{
std::vector<Node *> required_nodes;
for (const auto & item : aliases)
{
auto & child = getNode(item.first);
required_nodes.push_back(&child);
}
result_nodes.reserve(aliases.size());
for (size_t i = 0; i < aliases.size(); ++i)
{
const auto & item = aliases[i];
auto * child = required_nodes[i];
if (!item.second.empty() && item.first != item.second)
{
Node node;
node.type = ActionType::ALIAS;
node.result_type = child->result_type;
node.result_name = std::move(item.second);
node.column = child->column;
node.allow_constant_folding = child->allow_constant_folding;
node.children.emplace_back(child);
auto & alias = addNode(std::move(node), true);
result_nodes.push_back(&alias);
}
else
result_nodes.push_back(child);
}
}
void ActionsDAG::addAliases(const NamesWithAliases & aliases)
{
std::vector<Node *> result_nodes;
addAliases(aliases, result_nodes);
}
void ActionsDAG::project(const NamesWithAliases & projection)
{
std::vector<Node *> result_nodes;
addAliases(projection, result_nodes);
removeUnusedActions(result_nodes);
projectInput();
settings.projected_output = true;
}
void ActionsDAG::removeColumn(const std::string & column_name)
{
auto & node = getNode(column_name);
index.remove(&node);
}
bool ActionsDAG::tryRestoreColumn(const std::string & column_name)
{
if (index.contains(column_name))
return true;
for (auto it = nodes.rbegin(); it != nodes.rend(); ++it)
{
auto & node = *it;
if (node.result_name == column_name)
{
index.replace(&node);
return true;
}
}
return false;
}
ActionsDAGPtr ActionsDAG::clone() const
{
auto actions = cloneEmpty();
std::unordered_map<const Node *, Node *> copy_map;
for (const auto & node : nodes)
{
auto & copy_node = actions->nodes.emplace_back(node);
copy_map[&node] = &copy_node;
}
for (auto & node : actions->nodes)
for (auto & child : node.children)
child = copy_map[child];
for (const auto & node : index)
actions->index.insert(copy_map[node]);
return actions;
}
void ActionsDAG::compileExpressions()
{
#if USE_EMBEDDED_COMPILER
if (settings.compile_expressions)
{
compileFunctions();
removeUnusedActions();
}
#endif
}
std::string ActionsDAG::dumpDAG() const
{
std::unordered_map<const Node *, size_t> map;
for (const auto & node : nodes)
{
size_t idx = map.size();
map[&node] = idx;
}
WriteBufferFromOwnString out;
for (const auto & node : nodes)
{
out << map[&node] << " : ";
switch (node.type)
{
case ActionsDAG::ActionType::COLUMN:
out << "COLUMN ";
break;
case ActionsDAG::ActionType::ALIAS:
out << "ALIAS ";
break;
case ActionsDAG::ActionType::FUNCTION:
out << "FUNCTION ";
break;
case ActionsDAG::ActionType::ARRAY_JOIN:
out << "ARRAY JOIN ";
break;
case ActionsDAG::ActionType::INPUT:
out << "INPUT ";
break;
}
out << "(";
for (size_t i = 0; i < node.children.size(); ++i)
{
if (i)
out << ", ";
out << map[node.children[i]];
}
out << ")";
out << " " << (node.column ? node.column->getName() : "(no column)");
out << " " << (node.result_type ? node.result_type->getName() : "(no type)");
out << " " << (!node.result_name.empty() ? node.result_name : "(no name)");
if (node.function_base)
out << " [" << node.function_base->getName() << "]";
out << "\n";
}
return out.str();
}
bool ActionsDAG::hasArrayJoin() const
{
for (const auto & node : nodes)
if (node.type == ActionType::ARRAY_JOIN)
return true;
return false;
}
bool ActionsDAG::empty() const
{
for (const auto & node : nodes)
if (node.type != ActionType::INPUT)
return false;
return true;
}
ActionsDAGPtr ActionsDAG::splitActionsBeforeArrayJoin(const NameSet & array_joined_columns)
{
/// Split DAG into two parts.
/// (this_nodes, this_index) is a part which depends on ARRAY JOIN and stays here.
/// (split_nodes, split_index) is a part which will be moved before ARRAY JOIN.
std::list<Node> this_nodes;
std::list<Node> split_nodes;
Index this_index;
Index split_index;
struct Frame
{
Node * node;
size_t next_child_to_visit = 0;
};
struct Data
{
bool depend_on_array_join = false;
bool visited = false;
bool used_in_result = false;
/// Copies of node in one of the DAGs.
/// For COLUMN and INPUT both copies may exist.
Node * to_this = nullptr;
Node * to_split = nullptr;
};
std::stack<Frame> stack;
std::unordered_map<Node *, Data> data;
for (const auto & node : index)
data[node].used_in_result = true;
/// DFS. Decide if node depends on ARRAY JOIN and move it to one of the DAGs.
for (auto & node : nodes)
{
if (!data[&node].visited)
stack.push({.node = &node});
while (!stack.empty())
{
auto & cur = stack.top();
auto & cur_data = data[cur.node];
/// At first, visit all children. We depend on ARRAY JOIN if any child does.
while (cur.next_child_to_visit < cur.node->children.size())
{
auto * child = cur.node->children[cur.next_child_to_visit];
auto & child_data = data[child];
if (!child_data.visited)
{
stack.push({.node = child});
break;
}
++cur.next_child_to_visit;
if (child_data.depend_on_array_join)
cur_data.depend_on_array_join = true;
}
/// Make a copy part.
if (cur.next_child_to_visit == cur.node->children.size())
{
if (cur.node->type == ActionType::INPUT && array_joined_columns.count(cur.node->result_name))
cur_data.depend_on_array_join = true;
cur_data.visited = true;
stack.pop();
if (cur_data.depend_on_array_join)
{
auto & copy = this_nodes.emplace_back(*cur.node);
cur_data.to_this = &copy;
/// Replace children to newly created nodes.
for (auto & child : copy.children)
{
auto & child_data = data[child];
/// If children is not created, int may be from split part.
if (!child_data.to_this)
{
if (child->type == ActionType::COLUMN) /// Just create new node for COLUMN action.
{
child_data.to_this = &this_nodes.emplace_back(*child);
}
else
{
/// Node from split part is added as new input.
Node input_node;
input_node.type = ActionType::INPUT;
input_node.result_type = child->result_type;
input_node.result_name = child->result_name; // getUniqueNameForIndex(index, child->result_name);
child_data.to_this = &this_nodes.emplace_back(std::move(input_node));
/// This node is needed for current action, so put it to index also.
split_index.replace(child_data.to_split);
}
}
child = child_data.to_this;
}
}
else
{
auto & copy = split_nodes.emplace_back(*cur.node);
cur_data.to_split = &copy;
/// Replace children to newly created nodes.
for (auto & child : copy.children)
{
child = data[child].to_split;
assert(child != nullptr);
}
if (cur_data.used_in_result)
{
split_index.replace(&copy);
/// If this node is needed in result, add it as input.
Node input_node;
input_node.type = ActionType::INPUT;
input_node.result_type = node.result_type;
input_node.result_name = node.result_name;
cur_data.to_this = &this_nodes.emplace_back(std::move(input_node));
}
}
}
}
}
for (auto * node : index)
this_index.insert(data[node].to_this);
/// Consider actions are empty if all nodes are constants or inputs.
bool split_actions_are_empty = true;
for (const auto & node : split_nodes)
if (!node.children.empty())
split_actions_are_empty = false;
if (split_actions_are_empty)
return {};
index.swap(this_index);
nodes.swap(this_nodes);
auto split_actions = cloneEmpty();
split_actions->nodes.swap(split_nodes);
split_actions->index.swap(split_index);
split_actions->settings.project_input = false;
return split_actions;
}
}

View File

@ -0,0 +1,253 @@
#pragma once
#include <Core/ColumnsWithTypeAndName.h>
#include <Core/NamesAndTypes.h>
#include <Core/Names.h>
#if !defined(ARCADIA_BUILD)
# include "config_core.h"
#endif
namespace DB
{
class ActionsDAG;
using ActionsDAGPtr = std::shared_ptr<ActionsDAG>;
class IExecutableFunction;
using ExecutableFunctionPtr = std::shared_ptr<IExecutableFunction>;
class IFunctionBase;
using FunctionBasePtr = std::shared_ptr<IFunctionBase>;
class IFunctionOverloadResolver;
using FunctionOverloadResolverPtr = std::shared_ptr<IFunctionOverloadResolver>;
class IDataType;
using DataTypePtr = std::shared_ptr<const IDataType>;
class Context;
class CompiledExpressionCache;
/// Directed acyclic graph of expressions.
/// This is an intermediate representation of actions which is usually built from expression list AST.
/// Node of DAG describe calculation of a single column with known type, name, and constant value (if applicable).
///
/// DAG representation is useful in case we need to know explicit dependencies between actions.
/// It is helpful when it is needed to optimize actions, remove unused expressions, compile subexpressions,
/// split or merge parts of graph, calculate expressions on partial input.
///
/// Built DAG is used by ExpressionActions, which calculates expressions on block.
class ActionsDAG
{
public:
enum class ActionType
{
/// Column which must be in input.
INPUT,
/// Constant column with known value.
COLUMN,
/// Another one name for column.
ALIAS,
/// Function arrayJoin. Specially separated because it changes the number of rows.
ARRAY_JOIN,
FUNCTION,
};
struct Node
{
std::vector<Node *> children;
ActionType type;
std::string result_name;
DataTypePtr result_type;
FunctionOverloadResolverPtr function_builder;
/// Can be used after action was added to ExpressionActions if we want to get function signature or properties like monotonicity.
FunctionBasePtr function_base;
/// Prepared function which is used in function execution.
ExecutableFunctionPtr function;
/// If function is a compiled statement.
bool is_function_compiled = false;
/// For COLUMN node and propagated constants.
ColumnPtr column;
/// Some functions like `ignore()` always return constant but can't be replaced by constant it.
/// We calculate such constants in order to avoid unnecessary materialization, but prohibit it's folding.
bool allow_constant_folding = true;
};
/// Index is used to:
/// * find Node buy it's result_name
/// * specify order of columns in result
/// It represents a set of available columns.
/// Removing of column from index is equivalent to removing of column from final result.
///
/// DAG allows actions with duplicating result names. In this case index will point to last added Node.
/// It does not cause any problems as long as execution of actions does not depend on action names anymore.
///
/// Index is a list of nodes + [map: name -> list::iterator].
/// List is ordered, may contain nodes with same names, or one node several times.
class Index
{
private:
std::list<Node *> list;
/// Map key is a string_view to Node::result_name for node from value.
/// Map always point to existing node, so key always valid (nodes live longer then index).
std::unordered_map<std::string_view, std::list<Node *>::iterator> map;
public:
auto size() const { return list.size(); }
bool contains(std::string_view key) const { return map.count(key) != 0; }
std::list<Node *>::iterator begin() { return list.begin(); }
std::list<Node *>::iterator end() { return list.end(); }
std::list<Node *>::const_iterator begin() const { return list.begin(); }
std::list<Node *>::const_iterator end() const { return list.end(); }
std::list<Node *>::const_iterator find(std::string_view key) const
{
auto it = map.find(key);
if (it == map.end())
return list.end();
return it->second;
}
/// Insert method doesn't check if map already have node with the same name.
/// If node with the same name exists, it is removed from map, but not list.
/// It is expected and used for project(), when result may have several columns with the same name.
void insert(Node * node) { map[node->result_name] = list.emplace(list.end(), node); }
/// If node with same name exists in index, replace it. Otherwise insert new node to index.
void replace(Node * node)
{
if (auto handle = map.extract(node->result_name))
{
handle.key() = node->result_name; /// Change string_view
*handle.mapped() = node;
map.insert(std::move(handle));
}
else
insert(node);
}
void remove(Node * node)
{
auto it = map.find(node->result_name);
if (it != map.end())
return;
list.erase(it->second);
map.erase(it);
}
void swap(Index & other)
{
list.swap(other.list);
map.swap(other.map);
}
};
using Nodes = std::list<Node>;
struct ActionsSettings
{
size_t max_temporary_columns = 0;
size_t max_temporary_non_const_columns = 0;
size_t min_count_to_compile_expression = 0;
bool compile_expressions = false;
bool project_input = false;
bool projected_output = false;
};
private:
Nodes nodes;
Index index;
ActionsSettings settings;
#if USE_EMBEDDED_COMPILER
std::shared_ptr<CompiledExpressionCache> compilation_cache;
#endif
public:
ActionsDAG() = default;
ActionsDAG(const ActionsDAG &) = delete;
ActionsDAG & operator=(const ActionsDAG &) = delete;
explicit ActionsDAG(const NamesAndTypesList & inputs);
explicit ActionsDAG(const ColumnsWithTypeAndName & inputs);
const Nodes & getNodes() const { return nodes; }
const Index & getIndex() const { return index; }
NamesAndTypesList getRequiredColumns() const;
ColumnsWithTypeAndName getResultColumns() const;
NamesAndTypesList getNamesAndTypesList() const;
Names getNames() const;
std::string dumpNames() const;
std::string dumpDAG() const;
const Node & addInput(std::string name, DataTypePtr type);
const Node & addInput(ColumnWithTypeAndName column);
const Node & addColumn(ColumnWithTypeAndName column);
const Node & addAlias(const std::string & name, std::string alias, bool can_replace = false);
const Node & addArrayJoin(const std::string & source_name, std::string result_name);
const Node & addFunction(
const FunctionOverloadResolverPtr & function,
const Names & argument_names,
std::string result_name,
const Context & context);
/// Call addAlias several times.
void addAliases(const NamesWithAliases & aliases);
/// Add alias actions and remove unused columns from index. Also specify result columns order in index.
void project(const NamesWithAliases & projection);
/// Removes column from index.
void removeColumn(const std::string & column_name);
/// If column is not in index, try to find it in nodes and insert back into index.
bool tryRestoreColumn(const std::string & column_name);
void projectInput() { settings.project_input = true; }
void removeUnusedActions(const Names & required_names);
/// Splits actions into two parts. Returned half may be swapped with ARRAY JOIN.
/// Returns nullptr if no actions may be moved before ARRAY JOIN.
ActionsDAGPtr splitActionsBeforeArrayJoin(const NameSet & array_joined_columns);
bool hasArrayJoin() const;
bool empty() const; /// If actions only contain inputs.
const ActionsSettings & getSettings() const { return settings; }
void compileExpressions();
ActionsDAGPtr clone() const;
private:
Node & addNode(Node node, bool can_replace = false);
Node & getNode(const std::string & name);
ActionsDAGPtr cloneEmpty() const
{
auto actions = std::make_shared<ActionsDAG>();
actions->settings = settings;
#if USE_EMBEDDED_COMPILER
actions->compilation_cache = compilation_cache;
#endif
return actions;
}
void removeUnusedActions(const std::vector<Node *> & required_nodes);
void removeUnusedActions();
void addAliases(const NamesWithAliases & aliases, std::vector<Node *> & result_nodes);
void compileFunctions();
};
}

View File

@ -350,7 +350,7 @@ SetPtr makeExplicitSet(
auto it = index.find(left_arg->getColumnName());
if (it == index.end())
throw Exception("Unknown identifier: '" + left_arg->getColumnName() + "'", ErrorCodes::UNKNOWN_IDENTIFIER);
const DataTypePtr & left_arg_type = it->second->result_type;
const DataTypePtr & left_arg_type = (*it)->result_type;
DataTypes set_element_types = {left_arg_type};
const auto * left_tuple_type = typeid_cast<const DataTypeTuple *>(left_arg_type.get());
@ -383,7 +383,7 @@ SetPtr makeExplicitSet(
ActionsMatcher::Data::Data(
const Context & context_, SizeLimits set_size_limit_, size_t subquery_depth_,
const NamesAndTypesList & source_columns_, ActionsDAGPtr actions,
const NamesAndTypesList & source_columns_, ActionsDAGPtr actions_dag,
PreparedSets & prepared_sets_, SubqueriesForSets & subqueries_for_sets_,
bool no_subqueries_, bool no_makeset_, bool only_consts_, bool create_source_for_in_)
: context(context_)
@ -397,45 +397,45 @@ ActionsMatcher::Data::Data(
, only_consts(only_consts_)
, create_source_for_in(create_source_for_in_)
, visit_depth(0)
, actions_stack(std::move(actions), context)
, actions_stack(std::move(actions_dag), context)
, next_unique_suffix(actions_stack.getLastActions().getIndex().size() + 1)
{
}
bool ActionsMatcher::Data::hasColumn(const String & column_name) const
{
return actions_stack.getLastActions().getIndex().count(column_name) != 0;
return actions_stack.getLastActions().getIndex().contains(column_name);
}
ScopeStack::ScopeStack(ActionsDAGPtr actions, const Context & context_)
ScopeStack::ScopeStack(ActionsDAGPtr actions_dag, const Context & context_)
: context(context_)
{
auto & level = stack.emplace_back();
level.actions = std::move(actions);
level.actions_dag = std::move(actions_dag);
for (const auto & [name, node] : level.actions->getIndex())
if (node->type == ActionsDAG::Type::INPUT)
level.inputs.emplace(name);
for (const auto & node : level.actions_dag->getIndex())
if (node->type == ActionsDAG::ActionType::INPUT)
level.inputs.emplace(node->result_name);
}
void ScopeStack::pushLevel(const NamesAndTypesList & input_columns)
{
auto & level = stack.emplace_back();
level.actions = std::make_shared<ActionsDAG>();
level.actions_dag = std::make_shared<ActionsDAG>();
const auto & prev = stack[stack.size() - 2];
for (const auto & input_column : input_columns)
{
level.actions->addInput(input_column.name, input_column.type);
level.actions_dag->addInput(input_column.name, input_column.type);
level.inputs.emplace(input_column.name);
}
const auto & index = level.actions->getIndex();
const auto & index = level.actions_dag->getIndex();
for (const auto & [name, node] : prev.actions->getIndex())
for (const auto & node : prev.actions_dag->getIndex())
{
if (index.count(name) == 0)
level.actions->addInput({node->column, node->result_type, node->result_name});
if (!index.contains(node->result_name))
level.actions_dag->addInput({node->column, node->result_type, node->result_name});
}
}
@ -448,10 +448,10 @@ size_t ScopeStack::getColumnLevel(const std::string & name)
if (stack[i].inputs.count(name))
return i;
const auto & index = stack[i].actions->getIndex();
const auto & index = stack[i].actions_dag->getIndex();
auto it = index.find(name);
if (it != index.end() && it->second->type != ActionsDAG::Type::INPUT)
if (it != index.end() && (*it)->type != ActionsDAG::ActionType::INPUT)
return i;
}
@ -460,66 +460,65 @@ size_t ScopeStack::getColumnLevel(const std::string & name)
void ScopeStack::addColumn(ColumnWithTypeAndName column)
{
const auto & node = stack[0].actions->addColumn(std::move(column));
const auto & node = stack[0].actions_dag->addColumn(std::move(column));
for (size_t j = 1; j < stack.size(); ++j)
stack[j].actions->addInput({node.column, node.result_type, node.result_name});
stack[j].actions_dag->addInput({node.column, node.result_type, node.result_name});
}
void ScopeStack::addAlias(const std::string & name, std::string alias)
{
auto level = getColumnLevel(name);
const auto & node = stack[level].actions->addAlias(name, std::move(alias));
const auto & node = stack[level].actions_dag->addAlias(name, std::move(alias));
for (size_t j = level + 1; j < stack.size(); ++j)
stack[j].actions->addInput({node.column, node.result_type, node.result_name});
stack[j].actions_dag->addInput({node.column, node.result_type, node.result_name});
}
void ScopeStack::addArrayJoin(const std::string & source_name, std::string result_name, std::string unique_column_name)
void ScopeStack::addArrayJoin(const std::string & source_name, std::string result_name)
{
getColumnLevel(source_name);
if (stack.front().actions->getIndex().count(source_name) == 0)
if (!stack.front().actions_dag->getIndex().contains(source_name))
throw Exception("Expression with arrayJoin cannot depend on lambda argument: " + source_name,
ErrorCodes::BAD_ARGUMENTS);
const auto & node = stack.front().actions->addArrayJoin(source_name, std::move(result_name), std::move(unique_column_name));
const auto & node = stack.front().actions_dag->addArrayJoin(source_name, std::move(result_name));
for (size_t j = 1; j < stack.size(); ++j)
stack[j].actions->addInput({node.column, node.result_type, node.result_name});
stack[j].actions_dag->addInput({node.column, node.result_type, node.result_name});
}
void ScopeStack::addFunction(
const FunctionOverloadResolverPtr & function,
const Names & argument_names,
std::string result_name,
bool compile_expressions)
std::string result_name)
{
size_t level = 0;
for (const auto & argument : argument_names)
level = std::max(level, getColumnLevel(argument));
const auto & node = stack[level].actions->addFunction(function, argument_names, std::move(result_name), compile_expressions);
const auto & node = stack[level].actions_dag->addFunction(function, argument_names, std::move(result_name), context);
for (size_t j = level + 1; j < stack.size(); ++j)
stack[j].actions->addInput({node.column, node.result_type, node.result_name});
stack[j].actions_dag->addInput({node.column, node.result_type, node.result_name});
}
ActionsDAGPtr ScopeStack::popLevel()
{
auto res = std::move(stack.back());
stack.pop_back();
return res.actions;
return res.actions_dag;
}
std::string ScopeStack::dumpNames() const
{
return stack.back().actions->dumpNames();
return stack.back().actions_dag->dumpNames();
}
const ActionsDAG & ScopeStack::getLastActions() const
{
return *stack.back().actions;
return *stack.back().actions_dag;
}
bool ActionsMatcher::needChildVisit(const ASTPtr & node, const ASTPtr & child)
@ -572,7 +571,7 @@ std::optional<NameAndTypePair> ActionsMatcher::getNameAndTypeFromAST(const ASTPt
const auto & index = data.actions_stack.getLastActions().getIndex();
auto it = index.find(child_column_name);
if (it != index.end())
return NameAndTypePair(child_column_name, it->second->result_type);
return NameAndTypePair(child_column_name, (*it)->result_type);
if (!data.only_consts)
throw Exception("Unknown identifier: " + child_column_name + " there are columns: " + data.actions_stack.dumpNames(),
@ -892,10 +891,12 @@ void ActionsMatcher::visit(const ASTFunction & node, const ASTPtr & ast, Data &
data.actions_stack.pushLevel(lambda_arguments);
visit(lambda->arguments->children.at(1), data);
auto lambda_dag = data.actions_stack.popLevel();
auto lambda_actions = lambda_dag->buildExpressions(data.context);
String result_name = lambda->arguments->children.at(1)->getColumnName();
lambda_actions->finalize(Names(1, result_name));
lambda_dag->removeUnusedActions(Names(1, result_name));
auto lambda_actions = std::make_shared<ExpressionActions>(lambda_dag);
DataTypePtr result_type = lambda_actions->getSampleBlock().getByName(result_name).type;
Names captured;
@ -954,7 +955,7 @@ void ActionsMatcher::visit(const ASTLiteral & literal, const ASTPtr & /* ast */,
auto it = index.find(default_name);
if (it != index.end())
existing_column = it->second;
existing_column = *it;
/*
* To approximate CSE, bind all identical literals to a single temporary
@ -1051,7 +1052,7 @@ SetPtr ActionsMatcher::makeSet(const ASTFunction & node, Data & data, bool no_su
* - this function shows the expression IN_data1.
*
* In case that we have HAVING with IN subquery, we have to force creating set for it.
* Also it doesn't make sence if it is GLOBAL IN or ordinary IN.
* Also it doesn't make sense if it is GLOBAL IN or ordinary IN.
*/
if (!subquery_for_set.source && data.create_source_for_in)
{
@ -1068,7 +1069,7 @@ SetPtr ActionsMatcher::makeSet(const ASTFunction & node, Data & data, bool no_su
{
const auto & last_actions = data.actions_stack.getLastActions();
const auto & index = last_actions.getIndex();
if (index.count(left_in_operand->getColumnName()) != 0)
if (index.contains(left_in_operand->getColumnName()))
/// An explicit enumeration of values in parentheses.
return makeExplicitSet(&node, last_actions, false, data.context, data.set_size_limit, data.prepared_sets);
else

Some files were not shown because too many files have changed in this diff Show More