Merge branch 'master' into Entrypoint_Issue

This commit is contained in:
Heena Bansal 2022-08-11 00:10:45 -04:00 committed by GitHub
commit fd03038722
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
166 changed files with 2203 additions and 789 deletions

View File

@ -104,6 +104,7 @@ if [ -n "$MAKE_DEB" ]; then
fi
mv ./programs/clickhouse* /output
[ -x ./programs/self-extracting/clickhouse ] && mv ./programs/self-extracting/clickhouse /output
mv ./src/unit_tests_dbms /output ||: # may not exist for some binary builds
find . -name '*.so' -print -exec mv '{}' /output \;
find . -name '*.so.*' -print -exec mv '{}' /output \;

View File

@ -69,6 +69,8 @@ function download
wget_with_retry "$BINARY_URL_TO_DOWNLOAD"
chmod +x clickhouse
# clickhouse may be compressed - run once to decompress
./clickhouse ||:
ln -s ./clickhouse ./clickhouse-server
ln -s ./clickhouse ./clickhouse-client

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.1.1262-prestable FIXME as compared to v22.2.1.2139-prestable
### ClickHouse release v22.3.1.1262-prestable (92ab33f560e) FIXME as compared to v22.2.1.2139-prestable (75366fc95e5)
#### Backward Incompatible Change
* Improvement the toDatetime function overflows. When the date string is very large, it will be converted to 1970. [#32898](https://github.com/ClickHouse/ClickHouse/pull/32898) ([HaiBo Li](https://github.com/marising)).

View File

@ -0,0 +1,30 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.10.22-lts (25886f517d4) FIXME as compared to v22.3.9.19-lts (7976930b82e)
#### Bug Fix
* Backported in [#39761](https://github.com/ClickHouse/ClickHouse/issues/39761): Fix seeking while reading from encrypted disk. This PR fixes [#38381](https://github.com/ClickHouse/ClickHouse/issues/38381). [#39687](https://github.com/ClickHouse/ClickHouse/pull/39687) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#39206](https://github.com/ClickHouse/ClickHouse/issues/39206): Fix reading of sparse columns from `MergeTree` tables that store their data in S3. [#37978](https://github.com/ClickHouse/ClickHouse/pull/37978) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#39381](https://github.com/ClickHouse/ClickHouse/issues/39381): Fixed error `Not found column Type in block` in selects with `PREWHERE` and read-in-order optimizations. [#39157](https://github.com/ClickHouse/ClickHouse/pull/39157) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#39588](https://github.com/ClickHouse/ClickHouse/issues/39588): Fix data race and possible heap-buffer-overflow in Avro format. Closes [#39094](https://github.com/ClickHouse/ClickHouse/issues/39094) Closes [#33652](https://github.com/ClickHouse/ClickHouse/issues/33652). [#39498](https://github.com/ClickHouse/ClickHouse/pull/39498) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#39610](https://github.com/ClickHouse/ClickHouse/issues/39610): Fix bug with maxsplit argument for splitByChar, which was not working correctly. [#39552](https://github.com/ClickHouse/ClickHouse/pull/39552) ([filimonov](https://github.com/filimonov)).
* Backported in [#39834](https://github.com/ClickHouse/ClickHouse/issues/39834): Fix `CANNOT_READ_ALL_DATA` exception with `local_filesystem_read_method=pread_threadpool`. This bug affected only Linux kernel version 5.9 and 5.10 according to [man](https://manpages.debian.org/testing/manpages-dev/preadv2.2.en.html#BUGS). [#39800](https://github.com/ClickHouse/ClickHouse/pull/39800) ([Anton Popov](https://github.com/CurtizJ)).
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)
* Backported in [#39238](https://github.com/ClickHouse/ClickHouse/issues/39238): Fix performance regression of scalar query optimization. [#35986](https://github.com/ClickHouse/ClickHouse/pull/35986) ([Amos Bird](https://github.com/amosbird)).
* Backported in [#39531](https://github.com/ClickHouse/ClickHouse/issues/39531): Fix some issues with async reads from remote filesystem which happened when reading low cardinality. [#36763](https://github.com/ClickHouse/ClickHouse/pull/36763) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Replace MemoryTrackerBlockerInThread to LockMemoryExceptionInThread [#39619](https://github.com/ClickHouse/ClickHouse/pull/39619) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Change mysql-odbc url [#39702](https://github.com/ClickHouse/ClickHouse/pull/39702) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,20 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.11.12-lts (137c5f72657) FIXME as compared to v22.3.10.22-lts (25886f517d4)
#### Build/Testing/Packaging Improvement
* Backported in [#39881](https://github.com/ClickHouse/ClickHouse/issues/39881): Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#39336](https://github.com/ClickHouse/ClickHouse/issues/39336): Fix `parallel_view_processing=1` with `optimize_trivial_insert_select=1`. Fix `max_insert_threads` while pushing to views. [#38731](https://github.com/ClickHouse/ClickHouse/pull/38731) ([Azat Khuzhin](https://github.com/azat)).
#### NO CL ENTRY
* NO CL ENTRY: 'Revert "Backport [#39687](https://github.com/ClickHouse/ClickHouse/issues/39687) to 22.3: Fix seeking while reading from encrypted disk"'. [#40052](https://github.com/ClickHouse/ClickHouse/pull/40052) ([Alexey Milovidov](https://github.com/alexey-milovidov)).

View File

@ -5,5 +5,5 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.2.2-lts FIXME as compared to v22.3.1.1262-prestable
### ClickHouse release v22.3.2.2-lts (89a621679c6) FIXME as compared to v22.3.1.1262-prestable (92ab33f560e)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.3.44-lts FIXME as compared to v22.3.2.2-lts
### ClickHouse release v22.3.3.44-lts (abb756d3ca2) FIXME as compared to v22.3.2.2-lts (89a621679c6)
#### Bug Fix
* Backported in [#35928](https://github.com/ClickHouse/ClickHouse/issues/35928): Added settings `input_format_ipv4_default_on_conversion_error`, `input_format_ipv6_default_on_conversion_error` to allow insert of invalid ip address values as default into tables. Closes [#35726](https://github.com/ClickHouse/ClickHouse/issues/35726). [#35733](https://github.com/ClickHouse/ClickHouse/pull/35733) ([Maksim Kita](https://github.com/kitaisreal)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.4.20-lts FIXME as compared to v22.3.3.44-lts
### ClickHouse release v22.3.4.20-lts (ecbaf001f49) FIXME as compared to v22.3.3.44-lts (abb756d3ca2)
#### Build/Testing/Packaging Improvement
* - Add `_le_` method for ClickHouseVersion - Fix auto_version for existing tag - docker_server now support getting version from tags - Add python unit tests to backport workflow. [#36028](https://github.com/ClickHouse/ClickHouse/pull/36028) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.5.5-lts FIXME as compared to v22.3.4.20-lts
### ClickHouse release v22.3.5.5-lts (438b4a81f77) FIXME as compared to v22.3.4.20-lts (ecbaf001f49)
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.6.5-lts FIXME as compared to v22.3.5.5-lts
### ClickHouse release v22.3.6.5-lts (3e44e824cff) FIXME as compared to v22.3.5.5-lts (438b4a81f77)
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.3.7.28-lts FIXME as compared to v22.3.6.5-lts
### ClickHouse release v22.3.7.28-lts (420bdfa2751) FIXME as compared to v22.3.6.5-lts (3e44e824cff)
#### Bug Fix (user-visible misbehavior in official stable or prestable release)

View File

@ -0,0 +1,32 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.8.39-lts (6bcf982f58b) FIXME as compared to v22.3.7.28-lts (420bdfa2751)
#### Build/Testing/Packaging Improvement
* Backported in [#38826](https://github.com/ClickHouse/ClickHouse/issues/38826): - Change `all|noarch` packages to architecture-dependent - Fix some documentation for it - Push aarch64|arm64 packages to artifactory and release assets - Fixes [#36443](https://github.com/ClickHouse/ClickHouse/issues/36443). [#38580](https://github.com/ClickHouse/ClickHouse/pull/38580) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#38453](https://github.com/ClickHouse/ClickHouse/issues/38453): Fix bug with nested short-circuit functions that led to execution of arguments even if condition is false. Closes [#38040](https://github.com/ClickHouse/ClickHouse/issues/38040). [#38173](https://github.com/ClickHouse/ClickHouse/pull/38173) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#38710](https://github.com/ClickHouse/ClickHouse/issues/38710): Fix incorrect result of distributed queries with `DISTINCT` and `LIMIT`. Fixes [#38282](https://github.com/ClickHouse/ClickHouse/issues/38282). [#38371](https://github.com/ClickHouse/ClickHouse/pull/38371) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#38689](https://github.com/ClickHouse/ClickHouse/issues/38689): Now it's possible to start a clickhouse-server and attach/detach tables even for tables with the incorrect values of IPv4/IPv6 representation. Proper fix for issue [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#38590](https://github.com/ClickHouse/ClickHouse/pull/38590) ([alesapin](https://github.com/alesapin)).
* Backported in [#38776](https://github.com/ClickHouse/ClickHouse/issues/38776): `rankCorr` function will work correctly if some arguments are NaNs. This closes [#38396](https://github.com/ClickHouse/ClickHouse/issues/38396). [#38722](https://github.com/ClickHouse/ClickHouse/pull/38722) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#38780](https://github.com/ClickHouse/ClickHouse/issues/38780): Fix use-after-free for Map combinator that leads to incorrect result. [#38748](https://github.com/ClickHouse/ClickHouse/pull/38748) ([Azat Khuzhin](https://github.com/azat)).
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)
* Backported in [#36818](https://github.com/ClickHouse/ClickHouse/issues/36818): Fix projection analysis which might lead to wrong query result when IN subquery is used. This fixes [#35336](https://github.com/ClickHouse/ClickHouse/issues/35336). [#35631](https://github.com/ClickHouse/ClickHouse/pull/35631) ([Amos Bird](https://github.com/amosbird)).
* Backported in [#38467](https://github.com/ClickHouse/ClickHouse/issues/38467): - Fix potential error with literals in `WHERE` for join queries. Close [#36279](https://github.com/ClickHouse/ClickHouse/issues/36279). [#36542](https://github.com/ClickHouse/ClickHouse/pull/36542) ([Vladimir C](https://github.com/vdimir)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Try to fix some trash [#37303](https://github.com/ClickHouse/ClickHouse/pull/37303) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update docker-compose to try get rid of v1 errors [#38394](https://github.com/ClickHouse/ClickHouse/pull/38394) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Trying backport useful features for CI [#38510](https://github.com/ClickHouse/ClickHouse/pull/38510) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix backports diff [#38703](https://github.com/ClickHouse/ClickHouse/pull/38703) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,24 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.3.9.19-lts (7976930b82e) FIXME as compared to v22.3.8.39-lts (6bcf982f58b)
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#39097](https://github.com/ClickHouse/ClickHouse/issues/39097): Any allocations inside OvercommitTracker may lead to deadlock. Logging was not very informative so it's easier just to remove logging. Fixes [#37794](https://github.com/ClickHouse/ClickHouse/issues/37794). [#39030](https://github.com/ClickHouse/ClickHouse/pull/39030) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#39080](https://github.com/ClickHouse/ClickHouse/issues/39080): Fix bug in filesystem cache that could happen in some corner case which coincided with cache capacity hitting the limit. Closes [#39066](https://github.com/ClickHouse/ClickHouse/issues/39066). [#39070](https://github.com/ClickHouse/ClickHouse/pull/39070) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Backported in [#39149](https://github.com/ClickHouse/ClickHouse/issues/39149): Fix error `Block structure mismatch` which could happen for INSERT into table with attached MATERIALIZED VIEW and enabled setting `extremes = 1`. Closes [#29759](https://github.com/ClickHouse/ClickHouse/issues/29759) and [#38729](https://github.com/ClickHouse/ClickHouse/issues/38729). [#39125](https://github.com/ClickHouse/ClickHouse/pull/39125) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#39372](https://github.com/ClickHouse/ClickHouse/issues/39372): Declare RabbitMQ queue without default arguments `x-max-length` and `x-overflow`. [#39259](https://github.com/ClickHouse/ClickHouse/pull/39259) ([rnbondarenko](https://github.com/rnbondarenko)).
* Backported in [#39379](https://github.com/ClickHouse/ClickHouse/issues/39379): Fix segmentation fault in MaterializedPostgreSQL database engine, which could happen if some exception occurred at replication initialisation. Closes [#36939](https://github.com/ClickHouse/ClickHouse/issues/36939). [#39272](https://github.com/ClickHouse/ClickHouse/pull/39272) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Backported in [#39351](https://github.com/ClickHouse/ClickHouse/issues/39351): Fix incorrect fetch postgresql tables query fro PostgreSQL database engine. Closes [#33502](https://github.com/ClickHouse/ClickHouse/issues/33502). [#39283](https://github.com/ClickHouse/ClickHouse/pull/39283) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Reproduce and a little bit better fix for LC dict right offset. [#36856](https://github.com/ClickHouse/ClickHouse/pull/36856) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Retry docker buildx commands with progressive sleep in between [#38898](https://github.com/ClickHouse/ClickHouse/pull/38898) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add docker_server.py running to backport and release CIs [#39011](https://github.com/ClickHouse/ClickHouse/pull/39011) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.4.1.2305-prestable FIXME as compared to v22.3.1.1262-prestable
### ClickHouse release v22.4.1.2305-prestable (77a82cc090d) FIXME as compared to v22.3.1.1262-prestable (92ab33f560e)
#### Backward Incompatible Change
* Function `yandexConsistentHash` (consistent hashing algorithm by Konstantin "kostik" Oblakov) is renamed to `kostikConsistentHash`. The old name is left as an alias for compatibility. Although this change is backward compatible, we may remove the alias in subsequent releases, that's why it's recommended to update the usages of this function in your apps. [#35553](https://github.com/ClickHouse/ClickHouse/pull/35553) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
@ -68,7 +68,7 @@ sidebar_label: 2022
* For lts releases packages will be pushed to both lts and stable repos. [#35382](https://github.com/ClickHouse/ClickHouse/pull/35382) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Support uuid for postgres engines. Closes [#35384](https://github.com/ClickHouse/ClickHouse/issues/35384). [#35403](https://github.com/ClickHouse/ClickHouse/pull/35403) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Add arguments `--user`, `--password`, `--host`, `--port` for clickhouse-diagnostics. [#35422](https://github.com/ClickHouse/ClickHouse/pull/35422) ([李扬](https://github.com/taiyang-li)).
* fix INSERT INTO table FROM INFILE does not display progress bar. [#35429](https://github.com/ClickHouse/ClickHouse/pull/35429) ([xiedeyantu](https://github.com/xiedeyantu)).
* fix INSERT INTO table FROM INFILE does not display progress bar. [#35429](https://github.com/ClickHouse/ClickHouse/pull/35429) ([chen](https://github.com/xiedeyantu)).
* Allow server to bind to low-numbered ports (e.g. 443). ClickHouse installation script will set `cap_net_bind_service` to the binary file. [#35451](https://github.com/ClickHouse/ClickHouse/pull/35451) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add settings `input_format_orc_case_insensitive_column_matching`, `input_format_arrow_case_insensitive_column_matching`, and `input_format_parquet_case_insensitive_column_matching` which allows ClickHouse to use case insensitive matching of columns while reading data from ORC, Arrow or Parquet files. [#35459](https://github.com/ClickHouse/ClickHouse/pull/35459) ([Antonio Andelic](https://github.com/antonio2368)).
* - Add explicit table info to the scan node of query plan and pipeline. [#35460](https://github.com/ClickHouse/ClickHouse/pull/35460) ([何李夫](https://github.com/helifu)).
@ -106,7 +106,7 @@ sidebar_label: 2022
* ASTPartition::formatImpl should output ALL while executing ALTER TABLE t DETACH PARTITION ALL. [#35987](https://github.com/ClickHouse/ClickHouse/pull/35987) ([awakeljw](https://github.com/awakeljw)).
* `clickhouse-keeper` starts answering 4-letter commands before getting the quorum. [#35992](https://github.com/ClickHouse/ClickHouse/pull/35992) ([Antonio Andelic](https://github.com/antonio2368)).
* Fix wrong assertion in replxx which happens when navigating back the history when the first line of input is a newline. Mark as improvement because it only affects debug build. This fixes [#34511](https://github.com/ClickHouse/ClickHouse/issues/34511). [#36007](https://github.com/ClickHouse/ClickHouse/pull/36007) ([Amos Bird](https://github.com/amosbird)).
* If someone writes DEFAULT NULL in table definition, make data type Nullable. [#35887](https://github.com/ClickHouse/ClickHouse/issues/35887). [#36058](https://github.com/ClickHouse/ClickHouse/pull/36058) ([xiedeyantu](https://github.com/xiedeyantu)).
* If someone writes DEFAULT NULL in table definition, make data type Nullable. [#35887](https://github.com/ClickHouse/ClickHouse/issues/35887). [#36058](https://github.com/ClickHouse/ClickHouse/pull/36058) ([chen](https://github.com/xiedeyantu)).
* Added `thread_id` and `query_id` columns to `system.zookeeper_log` table. [#36074](https://github.com/ClickHouse/ClickHouse/pull/36074) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Auto assign numbers for Enum elements. [#36101](https://github.com/ClickHouse/ClickHouse/pull/36101) ([awakeljw](https://github.com/awakeljw)).
* Reset thread name in `ThreadPool` to `ThreadPoolIdle` after job is done. This is to avoid displaying the old thread name for idle threads. This closes [#36114](https://github.com/ClickHouse/ClickHouse/issues/36114). [#36115](https://github.com/ClickHouse/ClickHouse/pull/36115) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
@ -331,7 +331,7 @@ sidebar_label: 2022
* ci: replace directory system log tables artifacts with tsv [#35773](https://github.com/ClickHouse/ClickHouse/pull/35773) ([Azat Khuzhin](https://github.com/azat)).
* One more try to resurrect build hash [#35774](https://github.com/ClickHouse/ClickHouse/pull/35774) ([alesapin](https://github.com/alesapin)).
* Refactoring QueryPipeline [#35789](https://github.com/ClickHouse/ClickHouse/pull/35789) ([Amos Bird](https://github.com/amosbird)).
* Delete duplicate code [#35798](https://github.com/ClickHouse/ClickHouse/pull/35798) ([xiedeyantu](https://github.com/xiedeyantu)).
* Delete duplicate code [#35798](https://github.com/ClickHouse/ClickHouse/pull/35798) ([chen](https://github.com/xiedeyantu)).
* remove unused variable [#35800](https://github.com/ClickHouse/ClickHouse/pull/35800) ([flynn](https://github.com/ucasfl)).
* Make `SortDescription::column_name` always non-empty [#35805](https://github.com/ClickHouse/ClickHouse/pull/35805) ([Nikita Taranov](https://github.com/nickitat)).
* Fix latest_error referenced before assignment [#35807](https://github.com/ClickHouse/ClickHouse/pull/35807) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
@ -417,7 +417,7 @@ sidebar_label: 2022
* Revert reverting "Fix crash in ParallelReadBuffer" [#36212](https://github.com/ClickHouse/ClickHouse/pull/36212) ([Kruglov Pavel](https://github.com/Avogar)).
* Make stateless tests with s3 always green [#36214](https://github.com/ClickHouse/ClickHouse/pull/36214) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Add Tyler Hannan to contributors [#36216](https://github.com/ClickHouse/ClickHouse/pull/36216) ([Tyler Hannan](https://github.com/tylerhannan)).
* Fix the repeated call of func to get the table when drop table [#36248](https://github.com/ClickHouse/ClickHouse/pull/36248) ([xiedeyantu](https://github.com/xiedeyantu)).
* Fix the repeated call of func to get the table when drop table [#36248](https://github.com/ClickHouse/ClickHouse/pull/36248) ([chen](https://github.com/xiedeyantu)).
* Split test 01675_data_type_coroutine into 2 tests to prevent possible timeouts [#36250](https://github.com/ClickHouse/ClickHouse/pull/36250) ([Kruglov Pavel](https://github.com/Avogar)).
* Merge TRUSTED_CONTRIBUTORS in lambda and import in check [#36252](https://github.com/ClickHouse/ClickHouse/pull/36252) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix exception "File segment can be completed only by downloader" in tests [#36253](https://github.com/ClickHouse/ClickHouse/pull/36253) ([Kseniia Sumarokova](https://github.com/kssenii)).

View File

@ -5,5 +5,5 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.4.2.1-stable FIXME as compared to v22.4.1.2305-prestable
### ClickHouse release v22.4.2.1-stable (b34ebdc36ae) FIXME as compared to v22.4.1.2305-prestable (77a82cc090d)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.4.3.3-stable FIXME as compared to v22.4.2.1-stable
### ClickHouse release v22.4.3.3-stable (def956d6299) FIXME as compared to v22.4.2.1-stable (b34ebdc36ae)
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.4.4.7-stable FIXME as compared to v22.4.3.3-stable
### ClickHouse release v22.4.4.7-stable (ba44414f9b3) FIXME as compared to v22.4.3.3-stable (def956d6299)
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.4.5.9-stable FIXME as compared to v22.4.4.7-stable
### ClickHouse release v22.4.5.9-stable (059ef6cadcd) FIXME as compared to v22.4.4.7-stable (ba44414f9b3)
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -0,0 +1,44 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.4.6.53-stable (0625731c940) FIXME as compared to v22.4.5.9-stable (059ef6cadcd)
#### New Feature
* Backported in [#38714](https://github.com/ClickHouse/ClickHouse/issues/38714): SALT is allowed for CREATE USER <user> IDENTIFIED WITH sha256_hash. [#37377](https://github.com/ClickHouse/ClickHouse/pull/37377) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### Build/Testing/Packaging Improvement
* Backported in [#38828](https://github.com/ClickHouse/ClickHouse/issues/38828): - Change `all|noarch` packages to architecture-dependent - Fix some documentation for it - Push aarch64|arm64 packages to artifactory and release assets - Fixes [#36443](https://github.com/ClickHouse/ClickHouse/issues/36443). [#38580](https://github.com/ClickHouse/ClickHouse/pull/38580) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#37717](https://github.com/ClickHouse/ClickHouse/issues/37717): Fix unexpected errors with a clash of constant strings in aggregate function, prewhere and join. Close [#36891](https://github.com/ClickHouse/ClickHouse/issues/36891). [#37336](https://github.com/ClickHouse/ClickHouse/pull/37336) ([Vladimir C](https://github.com/vdimir)).
* Backported in [#37512](https://github.com/ClickHouse/ClickHouse/issues/37512): Fix logical error in normalizeUTF8 functions. Closes [#37298](https://github.com/ClickHouse/ClickHouse/issues/37298). [#37443](https://github.com/ClickHouse/ClickHouse/pull/37443) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#37941](https://github.com/ClickHouse/ClickHouse/issues/37941): Fix setting cast_ipv4_ipv6_default_on_conversion_error for internal cast function. Closes [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#37761](https://github.com/ClickHouse/ClickHouse/pull/37761) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#38452](https://github.com/ClickHouse/ClickHouse/issues/38452): Fix bug with nested short-circuit functions that led to execution of arguments even if condition is false. Closes [#38040](https://github.com/ClickHouse/ClickHouse/issues/38040). [#38173](https://github.com/ClickHouse/ClickHouse/pull/38173) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#38711](https://github.com/ClickHouse/ClickHouse/issues/38711): Fix incorrect result of distributed queries with `DISTINCT` and `LIMIT`. Fixes [#38282](https://github.com/ClickHouse/ClickHouse/issues/38282). [#38371](https://github.com/ClickHouse/ClickHouse/pull/38371) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#38593](https://github.com/ClickHouse/ClickHouse/issues/38593): Fix parts removal (will be left forever if they had not been removed on server shutdown) after incorrect server shutdown. [#38486](https://github.com/ClickHouse/ClickHouse/pull/38486) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#38596](https://github.com/ClickHouse/ClickHouse/issues/38596): Fix table creation to avoid replication issues with pre-22.4 replicas. [#38541](https://github.com/ClickHouse/ClickHouse/pull/38541) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#38686](https://github.com/ClickHouse/ClickHouse/issues/38686): Now it's possible to start a clickhouse-server and attach/detach tables even for tables with the incorrect values of IPv4/IPv6 representation. Proper fix for issue [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#38590](https://github.com/ClickHouse/ClickHouse/pull/38590) ([alesapin](https://github.com/alesapin)).
* Backported in [#38663](https://github.com/ClickHouse/ClickHouse/issues/38663): Adapt some more nodes to avoid issues with pre-22.4 replicas. [#38627](https://github.com/ClickHouse/ClickHouse/pull/38627) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#38777](https://github.com/ClickHouse/ClickHouse/issues/38777): `rankCorr` function will work correctly if some arguments are NaNs. This closes [#38396](https://github.com/ClickHouse/ClickHouse/issues/38396). [#38722](https://github.com/ClickHouse/ClickHouse/pull/38722) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#38781](https://github.com/ClickHouse/ClickHouse/issues/38781): Fix use-after-free for Map combinator that leads to incorrect result. [#38748](https://github.com/ClickHouse/ClickHouse/pull/38748) ([Azat Khuzhin](https://github.com/azat)).
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)
* Backported in [#37456](https://github.com/ClickHouse/ClickHouse/issues/37456): Server might fail to start if it cannot resolve hostname of external ClickHouse dictionary. It's fixed. Fixes [#36451](https://github.com/ClickHouse/ClickHouse/issues/36451). [#36463](https://github.com/ClickHouse/ClickHouse/pull/36463) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Backported in [#38468](https://github.com/ClickHouse/ClickHouse/issues/38468): - Fix potential error with literals in `WHERE` for join queries. Close [#36279](https://github.com/ClickHouse/ClickHouse/issues/36279). [#36542](https://github.com/ClickHouse/ClickHouse/pull/36542) ([Vladimir C](https://github.com/vdimir)).
* Backported in [#37363](https://github.com/ClickHouse/ClickHouse/issues/37363): Fixed problem with infs in `quantileTDigest`. Fixes [#32107](https://github.com/ClickHouse/ClickHouse/issues/32107). [#37021](https://github.com/ClickHouse/ClickHouse/pull/37021) ([Vladimir Chebotarev](https://github.com/excitoon)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Integration tests [#36866](https://github.com/ClickHouse/ClickHouse/pull/36866) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Try to fix some trash [#37303](https://github.com/ClickHouse/ClickHouse/pull/37303) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update protobuf files for kafka and rabbitmq [fix integration tests] [#37884](https://github.com/ClickHouse/ClickHouse/pull/37884) ([Nikita Taranov](https://github.com/nickitat)).
* Try fix `test_grpc_protocol/test.py::test_progress` [#37908](https://github.com/ClickHouse/ClickHouse/pull/37908) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update docker-compose to try get rid of v1 errors [#38394](https://github.com/ClickHouse/ClickHouse/pull/38394) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix backports diff [#38703](https://github.com/ClickHouse/ClickHouse/pull/38703) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -5,10 +5,10 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.5.1.2079-stable FIXME as compared to v22.4.1.2305-prestable
### ClickHouse release v22.5.1.2079-stable (df0cb062098) FIXME as compared to v22.4.1.2305-prestable (77a82cc090d)
#### Backward Incompatible Change
* Updated the BoringSSL module to the official FIPS compliant version. This makes ClickHouse FIPS compliant. [#35914](https://github.com/ClickHouse/ClickHouse/pull/35914) ([Meena-Renganathan](https://github.com/Meena-Renganathan)).
* Updated the BoringSSL module to the official FIPS compliant version. This makes ClickHouse FIPS compliant. [#35914](https://github.com/ClickHouse/ClickHouse/pull/35914) ([Deleted user](https://github.com/ghost)).
* Now, background merges, mutations and `OPTIMIZE` will not increment `SelectedRows` and `SelectedBytes` metrics. They (still) will increment `MergedRows` and `MergedUncompressedBytes` as it was before. [#37040](https://github.com/ClickHouse/ClickHouse/pull/37040) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
#### New Feature
@ -20,7 +20,7 @@ sidebar_label: 2022
* Parse collations in CREATE TABLE, throw exception or ignore. closes [#35892](https://github.com/ClickHouse/ClickHouse/issues/35892). [#36271](https://github.com/ClickHouse/ClickHouse/pull/36271) ([yuuch](https://github.com/yuuch)).
* Add aliases JSONLines and NDJSON for JSONEachRow. Closes [#36303](https://github.com/ClickHouse/ClickHouse/issues/36303). [#36327](https://github.com/ClickHouse/ClickHouse/pull/36327) ([flynn](https://github.com/ucasfl)).
* Set parts_to_delay_insert and parts_to_throw_insert as query-level settings. If they are defined, they can override table-level settings. [#36371](https://github.com/ClickHouse/ClickHouse/pull/36371) ([Memo](https://github.com/Joeywzr)).
* temporary table can show total rows and total bytes. [#36401](https://github.com/ClickHouse/ClickHouse/issues/36401). [#36439](https://github.com/ClickHouse/ClickHouse/pull/36439) ([xiedeyantu](https://github.com/xiedeyantu)).
* temporary table can show total rows and total bytes. [#36401](https://github.com/ClickHouse/ClickHouse/issues/36401). [#36439](https://github.com/ClickHouse/ClickHouse/pull/36439) ([chen](https://github.com/xiedeyantu)).
* Added new hash function - wyHash64. [#36467](https://github.com/ClickHouse/ClickHouse/pull/36467) ([olevino](https://github.com/olevino)).
* Window function nth_value was added. [#36601](https://github.com/ClickHouse/ClickHouse/pull/36601) ([Nikolay](https://github.com/ndchikin)).
* Add MySQLDump input format. It reads all data from INSERT queries belonging to one table in dump. If there are more than one table, by default it reads data from the first one. [#36667](https://github.com/ClickHouse/ClickHouse/pull/36667) ([Kruglov Pavel](https://github.com/Avogar)).
@ -212,7 +212,7 @@ sidebar_label: 2022
* Update version after release [#36502](https://github.com/ClickHouse/ClickHouse/pull/36502) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Followup on [#36172](https://github.com/ClickHouse/ClickHouse/issues/36172) password hash salt feature [#36510](https://github.com/ClickHouse/ClickHouse/pull/36510) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Update version_date.tsv after v22.4.2.1-stable [#36533](https://github.com/ClickHouse/ClickHouse/pull/36533) ([github-actions[bot]](https://github.com/apps/github-actions)).
* fix log should print 'from' path [#36535](https://github.com/ClickHouse/ClickHouse/pull/36535) ([xiedeyantu](https://github.com/xiedeyantu)).
* fix log should print 'from' path [#36535](https://github.com/ClickHouse/ClickHouse/pull/36535) ([chen](https://github.com/xiedeyantu)).
* Add function bin tests for Int/UInt128/UInt256 [#36537](https://github.com/ClickHouse/ClickHouse/pull/36537) ([Memo](https://github.com/Joeywzr)).
* Fix 01161_all_system_tables [#36539](https://github.com/ClickHouse/ClickHouse/pull/36539) ([Antonio Andelic](https://github.com/antonio2368)).
* Update PULL_REQUEST_TEMPLATE.md [#36543](https://github.com/ClickHouse/ClickHouse/pull/36543) ([Ivan Blinkov](https://github.com/blinkov)).

View File

@ -0,0 +1,40 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.5.2.53-stable (5fd600fda9e) FIXME as compared to v22.5.1.2079-stable (df0cb062098)
#### New Feature
* Backported in [#38713](https://github.com/ClickHouse/ClickHouse/issues/38713): SALT is allowed for CREATE USER <user> IDENTIFIED WITH sha256_hash. [#37377](https://github.com/ClickHouse/ClickHouse/pull/37377) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### Build/Testing/Packaging Improvement
* Backported in [#38827](https://github.com/ClickHouse/ClickHouse/issues/38827): - Change `all|noarch` packages to architecture-dependent - Fix some documentation for it - Push aarch64|arm64 packages to artifactory and release assets - Fixes [#36443](https://github.com/ClickHouse/ClickHouse/issues/36443). [#38580](https://github.com/ClickHouse/ClickHouse/pull/38580) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#37716](https://github.com/ClickHouse/ClickHouse/issues/37716): Fix unexpected errors with a clash of constant strings in aggregate function, prewhere and join. Close [#36891](https://github.com/ClickHouse/ClickHouse/issues/36891). [#37336](https://github.com/ClickHouse/ClickHouse/pull/37336) ([Vladimir C](https://github.com/vdimir)).
* Backported in [#37408](https://github.com/ClickHouse/ClickHouse/issues/37408): Throw an exception when GROUPING SETS used with ROLLUP or CUBE. [#37367](https://github.com/ClickHouse/ClickHouse/pull/37367) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#37513](https://github.com/ClickHouse/ClickHouse/issues/37513): Fix logical error in normalizeUTF8 functions. Closes [#37298](https://github.com/ClickHouse/ClickHouse/issues/37298). [#37443](https://github.com/ClickHouse/ClickHouse/pull/37443) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#37942](https://github.com/ClickHouse/ClickHouse/issues/37942): Fix setting cast_ipv4_ipv6_default_on_conversion_error for internal cast function. Closes [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#37761](https://github.com/ClickHouse/ClickHouse/pull/37761) ([Maksim Kita](https://github.com/kitaisreal)).
* Backported in [#38451](https://github.com/ClickHouse/ClickHouse/issues/38451): Fix bug with nested short-circuit functions that led to execution of arguments even if condition is false. Closes [#38040](https://github.com/ClickHouse/ClickHouse/issues/38040). [#38173](https://github.com/ClickHouse/ClickHouse/pull/38173) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#38544](https://github.com/ClickHouse/ClickHouse/issues/38544): Do not allow recursive usage of OvercommitTracker during logging. Fixes [#37794](https://github.com/ClickHouse/ClickHouse/issues/37794) cc @tavplubix @davenger. [#38246](https://github.com/ClickHouse/ClickHouse/pull/38246) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#38708](https://github.com/ClickHouse/ClickHouse/issues/38708): Fix incorrect result of distributed queries with `DISTINCT` and `LIMIT`. Fixes [#38282](https://github.com/ClickHouse/ClickHouse/issues/38282). [#38371](https://github.com/ClickHouse/ClickHouse/pull/38371) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#38595](https://github.com/ClickHouse/ClickHouse/issues/38595): Fix parts removal (will be left forever if they had not been removed on server shutdown) after incorrect server shutdown. [#38486](https://github.com/ClickHouse/ClickHouse/pull/38486) ([Azat Khuzhin](https://github.com/azat)).
* Backported in [#38598](https://github.com/ClickHouse/ClickHouse/issues/38598): Fix table creation to avoid replication issues with pre-22.4 replicas. [#38541](https://github.com/ClickHouse/ClickHouse/pull/38541) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#38688](https://github.com/ClickHouse/ClickHouse/issues/38688): Now it's possible to start a clickhouse-server and attach/detach tables even for tables with the incorrect values of IPv4/IPv6 representation. Proper fix for issue [#35156](https://github.com/ClickHouse/ClickHouse/issues/35156). [#38590](https://github.com/ClickHouse/ClickHouse/pull/38590) ([alesapin](https://github.com/alesapin)).
* Backported in [#38664](https://github.com/ClickHouse/ClickHouse/issues/38664): Adapt some more nodes to avoid issues with pre-22.4 replicas. [#38627](https://github.com/ClickHouse/ClickHouse/pull/38627) ([Raúl Marín](https://github.com/Algunenano)).
* Backported in [#38779](https://github.com/ClickHouse/ClickHouse/issues/38779): `rankCorr` function will work correctly if some arguments are NaNs. This closes [#38396](https://github.com/ClickHouse/ClickHouse/issues/38396). [#38722](https://github.com/ClickHouse/ClickHouse/pull/38722) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Backported in [#38783](https://github.com/ClickHouse/ClickHouse/issues/38783): Fix use-after-free for Map combinator that leads to incorrect result. [#38748](https://github.com/ClickHouse/ClickHouse/pull/38748) ([Azat Khuzhin](https://github.com/azat)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Try to fix some trash [#37303](https://github.com/ClickHouse/ClickHouse/pull/37303) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Update protobuf files for kafka and rabbitmq [fix integration tests] [#37884](https://github.com/ClickHouse/ClickHouse/pull/37884) ([Nikita Taranov](https://github.com/nickitat)).
* Try fix `test_grpc_protocol/test.py::test_progress` [#37908](https://github.com/ClickHouse/ClickHouse/pull/37908) ([Alexander Tokmakov](https://github.com/tavplubix)).
* Try to fix BC check [#38178](https://github.com/ClickHouse/ClickHouse/pull/38178) ([Kruglov Pavel](https://github.com/Avogar)).
* Update docker-compose to try get rid of v1 errors [#38394](https://github.com/ClickHouse/ClickHouse/pull/38394) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Fix backports diff [#38703](https://github.com/ClickHouse/ClickHouse/pull/38703) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,24 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.5.3.21-stable (e03724efec5) FIXME as compared to v22.5.2.53-stable (5fd600fda9e)
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#38241](https://github.com/ClickHouse/ClickHouse/issues/38241): Fix possible crash in `Distributed` async insert in case of removing a replica from config. [#38029](https://github.com/ClickHouse/ClickHouse/pull/38029) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#39098](https://github.com/ClickHouse/ClickHouse/issues/39098): Any allocations inside OvercommitTracker may lead to deadlock. Logging was not very informative so it's easier just to remove logging. Fixes [#37794](https://github.com/ClickHouse/ClickHouse/issues/37794). [#39030](https://github.com/ClickHouse/ClickHouse/pull/39030) ([Dmitry Novik](https://github.com/novikd)).
* Backported in [#39078](https://github.com/ClickHouse/ClickHouse/issues/39078): Fix bug in filesystem cache that could happen in some corner case which coincided with cache capacity hitting the limit. Closes [#39066](https://github.com/ClickHouse/ClickHouse/issues/39066). [#39070](https://github.com/ClickHouse/ClickHouse/pull/39070) ([Kseniia Sumarokova](https://github.com/kssenii)).
* Backported in [#39152](https://github.com/ClickHouse/ClickHouse/issues/39152): Fix error `Block structure mismatch` which could happen for INSERT into table with attached MATERIALIZED VIEW and enabled setting `extremes = 1`. Closes [#29759](https://github.com/ClickHouse/ClickHouse/issues/29759) and [#38729](https://github.com/ClickHouse/ClickHouse/issues/38729). [#39125](https://github.com/ClickHouse/ClickHouse/pull/39125) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#39274](https://github.com/ClickHouse/ClickHouse/issues/39274): Fixed error `Not found column Type in block` in selects with `PREWHERE` and read-in-order optimizations. [#39157](https://github.com/ClickHouse/ClickHouse/pull/39157) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Backported in [#39369](https://github.com/ClickHouse/ClickHouse/issues/39369): Declare RabbitMQ queue without default arguments `x-max-length` and `x-overflow`. [#39259](https://github.com/ClickHouse/ClickHouse/pull/39259) ([rnbondarenko](https://github.com/rnbondarenko)).
* Backported in [#39350](https://github.com/ClickHouse/ClickHouse/issues/39350): Fix incorrect fetch postgresql tables query fro PostgreSQL database engine. Closes [#33502](https://github.com/ClickHouse/ClickHouse/issues/33502). [#39283](https://github.com/ClickHouse/ClickHouse/pull/39283) ([Kseniia Sumarokova](https://github.com/kssenii)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Retry docker buildx commands with progressive sleep in between [#38898](https://github.com/ClickHouse/ClickHouse/pull/38898) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Add docker_server.py running to backport and release CIs [#39011](https://github.com/ClickHouse/ClickHouse/pull/39011) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,29 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.5.4.19-stable (c893bba830e) FIXME as compared to v22.5.3.21-stable (e03724efec5)
#### Bug Fix
* Backported in [#39748](https://github.com/ClickHouse/ClickHouse/issues/39748): Fix seeking while reading from encrypted disk. This PR fixes [#38381](https://github.com/ClickHouse/ClickHouse/issues/38381). [#39687](https://github.com/ClickHouse/ClickHouse/pull/39687) ([Vitaly Baranov](https://github.com/vitlibar)).
#### Build/Testing/Packaging Improvement
* Backported in [#39882](https://github.com/ClickHouse/ClickHouse/issues/39882): Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#39209](https://github.com/ClickHouse/ClickHouse/issues/39209): Fix reading of sparse columns from `MergeTree` tables that store their data in S3. [#37978](https://github.com/ClickHouse/ClickHouse/pull/37978) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#39589](https://github.com/ClickHouse/ClickHouse/issues/39589): Fix data race and possible heap-buffer-overflow in Avro format. Closes [#39094](https://github.com/ClickHouse/ClickHouse/issues/39094) Closes [#33652](https://github.com/ClickHouse/ClickHouse/issues/33652). [#39498](https://github.com/ClickHouse/ClickHouse/pull/39498) ([Kruglov Pavel](https://github.com/Avogar)).
* Backported in [#39611](https://github.com/ClickHouse/ClickHouse/issues/39611): Fix bug with maxsplit argument for splitByChar, which was not working correctly. [#39552](https://github.com/ClickHouse/ClickHouse/pull/39552) ([filimonov](https://github.com/filimonov)).
* Backported in [#39790](https://github.com/ClickHouse/ClickHouse/issues/39790): Fix wrong index analysis with tuples and operator `IN`, which could lead to wrong query result. [#39752](https://github.com/ClickHouse/ClickHouse/pull/39752) ([Anton Popov](https://github.com/CurtizJ)).
* Backported in [#39835](https://github.com/ClickHouse/ClickHouse/issues/39835): Fix `CANNOT_READ_ALL_DATA` exception with `local_filesystem_read_method=pread_threadpool`. This bug affected only Linux kernel version 5.9 and 5.10 according to [man](https://manpages.debian.org/testing/manpages-dev/preadv2.2.en.html#BUGS). [#39800](https://github.com/ClickHouse/ClickHouse/pull/39800) ([Anton Popov](https://github.com/CurtizJ)).
#### NOT FOR CHANGELOG / INSIGNIFICANT
* Fix reading from s3 in some corner cases [#38239](https://github.com/ClickHouse/ClickHouse/pull/38239) ([Anton Popov](https://github.com/CurtizJ)).
* Replace MemoryTrackerBlockerInThread to LockMemoryExceptionInThread [#39619](https://github.com/ClickHouse/ClickHouse/pull/39619) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Change mysql-odbc url [#39702](https://github.com/ClickHouse/ClickHouse/pull/39702) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.6.1.1985-stable FIXME as compared to v22.5.1.2079-stable
### ClickHouse release v22.6.1.1985-stable (7000c4e0033) FIXME as compared to v22.5.1.2079-stable (df0cb062098)
#### Backward Incompatible Change
* Changes how settings using `seconds` as type are parsed to support floating point values (for example: `max_execution_time=0.5`). Infinity or NaN values will throw an exception. [#37187](https://github.com/ClickHouse/ClickHouse/pull/37187) ([Raúl Marín](https://github.com/Algunenano)).
@ -78,7 +78,7 @@ sidebar_label: 2022
* Allow to use String type instead of Binary in Arrow/Parquet/ORC formats. This PR introduces 3 new settings for it: `output_format_arrow_string_as_string`, `output_format_parquet_string_as_string`, `output_format_orc_string_as_string`. Default value for all settings is `false`. [#37327](https://github.com/ClickHouse/ClickHouse/pull/37327) ([Kruglov Pavel](https://github.com/Avogar)).
* Apply setting `input_format_max_rows_to_read_for_schema_inference` for all read rows in total from all files in globs. Previously setting `input_format_max_rows_to_read_for_schema_inference` was applied for each file in glob separately and in case of huge number of nulls we could read first `input_format_max_rows_to_read_for_schema_inference` rows from each file and get nothing. Also increase default value for this setting to 25000. [#37332](https://github.com/ClickHouse/ClickHouse/pull/37332) ([Kruglov Pavel](https://github.com/Avogar)).
* allows providing `NULL`/`NOT NULL` right after type in column declaration. [#37337](https://github.com/ClickHouse/ClickHouse/pull/37337) ([Igor Nikonov](https://github.com/devcrafter)).
* optimize file segment PARTIALLY_DOWNLOADED get read buffer. [#37338](https://github.com/ClickHouse/ClickHouse/pull/37338) ([xiedeyantu](https://github.com/xiedeyantu)).
* optimize file segment PARTIALLY_DOWNLOADED get read buffer. [#37338](https://github.com/ClickHouse/ClickHouse/pull/37338) ([chen](https://github.com/xiedeyantu)).
* Allow to prune the list of files via virtual columns such as `_file` and `_path` when reading from S3. This is for [#37174](https://github.com/ClickHouse/ClickHouse/issues/37174) , [#23494](https://github.com/ClickHouse/ClickHouse/issues/23494). [#37356](https://github.com/ClickHouse/ClickHouse/pull/37356) ([Amos Bird](https://github.com/amosbird)).
* Try to improve short circuit functions processing to fix problems with stress tests. [#37384](https://github.com/ClickHouse/ClickHouse/pull/37384) ([Kruglov Pavel](https://github.com/Avogar)).
* Closes [#37395](https://github.com/ClickHouse/ClickHouse/issues/37395). [#37415](https://github.com/ClickHouse/ClickHouse/pull/37415) ([Memo](https://github.com/Joeywzr)).
@ -117,7 +117,7 @@ sidebar_label: 2022
* Remove recursive submodules, because we don't need them and they can be confusing. Add style check to prevent recursive submodules. This closes [#32821](https://github.com/ClickHouse/ClickHouse/issues/32821). [#37616](https://github.com/ClickHouse/ClickHouse/pull/37616) ([Alexey Milovidov](https://github.com/alexey-milovidov)).
* Add docs spellcheck to CI. [#37790](https://github.com/ClickHouse/ClickHouse/pull/37790) ([Vladimir C](https://github.com/vdimir)).
* Fix overly aggressive stripping which removed the embedded hash required for checking the consistency of the executable. [#37993](https://github.com/ClickHouse/ClickHouse/pull/37993) ([Robert Schulze](https://github.com/rschu1ze)).
* fix MacOS build compressor faild. [#38007](https://github.com/ClickHouse/ClickHouse/pull/38007) ([xiedeyantu](https://github.com/xiedeyantu)).
* fix MacOS build compressor faild. [#38007](https://github.com/ClickHouse/ClickHouse/pull/38007) ([chen](https://github.com/xiedeyantu)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
@ -166,7 +166,7 @@ sidebar_label: 2022
* Fix possible incorrect result of `SELECT ... WITH FILL` in the case when `ORDER BY` should be applied after `WITH FILL` result (e.g. for outer query). Incorrect result was caused by optimization for `ORDER BY` expressions ([#35623](https://github.com/ClickHouse/ClickHouse/issues/35623)). Closes [#37904](https://github.com/ClickHouse/ClickHouse/issues/37904). [#37959](https://github.com/ClickHouse/ClickHouse/pull/37959) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
* Add missing default columns when pushing to the target table in WindowView, fix [#37815](https://github.com/ClickHouse/ClickHouse/issues/37815). [#37965](https://github.com/ClickHouse/ClickHouse/pull/37965) ([vxider](https://github.com/Vxider)).
* Fixed a stack overflow issue that would cause compilation to fail. [#37996](https://github.com/ClickHouse/ClickHouse/pull/37996) ([Han Shukai](https://github.com/KinderRiven)).
* when open enable_filesystem_query_cache_limit, throw Reserved cache size exceeds the remaining cache size. [#38004](https://github.com/ClickHouse/ClickHouse/pull/38004) ([xiedeyantu](https://github.com/xiedeyantu)).
* when open enable_filesystem_query_cache_limit, throw Reserved cache size exceeds the remaining cache size. [#38004](https://github.com/ClickHouse/ClickHouse/pull/38004) ([chen](https://github.com/xiedeyantu)).
* Query, containing ORDER BY ... WITH FILL, can generate extra rows when multiple WITH FILL columns are present. [#38074](https://github.com/ClickHouse/ClickHouse/pull/38074) ([Yakov Olkhovskiy](https://github.com/yakov-olkhovskiy)).
#### Bug Fix (user-visible misbehaviour in official stable or prestable release)

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.6.2.12-stable FIXME as compared to v22.6.1.1985-stable
### ClickHouse release v22.6.2.12-stable (1fc97f10cbf) FIXME as compared to v22.6.1.1985-stable (7000c4e0033)
#### Improvement
* Backported in [#38484](https://github.com/ClickHouse/ClickHouse/issues/38484): Improve the stability for hive storage integration test. Move the data prepare step into test.py. [#38260](https://github.com/ClickHouse/ClickHouse/pull/38260) ([lgbo](https://github.com/lgbo-ustc)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.6.3.35-stable FIXME as compared to v22.6.2.12-stable
### ClickHouse release v22.6.3.35-stable (d5566f2f2dd) FIXME as compared to v22.6.2.12-stable (1fc97f10cbf)
#### Bug Fix
* Backported in [#38812](https://github.com/ClickHouse/ClickHouse/issues/38812): Fix crash when executing GRANT ALL ON *.* with ON CLUSTER. It was broken in https://github.com/ClickHouse/ClickHouse/pull/35767. This closes [#38618](https://github.com/ClickHouse/ClickHouse/issues/38618). [#38674](https://github.com/ClickHouse/ClickHouse/pull/38674) ([Vitaly Baranov](https://github.com/vitlibar)).

View File

@ -5,7 +5,7 @@ sidebar_label: 2022
# 2022 Changelog
### ClickHouse release v22.6.4.35-stable FIXME as compared to v22.6.3.35-stable
### ClickHouse release v22.6.4.35-stable (b9202cae6f4) FIXME as compared to v22.6.3.35-stable (d5566f2f2dd)
#### Build/Testing/Packaging Improvement
* Backported in [#38822](https://github.com/ClickHouse/ClickHouse/issues/38822): - Change `all|noarch` packages to architecture-dependent - Fix some documentation for it - Push aarch64|arm64 packages to artifactory and release assets - Fixes [#36443](https://github.com/ClickHouse/ClickHouse/issues/36443). [#38580](https://github.com/ClickHouse/ClickHouse/pull/38580) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).

View File

@ -0,0 +1,18 @@
---
sidebar_position: 1
sidebar_label: 2022
---
# 2022 Changelog
### ClickHouse release v22.7.3.5-stable (e140b8b5f3a) FIXME as compared to v22.7.2.15-stable (f843089624e)
#### Build/Testing/Packaging Improvement
* Backported in [#39884](https://github.com/ClickHouse/ClickHouse/issues/39884): Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
* Backported in [#39884](https://github.com/ClickHouse/ClickHouse/issues/39884): Former packages used to install systemd.service file to `/etc`. The files there are marked as `conf` and are not cleaned out, and not updated automatically. This PR cleans them out. [#39323](https://github.com/ClickHouse/ClickHouse/pull/39323) ([Mikhail f. Shiryaev](https://github.com/Felixoid)).
#### Bug Fix (user-visible misbehavior in official stable or prestable release)
* Backported in [#40045](https://github.com/ClickHouse/ClickHouse/issues/40045): Fix big memory usage during fetches. Fixes [#39915](https://github.com/ClickHouse/ClickHouse/issues/39915). [#39990](https://github.com/ClickHouse/ClickHouse/pull/39990) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).
* Backported in [#40045](https://github.com/ClickHouse/ClickHouse/issues/40045): Fix big memory usage during fetches. Fixes [#39915](https://github.com/ClickHouse/ClickHouse/issues/39915). [#39990](https://github.com/ClickHouse/ClickHouse/pull/39990) ([Nikolai Kochetov](https://github.com/KochetovNicolai)).

View File

@ -5,7 +5,7 @@ sidebar_label: Date32
# Date32
A date. Supports the date range same with [Datetime64](../../sql-reference/data-types/datetime64.md). Stored in four bytes as the number of days since 1925-01-01. Allows storing values till 2283-11-11.
A date. Supports the date range same with [Datetime64](../../sql-reference/data-types/datetime64.md). Stored in four bytes as the number of days since 1900-01-01. Allows storing values till 2299-12-31.
**Examples**
@ -36,5 +36,5 @@ SELECT * FROM new;
- [toDate32](../../sql-reference/functions/type-conversion-functions.md#todate32)
- [toDate32OrZero](../../sql-reference/functions/type-conversion-functions.md#todate32-or-zero)
- [toDate32OrNull](../../sql-reference/functions/type-conversion-functions.md#todate32-or-null)
- [toDate32OrNull](../../sql-reference/functions/type-conversion-functions.md#todate32-or-null)

View File

@ -18,7 +18,7 @@ DateTime64(precision, [timezone])
Internally, stores data as a number of ticks since epoch start (1970-01-01 00:00:00 UTC) as Int64. The tick resolution is determined by the precision parameter. Additionally, the `DateTime64` type can store time zone that is the same for the entire column, that affects how the values of the `DateTime64` type values are displayed in text format and how the values specified as strings are parsed (2020-01-01 05:00:01.000). The time zone is not stored in the rows of the table (or in resultset), but is stored in the column metadata. See details in [DateTime](../../sql-reference/data-types/datetime.md).
Supported range of values: \[1925-01-01 00:00:00, 2283-11-11 23:59:59.99999999\] (Note: The precision of the maximum value is 8).
Supported range of values: \[1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999\] (Note: The precision of the maximum value is 8).
## Examples

View File

@ -266,8 +266,8 @@ Result:
└────────────────┘
```
:::note
The return type `toStartOf*` functions described below is `Date` or `DateTime`. Though these functions can take `DateTime64` as an argument, passing them a `DateTime64` that is out of the normal range (years 1925 - 2283) will give an incorrect result.
:::note
The return type `toStartOf*` functions described below is `Date` or `DateTime`. Though these functions can take `DateTime64` as an argument, passing them a `DateTime64` that is out of the normal range (years 1900 - 2299) will give an incorrect result.
:::
## toStartOfYear
@ -291,7 +291,7 @@ Returns the date.
Rounds down a date or date with time to the first day of the month.
Returns the date.
:::note
:::note
The behavior of parsing incorrect dates is implementation specific. ClickHouse may return zero date, throw an exception or do “natural” overflow.
:::

View File

@ -218,23 +218,23 @@ SELECT toDate32('1955-01-01') AS value, toTypeName(value);
2. The value is outside the range:
``` sql
SELECT toDate32('1924-01-01') AS value, toTypeName(value);
SELECT toDate32('1899-01-01') AS value, toTypeName(value);
```
``` text
┌──────value─┬─toTypeName(toDate32('1925-01-01'))─┐
│ 1925-01-01 │ Date32 │
┌──────value─┬─toTypeName(toDate32('1899-01-01'))─┐
│ 1900-01-01 │ Date32 │
└────────────┴────────────────────────────────────┘
```
3. With `Date`-type argument:
``` sql
SELECT toDate32(toDate('1924-01-01')) AS value, toTypeName(value);
SELECT toDate32(toDate('1899-01-01')) AS value, toTypeName(value);
```
``` text
┌──────value─┬─toTypeName(toDate32(toDate('1924-01-01')))─┐
┌──────value─┬─toTypeName(toDate32(toDate('1899-01-01')))─┐
│ 1970-01-01 │ Date32 │
└────────────┴────────────────────────────────────────────┘
```
@ -248,14 +248,14 @@ The same as [toDate32](#todate32) but returns the min value of [Date32](../../sq
Query:
``` sql
SELECT toDate32OrZero('1924-01-01'), toDate32OrZero('');
SELECT toDate32OrZero('1899-01-01'), toDate32OrZero('');
```
Result:
``` text
┌─toDate32OrZero('1924-01-01')─┬─toDate32OrZero('')─┐
│ 1925-01-01 │ 1925-01-01 │
┌─toDate32OrZero('1899-01-01')─┬─toDate32OrZero('')─┐
│ 1900-01-01 │ 1900-01-01 │
└──────────────────────────────┴────────────────────┘
```

View File

@ -9,13 +9,13 @@ Table functions are methods for constructing tables.
You can use table functions in:
- [FROM](../../sql-reference/statements/select/from.md) clause of the `SELECT` query.
- [FROM](../../sql-reference/statements/select/from.md) clause of the `SELECT` query.
The method for creating a temporary table that is available only in the current query. The table is deleted when the query finishes.
The method for creating a temporary table that is available only in the current query. The table is deleted when the query finishes.
- [CREATE TABLE AS table_function()](../../sql-reference/statements/create/table.md) query.
It's one of the methods of creating a table.
It's one of the methods of creating a table.
- [INSERT INTO TABLE FUNCTION](../../sql-reference/statements/insert-into.md#inserting-into-table-function) query.
@ -38,4 +38,3 @@ You cant use table functions if the [allow_ddl](../../operations/settings/per
| [s3](../../sql-reference/table-functions/s3.md) | Creates a [S3](../../engines/table-engines/integrations/s3.md)-engine table. |
| [sqlite](../../sql-reference/table-functions/sqlite.md) | Creates a [sqlite](../../engines/table-engines/integrations/sqlite.md)-engine table. |
[Original article](https://clickhouse.com/docs/en/sql-reference/table-functions/) <!--hide-->

View File

@ -8,6 +8,7 @@ sidebar_label: Сборка на Mac OS X
:::info "Вам не нужно собирать ClickHouse самостоятельно"
Вы можете установить предварительно собранный ClickHouse, как описано в [Быстром старте](https://clickhouse.com/#quick-start).
Следуйте инструкциям по установке для `macOS (Intel)` или `macOS (Apple Silicon)`.
:::
Сборка должна запускаться с x86_64 (Intel) на macOS версии 10.15 (Catalina) и выше в последней версии компилятора Xcode's native AppleClang, Homebrew's vanilla Clang или в GCC-компиляторах.
@ -90,6 +91,7 @@ $ /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/
:::info "Note"
Вам понадобится команда `sudo`.
:::
1. Создайте файл `/Library/LaunchDaemons/limit.maxfiles.plist` и поместите в него следующее:

View File

@ -49,6 +49,7 @@ PostgreSQL массивы конвертируются в массивы ClickHo
:::info "Внимание"
Будьте внимательны, в PostgreSQL массивы, созданные как `type_name[]`, являются многомерными и могут содержать в себе разное количество измерений в разных строках одной таблицы. Внутри ClickHouse допустимы только многомерные массивы с одинаковым кол-вом измерений во всех строках таблицы.
:::
Поддерживает несколько реплик, которые должны быть перечислены через `|`. Например:

View File

@ -40,6 +40,7 @@ ORDER BY (CounterID, StartDate, intHash32(UserID));
:::info "Info"
Не рекомендуется делать слишком гранулированное партиционирование то есть задавать партиции по столбцу, в котором будет слишком большой разброс значений (речь идет о порядке более тысячи партиций). Это приведет к скоплению большого числа файлов и файловых дескрипторов в системе, что может значительно снизить производительность запросов `SELECT`.
:::
Чтобы получить набор кусков и партиций таблицы, можно воспользоваться системной таблицей [system.parts](../../../engines/table-engines/mergetree-family/custom-partitioning-key.md#system_tables-parts). В качестве примера рассмотрим таблицу `visits`, в которой задано партиционирование по месяцам. Выполним `SELECT` для таблицы `system.parts`:
@ -80,6 +81,7 @@ WHERE table = 'visits'
:::info "Info"
Названия кусков для таблиц старого типа образуются следующим образом: `20190117_20190123_2_2_0` (минимальная дата _ максимальная дата _ номер минимального блока _ номер максимального блока _ уровень).
:::
Как видно из примера выше, таблица содержит несколько отдельных кусков для одной и той же партиции (например, куски `201901_1_3_1` и `201901_1_9_2` принадлежат партиции `201901`). Это означает, что эти куски еще не были объединены в файловой системе они хранятся отдельно. После того как будет выполнено автоматическое слияние данных (выполняется примерно спустя 10 минут после вставки данных), исходные куски будут объединены в один более крупный кусок и помечены как неактивные.

View File

@ -14,3 +14,4 @@ sidebar_position: 10
:::info "Забавный факт"
Спустя годы после того, как ClickHouse получил свое название, принцип комбинирования двух слов, каждое из которых имеет подходящий смысл, был признан лучшим способом назвать базу данных в [исследовании Andy Pavlo](https://www.cs.cmu.edu/~pavlo/blog/2020/03/on-naming-a-database-management-system.html), Associate Professor of Databases в Carnegie Mellon University. ClickHouse разделил награду "за лучшее название СУБД" с Postgres.
:::

View File

@ -20,5 +20,6 @@ sidebar_label: Общие вопросы
:::info "Если вы не нашли то, что искали:"
Загляните в другие категории F.A.Q. или поищите в остальных разделах документации, ориентируясь по оглавлению слева.
:::
[Original article](https://clickhouse.com/docs/ru/faq/general/) <!--hide-->

View File

@ -60,3 +60,4 @@ sidebar_position: 8
- Ориентируйтесь на показатели, собранные при работе с реальными данными.
- Проверяйте производительность в процессе CI.
- Измеряйте и анализируйте всё, что только возможно.
:::

View File

@ -15,5 +15,6 @@ sidebar_label: Интеграция
:::info "Если вы не нашли то, что искали"
Загляните в другие подразделы F.A.Q. или поищите в остальных разделах документации, ориентируйтесь по оглавлению слева.
:::
[Original article](https://clickhouse.com/docs/ru/faq/integration/)

View File

@ -14,5 +14,6 @@ sidebar_label: Операции
:::info "Если вы не нашли то, что искали"
Загляните в другие подразделы F.A.Q. или поищите в остальных разделах документации, ориентируйтесь по оглавлению слева.
:::
[Original article](https://clickhouse.com/docs/en/faq/operations/)

View File

@ -293,6 +293,7 @@ $ clickhouse-client --query "SELECT COUNT(*) FROM datasets.trips_mergetree"
:::info "Info"
Если вы собираетесь выполнять запросы, приведенные ниже, то к имени таблицы
нужно добавить имя базы, `datasets.trips_mergetree`.
:::
## Результаты на одном сервере {#rezultaty-na-odnom-servere}

View File

@ -157,6 +157,7 @@ $ clickhouse-client --query "SELECT COUNT(*) FROM datasets.ontime"
:::info "Info"
Если вы собираетесь выполнять запросы, приведенные ниже, то к имени таблицы
нужно добавить имя базы, `datasets.ontime`.
:::
## Запросы: {#zaprosy}

View File

@ -99,6 +99,7 @@ ClickHouse предоставляет возможность аутентифи
:::info ""
Ещё раз отметим, что кроме `users.xml`, необходимо также включить Kerberos в `config.xml`.
:::
### Настройка Kerberos через SQL {#enabling-kerberos-using-sql}

View File

@ -174,6 +174,7 @@ ClickHouse проверяет условия для `min_part_size` и `min_part
:::info "Примечание"
Жесткое ограничение настраивается с помощью системных инструментов.
:::
**Пример**
@ -706,6 +707,7 @@ ClickHouse поддерживает динамическое изменение
:::info "Примечание"
Параметры этих настроек могут быть изменены во время выполнения запросов и вступят в силу немедленно. Запросы, которые уже запущены, выполнятся без изменений.
:::
Возможные значения:
@ -726,6 +728,7 @@ ClickHouse поддерживает динамическое изменение
:::info "Примечание"
Параметры этих настроек могут быть изменены во время выполнения запросов и вступят в силу немедленно. Запросы, которые уже запущены, выполнятся без изменений.
:::
Возможные значения:
@ -746,6 +749,7 @@ ClickHouse поддерживает динамическое изменение
:::info "Примечание"
Параметры этих настроек могут быть изменены во время выполнения запросов и вступят в силу немедленно. Запросы, которые уже запущены, выполнятся без изменений.
:::
Возможные значения:

View File

@ -30,6 +30,7 @@
:::info "Замечание"
Даже если `parts_to_do = 0`, для реплицированной таблицы возможна ситуация, когда мутация ещё не завершена из-за долго выполняющейся операции `INSERT`, которая добавляет данные, которые нужно будет мутировать.
:::
Если во время мутации какого-либо куска возникли проблемы, заполняются следующие столбцы:

View File

@ -8,6 +8,7 @@ sidebar_position: 141
:::info "Примечание"
Чтобы эта функция работала должным образом, исходные данные должны быть отсортированы. В [материализованном представлении](../../../sql-reference/statements/create/view.md#materialized) вместо нее рекомендуется использовать [deltaSumTimestamp](../../../sql-reference/aggregate-functions/reference/deltasumtimestamp.md#agg_functions-deltasumtimestamp).
:::
**Синтаксис**

View File

@ -20,6 +20,7 @@ intervalLengthSum(start, end)
:::info "Примечание"
Аргументы должны быть одного типа. В противном случае ClickHouse сгенерирует исключение.
:::
**Возвращаемое значение**

View File

@ -5,7 +5,7 @@ sidebar_label: Date32
# Date32 {#data_type-datetime32}
Дата. Поддерживается такой же диапазон дат, как для типа [Datetime64](../../sql-reference/data-types/datetime64.md). Значение хранится в четырех байтах и соответствует числу дней с 1925-01-01 по 2283-11-11.
Дата. Поддерживается такой же диапазон дат, как для типа [Datetime64](../../sql-reference/data-types/datetime64.md). Значение хранится в четырех байтах и соответствует числу дней с 1900-01-01 по 2299-12-31.
**Пример**
@ -36,5 +36,5 @@ SELECT * FROM new;
- [toDate32](../../sql-reference/functions/type-conversion-functions.md#todate32)
- [toDate32OrZero](../../sql-reference/functions/type-conversion-functions.md#todate32-or-zero)
- [toDate32OrNull](../../sql-reference/functions/type-conversion-functions.md#todate32-or-null)
- [toDate32OrNull](../../sql-reference/functions/type-conversion-functions.md#todate32-or-null)

View File

@ -18,7 +18,7 @@ DateTime64(precision, [timezone])
Данные хранятся в виде количества ‘тиков’, прошедших с момента начала эпохи (1970-01-01 00:00:00 UTC), в Int64. Размер тика определяется параметром precision. Дополнительно, тип `DateTime64` позволяет хранить часовой пояс, единый для всей колонки, который влияет на то, как будут отображаться значения типа `DateTime64` в текстовом виде и как будут парситься значения заданные в виде строк (2020-01-01 05:00:01.000). Часовой пояс не хранится в строках таблицы (выборки), а хранится в метаданных колонки. Подробнее см. [DateTime](datetime.md).
Диапазон значений: \[1925-01-01 00:00:00, 2283-11-11 23:59:59.99999999\] (Примечание: Точность максимального значения составляет 8).
Диапазон значений: \[1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999\] (Примечание: Точность максимального значения составляет 8).
## Примеры {#examples}

View File

@ -26,6 +26,7 @@ sidebar_label: Nullable
:::info "Info"
Почти всегда использование `Nullable` снижает производительность, учитывайте это при проектировании своих баз.
:::
## Поиск NULL {#finding-null}

View File

@ -464,6 +464,7 @@ SOURCE(ODBC(
:::info "Примечание"
Поля `table` и `query` не могут быть использованы вместе. Также обязательно должен быть один из источников данных: `table` или `query`.
:::
ClickHouse получает от ODBC-драйвера информацию о квотировании и квотирует настройки в запросах к драйверу, поэтому имя таблицы нужно указывать в соответствии с регистром имени таблицы в базе данных.
@ -543,6 +544,7 @@ SOURCE(MYSQL(
:::info "Примечание"
Поля `table` или `where` не могут быть использованы вместе с полем `query`. Также обязательно должен быть один из источников данных: `table` или `query`.
Явный параметр `secure` отсутствует. Автоматически поддержана работа в обоих случаях: когда установка SSL-соединения необходима и когда нет.
:::
MySQL можно подключить на локальном хосте через сокеты, для этого необходимо задать `host` и `socket`.
@ -633,6 +635,7 @@ SOURCE(CLICKHOUSE(
:::info "Примечание"
Поля `table` или `where` не могут быть использованы вместе с полем `query`. Также обязательно должен быть один из источников данных: `table` или `query`.
:::
### MongoDB {#dicts-external_dicts_dict_sources-mongodb}
@ -748,6 +751,7 @@ SOURCE(REDIS(
:::info "Примечание"
Поля `column_family` или `where` не могут быть использованы вместе с полем `query`. Также обязательно должен быть один из источников данных: `column_family` или `query`.
:::
### PostgreSQL {#dicts-external_dicts_dict_sources-postgresql}
@ -804,3 +808,4 @@ SOURCE(POSTGRESQL(
:::info "Примечание"
Поля `table` или `where` не могут быть использованы вместе с полем `query`. Также обязательно должен быть один из источников данных: `table` или `query`.
:::

View File

@ -57,7 +57,7 @@ toTimezone(value, timezone)
**Аргументы**
- `value` — время или дата с временем. [DateTime64](../../sql-reference/data-types/datetime64.md).
- `timezone` — часовой пояс для возвращаемого значения. [String](../../sql-reference/data-types/string.md). Этот аргумент является константой, потому что `toTimezone` изменяет часовой пояс столбца (часовой пояс является атрибутом типов `DateTime*`).
- `timezone` — часовой пояс для возвращаемого значения. [String](../../sql-reference/data-types/string.md). Этот аргумент является константой, потому что `toTimezone` изменяет часовой пояс столбца (часовой пояс является атрибутом типов `DateTime*`).
**Возвращаемое значение**
@ -267,7 +267,7 @@ SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp;
```
:::note "Attention"
`Date` или `DateTime` это возвращаемый тип функций `toStartOf*`, который описан ниже. Несмотря на то, что эти функции могут принимать `DateTime64` в качестве аргумента, если переданное значение типа `DateTime64` выходит за пределы нормального диапазона (с 1925 по 2283 год), то это даст неверный результат.
`Date` или `DateTime` это возвращаемый тип функций `toStartOf*`, который описан ниже. Несмотря на то, что эти функции могут принимать `DateTime64` в качестве аргумента, если переданное значение типа `DateTime64` выходит за пределы нормального диапазона (с 1900 по 2299 год), то это даст неверный результат.
:::
## toStartOfYear {#tostartofyear}

View File

@ -86,6 +86,7 @@ geohashesInBox(longitude_min, latitude_min, longitude_max, latitude_max, precisi
:::info "Замечание"
Все передаваемые координаты должны быть одного и того же типа: либо `Float32`, либо `Float64`.
:::
**Возвращаемые значения**
@ -96,6 +97,7 @@ geohashesInBox(longitude_min, latitude_min, longitude_max, latitude_max, precisi
:::info "Замечание"
Если возвращаемый массив содержит свыше 10 000 000 элементов, функция сгенерирует исключение.
:::
**Пример**

View File

@ -209,7 +209,7 @@ SELECT toDate32('1955-01-01') AS value, toTypeName(value);
```
``` text
┌──────value─┬─toTypeName(toDate32('1925-01-01'))─┐
┌──────value─┬─toTypeName(toDate32('1955-01-01'))─┐
│ 1955-01-01 │ Date32 │
└────────────┴────────────────────────────────────┘
```
@ -217,23 +217,23 @@ SELECT toDate32('1955-01-01') AS value, toTypeName(value);
2. Значение выходит за границы диапазона:
``` sql
SELECT toDate32('1924-01-01') AS value, toTypeName(value);
SELECT toDate32('1899-01-01') AS value, toTypeName(value);
```
``` text
┌──────value─┬─toTypeName(toDate32('1925-01-01'))─┐
│ 1925-01-01 │ Date32 │
┌──────value─┬─toTypeName(toDate32('1899-01-01'))─┐
│ 1900-01-01 │ Date32 │
└────────────┴────────────────────────────────────┘
```
3. С аргументом типа `Date`:
``` sql
SELECT toDate32(toDate('1924-01-01')) AS value, toTypeName(value);
SELECT toDate32(toDate('1899-01-01')) AS value, toTypeName(value);
```
``` text
┌──────value─┬─toTypeName(toDate32(toDate('1924-01-01')))─┐
┌──────value─┬─toTypeName(toDate32(toDate('1899-01-01')))─┐
│ 1970-01-01 │ Date32 │
└────────────┴────────────────────────────────────────────┘
```
@ -247,14 +247,14 @@ SELECT toDate32(toDate('1924-01-01')) AS value, toTypeName(value);
Запрос:
``` sql
SELECT toDate32OrZero('1924-01-01'), toDate32OrZero('');
SELECT toDate32OrZero('1899-01-01'), toDate32OrZero('');
```
Результат:
``` text
┌─toDate32OrZero('1924-01-01')─┬─toDate32OrZero('')─┐
│ 1925-01-01 │ 1925-01-01 │
┌─toDate32OrZero('1899-01-01')─┬─toDate32OrZero('')─┐
│ 1900-01-01 │ 1900-01-01 │
└──────────────────────────────┴────────────────────┘
```
@ -1209,6 +1209,7 @@ SELECT toLowCardinality('1');
:::info "Примечание"
Возвращаемое значение — это временная метка в UTC, а не в часовом поясе `DateTime64`.
:::
**Синтаксис**

View File

@ -25,7 +25,7 @@ ALTER TABLE [db].name [ON CLUSTER cluster] ADD|DROP|CLEAR|COMMENT|MODIFY COLUMN
- [CONSTRAINT](../../../sql-reference/statements/alter/constraint.md)
- [TTL](../../../sql-reference/statements/alter/ttl.md)
:::note
:::note
Запрос `ALTER TABLE` поддерживается только для таблиц типа `*MergeTree`, а также `Merge` и `Distributed`. Запрос имеет несколько вариантов.
:::
Следующие запросы `ALTER` управляют представлениями:
@ -77,5 +77,6 @@ ALTER TABLE [db.]table MATERIALIZE INDEX name IN PARTITION partition_name
:::info "Примечание"
Для всех запросов `ALTER` при `replication_alter_partitions_sync = 2` и неактивности некоторых реплик больше времени, заданного настройкой `replication_wait_for_inactive_replica_timeout`, генерируется исключение `UNFINISHED`.
:::
Для запросов `ALTER TABLE ... UPDATE|DELETE` синхронность выполнения определяется настройкой [mutations_sync](../../../operations/settings/settings.md#mutations_sync).

View File

@ -56,6 +56,7 @@ CREATE USER [IF NOT EXISTS | OR REPLACE] name1 [ON CLUSTER cluster_name1]
:::info "Внимание"
ClickHouse трактует конструкцию `user_name@'address'` как имя пользователя целиком. То есть технически вы можете создать несколько пользователей с одинаковыми `user_name`, но разными частями конструкции после `@`, но лучше так не делать.
:::
## Секция GRANTEES {#grantees}

View File

@ -88,6 +88,7 @@ LIVE-представления работают по тому же принци
- `LIVE VIEW` не обновляется, если в исходном запросе используются несколько таблиц.
В случаях, когда `LIVE VIEW` не обновляется автоматически, чтобы обновлять его принудительно с заданной периодичностью, используйте [WITH REFRESH](#live-view-with-refresh).
:::
### Отслеживание изменений LIVE-представлений {#live-view-monitoring}

View File

@ -30,6 +30,7 @@ ClickHouse не оповещает клиента. Чтобы включить
:::info "Примечание"
Если значение настройки `replication_alter_partitions_sync` равно `2` и некоторые реплики не активны больше времени, заданного настройкой `replication_wait_for_inactive_replica_timeout`, то генерируется исключение `UNFINISHED`.
:::
## Выражение BY {#by-expression}

View File

@ -368,6 +368,7 @@ SHOW ACCESS
:::info "Note"
По запросу `SHOW CLUSTER name` вы получите содержимое таблицы system.clusters для этого кластера.
:::
### Синтаксис {#show-cluster-syntax}

View File

@ -19,3 +19,4 @@ TRUNCATE TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster]
:::info "Примечание"
Если значение настройки `replication_alter_partitions_sync` равно `2` и некоторые реплики не активны больше времени, заданного настройкой `replication_wait_for_inactive_replica_timeout`, то генерируется исключение `UNFINISHED`.
:::

View File

@ -104,3 +104,4 @@ WATCH lv EVENTS LIMIT 1;
:::info "Примечание"
При отслеживании [LIVE VIEW](./create/view.md#live-view) через интерфейс HTTP следует использовать формат [JSONEachRowWithProgress](../../interfaces/formats.md#jsoneachrowwithprogress). Постоянные сообщения об изменениях будут добавлены в поток вывода для поддержания активности долговременного HTTP-соединения до тех пор, пока результат запроса изменяется. Проомежуток времени между сообщениями об изменениях управляется настройкой[live_view_heartbeat_interval](./create/view.md#live-view-settings).
:::

View File

@ -28,6 +28,7 @@ postgresql('host:port', 'database', 'table', 'user', 'password'[, `schema`])
:::info "Примечание"
В запросах `INSERT` для того чтобы отличить табличную функцию `postgresql(...)` от таблицы со списком имен столбцов вы должны указывать ключевые слова `FUNCTION` или `TABLE FUNCTION`. См. примеры ниже.
:::
## Особенности реализации {#implementation-details}
@ -43,6 +44,7 @@ PostgreSQL массивы конвертируются в массивы ClickHo
:::info "Примечание"
Будьте внимательны, в PostgreSQL массивы, созданные как `type_name[]`, являются многомерными и могут содержать в себе разное количество измерений в разных строках одной таблицы. Внутри ClickHouse допустипы только многомерные массивы с одинаковым кол-вом измерений во всех строках таблицы.
:::
Поддерживает несколько реплик, которые должны быть перечислены через `|`. Например:

View File

@ -19,7 +19,7 @@ DateTime64(precision, [timezone])
在内部此类型以Int64类型将数据存储为自Linux纪元开始(1970-01-01 00:00:00UTC)的时间刻度数ticks。时间刻度的分辨率由precision参数确定。此外`DateTime64` 类型可以像存储其他数据列一样存储时区信息,时区会影响 `DateTime64` 类型的值如何以文本格式显示,以及如何解析以字符串形式指定的时间数据 (2020-01-01 05:00:01.000)。时区不存储在表的行中也不在resultset中而是存储在列的元数据中。详细信息请参考 [DateTime](datetime.md) 数据类型.
值的范围: \[1925-01-01 00:00:00, 2283-11-11 23:59:59.99999999\] (注意: 最大值的精度是8)。
值的范围: \[1900-01-01 00:00:00, 2299-12-31 23:59:59.99999999\] (注意: 最大值的精度是8)。
## 示例 {#examples}

View File

@ -263,8 +263,8 @@ SELECT toUnixTimestamp('2017-11-05 08:07:47', 'Asia/Tokyo') AS unix_timestamp
└────────────────┘
```
:::注意
下面描述的返回类型 `toStartOf` 函数是 `Date``DateTime`。尽管这些函数可以将 `DateTime64` 作为参数但将超出正常范围1925年-2283年)的 `DateTime64` 传递给它们会给出不正确的结果。
:::注意
下面描述的返回类型 `toStartOf` 函数是 `Date``DateTime`。尽管这些函数可以将 `DateTime64` 作为参数但将超出正常范围1900年-2299年)的 `DateTime64` 传递给它们会给出不正确的结果。
:::
## toStartOfYear {#tostartofyear}
@ -1221,4 +1221,4 @@ SELECT fromModifiedJulianDayOrNull(58849);
└────────────────────────────────────┘
```
[Original article](https://clickhouse.com/docs/en/query_language/functions/date_time_functions/) <!--hide-->
[Original article](https://clickhouse.com/docs/en/query_language/functions/date_time_functions/) <!--hide-->

View File

@ -60,6 +60,14 @@
</logger>
</levels>
-->
<!-- Structured log formatting:
You can specify log format(for now, JSON only). In that case, the console log will be printed
in specified format like JSON.
For example, as below:
{"date_time":"1650918987.180175","thread_name":"#1","thread_id":"254545","level":"Trace","query_id":"","logger_name":"BaseDaemon","message":"Received signal 2","source_file":"../base/daemon/BaseDaemon.cpp; virtual void SignalListener::run()","source_line":"192"}
To enable JSON logging support, just uncomment <formatting> tag below.
-->
<!-- <formatting>json</formatting> -->
</logger>
<!-- Add headers to response in options request. OPTIONS method is used in CORS preflight requests. -->

View File

@ -10,20 +10,23 @@
#include <type_traits>
#define DATE_LUT_MIN_YEAR 1925 /// 1925 since wast majority of timezones changed to 15-minute aligned offsets somewhere in 1924 or earlier.
#define DATE_LUT_MAX_YEAR 2283 /// Last supported year (complete)
#define DATE_LUT_MIN_YEAR 1900 /// 1900 since majority of financial organizations consider 1900 as an initial year.
#define DATE_LUT_MAX_YEAR 2299 /// Last supported year (complete)
#define DATE_LUT_YEARS (1 + DATE_LUT_MAX_YEAR - DATE_LUT_MIN_YEAR) /// Number of years in lookup table
#define DATE_LUT_SIZE 0x20000
#define DATE_LUT_SIZE 0x23AB1
#define DATE_LUT_MAX (0xFFFFFFFFU - 86400)
#define DATE_LUT_MAX_DAY_NUM 0xFFFF
#define DAYNUM_OFFSET_EPOCH 25567
/// Max int value of Date32, DATE LUT cache size minus daynum_offset_epoch
#define DATE_LUT_MAX_EXTEND_DAY_NUM (DATE_LUT_SIZE - 16436)
#define DATE_LUT_MAX_EXTEND_DAY_NUM (DATE_LUT_SIZE - DAYNUM_OFFSET_EPOCH)
/// A constant to add to time_t so every supported time point becomes non-negative and still has the same remainder of division by 3600.
/// If we treat "remainder of division" operation in the sense of modular arithmetic (not like in C++).
#define DATE_LUT_ADD ((1970 - DATE_LUT_MIN_YEAR) * 366 * 86400)
#define DATE_LUT_ADD ((1970 - DATE_LUT_MIN_YEAR) * 366L * 86400)
#if defined(__PPC__)
@ -64,62 +67,78 @@ private:
// Same as above but select different function overloads for zero saturation.
STRONG_TYPEDEF(UInt32, LUTIndexWithSaturation)
static inline LUTIndex normalizeLUTIndex(UInt32 index)
{
if (index >= DATE_LUT_SIZE)
return LUTIndex(DATE_LUT_SIZE - 1);
return LUTIndex{index};
}
static inline LUTIndex normalizeLUTIndex(Int64 index)
{
if (unlikely(index < 0))
return LUTIndex(0);
if (index >= DATE_LUT_SIZE)
return LUTIndex(DATE_LUT_SIZE - 1);
return LUTIndex{index};
}
template <typename T>
friend inline LUTIndex operator+(const LUTIndex & index, const T v)
{
return LUTIndex{(index.toUnderType() + UInt32(v)) & date_lut_mask};
return normalizeLUTIndex(index.toUnderType() + UInt32(v));
}
template <typename T>
friend inline LUTIndex operator+(const T v, const LUTIndex & index)
{
return LUTIndex{(v + index.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(static_cast<Int64>(v + index.toUnderType()));
}
friend inline LUTIndex operator+(const LUTIndex & index, const LUTIndex & v)
{
return LUTIndex{(index.toUnderType() + v.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(static_cast<UInt32>(index.toUnderType() + v.toUnderType()));
}
template <typename T>
friend inline LUTIndex operator-(const LUTIndex & index, const T v)
{
return LUTIndex{(index.toUnderType() - UInt32(v)) & date_lut_mask};
return normalizeLUTIndex(static_cast<Int64>(index.toUnderType() - UInt32(v)));
}
template <typename T>
friend inline LUTIndex operator-(const T v, const LUTIndex & index)
{
return LUTIndex{(v - index.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(static_cast<Int64>(v - index.toUnderType()));
}
friend inline LUTIndex operator-(const LUTIndex & index, const LUTIndex & v)
{
return LUTIndex{(index.toUnderType() - v.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(static_cast<Int64>(index.toUnderType() - v.toUnderType()));
}
template <typename T>
friend inline LUTIndex operator*(const LUTIndex & index, const T v)
{
return LUTIndex{(index.toUnderType() * UInt32(v)) & date_lut_mask};
return normalizeLUTIndex(index.toUnderType() * UInt32(v));
}
template <typename T>
friend inline LUTIndex operator*(const T v, const LUTIndex & index)
{
return LUTIndex{(v * index.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(v * index.toUnderType());
}
template <typename T>
friend inline LUTIndex operator/(const LUTIndex & index, const T v)
{
return LUTIndex{(index.toUnderType() / UInt32(v)) & date_lut_mask};
return normalizeLUTIndex(index.toUnderType() / UInt32(v));
}
template <typename T>
friend inline LUTIndex operator/(const T v, const LUTIndex & index)
{
return LUTIndex{(UInt32(v) / index.toUnderType()) & date_lut_mask};
return normalizeLUTIndex(UInt32(v) / index.toUnderType());
}
public:
@ -168,14 +187,9 @@ public:
static_assert(sizeof(Values) == 16);
private:
/// Mask is all-ones to allow efficient protection against overflow.
static constexpr UInt32 date_lut_mask = 0x1ffff;
static_assert(date_lut_mask == DATE_LUT_SIZE - 1);
/// Offset to epoch in days (ExtendedDayNum) of the first day in LUT.
/// "epoch" is the Unix Epoch (starts at unix timestamp zero)
static constexpr UInt32 daynum_offset_epoch = 16436;
static constexpr UInt32 daynum_offset_epoch = 25567;
static_assert(daynum_offset_epoch == (1970 - DATE_LUT_MIN_YEAR) * 365 + (1970 - DATE_LUT_MIN_YEAR / 4 * 4) / 4);
/// Lookup table is indexed by LUTIndex.
@ -232,12 +246,12 @@ private:
static inline LUTIndex toLUTIndex(DayNum d)
{
return LUTIndex{(d + daynum_offset_epoch) & date_lut_mask};
return normalizeLUTIndex(d + daynum_offset_epoch);
}
static inline LUTIndex toLUTIndex(ExtendedDayNum d)
{
return LUTIndex{static_cast<UInt32>(d + daynum_offset_epoch) & date_lut_mask};
return normalizeLUTIndex(static_cast<UInt32>(d + daynum_offset_epoch));
}
inline LUTIndex toLUTIndex(Time t) const
@ -1062,7 +1076,7 @@ public:
auto year_lut_index = (year - DATE_LUT_MIN_YEAR) * 12 + month - 1;
UInt32 index = years_months_lut[year_lut_index].toUnderType() + day_of_month - 1;
/// When date is out of range, default value is DATE_LUT_SIZE - 1 (2283-11-11)
/// When date is out of range, default value is DATE_LUT_SIZE - 1 (2299-12-31)
return LUTIndex{std::min(index, static_cast<UInt32>(DATE_LUT_SIZE - 1))};
}

View File

@ -79,12 +79,13 @@ FailuresCount countFailures(const ::testing::TestResult & test_result)
TEST(DateLUTTest, makeDayNumTest)
{
const DateLUTImpl & lut = DateLUT::instance("UTC");
EXPECT_EQ(0, lut.makeDayNum(1924, 12, 31));
EXPECT_EQ(-1, lut.makeDayNum(1924, 12, 31, -1));
EXPECT_EQ(0, lut.makeDayNum(1899, 12, 31));
EXPECT_EQ(-1, lut.makeDayNum(1899, 12, 31, -1));
EXPECT_EQ(-25567, lut.makeDayNum(1900, 1, 1));
EXPECT_EQ(-16436, lut.makeDayNum(1925, 1, 1));
EXPECT_EQ(0, lut.makeDayNum(1970, 1, 1));
EXPECT_EQ(114635, lut.makeDayNum(2283, 11, 11));
EXPECT_EQ(114635, lut.makeDayNum(2500, 12, 25));
EXPECT_EQ(120529, lut.makeDayNum(2300, 12, 31));
EXPECT_EQ(120529, lut.makeDayNum(2500, 12, 25));
}

View File

@ -292,7 +292,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
const char * dest_start = dest;
const UInt32 items_count = source_size / sizeof(ValueType);
unalignedStore<UInt32>(dest, items_count);
unalignedStoreLE<UInt32>(dest, items_count);
dest += sizeof(items_count);
ValueType prev_value{};
@ -300,8 +300,8 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
if (source < source_end)
{
prev_value = unalignedLoad<ValueType>(source);
unalignedStore<ValueType>(dest, prev_value);
prev_value = unalignedLoadLE<ValueType>(source);
unalignedStoreLE<ValueType>(dest, prev_value);
source += sizeof(prev_value);
dest += sizeof(prev_value);
@ -309,10 +309,10 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
if (source < source_end)
{
const ValueType curr_value = unalignedLoad<ValueType>(source);
const ValueType curr_value = unalignedLoadLE<ValueType>(source);
prev_delta = curr_value - prev_value;
unalignedStore<UnsignedDeltaType>(dest, prev_delta);
unalignedStoreLE<UnsignedDeltaType>(dest, prev_delta);
source += sizeof(curr_value);
dest += sizeof(prev_delta);
@ -324,7 +324,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest)
int item = 2;
for (; source < source_end; source += sizeof(ValueType), ++item)
{
const ValueType curr_value = unalignedLoad<ValueType>(source);
const ValueType curr_value = unalignedLoadLE<ValueType>(source);
const UnsignedDeltaType delta = curr_value - prev_value;
const UnsignedDeltaType double_delta = delta - prev_delta;
@ -368,7 +368,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest,
if (source + sizeof(UInt32) > source_end)
return;
const UInt32 items_count = unalignedLoad<UInt32>(source);
const UInt32 items_count = unalignedLoadLE<UInt32>(source);
source += sizeof(items_count);
ValueType prev_value{};
@ -378,10 +378,10 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest,
if (source + sizeof(ValueType) > source_end || items_count < 1)
return;
prev_value = unalignedLoad<ValueType>(source);
prev_value = unalignedLoadLE<ValueType>(source);
if (dest + sizeof(prev_value) > output_end)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress the data");
unalignedStore<ValueType>(dest, prev_value);
unalignedStoreLE<ValueType>(dest, prev_value);
source += sizeof(prev_value);
dest += sizeof(prev_value);
@ -390,11 +390,11 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest,
if (source + sizeof(UnsignedDeltaType) > source_end || items_count < 2)
return;
prev_delta = unalignedLoad<UnsignedDeltaType>(source);
prev_delta = unalignedLoadLE<UnsignedDeltaType>(source);
prev_value = prev_value + static_cast<ValueType>(prev_delta);
if (dest + sizeof(prev_value) > output_end)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress the data");
unalignedStore<ValueType>(dest, prev_value);
unalignedStoreLE<ValueType>(dest, prev_value);
source += sizeof(prev_delta);
dest += sizeof(prev_value);
@ -427,7 +427,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest,
const ValueType curr_value = prev_value + delta;
if (dest + sizeof(curr_value) > output_end)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Cannot decompress the data");
unalignedStore<ValueType>(dest, curr_value);
unalignedStoreLE<ValueType>(dest, curr_value);
dest += sizeof(curr_value);
prev_delta = curr_value - prev_value;

View File

@ -208,7 +208,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
const UInt32 items_count = source_size / sizeof(T);
unalignedStore<UInt32>(dest, items_count);
unalignedStoreLE<UInt32>(dest, items_count);
dest += sizeof(items_count);
T prev_value{};
@ -217,8 +217,8 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
if (source < source_end)
{
prev_value = unalignedLoad<T>(source);
unalignedStore<T>(dest, prev_value);
prev_value = unalignedLoadLE<T>(source);
unalignedStoreLE<T>(dest, prev_value);
source += sizeof(prev_value);
dest += sizeof(prev_value);
@ -228,7 +228,7 @@ UInt32 compressDataForType(const char * source, UInt32 source_size, char * dest,
while (source < source_end)
{
const T curr_value = unalignedLoad<T>(source);
const T curr_value = unalignedLoadLE<T>(source);
source += sizeof(curr_value);
const auto xored_data = curr_value ^ prev_value;
@ -274,7 +274,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
if (source + sizeof(UInt32) > source_end)
return;
const UInt32 items_count = unalignedLoad<UInt32>(source);
const UInt32 items_count = unalignedLoadLE<UInt32>(source);
source += sizeof(items_count);
T prev_value{};
@ -283,8 +283,8 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
if (source + sizeof(T) > source_end || items_count < 1)
return;
prev_value = unalignedLoad<T>(source);
unalignedStore<T>(dest, prev_value);
prev_value = unalignedLoadLE<T>(source);
unalignedStoreLE<T>(dest, prev_value);
source += sizeof(prev_value);
dest += sizeof(prev_value);
@ -326,7 +326,7 @@ void decompressDataForType(const char * source, UInt32 source_size, char * dest)
}
// else: 0b0 prefix - use prev_value
unalignedStore<T>(dest, curr_value);
unalignedStoreLE<T>(dest, curr_value);
dest += sizeof(curr_value);
prev_xored_info = curr_xored_info;

View File

@ -86,8 +86,8 @@ UInt32 ICompressionCodec::compress(const char * source, UInt32 source_size, char
UInt8 header_size = getHeaderSize();
/// Write data from header_size
UInt32 compressed_bytes_written = doCompressData(source, source_size, &dest[header_size]);
unalignedStore<UInt32>(&dest[1], compressed_bytes_written + header_size);
unalignedStore<UInt32>(&dest[5], source_size);
unalignedStoreLE<UInt32>(&dest[1], compressed_bytes_written + header_size);
unalignedStoreLE<UInt32>(&dest[5], source_size);
return header_size + compressed_bytes_written;
}
@ -112,7 +112,7 @@ UInt32 ICompressionCodec::decompress(const char * source, UInt32 source_size, ch
UInt32 ICompressionCodec::readCompressedBlockSize(const char * source)
{
UInt32 compressed_block_size = unalignedLoad<UInt32>(&source[1]);
UInt32 compressed_block_size = unalignedLoadLE<UInt32>(&source[1]);
if (compressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with compressed block size 0");
return compressed_block_size;
@ -121,7 +121,7 @@ UInt32 ICompressionCodec::readCompressedBlockSize(const char * source)
UInt32 ICompressionCodec::readDecompressedBlockSize(const char * source)
{
UInt32 decompressed_block_size = unalignedLoad<UInt32>(&source[5]);
UInt32 decompressed_block_size = unalignedLoadLE<UInt32>(&source[5]);
if (decompressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with decompressed block size 0");
return decompressed_block_size;

View File

@ -175,7 +175,7 @@ private:
throw std::runtime_error("No more data to read");
}
current_value = unalignedLoad<T>(data);
current_value = unalignedLoadLE<T>(data);
data = reinterpret_cast<const char *>(data) + sizeof(T);
}
};
@ -371,7 +371,7 @@ CodecTestSequence makeSeq(Args && ... args)
char * write_pos = data.data();
for (const auto & v : vals)
{
unalignedStore<T>(write_pos, v);
unalignedStoreLE<T>(write_pos, v);
write_pos += sizeof(v);
}
@ -393,7 +393,7 @@ CodecTestSequence generateSeq(Generator gen, const char* gen_name, B Begin = 0,
{
const T v = gen(static_cast<T>(i));
unalignedStore<T>(write_pos, v);
unalignedStoreLE<T>(write_pos, v);
write_pos += sizeof(v);
}

View File

@ -705,6 +705,10 @@ static constexpr UInt64 operator""_GiB(unsigned long long value)
M(Bool, input_format_arrow_skip_columns_with_unsupported_types_in_schema_inference, false, "Skip columns with unsupported types while schema inference for format Arrow", 0) \
M(String, column_names_for_schema_inference, "", "The list of column names to use in schema inference for formats without column names. The format: 'column1,column2,column3,...'", 0) \
M(Bool, input_format_json_read_bools_as_numbers, true, "Allow to parse bools as numbers in JSON input formats", 0) \
M(Bool, input_format_json_try_infer_numbers_from_strings, true, "Try to infer numbers from string fields while schema inference", 0) \
M(Bool, input_format_try_infer_integers, true, "Try to infer numbers from string fields while schema inference in text formats", 0) \
M(Bool, input_format_try_infer_dates, true, "Try to infer dates from string fields while schema inference in text formats", 0) \
M(Bool, input_format_try_infer_datetimes, true, "Try to infer datetimes from string fields while schema inference in text formats", 0) \
M(Bool, input_format_protobuf_flatten_google_wrappers, false, "Enable Google wrappers for regular non-nested columns, e.g. google.protobuf.StringValue 'str' for String column 'str'. For Nullable columns empty wrappers are recognized as defaults, and missing as nulls", 0) \
M(Bool, output_format_protobuf_nullables_with_google_wrappers, false, "When serializing Nullable columns with Google wrappers, serialize default values as empty wrappers. If turned off, default and null values are not serialized", 0) \
M(UInt64, input_format_csv_skip_first_lines, 0, "Skip specified number of lines at the beginning of data in CSV format", 0) \

View File

@ -1013,7 +1013,11 @@ void BaseDaemon::setupWatchdog()
/// If streaming compression of logs is used then we write watchdog logs to cerr
if (config().getRawString("logger.stream_compress", "false") == "true")
{
Poco::AutoPtr<OwnPatternFormatter> pf = new OwnPatternFormatter;
Poco::AutoPtr<OwnPatternFormatter> pf;
if (config().getString("logger.formatting", "") == "json")
pf = new OwnJSONPatternFormatter;
else
pf = new OwnPatternFormatter;
Poco::AutoPtr<DB::OwnFormattingChannel> log = new DB::OwnFormattingChannel(pf, new Poco::ConsoleChannel(std::cerr));
logger().setChannel(log);
}

View File

@ -0,0 +1,178 @@
#include <DataTypes/transformTypesRecursively.h>
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeMap.h>
#include <DataTypes/DataTypeTuple.h>
#include <DataTypes/DataTypeNullable.h>
namespace DB
{
void transformTypesRecursively(DataTypes & types, std::function<void(DataTypes &)> transform_simple_types, std::function<void(DataTypes &)> transform_complex_types)
{
{
/// Arrays
bool have_array = false;
bool all_arrays = true;
DataTypes nested_types;
for (const auto & type : types)
{
if (const DataTypeArray * type_array = typeid_cast<const DataTypeArray *>(type.get()))
{
have_array = true;
nested_types.push_back(type_array->getNestedType());
}
else
all_arrays = false;
}
if (have_array)
{
if (all_arrays)
{
transformTypesRecursively(nested_types, transform_simple_types, transform_complex_types);
for (size_t i = 0; i != types.size(); ++i)
types[i] = std::make_shared<DataTypeArray>(nested_types[i]);
}
if (transform_complex_types)
transform_complex_types(types);
return;
}
}
{
/// Tuples
bool have_tuple = false;
bool all_tuples = true;
size_t tuple_size = 0;
std::vector<DataTypes> nested_types;
for (const auto & type : types)
{
if (const DataTypeTuple * type_tuple = typeid_cast<const DataTypeTuple *>(type.get()))
{
if (!have_tuple)
{
tuple_size = type_tuple->getElements().size();
nested_types.resize(tuple_size);
for (size_t elem_idx = 0; elem_idx < tuple_size; ++elem_idx)
nested_types[elem_idx].reserve(types.size());
}
else if (tuple_size != type_tuple->getElements().size())
return;
have_tuple = true;
for (size_t elem_idx = 0; elem_idx < tuple_size; ++elem_idx)
nested_types[elem_idx].emplace_back(type_tuple->getElements()[elem_idx]);
}
else
all_tuples = false;
}
if (have_tuple)
{
if (all_tuples)
{
std::vector<DataTypes> transposed_nested_types(types.size());
for (size_t elem_idx = 0; elem_idx < tuple_size; ++elem_idx)
{
transformTypesRecursively(nested_types[elem_idx], transform_simple_types, transform_complex_types);
for (size_t i = 0; i != types.size(); ++i)
transposed_nested_types[i].push_back(nested_types[elem_idx][i]);
}
for (size_t i = 0; i != types.size(); ++i)
types[i] = std::make_shared<DataTypeTuple>(transposed_nested_types[i]);
}
if (transform_complex_types)
transform_complex_types(types);
return;
}
}
{
/// Maps
bool have_maps = false;
bool all_maps = true;
DataTypes key_types;
DataTypes value_types;
key_types.reserve(types.size());
value_types.reserve(types.size());
for (const auto & type : types)
{
if (const DataTypeMap * type_map = typeid_cast<const DataTypeMap *>(type.get()))
{
have_maps = true;
key_types.emplace_back(type_map->getKeyType());
value_types.emplace_back(type_map->getValueType());
}
else
all_maps = false;
}
if (have_maps)
{
if (all_maps)
{
transformTypesRecursively(key_types, transform_simple_types, transform_complex_types);
transformTypesRecursively(value_types, transform_simple_types, transform_complex_types);
for (size_t i = 0; i != types.size(); ++i)
types[i] = std::make_shared<DataTypeMap>(key_types[i], value_types[i]);
}
if (transform_complex_types)
transform_complex_types(types);
return;
}
}
{
/// Nullable
bool have_nullable = false;
std::vector<UInt8> is_nullable;
is_nullable.reserve(types.size());
DataTypes nested_types;
nested_types.reserve(types.size());
for (const auto & type : types)
{
if (const DataTypeNullable * type_nullable = typeid_cast<const DataTypeNullable *>(type.get()))
{
have_nullable = true;
is_nullable.push_back(1);
nested_types.push_back(type_nullable->getNestedType());
}
else
{
is_nullable.push_back(0);
nested_types.push_back(type);
}
}
if (have_nullable)
{
transformTypesRecursively(nested_types, transform_simple_types, transform_complex_types);
for (size_t i = 0; i != types.size(); ++i)
{
if (is_nullable[i])
types[i] = makeNullable(nested_types[i]);
else
types[i] = nested_types[i];
}
return;
}
}
transform_simple_types(types);
}
}

View File

@ -0,0 +1,17 @@
#pragma once
#include <DataTypes/IDataType.h>
#include <functional>
namespace DB
{
/// Function that applies custom transformation functions to provided types recursively.
/// Implementation is similar to function getLeastSuperType:
/// If all types are Array/Map/Tuple/Nullable, this function will be called to nested types.
/// If not all types are the same complex type (Array/Map/Tuple), this function won't be called to nested types.
/// Function transform_simple_types will be applied to resulting simple types after all recursive calls.
/// Function transform_complex_types will be applied to complex types (Array/Map/Tuple) after recursive call to their nested types.
void transformTypesRecursively(DataTypes & types, std::function<void(DataTypes &)> transform_simple_types, std::function<void(DataTypes &)> transform_complex_types);
}

View File

@ -9,8 +9,12 @@
#include <DataTypes/DataTypeArray.h>
#include <DataTypes/DataTypeNothing.h>
#include <DataTypes/DataTypeTuple.h>
#include <DataTypes/getLeastSupertype.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime64.h>
#include <DataTypes/DataTypeMap.h>
#include <DataTypes/DataTypeObject.h>
#include <DataTypes/getLeastSupertype.h>
#include <DataTypes/transformTypesRecursively.h>
#include <IO/ReadHelpers.h>
#include <IO/WriteHelpers.h>
#include <IO/ReadBufferFromString.h>
@ -255,7 +259,220 @@ String readStringByEscapingRule(ReadBuffer & buf, FormatSettings::EscapingRule e
return readByEscapingRule<true>(buf, escaping_rule, format_settings);
}
static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
void transformInferredTypesIfNeededImpl(DataTypes & types, const FormatSettings & settings, bool is_json, const std::unordered_set<const IDataType *> * numbers_parsed_from_json_strings = nullptr)
{
/// Do nothing if we didn't try to infer something special.
if (!settings.try_infer_integers && !settings.try_infer_dates && !settings.try_infer_datetimes && !is_json)
return;
auto transform_simple_types = [&](DataTypes & data_types)
{
/// If we have floats and integers convert them all to float.
if (settings.try_infer_integers)
{
bool have_floats = false;
bool have_integers = false;
for (const auto & type : data_types)
{
have_floats |= isFloat(type);
have_integers |= isInteger(type) && !isBool(type);
}
if (have_floats && have_integers)
{
for (auto & type : data_types)
{
if (isInteger(type))
type = std::make_shared<DataTypeFloat64>();
}
}
}
/// If we have only dates and datetimes, convert dates to datetime.
/// If we have date/datetimes and smth else, convert them to string, because
/// There is a special case when we inferred both Date/DateTime and Int64 from Strings,
/// for example: "arr: ["2020-01-01", "2000"]" -> Tuple(Date, Int64),
/// so if we have Date/DateTime and smth else (not only String) we should
/// convert Date/DateTime back to String, so then we will be able to
/// convert Int64 back to String as well.
if (settings.try_infer_dates || settings.try_infer_datetimes)
{
bool have_dates = false;
bool have_datetimes = false;
bool all_dates_or_datetimes = true;
for (const auto & type : data_types)
{
have_dates |= isDate(type);
have_datetimes |= isDateTime64(type);
all_dates_or_datetimes &= isDate(type) || isDateTime64(type);
}
if (!all_dates_or_datetimes && (have_dates || have_datetimes))
{
for (auto & type : data_types)
{
if (isDate(type) || isDateTime64(type))
type = std::make_shared<DataTypeString>();
}
}
else if (have_dates && have_datetimes)
{
for (auto & type : data_types)
{
if (isDate(type))
type = std::make_shared<DataTypeDateTime64>(9);
}
}
}
if (!is_json)
return;
/// Check settings specific for JSON formats.
/// If we have numbers and strings, convert numbers to strings.
if (settings.json.try_infer_numbers_from_strings)
{
bool have_strings = false;
bool have_numbers = false;
for (const auto & type : data_types)
{
have_strings |= isString(type);
have_numbers |= isNumber(type);
}
if (have_strings && have_numbers)
{
for (auto & type : data_types)
{
if (isNumber(type) && (!numbers_parsed_from_json_strings || numbers_parsed_from_json_strings->contains(type.get())))
type = std::make_shared<DataTypeString>();
}
}
}
if (settings.json.read_bools_as_numbers)
{
/// Note that have_floats and have_integers both cannot be
/// equal to true as in one of previous checks we convert
/// integers to floats if we have both.
bool have_floats = false;
bool have_integers = false;
bool have_bools = false;
for (const auto & type : data_types)
{
have_floats |= isFloat(type);
have_integers |= isInteger(type) && !isBool(type);
have_bools |= isBool(type);
}
if (have_bools && (have_integers || have_floats))
{
for (auto & type : data_types)
{
if (isBool(type))
{
if (have_integers)
type = std::make_shared<DataTypeInt64>();
else
type = std::make_shared<DataTypeFloat64>();
}
}
}
}
};
auto transform_complex_types = [&](DataTypes & data_types)
{
if (!is_json)
return;
bool have_maps = false;
bool have_objects = false;
bool are_maps_equal = true;
DataTypePtr first_map_type;
for (const auto & type : data_types)
{
if (isMap(type))
{
if (!have_maps)
{
first_map_type = type;
have_maps = true;
}
else
{
are_maps_equal &= type->equals(*first_map_type);
}
}
else if (isObject(type))
{
have_objects = true;
}
}
if (have_maps && (have_objects || !are_maps_equal))
{
for (auto & type : data_types)
{
if (isMap(type))
type = std::make_shared<DataTypeObject>("json", true);
}
}
};
transformTypesRecursively(types, transform_simple_types, transform_complex_types);
}
void transformInferredTypesIfNeeded(DataTypes & types, const FormatSettings & settings, FormatSettings::EscapingRule escaping_rule)
{
transformInferredTypesIfNeededImpl(types, settings, escaping_rule == FormatSettings::EscapingRule::JSON);
}
void transformInferredTypesIfNeeded(DataTypePtr & first, DataTypePtr & second, const FormatSettings & settings, FormatSettings::EscapingRule escaping_rule)
{
DataTypes types = {first, second};
transformInferredTypesIfNeeded(types, settings, escaping_rule);
first = std::move(types[0]);
second = std::move(types[1]);
}
void transformInferredJSONTypesIfNeeded(DataTypes & types, const FormatSettings & settings, const std::unordered_set<const IDataType *> * numbers_parsed_from_json_strings)
{
transformInferredTypesIfNeededImpl(types, settings, true, numbers_parsed_from_json_strings);
}
void transformInferredJSONTypesIfNeeded(DataTypePtr & first, DataTypePtr & second, const FormatSettings & settings)
{
DataTypes types = {first, second};
transformInferredJSONTypesIfNeeded(types, settings);
first = std::move(types[0]);
second = std::move(types[1]);
}
DataTypePtr tryInferDateOrDateTime(const std::string_view & field, const FormatSettings & settings)
{
if (settings.try_infer_dates)
{
ReadBufferFromString buf(field);
DayNum tmp;
if (tryReadDateText(tmp, buf) && buf.eof())
return makeNullable(std::make_shared<DataTypeDate>());
}
if (settings.try_infer_datetimes)
{
ReadBufferFromString buf(field);
DateTime64 tmp;
if (tryReadDateTime64Text(tmp, 9, buf) && buf.eof())
return makeNullable(std::make_shared<DataTypeDateTime64>(9));
}
return nullptr;
}
static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBufferFromString & buf, const FormatSettings & settings)
{
if (buf.eof())
return nullptr;
@ -279,7 +496,7 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
else
first = false;
auto nested_type = determineDataTypeForSingleFieldImpl(buf);
auto nested_type = determineDataTypeForSingleFieldImpl(buf, settings);
if (!nested_type)
return nullptr;
@ -294,6 +511,8 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
if (nested_types.empty())
return std::make_shared<DataTypeArray>(std::make_shared<DataTypeNothing>());
transformInferredTypesIfNeeded(nested_types, settings);
auto least_supertype = tryGetLeastSupertype(nested_types);
if (!least_supertype)
return nullptr;
@ -320,7 +539,7 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
else
first = false;
auto nested_type = determineDataTypeForSingleFieldImpl(buf);
auto nested_type = determineDataTypeForSingleFieldImpl(buf, settings);
if (!nested_type)
return nullptr;
@ -355,7 +574,7 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
else
first = false;
auto key_type = determineDataTypeForSingleFieldImpl(buf);
auto key_type = determineDataTypeForSingleFieldImpl(buf, settings);
if (!key_type)
return nullptr;
@ -366,7 +585,7 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
return nullptr;
skipWhitespaceIfAny(buf);
auto value_type = determineDataTypeForSingleFieldImpl(buf);
auto value_type = determineDataTypeForSingleFieldImpl(buf, settings);
if (!value_type)
return nullptr;
@ -382,6 +601,9 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
if (key_types.empty())
return std::make_shared<DataTypeMap>(std::make_shared<DataTypeNothing>(), std::make_shared<DataTypeNothing>());
transformInferredTypesIfNeeded(key_types, settings);
transformInferredTypesIfNeeded(value_types, settings);
auto key_least_supertype = tryGetLeastSupertype(key_types);
auto value_least_supertype = tryGetLeastSupertype(value_types);
@ -398,9 +620,11 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
if (*buf.position() == '\'')
{
++buf.position();
String field;
while (!buf.eof())
{
char * next_pos = find_first_symbols<'\\', '\''>(buf.position(), buf.buffer().end());
field.append(buf.position(), next_pos);
buf.position() = next_pos;
if (!buf.hasPendingData())
@ -409,6 +633,7 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
if (*buf.position() == '\'')
break;
field.push_back(*buf.position());
if (*buf.position() == '\\')
++buf.position();
}
@ -417,6 +642,9 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
return nullptr;
++buf.position();
if (auto type = tryInferDateOrDateTime(field, settings))
return type;
return std::make_shared<DataTypeString>();
}
@ -430,15 +658,29 @@ static DataTypePtr determineDataTypeForSingleFieldImpl(ReadBuffer & buf)
/// Number
Float64 tmp;
auto * pos_before_float = buf.position();
if (tryReadFloatText(tmp, buf))
{
if (settings.try_infer_integers)
{
auto * float_end_pos = buf.position();
buf.position() = pos_before_float;
Int64 tmp_int;
if (tryReadIntText(tmp_int, buf) && buf.position() == float_end_pos)
return std::make_shared<DataTypeInt64>();
buf.position() = float_end_pos;
}
return std::make_shared<DataTypeFloat64>();
}
return nullptr;
}
static DataTypePtr determineDataTypeForSingleField(ReadBuffer & buf)
static DataTypePtr determineDataTypeForSingleField(ReadBufferFromString & buf, const FormatSettings & settings)
{
return makeNullableRecursivelyAndCheckForNothing(determineDataTypeForSingleFieldImpl(buf));
return makeNullableRecursivelyAndCheckForNothing(determineDataTypeForSingleFieldImpl(buf, settings));
}
DataTypePtr determineDataTypeByEscapingRule(const String & field, const FormatSettings & format_settings, FormatSettings::EscapingRule escaping_rule)
@ -448,11 +690,11 @@ DataTypePtr determineDataTypeByEscapingRule(const String & field, const FormatSe
case FormatSettings::EscapingRule::Quoted:
{
ReadBufferFromString buf(field);
auto type = determineDataTypeForSingleField(buf);
auto type = determineDataTypeForSingleField(buf, format_settings);
return buf.eof() ? type : nullptr;
}
case FormatSettings::EscapingRule::JSON:
return JSONUtils::getDataTypeFromField(field);
return JSONUtils::getDataTypeFromField(field, format_settings);
case FormatSettings::EscapingRule::CSV:
{
if (!format_settings.csv.input_format_use_best_effort_in_schema_inference)
@ -466,9 +708,13 @@ DataTypePtr determineDataTypeByEscapingRule(const String & field, const FormatSe
if (field.size() > 1 && ((field.front() == '\'' && field.back() == '\'') || (field.front() == '"' && field.back() == '"')))
{
ReadBufferFromString buf(std::string_view(field.data() + 1, field.size() - 2));
auto data = std::string_view(field.data() + 1, field.size() - 2);
if (auto date_type = tryInferDateOrDateTime(data, format_settings))
return date_type;
ReadBufferFromString buf(data);
/// Try to determine the type of value inside quotes
auto type = determineDataTypeForSingleField(buf);
auto type = determineDataTypeForSingleField(buf, format_settings);
if (!type)
return nullptr;
@ -481,6 +727,14 @@ DataTypePtr determineDataTypeByEscapingRule(const String & field, const FormatSe
}
/// Case when CSV value is not in quotes. Check if it's a number, and if not, determine it's as a string.
if (format_settings.try_infer_integers)
{
ReadBufferFromString buf(field);
Int64 tmp_int;
if (tryReadIntText(tmp_int, buf) && buf.eof())
return makeNullable(std::make_shared<DataTypeInt64>());
}
ReadBufferFromString buf(field);
Float64 tmp;
if (tryReadFloatText(tmp, buf) && buf.eof())
@ -500,8 +754,11 @@ DataTypePtr determineDataTypeByEscapingRule(const String & field, const FormatSe
if (field == format_settings.bool_false_representation || field == format_settings.bool_true_representation)
return DataTypeFactory::instance().get("Nullable(Bool)");
if (auto date_type = tryInferDateOrDateTime(field, format_settings))
return date_type;
ReadBufferFromString buf(field);
auto type = determineDataTypeForSingleField(buf);
auto type = determineDataTypeForSingleField(buf, format_settings);
if (!buf.eof())
return makeNullable(std::make_shared<DataTypeString>());

View File

@ -60,4 +60,21 @@ DataTypes determineDataTypesByEscapingRule(const std::vector<String> & fields, c
DataTypePtr getDefaultDataTypeForEscapingRule(FormatSettings::EscapingRule escaping_rule);
DataTypes getDefaultDataTypeForEscapingRules(const std::vector<FormatSettings::EscapingRule> & escaping_rules);
/// Try to infer Date or Datetime from string if corresponding settings are enabled.
DataTypePtr tryInferDateOrDateTime(const std::string_view & field, const FormatSettings & settings);
/// Check if we need to transform types inferred from data and transform it if necessary.
/// It's used when we try to infer some not ordinary types from another types.
/// For example dates from strings, we should check if dates were inferred from all strings
/// in the same way and if not, transform inferred dates back to strings.
/// For example, if we have array of strings and we tried to infer dates from them,
/// to make the result type Array(Date) we should ensure that all strings were
/// successfully parsed as dated and if not, convert all dates back to strings and make result type Array(String).
void transformInferredTypesIfNeeded(DataTypes & types, const FormatSettings & settings, FormatSettings::EscapingRule escaping_rule = FormatSettings::EscapingRule::Escaped);
void transformInferredTypesIfNeeded(DataTypePtr & first, DataTypePtr & second, const FormatSettings & settings, FormatSettings::EscapingRule escaping_rule = FormatSettings::EscapingRule::Escaped);
/// Same as transformInferredTypesIfNeeded but takes into account settings that are special for JSON formats.
void transformInferredJSONTypesIfNeeded(DataTypes & types, const FormatSettings & settings, const std::unordered_set<const IDataType *> * numbers_parsed_from_json_strings = nullptr);
void transformInferredJSONTypesIfNeeded(DataTypePtr & first, DataTypePtr & second, const FormatSettings & settings);
}

View File

@ -94,6 +94,7 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings)
format_settings.json.quote_64bit_integers = settings.output_format_json_quote_64bit_integers;
format_settings.json.quote_denormals = settings.output_format_json_quote_denormals;
format_settings.json.read_bools_as_numbers = settings.input_format_json_read_bools_as_numbers;
format_settings.json.try_infer_numbers_from_strings = settings.input_format_json_try_infer_numbers_from_strings;
format_settings.null_as_default = settings.input_format_null_as_default;
format_settings.decimal_trailing_zeros = settings.output_format_decimal_trailing_zeros;
format_settings.parquet.row_group_size = settings.output_format_parquet_row_group_size;
@ -165,6 +166,9 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings)
format_settings.sql_insert.table_name = settings.output_format_sql_insert_table_name;
format_settings.sql_insert.use_replace = settings.output_format_sql_insert_use_replace;
format_settings.sql_insert.quote_names = settings.output_format_sql_insert_quote_names;
format_settings.try_infer_integers = settings.input_format_try_infer_integers;
format_settings.try_infer_dates = settings.input_format_try_infer_dates;
format_settings.try_infer_datetimes = settings.input_format_try_infer_datetimes;
/// Validate avro_schema_registry_url with RemoteHostFilter when non-empty and in Server context
if (format_settings.schema.is_server)

View File

@ -38,6 +38,9 @@ struct FormatSettings
UInt64 max_rows_to_read_for_schema_inference = 100;
String column_names_for_schema_inference;
bool try_infer_integers = false;
bool try_infer_dates = false;
bool try_infer_datetimes = false;
enum class DateTimeInputFormat
{
@ -142,6 +145,7 @@ struct FormatSettings
bool named_tuples_as_objects = false;
bool serialize_as_strings = false;
bool read_bools_as_numbers = true;
bool try_infer_numbers_from_strings = false;
} json;
struct

View File

@ -1,6 +1,7 @@
#include <IO/ReadHelpers.h>
#include <Formats/JSONUtils.h>
#include <Formats/ReadSchemaUtils.h>
#include <Formats/EscapingRuleUtils.h>
#include <IO/ReadBufferFromString.h>
#include <IO/WriteBufferValidUTF8.h>
#include <DataTypes/Serializations/SerializationNullable.h>
@ -121,7 +122,7 @@ namespace JSONUtils
}
template <class Element>
DataTypePtr getDataTypeFromFieldImpl(const Element & field)
DataTypePtr getDataTypeFromFieldImpl(const Element & field, const FormatSettings & settings, std::unordered_set<const IDataType *> & numbers_parsed_from_json_strings)
{
if (field.isNull())
return nullptr;
@ -129,11 +130,48 @@ namespace JSONUtils
if (field.isBool())
return DataTypeFactory::instance().get("Nullable(Bool)");
if (field.isInt64() || field.isUInt64() || field.isDouble())
if (field.isInt64() || field.isUInt64())
{
if (settings.try_infer_integers)
return makeNullable(std::make_shared<DataTypeInt64>());
return makeNullable(std::make_shared<DataTypeFloat64>());
}
if (field.isDouble())
return makeNullable(std::make_shared<DataTypeFloat64>());
if (field.isString())
{
if (auto date_type = tryInferDateOrDateTime(field.getString(), settings))
return date_type;
if (!settings.json.try_infer_numbers_from_strings)
return makeNullable(std::make_shared<DataTypeString>());
ReadBufferFromString buf(field.getString());
if (settings.try_infer_integers)
{
Int64 tmp_int;
if (tryReadIntText(tmp_int, buf) && buf.eof())
{
auto type = std::make_shared<DataTypeInt64>();
numbers_parsed_from_json_strings.insert(type.get());
return makeNullable(type);
}
}
Float64 tmp;
if (tryReadFloatText(tmp, buf) && buf.eof())
{
auto type = std::make_shared<DataTypeFloat64>();
numbers_parsed_from_json_strings.insert(type.get());
return makeNullable(type);
}
return makeNullable(std::make_shared<DataTypeString>());
}
if (field.isArray())
{
@ -145,20 +183,32 @@ namespace JSONUtils
DataTypes nested_data_types;
/// If this array contains fields with different types we will treat it as Tuple.
bool is_tuple = false;
bool are_types_the_same = true;
for (const auto element : array)
{
auto type = getDataTypeFromFieldImpl(element);
auto type = getDataTypeFromFieldImpl(element, settings, numbers_parsed_from_json_strings);
if (!type)
return nullptr;
if (!nested_data_types.empty() && type->getName() != nested_data_types.back()->getName())
is_tuple = true;
if (!nested_data_types.empty() && !type->equals(*nested_data_types.back()))
are_types_the_same = false;
nested_data_types.push_back(std::move(type));
}
if (is_tuple)
if (!are_types_the_same)
{
auto nested_types_copy = nested_data_types;
transformInferredJSONTypesIfNeeded(nested_types_copy, settings, &numbers_parsed_from_json_strings);
are_types_the_same = true;
for (size_t i = 1; i < nested_types_copy.size(); ++i)
are_types_the_same &= nested_types_copy[i]->equals(*nested_types_copy[i - 1]);
if (are_types_the_same)
nested_data_types = std::move(nested_types_copy);
}
if (!are_types_the_same)
return std::make_shared<DataTypeTuple>(nested_data_types);
return std::make_shared<DataTypeArray>(nested_data_types.back());
@ -167,38 +217,35 @@ namespace JSONUtils
if (field.isObject())
{
auto object = field.getObject();
DataTypePtr value_type;
bool is_object = false;
DataTypes value_types;
bool have_object_value = false;
for (const auto key_value_pair : object)
{
auto type = getDataTypeFromFieldImpl(key_value_pair.second);
auto type = getDataTypeFromFieldImpl(key_value_pair.second, settings, numbers_parsed_from_json_strings);
if (!type)
continue;
if (isObject(type))
{
is_object = true;
have_object_value = true;
break;
}
if (!value_type)
{
value_type = type;
}
else if (!value_type->equals(*type))
{
is_object = true;
break;
}
value_types.push_back(type);
}
if (is_object)
if (value_types.empty())
return nullptr;
transformInferredJSONTypesIfNeeded(value_types, settings, &numbers_parsed_from_json_strings);
bool are_types_equal = true;
for (size_t i = 1; i < value_types.size(); ++i)
are_types_equal &= value_types[i]->equals(*value_types[0]);
if (have_object_value || !are_types_equal)
return std::make_shared<DataTypeObject>("json", true);
if (value_type)
return std::make_shared<DataTypeMap>(std::make_shared<DataTypeString>(), value_type);
return nullptr;
return std::make_shared<DataTypeMap>(std::make_shared<DataTypeString>(), value_types[0]);
}
throw Exception{ErrorCodes::INCORRECT_DATA, "Unexpected JSON type"};
@ -215,18 +262,19 @@ namespace JSONUtils
#endif
}
DataTypePtr getDataTypeFromField(const String & field)
DataTypePtr getDataTypeFromField(const String & field, const FormatSettings & settings)
{
auto [parser, element] = getJSONParserAndElement();
bool parsed = parser.parse(field, element);
if (!parsed)
throw Exception(ErrorCodes::INCORRECT_DATA, "Cannot parse JSON object here: {}", field);
return getDataTypeFromFieldImpl(element);
std::unordered_set<const IDataType *> numbers_parsed_from_json_strings;
return getDataTypeFromFieldImpl(element, settings, numbers_parsed_from_json_strings);
}
template <class Extractor, const char opening_bracket, const char closing_bracket>
static DataTypes determineColumnDataTypesFromJSONEachRowDataImpl(ReadBuffer & in, bool /*json_strings*/, Extractor & extractor)
static DataTypes determineColumnDataTypesFromJSONEachRowDataImpl(ReadBuffer & in, const FormatSettings & settings, bool /*json_strings*/, Extractor & extractor)
{
String line = readJSONEachRowLineIntoStringImpl<opening_bracket, closing_bracket>(in);
auto [parser, element] = getJSONParserAndElement();
@ -238,8 +286,9 @@ namespace JSONUtils
DataTypes data_types;
data_types.reserve(fields.size());
std::unordered_set<const IDataType *> numbers_parsed_from_json_strings;
for (const auto & field : fields)
data_types.push_back(getDataTypeFromFieldImpl(field));
data_types.push_back(getDataTypeFromFieldImpl(field, settings, numbers_parsed_from_json_strings));
/// TODO: For JSONStringsEachRow/JSONCompactStringsEach all types will be strings.
/// Should we try to parse data inside strings somehow in this case?
@ -284,11 +333,11 @@ namespace JSONUtils
std::vector<String> column_names;
};
NamesAndTypesList readRowAndGetNamesAndDataTypesForJSONEachRow(ReadBuffer & in, bool json_strings)
NamesAndTypesList readRowAndGetNamesAndDataTypesForJSONEachRow(ReadBuffer & in, const FormatSettings & settings, bool json_strings)
{
JSONEachRowFieldsExtractor extractor;
auto data_types
= determineColumnDataTypesFromJSONEachRowDataImpl<JSONEachRowFieldsExtractor, '{', '}'>(in, json_strings, extractor);
= determineColumnDataTypesFromJSONEachRowDataImpl<JSONEachRowFieldsExtractor, '{', '}'>(in, settings, json_strings, extractor);
NamesAndTypesList result;
for (size_t i = 0; i != extractor.column_names.size(); ++i)
result.emplace_back(extractor.column_names[i], data_types[i]);
@ -313,10 +362,10 @@ namespace JSONUtils
}
};
DataTypes readRowAndGetDataTypesForJSONCompactEachRow(ReadBuffer & in, bool json_strings)
DataTypes readRowAndGetDataTypesForJSONCompactEachRow(ReadBuffer & in, const FormatSettings & settings, bool json_strings)
{
JSONCompactEachRowFieldsExtractor extractor;
return determineColumnDataTypesFromJSONEachRowDataImpl<JSONCompactEachRowFieldsExtractor, '[', ']'>(in, json_strings, extractor);
return determineColumnDataTypesFromJSONEachRowDataImpl<JSONCompactEachRowFieldsExtractor, '[', ']'>(in, settings, json_strings, extractor);
}

View File

@ -22,16 +22,16 @@ namespace JSONUtils
/// Parse JSON from string and convert it's type to ClickHouse type. Make the result type always Nullable.
/// JSON array with different nested types is treated as Tuple.
/// If cannot convert (for example when field contains null), return nullptr.
DataTypePtr getDataTypeFromField(const String & field);
DataTypePtr getDataTypeFromField(const String & field, const FormatSettings & settings);
/// Read row in JSONEachRow format and try to determine type for each field.
/// Return list of names and types.
/// If cannot determine the type of some field, return nullptr for it.
NamesAndTypesList readRowAndGetNamesAndDataTypesForJSONEachRow(ReadBuffer & in, bool json_strings);
NamesAndTypesList readRowAndGetNamesAndDataTypesForJSONEachRow(ReadBuffer & in, const FormatSettings & settings, bool json_strings);
/// Read row in JSONCompactEachRow format and try to determine type for each field.
/// If cannot determine the type of some field, return nullptr for it.
DataTypes readRowAndGetDataTypesForJSONCompactEachRow(ReadBuffer & in, bool json_strings);
DataTypes readRowAndGetDataTypesForJSONCompactEachRow(ReadBuffer & in, const FormatSettings & settings, bool json_strings);
bool nonTrivialPrefixAndSuffixCheckerJSONEachRowImpl(ReadBuffer & buf);

View File

@ -88,8 +88,13 @@ ColumnsDescription readSchemaFromFormat(
catch (...)
{
auto exception_message = getCurrentExceptionMessage(false);
throw Exception(ErrorCodes::CANNOT_EXTRACT_TABLE_STRUCTURE, "Cannot extract table structure from {} format file: {}. You can specify the structure manually", format_name, exception_message);
throw Exception(
ErrorCodes::CANNOT_EXTRACT_TABLE_STRUCTURE,
"Cannot extract table structure from {} format file:\n{}\nYou can specify the structure manually",
format_name,
exception_message);
}
++iterations;
if (is_eof)

View File

@ -52,6 +52,7 @@ public:
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1, 2, 3}; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
@ -69,7 +70,7 @@ public:
if (arguments.size() > 1)
{
const auto & hash_col = arguments[1];
if (!isString(hash_col.type) || !isColumnConst(*hash_col.column.get()))
if (!isString(hash_col.type))
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Second argument of function {} must be String, got {}",
@ -80,7 +81,7 @@ public:
if (arguments.size() > 2)
{
const auto & min_length_col = arguments[2];
if (!isUInt8(min_length_col.type) || !isColumnConst(*min_length_col.column.get()))
if (!isUInt8(min_length_col.type))
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Third argument of function {} must be UInt8, got {}",
@ -91,7 +92,7 @@ public:
if (arguments.size() > 3)
{
const auto & alphabet_col = arguments[3];
if (!isString(alphabet_col.type) || !isColumnConst(*alphabet_col.column.get()))
if (!isString(alphabet_col.type))
throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Fourth argument of function {} must be String, got {}",

View File

@ -599,8 +599,9 @@ template <typename Name> struct ConvertImpl<DataTypeFloat32, DataTypeDateTime, N
template <typename Name> struct ConvertImpl<DataTypeFloat64, DataTypeDateTime, Name>
: DateTimeTransformImpl<DataTypeFloat64, DataTypeDateTime, ToDateTimeTransform64Signed<Float64, UInt32>> {};
const time_t LUT_MIN_TIME = -1420070400l; // 1925-01-01 UTC
const time_t LUT_MAX_TIME = 9877248000l; // 2282-12-31 UTC
const time_t LUT_MIN_TIME = -2208988800l; // 1900-01-01 UTC
const time_t LUT_MAX_TIME = 10413791999l; // 2299-12-31 UTC
/** Conversion of numeric to DateTime64
*/

View File

@ -164,9 +164,9 @@ struct MakeDate32Traits
using ReturnDataType = DataTypeDate32;
using ReturnColumnType = ColumnInt32;
static constexpr auto MIN_YEAR = 1925;
static constexpr auto MAX_YEAR = 2283;
static constexpr auto MAX_DATE = YearMonthDayToSingleInt(MAX_YEAR, 11, 11);
static constexpr auto MIN_YEAR = 1900;
static constexpr auto MAX_YEAR = 2299;
static constexpr auto MAX_DATE = YearMonthDayToSingleInt(MAX_YEAR, 12, 31);
};
/// Common implementation for makeDateTime, makeDateTime64

View File

@ -97,7 +97,12 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
log_file->setProperty(Poco::FileChannel::PROP_ROTATEONOPEN, config.getRawString("logger.rotateOnOpen", "false"));
log_file->open();
Poco::AutoPtr<OwnPatternFormatter> pf = new OwnPatternFormatter;
Poco::AutoPtr<OwnPatternFormatter> pf;
if (config.getString("logger.formatting", "") == "json")
pf = new OwnJSONPatternFormatter;
else
pf = new OwnPatternFormatter;
Poco::AutoPtr<DB::OwnFormattingChannel> log = new DB::OwnFormattingChannel(pf, log_file);
log->setLevel(log_level);
@ -133,7 +138,12 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
error_log_file->setProperty(Poco::FileChannel::PROP_FLUSH, config.getRawString("logger.flush", "true"));
error_log_file->setProperty(Poco::FileChannel::PROP_ROTATEONOPEN, config.getRawString("logger.rotateOnOpen", "false"));
Poco::AutoPtr<OwnPatternFormatter> pf = new OwnPatternFormatter;
Poco::AutoPtr<OwnPatternFormatter> pf;
if (config.getString("logger.formatting", "") == "json")
pf = new OwnJSONPatternFormatter;
else
pf = new OwnPatternFormatter;
Poco::AutoPtr<DB::OwnFormattingChannel> errorlog = new DB::OwnFormattingChannel(pf, error_log_file);
errorlog->setLevel(errorlog_level);
@ -172,7 +182,12 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
}
syslog_channel->open();
Poco::AutoPtr<OwnPatternFormatter> pf = new OwnPatternFormatter;
Poco::AutoPtr<OwnPatternFormatter> pf;
if (config.getString("logger.formatting", "") == "json")
pf = new OwnJSONPatternFormatter;
else
pf = new OwnPatternFormatter;
Poco::AutoPtr<DB::OwnFormattingChannel> log = new DB::OwnFormattingChannel(pf, syslog_channel);
log->setLevel(syslog_level);
@ -195,7 +210,11 @@ void Loggers::buildLoggers(Poco::Util::AbstractConfiguration & config, Poco::Log
max_log_level = console_log_level;
}
Poco::AutoPtr<OwnPatternFormatter> pf = new OwnPatternFormatter(color_enabled);
Poco::AutoPtr<OwnPatternFormatter> pf;
if (config.getString("logger.formatting", "") == "json")
pf = new OwnJSONPatternFormatter;
else
pf = new OwnPatternFormatter(color_enabled);
Poco::AutoPtr<DB::OwnFormattingChannel> log = new DB::OwnFormattingChannel(pf, new Poco::ConsoleChannel);
log->setLevel(console_log_level);
split->addChannel(log, "console");

View File

@ -4,6 +4,7 @@
#include <Poco/Channel.h>
#include <Poco/FormattingChannel.h>
#include "ExtendedLogChannel.h"
#include "OwnJSONPatternFormatter.h"
#include "OwnPatternFormatter.h"

View File

@ -0,0 +1,104 @@
#include "OwnJSONPatternFormatter.h"
#include <functional>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteHelpers.h>
#include <Interpreters/InternalTextLogsQueue.h>
#include <base/terminalColors.h>
#include <Common/CurrentThread.h>
#include <Common/HashTable/Hash.h>
OwnJSONPatternFormatter::OwnJSONPatternFormatter() : OwnPatternFormatter("")
{
}
void OwnJSONPatternFormatter::formatExtended(const DB::ExtendedLogMessage & msg_ext, std::string & text) const
{
DB::WriteBufferFromString wb(text);
DB::FormatSettings settings;
const Poco::Message & msg = msg_ext.base;
DB::writeChar('{', wb);
writeJSONString("date_time", wb, settings);
DB::writeChar(':', wb);
DB::writeChar('\"', wb);
/// Change delimiters in date for compatibility with old logs.
writeDateTimeUnixTimestamp(msg_ext.time_seconds, 0, wb);
DB::writeChar('.', wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 100000) % 10), wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 10000) % 10), wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 1000) % 10), wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 100) % 10), wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 10) % 10), wb);
DB::writeChar('0' + ((msg_ext.time_microseconds / 1) % 10), wb);
DB::writeChar('\"', wb);
DB::writeChar(',', wb);
writeJSONString("thread_name", wb, settings);
DB::writeChar(':', wb);
writeJSONString(msg.getThread(), wb, settings);
DB::writeChar(',', wb);
writeJSONString("thread_id", wb, settings);
DB::writeChar(':', wb);
DB::writeChar('\"', wb);
DB::writeIntText(msg_ext.thread_id, wb);
DB::writeChar('\"', wb);
DB::writeChar(',', wb);
writeJSONString("level", wb, settings);
DB::writeChar(':', wb);
int priority = static_cast<int>(msg.getPriority());
writeJSONString(std::to_string(priority), wb, settings);
DB::writeChar(',', wb);
/// We write query_id even in case when it is empty (no query context)
/// just to be convenient for various log parsers.
writeJSONString("query_id", wb, settings);
DB::writeChar(':', wb);
writeJSONString(msg_ext.query_id, wb, settings);
DB::writeChar(',', wb);
writeJSONString("logger_name", wb, settings);
DB::writeChar(':', wb);
writeJSONString(msg.getSource(), wb, settings);
DB::writeChar(',', wb);
writeJSONString("message", wb, settings);
DB::writeChar(':', wb);
writeJSONString(msg.getText(), wb, settings);
DB::writeChar(',', wb);
writeJSONString("source_file", wb, settings);
DB::writeChar(':', wb);
const char * source_file = msg.getSourceFile();
if (source_file != nullptr)
writeJSONString(source_file, wb, settings);
else
writeJSONString("", wb, settings);
DB::writeChar(',', wb);
writeJSONString("source_line", wb, settings);
DB::writeChar(':', wb);
DB::writeChar('\"', wb);
DB::writeIntText(msg.getSourceLine(), wb);
DB::writeChar('\"', wb);
DB::writeChar('}', wb);
}
void OwnJSONPatternFormatter::format(const Poco::Message & msg, std::string & text)
{
formatExtended(DB::ExtendedLogMessage::getFrom(msg), text);
}

View File

@ -0,0 +1,32 @@
#pragma once
#include <Poco/PatternFormatter.h>
#include "ExtendedLogChannel.h"
#include "OwnPatternFormatter.h"
/** Format log messages own way in JSON.
* We can't obtain some details using Poco::PatternFormatter.
*
* Firstly, the thread number here is peaked not from Poco::Thread
* threads only, but from all threads with number assigned (see ThreadNumber.h)
*
* Secondly, the local date and time are correctly displayed.
* Poco::PatternFormatter does not work well with local time,
* when timestamps are close to DST timeshift moments.
* - see Poco sources and http://thread.gmane.org/gmane.comp.time.tz/8883
*
* Also it's made a bit more efficient (unimportant).
*/
class Loggers;
class OwnJSONPatternFormatter : public OwnPatternFormatter
{
public:
OwnJSONPatternFormatter();
void format(const Poco::Message & msg, std::string & text) override;
void formatExtended(const DB::ExtendedLogMessage & msg_ext, std::string & text) const override;
};

View File

@ -27,7 +27,7 @@ public:
OwnPatternFormatter(bool color_ = false);
void format(const Poco::Message & msg, std::string & text) override;
void formatExtended(const DB::ExtendedLogMessage & msg_ext, std::string & text) const;
virtual void formatExtended(const DB::ExtendedLogMessage & msg_ext, std::string & text) const;
private:
bool color;

View File

@ -1,5 +1,6 @@
#include <Processors/Formats/ISchemaReader.h>
#include <Formats/ReadSchemaUtils.h>
#include <Formats/EscapingRuleUtils.h>
#include <DataTypes/DataTypeString.h>
#include <boost/algorithm/string.hpp>
@ -17,35 +18,38 @@ namespace ErrorCodes
void chooseResultColumnType(
DataTypePtr & type,
const DataTypePtr & new_type,
CommonDataTypeChecker common_type_checker,
DataTypePtr & new_type,
std::function<void(DataTypePtr &, DataTypePtr &)> transform_types_if_needed,
const DataTypePtr & default_type,
const String & column_name,
size_t row)
{
if (!type)
{
type = new_type;
return;
}
if (!new_type || type->equals(*new_type))
return;
transform_types_if_needed(type, new_type);
if (type->equals(*new_type))
return;
/// If the new type and the previous type for this column are different,
/// we will use default type if we have it or throw an exception.
if (new_type && !type->equals(*new_type))
if (default_type)
type = default_type;
else
{
DataTypePtr common_type;
if (common_type_checker)
common_type = common_type_checker(type, new_type);
if (common_type)
type = common_type;
else if (default_type)
type = default_type;
else
throw Exception(
ErrorCodes::TYPE_MISMATCH,
"Automatically defined type {} for column {} in row {} differs from type defined by previous rows: {}",
type->getName(),
column_name,
row,
new_type->getName());
throw Exception(
ErrorCodes::TYPE_MISMATCH,
"Automatically defined type {} for column {} in row {} differs from type defined by previous rows: {}",
type->getName(),
column_name,
row,
new_type->getName());
}
}
@ -63,8 +67,8 @@ void checkResultColumnTypeAndAppend(NamesAndTypesList & result, DataTypePtr & ty
result.emplace_back(name, type);
}
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings)
: ISchemaReader(in_)
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_)
: ISchemaReader(in_), format_settings(format_settings_)
{
if (!format_settings.column_names_for_schema_inference.empty())
{
@ -79,14 +83,14 @@ IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & form
}
}
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings, DataTypePtr default_type_)
: IRowSchemaReader(in_, format_settings)
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_, DataTypePtr default_type_)
: IRowSchemaReader(in_, format_settings_)
{
default_type = default_type_;
}
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings, const DataTypes & default_types_)
: IRowSchemaReader(in_, format_settings)
IRowSchemaReader::IRowSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_, const DataTypes & default_types_)
: IRowSchemaReader(in_, format_settings_)
{
default_types = default_types_;
}
@ -116,7 +120,8 @@ NamesAndTypesList IRowSchemaReader::readSchema()
if (!new_data_types[i])
continue;
chooseResultColumnType(data_types[i], new_data_types[i], common_type_checker, getDefaultType(i), std::to_string(i + 1), rows_read);
auto transform_types_if_needed = [&](DataTypePtr & type, DataTypePtr & new_type){ transformTypesIfNeeded(type, new_type, i); };
chooseResultColumnType(data_types[i], new_data_types[i], transform_types_if_needed, getDefaultType(i), std::to_string(i + 1), rows_read);
}
}
@ -156,8 +161,13 @@ DataTypePtr IRowSchemaReader::getDefaultType(size_t column) const
return nullptr;
}
IRowWithNamesSchemaReader::IRowWithNamesSchemaReader(ReadBuffer & in_, DataTypePtr default_type_)
: ISchemaReader(in_), default_type(default_type_)
void IRowSchemaReader::transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t)
{
transformInferredTypesIfNeeded(type, new_type, format_settings);
}
IRowWithNamesSchemaReader::IRowWithNamesSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_, DataTypePtr default_type_)
: ISchemaReader(in_), format_settings(format_settings_), default_type(default_type_)
{
}
@ -181,6 +191,7 @@ NamesAndTypesList IRowWithNamesSchemaReader::readSchema()
names_order.push_back(name);
}
auto transform_types_if_needed = [&](DataTypePtr & type, DataTypePtr & new_type){ transformTypesIfNeeded(type, new_type); };
for (rows_read = 1; rows_read < max_rows_to_read; ++rows_read)
{
auto new_names_and_types = readRowAndGetNamesAndDataTypes(eof);
@ -188,7 +199,7 @@ NamesAndTypesList IRowWithNamesSchemaReader::readSchema()
/// We reached eof.
break;
for (const auto & [name, new_type] : new_names_and_types)
for (auto & [name, new_type] : new_names_and_types)
{
auto it = names_to_types.find(name);
/// If we didn't see this column before, just add it.
@ -200,7 +211,7 @@ NamesAndTypesList IRowWithNamesSchemaReader::readSchema()
}
auto & type = it->second;
chooseResultColumnType(type, new_type, common_type_checker, default_type, name, rows_read);
chooseResultColumnType(type, new_type, transform_types_if_needed, default_type, name, rows_read);
}
}
@ -219,4 +230,9 @@ NamesAndTypesList IRowWithNamesSchemaReader::readSchema()
return result;
}
void IRowWithNamesSchemaReader::transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type)
{
transformInferredTypesIfNeeded(type, new_type, format_settings);
}
}

View File

@ -53,8 +53,6 @@ public:
NamesAndTypesList readSchema() override;
void setCommonTypeChecker(CommonDataTypeChecker checker) { common_type_checker = checker; }
protected:
/// Read one row and determine types of columns in it.
/// Return types in the same order in which the values were in the row.
@ -67,6 +65,10 @@ protected:
void setMaxRowsToRead(size_t max_rows) override { max_rows_to_read = max_rows; }
size_t getNumRowsRead() const override { return rows_read; }
FormatSettings format_settings;
virtual void transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t column_idx);
private:
DataTypePtr getDefaultType(size_t column) const;
@ -74,7 +76,6 @@ private:
size_t rows_read = 0;
DataTypePtr default_type;
DataTypes default_types;
CommonDataTypeChecker common_type_checker;
std::vector<String> column_names;
};
@ -86,12 +87,10 @@ private:
class IRowWithNamesSchemaReader : public ISchemaReader
{
public:
IRowWithNamesSchemaReader(ReadBuffer & in_, DataTypePtr default_type_ = nullptr);
IRowWithNamesSchemaReader(ReadBuffer & in_, const FormatSettings & format_settings_, DataTypePtr default_type_ = nullptr);
NamesAndTypesList readSchema() override;
bool hasStrictOrderOfColumns() const override { return false; }
void setCommonTypeChecker(CommonDataTypeChecker checker) { common_type_checker = checker; }
protected:
/// Read one row and determine types of columns in it.
/// Return list with names and types.
@ -102,11 +101,14 @@ protected:
void setMaxRowsToRead(size_t max_rows) override { max_rows_to_read = max_rows; }
size_t getNumRowsRead() const override { return rows_read; }
virtual void transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type);
FormatSettings format_settings;
private:
size_t max_rows_to_read;
size_t rows_read = 0;
DataTypePtr default_type;
CommonDataTypeChecker common_type_checker;
};
/// Base class for schema inference for formats that don't need any data to
@ -122,8 +124,8 @@ public:
void chooseResultColumnType(
DataTypePtr & type,
const DataTypePtr & new_type,
CommonDataTypeChecker common_type_checker,
DataTypePtr & new_type,
std::function<void(DataTypePtr &, DataTypePtr &)> transform_types_if_needed,
const DataTypePtr & default_type,
const String & column_name,
size_t row);

View File

@ -318,6 +318,11 @@ DataTypes CustomSeparatedSchemaReader::readRowAndGetDataTypes()
return determineDataTypesByEscapingRule(fields, reader.getFormatSettings(), reader.getEscapingRule());
}
void CustomSeparatedSchemaReader::transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t)
{
transformInferredTypesIfNeeded(type, new_type, format_settings, reader.getEscapingRule());
}
void registerInputFormatCustomSeparated(FormatFactory & factory)
{
for (bool ignore_spaces : {false, true})

View File

@ -97,6 +97,8 @@ public:
private:
DataTypes readRowAndGetDataTypes() override;
void transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t) override;
PeekableReadBuffer buf;
CustomSeparatedFormatReader reader;
bool first_row = true;

View File

@ -1,5 +1,6 @@
#include <Processors/Formats/Impl/JSONColumnsBlockInputFormatBase.h>
#include <Formats/JSONUtils.h>
#include <Formats/EscapingRuleUtils.h>
#include <IO/ReadHelpers.h>
#include <base/find_symbols.h>
@ -181,13 +182,14 @@ JSONColumnsSchemaReaderBase::JSONColumnsSchemaReaderBase(
{
}
void JSONColumnsSchemaReaderBase::chooseResulType(DataTypePtr & type, const DataTypePtr & new_type, const String & column_name, size_t row) const
void JSONColumnsSchemaReaderBase::chooseResulType(DataTypePtr & type, DataTypePtr & new_type, const String & column_name, size_t row) const
{
auto common_type_checker = [&](const DataTypePtr & first, const DataTypePtr & second)
auto convert_types_if_needed = [&](DataTypePtr & first, DataTypePtr & second)
{
return JSONUtils::getCommonTypeForJSONFormats(first, second, format_settings.json.read_bools_as_numbers);
DataTypes types = {first, second};
transformInferredJSONTypesIfNeeded(types, format_settings);
};
chooseResultColumnType(type, new_type, common_type_checker, nullptr, column_name, row);
chooseResultColumnType(type, new_type, convert_types_if_needed, nullptr, column_name, row);
}
NamesAndTypesList JSONColumnsSchemaReaderBase::readSchema()
@ -260,7 +262,7 @@ DataTypePtr JSONColumnsSchemaReaderBase::readColumnAndGetDataType(const String &
}
readJSONField(field, in);
DataTypePtr field_type = JSONUtils::getDataTypeFromField(field);
DataTypePtr field_type = JSONUtils::getDataTypeFromField(field, format_settings);
chooseResulType(column_type, field_type, column_name, rows_read);
++rows_read;
}

View File

@ -83,7 +83,7 @@ private:
DataTypePtr readColumnAndGetDataType(const String & column_name, size_t & rows_read, size_t max_rows_to_read);
/// Choose result type for column from two inferred types from different rows.
void chooseResulType(DataTypePtr & type, const DataTypePtr & new_type, const String & column_name, size_t row) const;
void chooseResulType(DataTypePtr & type, DataTypePtr & new_type, const String & column_name, size_t row) const;
const FormatSettings format_settings;
std::unique_ptr<JSONColumnsReaderBase> reader;

View File

@ -6,6 +6,7 @@
#include <Formats/FormatFactory.h>
#include <Formats/verbosePrintString.h>
#include <Formats/JSONUtils.h>
#include <Formats/EscapingRuleUtils.h>
#include <Formats/registerWithNamesAndTypes.h>
#include <DataTypes/NestedUtils.h>
#include <DataTypes/Serializations/SerializationNullable.h>
@ -187,11 +188,6 @@ JSONCompactEachRowRowSchemaReader::JSONCompactEachRowRowSchemaReader(
: FormatWithNamesAndTypesSchemaReader(in_, format_settings_, with_names_, with_types_, &reader)
, reader(in_, yield_strings_, format_settings_)
{
bool allow_bools_as_numbers = format_settings_.json.read_bools_as_numbers;
setCommonTypeChecker([allow_bools_as_numbers](const DataTypePtr & first, const DataTypePtr & second)
{
return JSONUtils::getCommonTypeForJSONFormats(first, second, allow_bools_as_numbers);
});
}
DataTypes JSONCompactEachRowRowSchemaReader::readRowAndGetDataTypes()
@ -210,7 +206,12 @@ DataTypes JSONCompactEachRowRowSchemaReader::readRowAndGetDataTypes()
if (in.eof())
return {};
return JSONUtils::readRowAndGetDataTypesForJSONCompactEachRow(in, reader.yieldStrings());
return JSONUtils::readRowAndGetDataTypesForJSONCompactEachRow(in, format_settings, reader.yieldStrings());
}
void JSONCompactEachRowRowSchemaReader::transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t)
{
transformInferredJSONTypesIfNeeded(type, new_type, format_settings);
}
void registerInputFormatJSONCompactEachRow(FormatFactory & factory)

View File

@ -80,6 +80,8 @@ public:
private:
DataTypes readRowAndGetDataTypes() override;
void transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type, size_t) override;
JSONCompactEachRowFormatReader reader;
bool first_row = true;
};

View File

@ -3,6 +3,7 @@
#include <Processors/Formats/Impl/JSONEachRowRowInputFormat.h>
#include <Formats/JSONUtils.h>
#include <Formats/EscapingRuleUtils.h>
#include <Formats/FormatFactory.h>
#include <DataTypes/NestedUtils.h>
#include <DataTypes/Serializations/SerializationNullable.h>
@ -306,18 +307,12 @@ void JSONEachRowRowInputFormat::readSuffix()
assertEOF(*in);
}
JSONEachRowSchemaReader::JSONEachRowSchemaReader(ReadBuffer & in_, bool json_strings_, const FormatSettings & format_settings)
: IRowWithNamesSchemaReader(in_)
JSONEachRowSchemaReader::JSONEachRowSchemaReader(ReadBuffer & in_, bool json_strings_, const FormatSettings & format_settings_)
: IRowWithNamesSchemaReader(in_, format_settings_)
, json_strings(json_strings_)
{
bool allow_bools_as_numbers = format_settings.json.read_bools_as_numbers;
setCommonTypeChecker([allow_bools_as_numbers](const DataTypePtr & first, const DataTypePtr & second)
{
return JSONUtils::getCommonTypeForJSONFormats(first, second, allow_bools_as_numbers);
});
}
NamesAndTypesList JSONEachRowSchemaReader::readRowAndGetNamesAndDataTypes(bool & eof)
{
if (first_row)
@ -350,7 +345,12 @@ NamesAndTypesList JSONEachRowSchemaReader::readRowAndGetNamesAndDataTypes(bool &
return {};
}
return JSONUtils::readRowAndGetNamesAndDataTypesForJSONEachRow(in, json_strings);
return JSONUtils::readRowAndGetNamesAndDataTypesForJSONEachRow(in, format_settings, json_strings);
}
void JSONEachRowSchemaReader::transformTypesIfNeeded(DataTypePtr & type, DataTypePtr & new_type)
{
transformInferredJSONTypesIfNeeded(type, new_type, format_settings);
}
void registerInputFormatJSONEachRow(FormatFactory & factory)

Some files were not shown because too many files have changed in this diff Show More