Commit Graph

42795 Commits

Author SHA1 Message Date
Kruglov Pavel
dcc3efe4b7
Merge pull request #50364 from Avogar/allow-skip-empty-files
Allow to skip empty files in file/s3/url/hdfs table functions
2023-06-16 14:59:49 +02:00
Alexander Tokmakov
9260a1bf2e
Merge pull request #51006 from ClickHouse/followup_50448
Follow-up to #50448
2023-06-16 15:32:12 +03:00
Kruglov Pavel
f8ddfb1fd8
Merge branch 'master' into allow-skip-empty-files 2023-06-16 13:23:41 +02:00
Kruglov Pavel
11f176dd19
Merge pull request #50712 from KevinyhZou/bug_fix_csv_parse_by_tab_delimiter
Support CSVInputFormat to read csv file by whitespace & tab delimiter
2023-06-16 13:16:22 +02:00
Han Fei
6bbd0d144d
Merge pull request #50951 from ZhiguoZh/20230607-toyear-fix
Optimization of predicates with toYear/toYYYYMM based on a general solution
2023-06-16 00:32:55 +02:00
Robert Schulze
74cb79769b
Merge pull request #50925 from arenadata/ADQM-812
Implement support of syslog format in the parseDateTimeBestEffort() function
2023-06-15 21:04:50 +02:00
Alexander Tokmakov
353b57f13f an optimiation for alters and replicated db 2023-06-15 20:20:11 +02:00
Alexander Gololobov
f08fd758fa Make multiple parallel list requests 2023-06-15 18:45:10 +02:00
Nikolay Degterinsky
b963369920
Merge pull request #50865 from jmaicher/fix/50864/verification-cooldown-cache-entry
Fix type of LDAP server params hash in cache entry
2023-06-15 16:21:44 +02:00
Zhiguo Zhou
ff6629d1d1 Enhance safety of function generateOptimizedDateFilterAST
This commit checks the corner case where the comparator is none
of equals, notEquals, less, lessOrEquals, greater, greaterOrEquals,
and throws LOGICAL_ERROR exception if so.
2023-06-15 22:04:13 +08:00
Alexander Tokmakov
b248ba730a
Merge pull request #50997 from vitlibar/use-hash_of_all_files-for-parts
Use hash_of_all_files to check identity of parts during on-cluster backups
2023-06-15 16:54:55 +03:00
Zhiguo Zhou
d780d0bab1 fix style
Move error throws in Transform to FunctionDateOrDateTimeToSomething.
2023-06-15 21:48:02 +08:00
Han Fei
9e81b2fd5d fix style 2023-06-15 21:48:02 +08:00
Zhiguo Zhou
d14299eb09 The general optimization of predicates with date/datetime converters
As is suggested in issue #15257, the function preimage is a general
solution to the optimization problem with predicates containing the
date and datetime converters. This commit implements the idea by
adding the new methods, hasInformationAboutPreimage and getPreimage,
to IFunction/IFunctionBase, and having the specific convert functions
define their own preimage. Moreover, we added a new pass in the
TreeOptimizer and a new AST visitor for in-place rewriting the AST
with the converters' preimage.

Specifically, the optimization is applied to toYear and toYYYYMM.
2023-06-15 21:48:02 +08:00
Alexander Tokmakov
dc0a224f52 fix 2023-06-15 15:05:17 +02:00
avogar
870f3d1270 Fix comments 2023-06-15 12:59:46 +00:00
Kseniia Sumarokova
c15e7b93cb
Merge pull request #50976 from kssenii/fix-data-lakes-too-many-head-requests
Fix data lakes slowness because of synchronous head requests
2023-06-15 13:33:09 +02:00
Kseniia Sumarokova
31e08635bf
Merge branch 'master' into unify-priorities-pools 2023-06-15 12:51:45 +02:00
Igor Nikonov
1113a7c524
Merge pull request #50214 from azat/parallelize_output_from_storages-fix
Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs and storages with one block only
2023-06-15 12:48:54 +02:00
Kseniia Sumarokova
b7fbc4dd8e
Merge pull request #50977 from valbok/empty-table-overrides
MaterializedMySQL: Keep parentheses for empty table overrides
2023-06-15 12:48:04 +02:00
Alexander Tokmakov
a018d9ca11 try to fix false-positive 'part is lost forever' 2023-06-15 12:25:16 +02:00
Kruglov Pavel
7aea4a1f10
Merge branch 'master' into allow-skip-empty-files 2023-06-15 12:07:24 +02:00
Antonio Andelic
c1faf42481
Merge pull request #50967 from baibaichen/feature/fix_build_clang15
fix build issue on clang 15
2023-06-15 10:32:38 +02:00
Kseniia Sumarokova
c786fbf8bd
Add comment 2023-06-15 10:22:02 +02:00
Kseniia Sumarokova
c8619ee6e4
Merge pull request #50974 from kssenii/iceberg-metadata-fix
Fix iceberg V2 optional metadata parsing
2023-06-15 09:31:36 +02:00
KevinyhZou
953f40aa3b
Merge branch 'master' into bug_fix_csv_parse_by_tab_delimiter 2023-06-15 10:25:19 +08:00
Michael Kolupaev
badde0fde2 Print git hash when crashing 2023-06-14 15:22:44 -07:00
Jiebin Sun
fcadb851c8
Maintain per-thread timer_id rather than create/delete frequently (#48778)
* Maintain per-thread timer_id rather than create/delete frequently

The QueryProfiler will frequently create/delete timer_id globally, which
will result in heavy kernel lock contention.
The idea is to maintain thread-local timer_id. Before create the
timer_id, it should check whether there is a timer_id already. And we
could stop the timer by timer_settime() rather than delete the timer_id
with timer_delete().

Apply the patch and run clickbench on latest 65d671b7c7 ClickHouse with
SPR 112 x 2 vCPUs. Query 4, 0, 5, 3, 15, 32 have 17.5%, 14.4%, 8.3%, 7.9%,
7.1%, 5.8% performance gain. The overall geomean has got 2.5%
performance gain.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Pack the timer and delete the timer_id when thread terminates

Pack the timer and related methods into the class. Delete the timer_id
when the thread terminates.

According to the issue (ClickHouse#49965),
all of the SSB queries benefit from this optimization, some have even got
improved by ~30% and the overall QPS could be significantly improved by ~18%.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Update src/Common/QueryProfiler.cpp

Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>

* Update src/Common/QueryProfiler.cpp

Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>

* Fix the review issue of QueryProfiler Timer from PR
https://github.com/ClickHouse/ClickHouse/pull/48778.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

* Update src/Common/QueryProfiler.cpp

Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>

* Add two separate CurrentMetrics for created and active timers
in QueryProfiler.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>

---------

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-06-14 21:22:09 +02:00
Vitaly Baranov
6366940a37 Use hash_of_all_files from system.parts to check identity of parts during on-cluster backups. 2023-06-14 20:39:50 +02:00
Azat Khuzhin
e9c9db9335 Disable parallelize_output_from_storages for all system tables
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Azat Khuzhin
3c4f638bd7 Disable parallelize_output_from_storages for storages with only one block
$ clickhouse-benchmark -i 10000 -d 0 --parallelize_output_from_storages=0 -q "select * from values('foo')"
    Loaded 1 queries.

    Queries executed: 10000.

    localhost:9000, queries 10000, QPS: 2800.490, RPS: 2800.490, MiB/s: 0.032, result RPS: 2800.490, result MiB/s: 0.032.

    0.000%          0.000 sec.
    10.000%         0.000 sec.
    20.000%         0.000 sec.
    30.000%         0.000 sec.
    40.000%         0.000 sec.
    50.000%         0.000 sec.
    60.000%         0.000 sec.
    70.000%         0.000 sec.
    80.000%         0.000 sec.
    90.000%         0.000 sec.
    95.000%         0.000 sec.
    99.000%         0.001 sec.
    99.900%         0.001 sec.
    99.990%         0.001 sec.

    $ clickhouse-benchmark -i 10000 -d 0 --parallelize_output_from_storages=1 -q "select * from values('foo')"
    Loaded 1 queries.

    Queries executed: 10000.

    localhost:9000, queries 10000, QPS: 1259.805, RPS: 1259.805, MiB/s: 0.014, result RPS: 1259.805, result MiB/s: 0.014.

    0.000%          0.001 sec.
    10.000%         0.001 sec.
    20.000%         0.001 sec.
    30.000%         0.001 sec.
    40.000%         0.001 sec.
    50.000%         0.001 sec.
    60.000%         0.001 sec.
    70.000%         0.001 sec.
    80.000%         0.001 sec.
    90.000%         0.001 sec.
    95.000%         0.001 sec.
    99.000%         0.001 sec.
    99.900%         0.001 sec.
    99.990%         0.003 sec.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Azat Khuzhin
3e419730c3 Disable parallelize_output_from_storages for processing MATERIALIZED VIEWs
Adding more processors for parallelize_output_from_storages is not a
costless operation (I've experienced some issues in production because
of this), and it is not easy to fix in a normal way, so let's disable it
for now.

Before this patch:
- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=1, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 3.648 sec. Processed 20.00 million rows, 120.00 MB (5.48 million rows/s., 32.90 MB/s.)

- INSERT INTO input SELECT * FROM numbers(10e6) SETTINGS parallelize_output_from_storages=0, min_insert_block_size_rows=1000
  0 rows in set. Elapsed: 1.851 sec. Processed 20.00 million rows, 120.00 MB (10.80 million rows/s., 64.82 MB/s.)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-06-14 19:11:23 +03:00
Dmitry Novik
1d88b16830
Merge pull request #50584 from ClickHouse/analyzer-optimizations-on-shards
Analyzer: Do not apply Query Tree optimizations on shards
2023-06-14 14:31:46 +02:00
Val Doroshchuk
e7c5991b39 MaterializedMySQL: Keep parenthesises for empty table overrides
Empty table overrides are formatted without any parenthesises,
but they are required by a parser,
and it is not possible to parse empty table overrides without it.

:) CREATE DATABASE db ... TABLE OVERRIDE t1()

CREATE DATABASE db
...
TABLE OVERRIDE `t1`

This query will be saved to metadata and ClickHouse will not be able to start up, since
table overrides require ().
2023-06-14 13:37:49 +02:00
Nikolai Kochetov
5163927b7d
Merge pull request #50923 from ClickHouse/read-in-order-proj-bug
Do not apply projection if read-in-order was enabled.
2023-06-14 13:13:25 +02:00
Nikita Taranov
a00e42f75b
Fix erroneous sort_description propagation in CreatingSets (#50955) 2023-06-14 13:00:47 +02:00
kssenii
827ac17dc2 Fix 2023-06-14 12:59:06 +02:00
kssenii
47a6db596e Better 2023-06-14 11:37:57 +02:00
kssenii
2a3ef3941e Fix 2023-06-14 11:31:28 +02:00
Kseniia Sumarokova
e2d8299b23
Merge pull request #50952 from nickitat/fix_remote_read_perf_degr
Fix logic in `AsynchronousBoundedReadBuffer::seek`
2023-06-14 11:25:26 +02:00
Robert Schulze
2643fd2c25
Merge pull request #50689 from arenadata/ADQM-871
Added connection string to clickhouse-client
2023-06-14 10:39:32 +02:00
Robert Schulze
edd29492c3
Merge pull request #50893 from rschu1ze/snowflake-crash
Fix LOGICAL_ERROR in snowflakeToDateTime*()
2023-06-14 09:42:21 +02:00
Chang Chen
86694847c6 using Reader instead of typename CapnpType::Reader 2023-06-14 15:22:32 +08:00
Chang Chen
e281026e00 fix build issue on clang 15 2023-06-14 12:29:55 +08:00
Alexey Gerasimchuck
868c3bd45d minor change 2023-06-14 04:29:08 +00:00
kevinyhzou
f3b99156ac review fix 2023-06-14 10:48:21 +08:00
pufit
91d794cf0a
Merge pull request #50150 from JackyWoo/support_redis
Add Redis table Engine/Function
2023-06-13 21:34:14 -04:00
Alexey Gerasimchuck
f1b5d47ce2 corrections after second review iteration 2023-06-14 01:26:39 +00:00
Nikita Taranov
1bc5598aa7 impl 2023-06-13 20:02:50 +02:00
Victor Krasnov
f01b96f9f9
Merge branch 'master' into ADQM-812 2023-06-13 19:52:10 +03:00
Robert Schulze
76f69f2b44
Revert overengineering 2023-06-13 15:52:06 +00:00
Kruglov Pavel
39ba925f8b
Merge branch 'master' into allow-skip-empty-files 2023-06-13 17:17:26 +02:00
Kruglov Pavel
607f337d67
Merge pull request #50592 from Avogar/max-bytes-to-read-in-schema-inference
Add setting to limit the number of bytes to read in schema inference
2023-06-13 16:47:57 +02:00
Kruglov Pavel
2bfb15cf81
Merge pull request #50620 from Avogar/increase-bitmap-max-array-sixe
Increase max array size in group bitmap
2023-06-13 16:47:32 +02:00
avogar
2e1f56ae33 Address comments 2023-06-13 14:43:50 +00:00
Nikita Mikhaylov
52a460df67
Tests with parallel replicas are no more "always green" (#50896) 2023-06-13 16:43:35 +02:00
Kruglov Pavel
8fdcd91c38
Merge pull request #49752 from Avogar/better-capnproto-3
Refactor CapnProto format to improve input/output performance
2023-06-13 16:20:38 +02:00
Anton Popov
79f3300709
Merge pull request #50726 from CurtizJ/enable-mutations-throttling
Enable settings for mutations throttling by default
2023-06-13 14:57:36 +02:00
Nikolai Kochetov
eddd932636 Do not apply projection if read-in-order was enabled. 2023-06-13 12:34:26 +00:00
Robert Schulze
3e3b8ff5f6
More robustness 2023-06-13 12:24:31 +00:00
Kruglov Pavel
cbed327077
Merge pull request #50635 from Avogar/skip-trailing-empty-lines
Allow to skip trailing empty lines in CSV/TSV/CustomSeparated formats
2023-06-13 12:43:43 +02:00
Julian Maicher
6201947b45
Merge branch 'master' into fix/50864/verification-cooldown-cache-entry 2023-06-13 11:51:53 +02:00
Kseniia Sumarokova
0ab3dc9261
A bit safer UserDefinedSQLFunctionVisitor (#50913)
* Update UserDefinedSQLFunctionVisitor.cpp

* Update UserDefinedSQLFunctionVisitor.cpp

---------

Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>
2023-06-13 11:25:13 +02:00
Michael Kolupaev
72f2832129
Slightly more information in error message about cached disk (#50897) 2023-06-13 11:07:05 +02:00
Robert Schulze
79bc884733
Stabilize tests 2023-06-13 08:56:22 +00:00
Kseniia Sumarokova
9a3cea379f
Merge pull request #50723 from kssenii/fix-data-race-in-read-buffer
Fix data race in log message of cached buffer
2023-06-13 10:25:18 +02:00
Robert Schulze
d8d352b2c5
Merge pull request #50912 from rschu1ze/annoy-docs
Update Annoy docs
2023-06-13 10:03:50 +02:00
Robert Schulze
8358d29ac7
Merge pull request #50405 from ClibMouse/feature/reservoir-sampler-big-endian-support
Implement big-endian support for the deterministic reservoir sampler
2023-06-13 09:55:23 +02:00
Alexey Gerasimchuck
e3a13111ae
Merge branch 'master' into ADQM-871 2023-06-13 14:05:13 +10:00
Victor Krasnov
b3ef2a4860 Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-812 2023-06-13 03:55:38 +00:00
Alexey Gerasimchuck
2395b25f9e Changes after review 2023-06-13 01:55:34 +00:00
JackyWoo
9d548315e8
Merge branch 'master' into support_redis 2023-06-13 09:34:32 +08:00
JackyWoo
959fde4491 add notifications in docs 2023-06-13 09:33:38 +08:00
Vitaly Baranov
23cff1fc32
Merge pull request #50889 from vitlibar/fix-checking-lock-file-too-often-while-writing-backup
Fix checking the lock file too often while writing a backup
2023-06-13 02:42:51 +02:00
ltrk2
002c15823c Perform in-place endianness transform because of padding 2023-06-12 16:44:46 -07:00
Alexey Milovidov
4e7cd2da01
Merge pull request #50811 from ClickHouse/tavplubix-patch-6
Don't mark a part as broken on `Poco::TimeoutException`
2023-06-12 23:17:37 +03:00
Alexey Milovidov
5cdf893f3a
Merge pull request #50835 from ClickHouse/rename-async-metrics
Replace CPU CGroups metrics to one
2023-06-12 23:15:05 +03:00
Robert Schulze
4f39ee51ae
Update Annoy docs 2023-06-12 20:06:57 +00:00
Robert Schulze
7745f7da73
Merge branch 'master' into annoy-misc 2023-06-12 21:46:27 +02:00
ltrk2
edb4a644b1 Update FunctionsCodingIP.cpp 2023-06-12 14:23:22 -04:00
ltrk2
a4285d56b2 Fix compilation error on big-endian platforms 2023-06-12 14:23:22 -04:00
Kruglov Pavel
e4838725e3
Merge branch 'master' into allow-skip-empty-files 2023-06-12 20:03:23 +02:00
Robert Schulze
65d83e45cb
Fix crash in snowflakeToDateTime(), follow-up to #50834 2023-06-12 16:21:49 +00:00
Vitaly Baranov
5aa0566767 Fix checking the lock file too often while writing a backup. 2023-06-12 18:13:26 +02:00
Robert Schulze
128e8c20d5
Merge pull request #50709 from arenadata/ADQM-867
Added numeric arguments support to some Date/DateTime conversion functions
2023-06-12 17:08:14 +02:00
Dmitry Novik
26c9bda144 Add a comment 2023-06-12 13:54:45 +00:00
Anton Popov
d45f07743c fix getting number of mutations 2023-06-12 13:54:07 +00:00
Alexander Tokmakov
01c7d2fe71
Prostpone check of outdated parts (#50676)
* prostpone check of outdated parts

* Update ReplicatedMergeTreePartCheckThread.cpp
2023-06-12 16:53:26 +03:00
Alexander Tokmakov
5db3b393d8
Update MergeTreeData.cpp 2023-06-12 16:22:33 +03:00
Han Fei
d47cdd4eb6
Merge pull request #50605 from ClickHouse/revert-50467-revert-50430-hanfei/fix-crossjoin-filter-pushdown
Revert "Revert "make filter push down through cross join"" and supress a test
2023-06-12 15:12:58 +02:00
Anton Popov
32caf87163
Merge pull request #50104 from amosbird/fix_43107_47549
Proper mutation of skip indices and projections
2023-06-12 15:09:18 +02:00
Kruglov Pavel
873cee9451
Merge pull request #49626 from alekseygolub/renamefile
Added option to rename files, loaded via TableFunctionFile, after successful processing
2023-06-12 15:01:22 +02:00
Kruglov Pavel
edd47a2281
Merge branch 'master' into skip-trailing-empty-lines 2023-06-12 13:57:15 +02:00
Kruglov Pavel
e03cd725b0
Merge pull request #50602 from Avogar/null-as-default-schema-inference
Respect setting input_format_as_default in schema inference
2023-06-12 13:45:52 +02:00
Kruglov Pavel
da68980b8d
Merge branch 'master' into max-bytes-to-read-in-schema-inference 2023-06-12 13:45:31 +02:00
Kruglov Pavel
24d70a2afd
Fix 2023-06-12 13:37:59 +02:00
avogar
5cec4c3161 Fallback to parsing big integer from String instead of exception in Parquet format 2023-06-12 11:34:40 +00:00
Robert Schulze
6da002e250
Merge pull request #50834 from rschu1ze/non-const-tz
Add compat setting for non-const timezones
2023-06-12 13:33:55 +02:00
Alexander Tokmakov
676ba2fbde
Update MergeTreeData.cpp 2023-06-12 12:30:38 +03:00
Julian Maicher
c378c3fcbb Fix type of LDAP server params hash in cache entry
In 1ed7ad57d9, we switched from (`size_t`, usually 64bit) to SipHash (128bit) and forgot to change the type of the cache entry. This broke the caching of successful LDAP authentication requests (verification cooldown).

Fixes #50864
2023-06-12 10:58:35 +02:00