Commit Graph

109082 Commits

Author SHA1 Message Date
Yakov Olkhovskiy
99095446af review suggestions 2023-02-22 17:22:13 +00:00
Jiebin Sun
1f62135ba7 Make the optimized SIMD StringSearcher clean
This patch has revised the name of value and added comments to make
the SIMD StringSearcher clean and easy to understand based on pull
request 46289.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 12:18:21 -05:00
Jiebin Sun
d220e7f4fc Optimize the SIMD StringSearcher if needle_size is large
This patch offers an additional optimization when the needle_size is
large. If the needle_size is larger than the haystack_size, there is
no need to search any more.

The optimized SIMD StringSearcher has led at most 41.7% than Volnitsky
algorithm when the needle_size is less than 21, and fallen behind only
about 1% even when the needle_size is bigger than 50, which is not
considered as a common case.

Test platform: ICX server
Test query: SELECT COUNT(*) FROM hits WHERE URL LIKE '%{Needle}%';

Needle_size	opt/baseline
5		141.7%
6		129.4%
8		118.5%
9		112.3%
10		107.4%
14		103.4%
20		100.2%
21		100.7%
51		99.0%

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:58:17 -05:00
Jiebin Sun
f5a6a86dec Optimize the SIMD StringSearcher by searching first two chars
This patch offers the optimized SIMD StringSearcher by searching the first
and second chars together rather than only the first char, which will result
in big performance gain. The patch also provides a quick path when the needle
size is 1.

With this patch, I have tested the 43 queries in clickbench on ICX server.
Query 20 has got 35% performance gain. Other StringSearcher related queries
have got around 10% performance improvement. And the overall geomean of all
the queries has got 4.1% performance gain.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:55:30 -05:00
Kruglov Pavel
0e7143070e
Update docs about format table function 2023-02-22 17:51:29 +01:00
Alexander Gololobov
67dcd9694c Remove unused MergeTreeReadTask::remove_prewhere_column 2023-02-22 17:49:22 +01:00
avogar
50caa3d66c Update docs 2023-02-22 16:41:49 +00:00
Julio Jimenez
d0bd8877ce
Merge branch 'master' into fix-sonarcloud-job 2023-02-22 11:41:24 -05:00
avogar
e0931dbdbe Enable input_format_json_ignore_unknown_keys_in_named_tuple by default 2023-02-22 16:40:53 +00:00
avogar
5af6ac534e Use smaller test file 2023-02-22 15:51:47 +00:00
Nikolay Degterinsky
af992ca2db
Better 2023-02-22 16:51:36 +01:00
pufit
8e7533fa57
Merge pull request #46564 from AVMusorin/update-time-distribution-queue
Added `last_exception_time` column into distribution_queue table
2023-02-22 10:43:35 -05:00
Julio Jimenez
4f31e59dcd
Fix SonarCloud Job
Signed-off-by: Julio Jimenez <julio@clickhouse.com>
2023-02-22 10:34:27 -05:00
avogar
638b28cd85 Better test file 2023-02-22 15:21:06 +00:00
avogar
986dd72870 Fix possible clickhouse-local abort on JSONEachRow schema inference 2023-02-22 15:18:13 +00:00
Nikolay Degterinsky
cdbff57e6c
Ask for password interactively 2023-02-22 15:58:06 +01:00
Maksim Kita
40d2798cb4 Analyzer AutoFinalOnQueryPass fix 2023-02-22 15:51:13 +01:00
Alexander Gololobov
b0427c2e3c
Merge pull request #46660 from ClickHouse/fix_backup_test
Fix integration test: terminate old version without wait
2023-02-22 15:20:26 +01:00
Azat Khuzhin
9ab4944b9e Handle input_format_null_as_default for nested types
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-22 15:15:52 +01:00
Antonio Andelic
7f5fb77ed5
Increase table retries in cluster copier tests (#46590) 2023-02-22 15:09:48 +01:00
Kseniia Sumarokova
bec094cd79
Merge pull request #46712 from kssenii/add-iceberg-doc
Add iceberg engine doc
2023-02-22 14:49:03 +01:00
Robert Schulze
4fd4e77737
Poco: POCO_HAVE_INT64 is always defined 2023-02-22 13:48:29 +00:00
Kruglov Pavel
e433ecc18f
Better exception message during Tuple JSON deserialization 2023-02-22 14:37:55 +01:00
Yakov Olkhovskiy
620071bb42 fix 2023-02-22 13:33:40 +00:00
Yakov Olkhovskiy
ea244e5390 revert getViewContext 2023-02-22 13:28:45 +00:00
kssenii
c4761d6cc6 Fix checks 2023-02-22 14:27:43 +01:00
kssenii
bac464f89b Fix 2023-02-22 14:25:08 +01:00
Raúl Marín
3ea20ebbf0 Merge remote-tracking branch 'blessed/master' into fix_recurring_alias 2023-02-22 14:21:12 +01:00
Raúl Marín
e8094c9707 Add test for #46724 2023-02-22 14:20:48 +01:00
Alexander Tokmakov
fba2ec30a2 fix style check 2023-02-22 13:53:43 +01:00
Robert Schulze
9d116e6f5c
Merge pull request #46710 from ClickHouse/rs/bump-clang
Bump minimum required Clang from 12 to 15
2023-02-22 13:38:21 +01:00
Kruglov Pavel
3ba3fdbfa3
Merge pull request #46607 from kssenii/delay-loading-of-named-collections
Do not load named collections on server startup (on first access instead)
2023-02-22 13:22:34 +01:00
lzydmxy
ec8b6c5590 add __init__.py for integration test test_move_partition_to_disk_on_cluster 2023-02-22 19:57:56 +08:00
kssenii
ceff5f41d1 Fix tests 2023-02-22 12:27:07 +01:00
Dmitry Novik
67469ad46b
Merge pull request #46622 from ClickHouse/async-insert-memory-fix
Fix MemoryTracker counters for async inserts
2023-02-22 12:27:05 +01:00
Nikolai Kochetov
ab94d6dc18
Update docs/en/operations/system-tables/processors_profile_log.md
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-02-22 12:16:19 +01:00
Nikolai Kochetov
98c10ff6e5
Update docs/en/operations/system-tables/processors_profile_log.md
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-02-22 12:16:09 +01:00
Kseniia Sumarokova
c242fe3e5e
Update docs/en/engines/table-engines/integrations/hudi.md
Co-authored-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-02-22 12:11:42 +01:00
Kseniia Sumarokova
ef15d64895
Update docs/en/engines/table-engines/integrations/deltalake.md
Co-authored-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-02-22 12:11:23 +01:00
kssenii
21fcc3b69c Add iceberg doc 2023-02-22 12:04:24 +01:00
flynn
678e4250cd
Fix incorrect predicate push down with grouping sets (#46151) 2023-02-22 11:54:19 +01:00
robot-clickhouse-ci-2
2df52af445
Merge pull request #46711 from ClickHouse/vdimir/tmp-data-in-fs-cache-doc
Add doc for temporary_data_in_cache
2023-02-22 11:41:03 +01:00
Kseniia Sumarokova
3f0d93d6e6
Merge pull request #46656 from ClickHouse/kssenii-patch-6
Update postgres_utility.py
2023-02-22 11:35:03 +01:00
vdimir
a4919ce3a2
Add doc for temporary_data_in_cache 2023-02-22 10:19:28 +00:00
Robert Schulze
16d61832fb
Bump minimum required Clang from 12 to 15
Needed due to https://github.com/ClickHouse/ClickHouse/pull/46247#discussion_r1109855435
2023-02-22 10:03:08 +00:00
vdimir
d4bb84e68b
make clang-tidy happy about CrossToInnerJoinPass 2023-02-22 09:56:10 +00:00
Azat Khuzhin
2ca47a6eb6 BackgroundSchedulePool should not have any query context
BackgroundSchedulePool is used for some peridic jobs, not from the query
context, i.e. flush of Buffer table.

And for such jobs there cannot be any query context, and more
importantly it will not work correctly since that query_context will
eventually expires.

And this is the reason of this failures [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/46668/015991bc5e20c787851050c2eaa13f0fef3aac00/stateless_tests_flaky_check__asan_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-22 10:50:51 +01:00
Alexey Milovidov
5788deeadd
Merge pull request #46308 from ClickHouse/keeper-retries-by-default
Enable retries for INSERT by default in case of ZooKeeper session loss
2023-02-22 07:57:40 +03:00
HarryLeeIBM
ef33d11e3f Refactor code according to code review 2023-02-21 18:40:11 -08:00
Alexey Milovidov
2ae0b43570
Merge pull request #46626 from ClickHouse/fix-tests
Inhibit `index_granularity_bytes` randomization in some tests
2023-02-22 04:55:23 +03:00