Commit Graph

108988 Commits

Author SHA1 Message Date
robot-clickhouse-ci-2
e3e5d83f96
Merge pull request #46745 from ClickHouse/Avogar-patch-3
Update docs about format table function
2023-02-22 23:40:10 +01:00
Igor Nikonov
271b72abf4
Merge pull request #46642 from ClickHouse/remove_redundant_sorting_fix
Fix: remove redundant sorting optimization
2023-02-22 23:33:15 +01:00
Anton Popov
d5864fa88e allow to fallback from async insert in case of large amount of data 2023-02-22 21:59:24 +00:00
Alexey Milovidov
5154b04cfb
Merge pull request #46732 from ClickHouse/fix-sonarcloud-job
Fix SonarCloud Job
2023-02-23 00:31:54 +03:00
Jiebin Sun
d6710d9b34 Align all the SSE4.1 requirement and use needle_size
Align all the SSE4.1 requirement from StringSearcher. Use needle_size
in while loop to make the code clean.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 16:15:26 -05:00
HarryLeeIBM
e979a86604 Minor change for adding constexpr 2023-02-22 12:50:46 -08:00
HarryLeeIBM
18b93fc212 More refactoring for better style 2023-02-22 12:41:00 -08:00
Robert Schulze
81bf43157f
Allow configuration of Kafka topics with periods
The Kafka table engine allows global configuration and per-Kafka-topic
configuration. The latter uses syntax <kafka_TOPIC>, e.g. for topic
"football":

  <kafka_football>
      <retry_backoff_ms>250</retry_backoff_ms>
      <fetch_min_bytes>100000</fetch_min_bytes>
  </kafka_football>

Some users had to find out the hard way that such configuration doesn't
take effect if the topic name contains a period, e.g. "sports.football".
The reason is that ClickHouse configuration framework already uses
periods as level separators to descend the configuration hierarchy.
(Besides that, per-topic configuration at the same level as global
configuration could be considered ugly.)

Note that Kafka topics may contain characters "a-zA-Z0-9._-" (*) and
a tree-like topic organization using periods is quite common in
practice.

This PR deprecates the existing per-topic configuration syntax (but
continues to support it for backward compat) and introduces a new
per-topic configuration syntax below the global Kafka configuration of
the form:

<kafka>
   <topic name="football">
       <retry_backoff_ms>250</retry_backoff_ms>
       <fetch_min_bytes>100000</fetch_min_bytes>
   </topic>
</kafka>

The period restriction doesn't apply to XML attributes, so <topic
name="sports.football"> will work. Also, everything Kafka-related is
below <kafka>.

Considered but rejected alternatives:
- Extending Poco ConfigurationView with custom separators (e.g."/"
  instead of "."). Won't work easily because ConfigurationView only
  builds a path but defers descending the configuration tree to the
  normal configuration classes.
- Reloading the configuration file in StorageKafka (instead of reading
  the loaded file) but with a custom separator. This mode is supported
  by XML configuration. Too ugly and error-prone since the true
  configuration is composed from multiple configuration files.

(*) https://stackoverflow.com/a/37067544
2023-02-22 20:35:09 +00:00
Yakov Olkhovskiy
fadbeb8ebd T64 codec support for IPv4 2023-02-22 19:25:48 +00:00
Jiebin Sun
1f62135ba7 Make the optimized SIMD StringSearcher clean
This patch has revised the name of value and added comments to make
the SIMD StringSearcher clean and easy to understand based on pull
request 46289.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 12:18:21 -05:00
Jiebin Sun
d220e7f4fc Optimize the SIMD StringSearcher if needle_size is large
This patch offers an additional optimization when the needle_size is
large. If the needle_size is larger than the haystack_size, there is
no need to search any more.

The optimized SIMD StringSearcher has led at most 41.7% than Volnitsky
algorithm when the needle_size is less than 21, and fallen behind only
about 1% even when the needle_size is bigger than 50, which is not
considered as a common case.

Test platform: ICX server
Test query: SELECT COUNT(*) FROM hits WHERE URL LIKE '%{Needle}%';

Needle_size	opt/baseline
5		141.7%
6		129.4%
8		118.5%
9		112.3%
10		107.4%
14		103.4%
20		100.2%
21		100.7%
51		99.0%

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:58:17 -05:00
Jiebin Sun
f5a6a86dec Optimize the SIMD StringSearcher by searching first two chars
This patch offers the optimized SIMD StringSearcher by searching the first
and second chars together rather than only the first char, which will result
in big performance gain. The patch also provides a quick path when the needle
size is 1.

With this patch, I have tested the 43 queries in clickbench on ICX server.
Query 20 has got 35% performance gain. Other StringSearcher related queries
have got around 10% performance improvement. And the overall geomean of all
the queries has got 4.1% performance gain.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:55:30 -05:00
Kruglov Pavel
0e7143070e
Update docs about format table function 2023-02-22 17:51:29 +01:00
Alexander Gololobov
67dcd9694c Remove unused MergeTreeReadTask::remove_prewhere_column 2023-02-22 17:49:22 +01:00
Julio Jimenez
d0bd8877ce
Merge branch 'master' into fix-sonarcloud-job 2023-02-22 11:41:24 -05:00
avogar
5af6ac534e Use smaller test file 2023-02-22 15:51:47 +00:00
Nikolay Degterinsky
af992ca2db
Better 2023-02-22 16:51:36 +01:00
pufit
8e7533fa57
Merge pull request #46564 from AVMusorin/update-time-distribution-queue
Added `last_exception_time` column into distribution_queue table
2023-02-22 10:43:35 -05:00
Julio Jimenez
4f31e59dcd
Fix SonarCloud Job
Signed-off-by: Julio Jimenez <julio@clickhouse.com>
2023-02-22 10:34:27 -05:00
zvonand
393830ecdc add docs + tiny cleanup 2023-02-22 16:30:46 +01:00
avogar
638b28cd85 Better test file 2023-02-22 15:21:06 +00:00
avogar
986dd72870 Fix possible clickhouse-local abort on JSONEachRow schema inference 2023-02-22 15:18:13 +00:00
Nikolay Degterinsky
cdbff57e6c
Ask for password interactively 2023-02-22 15:58:06 +01:00
Maksim Kita
40d2798cb4 Analyzer AutoFinalOnQueryPass fix 2023-02-22 15:51:13 +01:00
Alexander Gololobov
b0427c2e3c
Merge pull request #46660 from ClickHouse/fix_backup_test
Fix integration test: terminate old version without wait
2023-02-22 15:20:26 +01:00
Azat Khuzhin
9ab4944b9e Handle input_format_null_as_default for nested types
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-22 15:15:52 +01:00
Antonio Andelic
7f5fb77ed5
Increase table retries in cluster copier tests (#46590) 2023-02-22 15:09:48 +01:00
Kseniia Sumarokova
bec094cd79
Merge pull request #46712 from kssenii/add-iceberg-doc
Add iceberg engine doc
2023-02-22 14:49:03 +01:00
Robert Schulze
4fd4e77737
Poco: POCO_HAVE_INT64 is always defined 2023-02-22 13:48:29 +00:00
Kruglov Pavel
e433ecc18f
Better exception message during Tuple JSON deserialization 2023-02-22 14:37:55 +01:00
kssenii
c4761d6cc6 Fix checks 2023-02-22 14:27:43 +01:00
kssenii
bac464f89b Fix 2023-02-22 14:25:08 +01:00
Raúl Marín
3ea20ebbf0 Merge remote-tracking branch 'blessed/master' into fix_recurring_alias 2023-02-22 14:21:12 +01:00
Raúl Marín
e8094c9707 Add test for #46724 2023-02-22 14:20:48 +01:00
Alexander Tokmakov
fba2ec30a2 fix style check 2023-02-22 13:53:43 +01:00
Robert Schulze
9d116e6f5c
Merge pull request #46710 from ClickHouse/rs/bump-clang
Bump minimum required Clang from 12 to 15
2023-02-22 13:38:21 +01:00
Kruglov Pavel
3ba3fdbfa3
Merge pull request #46607 from kssenii/delay-loading-of-named-collections
Do not load named collections on server startup (on first access instead)
2023-02-22 13:22:34 +01:00
lzydmxy
ec8b6c5590 add __init__.py for integration test test_move_partition_to_disk_on_cluster 2023-02-22 19:57:56 +08:00
kssenii
ceff5f41d1 Fix tests 2023-02-22 12:27:07 +01:00
Dmitry Novik
67469ad46b
Merge pull request #46622 from ClickHouse/async-insert-memory-fix
Fix MemoryTracker counters for async inserts
2023-02-22 12:27:05 +01:00
Nikolai Kochetov
ab94d6dc18
Update docs/en/operations/system-tables/processors_profile_log.md
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-02-22 12:16:19 +01:00
Nikolai Kochetov
98c10ff6e5
Update docs/en/operations/system-tables/processors_profile_log.md
Co-authored-by: Nikita Taranov <nikita.taranov@clickhouse.com>
2023-02-22 12:16:09 +01:00
Kseniia Sumarokova
c242fe3e5e
Update docs/en/engines/table-engines/integrations/hudi.md
Co-authored-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-02-22 12:11:42 +01:00
Kseniia Sumarokova
ef15d64895
Update docs/en/engines/table-engines/integrations/deltalake.md
Co-authored-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-02-22 12:11:23 +01:00
kssenii
21fcc3b69c Add iceberg doc 2023-02-22 12:04:24 +01:00
flynn
678e4250cd
Fix incorrect predicate push down with grouping sets (#46151) 2023-02-22 11:54:19 +01:00
robot-clickhouse-ci-2
2df52af445
Merge pull request #46711 from ClickHouse/vdimir/tmp-data-in-fs-cache-doc
Add doc for temporary_data_in_cache
2023-02-22 11:41:03 +01:00
Kseniia Sumarokova
3f0d93d6e6
Merge pull request #46656 from ClickHouse/kssenii-patch-6
Update postgres_utility.py
2023-02-22 11:35:03 +01:00
vdimir
a4919ce3a2
Add doc for temporary_data_in_cache 2023-02-22 10:19:28 +00:00
Robert Schulze
16d61832fb
Bump minimum required Clang from 12 to 15
Needed due to https://github.com/ClickHouse/ClickHouse/pull/46247#discussion_r1109855435
2023-02-22 10:03:08 +00:00