Commit Graph

5089 Commits

Author SHA1 Message Date
Jiebin Sun
d220e7f4fc Optimize the SIMD StringSearcher if needle_size is large
This patch offers an additional optimization when the needle_size is
large. If the needle_size is larger than the haystack_size, there is
no need to search any more.

The optimized SIMD StringSearcher has led at most 41.7% than Volnitsky
algorithm when the needle_size is less than 21, and fallen behind only
about 1% even when the needle_size is bigger than 50, which is not
considered as a common case.

Test platform: ICX server
Test query: SELECT COUNT(*) FROM hits WHERE URL LIKE '%{Needle}%';

Needle_size	opt/baseline
5		141.7%
6		129.4%
8		118.5%
9		112.3%
10		107.4%
14		103.4%
20		100.2%
21		100.7%
51		99.0%

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:58:17 -05:00
Jiebin Sun
f5a6a86dec Optimize the SIMD StringSearcher by searching first two chars
This patch offers the optimized SIMD StringSearcher by searching the first
and second chars together rather than only the first char, which will result
in big performance gain. The patch also provides a quick path when the needle
size is 1.

With this patch, I have tested the 43 queries in clickbench on ICX server.
Query 20 has got 35% performance gain. Other StringSearcher related queries
have got around 10% performance improvement. And the overall geomean of all
the queries has got 4.1% performance gain.

Signed-off-by: Jiebin Sun <jiebin.sun@intel.com>
2023-02-22 11:55:30 -05:00
Alexander Tokmakov
3f11948bb0
Merge branch 'master' into stack_trace_in_part_log 2023-02-03 20:05:00 +03:00
Robert Schulze
85cbb9288c
Merge pull request #45456 from FrankChen021/uncaught_exception
Fix uncaught exception in HTTPHandler
2023-02-03 15:26:02 +01:00
Anton Popov
cdbe145bc1
Merge pull request #45796 from CurtizJ/fix-leak-in-azure-sdk
Fix test `test_azure_blob_storage_zero_copy_replication ` (memory leak in azure sdk)
2023-02-03 14:16:19 +01:00
Frank Chen
d38adfab30 Merge two overridden functions as one 2023-02-03 15:27:45 +08:00
Frank Chen
d3a05a11da Merge remote-tracking branch 'remotes/github/master' into stack_trace_in_part_log 2023-02-03 11:39:43 +08:00
Frank Chen
7ad4b3176a Clean code 2023-02-03 11:13:45 +08:00
Frank Chen
93ead69b7f Resolve review comments 2023-02-03 11:05:54 +08:00
Nikolai Kochetov
6cd0b51127
Merge pull request #45871 from ClickHouse/fix-ipv6-parser
Fix ipv6 parser
2023-02-01 14:28:54 +01:00
Robert Schulze
b512316586
Merge pull request #45682 from ClickHouse/rename-qrc-to-qc
Rename "Query Result Cache" to "Query Cache"
2023-02-01 11:23:29 +01:00
Alexey Milovidov
bf9f62dcbd
Merge pull request #38456 from sauliusvl/iouring
Re-introduce io_uring read method
2023-02-01 06:41:32 +03:00
Yakov Olkhovskiy
3d64c84571 fix ipv6 parser 2023-02-01 01:02:58 +00:00
Robert Schulze
325c6bdf3d
Renaming: "Query Result Cache" --> "Query Cache"
Reasons:

- The cache will at some point store intermediate results as opposed to
  only query results. We should change the terminology now without
  having to worry about backward compat.

- Equivalent caches in MySQL (1) and Starrocks (2) are called "query
  cache".

- The new name is ca. 13.8% more catchy.

(1) https://dev.mysql.com/doc/refman/5.6/en/query-cache.html
(2) https://docs.starrocks.io/en-us/2.5/using_starrocks/query_cache
2023-01-31 09:54:34 +00:00
Anton Popov
31e8b692f4 fix typo 2023-01-31 01:13:23 +00:00
Anton Popov
839cd614fb fix memory leak in azure sdk 2023-01-31 01:01:10 +00:00
Vitaly Baranov
2e3a3cc4dc
Merge pull request #45701 from vitlibar/add-setting-allow-head-object-request
Add new S3 setting allow_head_object_request
2023-01-30 17:41:09 +01:00
Vladimir C
e7e8ae979f
Merge pull request #45271 from jh0x/feature-array-shuffle 2023-01-30 11:54:07 +01:00
Robert Schulze
66392b873c
Merge pull request #45513 from aiven-sal/aiven-sal/siphash_pr
Keyed SipHash
2023-01-30 11:10:00 +01:00
Alexey Milovidov
a7299746c7
Merge pull request #45743 from ClickHouse/oom-message
Improve MEMERY_LIMIT_EXCEEDED exception message
2023-01-30 00:46:04 +03:00
Robert Schulze
15ae2d1de5
Merge branch 'master' into aiven-sal/siphash_pr 2023-01-29 21:06:52 +01:00
alesapin
631c8fb155
Merge branch 'master' into add-setting-allow-head-object-request 2023-01-29 14:58:27 +01:00
Dmitry Novik
ec1f6bfd37 Improve MEMERY_LIMIT_EXCEEDED exception message 2023-01-29 01:52:37 +00:00
Saulius Valatka
8505b8c57a add profile events for failed/succeeded CQE completions 2023-01-28 21:54:44 +02:00
Saulius Valatka
8fa9a99ba1 implement pending request queue to prevent CQ overflows 2023-01-28 21:54:44 +02:00
Saulius Valatka
ac2c921bdf add initial io_uring support 2023-01-28 21:54:44 +02:00
Azat Khuzhin
6b42b66257 Simply filesystem helpers to check is-readable/writable/executable
Use one system call - faccessat(), over multiple system calls.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-27 21:11:10 +01:00
Joanna Hulboj
9a559b5475 FIXUP: More comments about shuffle 2023-01-27 20:07:56 +00:00
Joanna Hulboj
31eb936457 Added Fisher-Yates shuffle and partial-shuffle 2023-01-27 20:07:55 +00:00
Vitaly Baranov
d02b255b55 Add new setting "allow_head_object_request" to S3::RequestSettings. 2023-01-27 15:10:34 +01:00
Alexander Tokmakov
c366806c3e
Merge pull request #45527 from ClickHouse/exception_message_patterns4
Better formatting for exception messages 2
2023-01-27 15:31:52 +03:00
Salvatore Mesoraca
2e72e27206
common: siphash: add support for custom keys 2023-01-27 13:00:53 +01:00
Alexey Milovidov
5b257ab806
Merge pull request #45233 from ClickHouse/improve_week_day
Revert "Revert "Improve week day""
2023-01-27 02:44:17 +03:00
Alexander Tokmakov
a584ad0eb1 forbid runtime strings 2023-01-26 10:52:47 +01:00
Alexander Tokmakov
9b670946db Merge branch 'master' into exception_message_patterns5 2023-01-26 00:41:32 +01:00
Alexander Tokmakov
3744fa2c63 format more messages 2023-01-25 21:16:42 +01:00
sichenzhao
243ac52259
Added two metrics about memory usage in cgroup to asynchronous metrics (#45301) 2023-01-25 20:32:17 +01:00
Alexander Tokmakov
ae795d87b2 fix 2023-01-25 16:06:40 +01:00
Alexander Tokmakov
6eb557b2ba Merge branch 'master' into exception_message_patterns4 2023-01-25 13:49:17 +01:00
Robert Schulze
59528cfca0
Merge pull request #45460 from ClickHouse/inv-index-cleanup
Cleanup of inverted index
2023-01-25 13:23:38 +01:00
Sergei Trifonov
0d1ea05ff6
Merge pull request #45007 from ClickHouse/cancellable-mutex-integration
Fast shared mutex integration
2023-01-25 11:15:46 +01:00
Alexander Tokmakov
d1baa7300c reformat ParsingException 2023-01-24 23:21:29 +01:00
serxa
51da43d6cf fix try_shared_lock() in SharedMutex and CancelableSharedMutex 2023-01-24 14:36:07 +00:00
Robert Schulze
9ff2bfcbf5
Merge remote-tracking branch 'origin/master' into inv-index-cleanup
src/Interpreters/GinFilter.cpp
	src/Interpreters/InterpreterCreateQuery.cpp
	src/Storages/MergeTree/MergeTreeData.cpp
	src/Storages/MergeTree/MergeTreeDataPartWriterOnDisk.cpp
	src/Storages/MergeTree/MergeTreeIndexInverted.cpp
2023-01-24 10:09:42 +00:00
Alexander Tokmakov
c6910f39b9 fix 2023-01-24 01:11:58 +01:00
Alexander Tokmakov
414693feb2 fixes 2023-01-24 00:46:03 +01:00
Alexander Tokmakov
bb4c8e169f check number of parameters in format string 2023-01-23 23:16:16 +01:00
Alexander Tokmakov
3f6594f4c6 forbid old ctor of Exception 2023-01-23 22:18:05 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Alexey Milovidov
62a8de34cc
Merge pull request #44811 from azat/build/glibc2.36-fix
Fix ASan builds for glibc 2.36+
2023-01-23 23:57:20 +03:00