Commit Graph

1585 Commits

Author SHA1 Message Date
jsc0218
2e90f0c7c7 fix the crash from type mismatch in regexpr dict 2024-01-18 20:20:36 +00:00
jsc0218
393e12b0d3 support regexp dict short circuit 2024-01-18 15:51:58 +00:00
jsc0218
091216766f support polygon dict short circuit 2024-01-16 16:16:18 +00:00
jsc0218
7b6f96dede support iptrie dict short circuit 2024-01-15 19:20:43 +00:00
jsc0218
2ed37a016c support direct dict short circuit 2024-01-15 03:13:30 +00:00
jsc0218
3de0fbae87 support ssd cache dict short circuit 2024-01-14 03:37:03 +00:00
Nikolai Kochetov
e1639610f2
Update PolygonDictionaryUtils.h 2024-01-13 21:48:41 +01:00
Nikolai Kochetov
d954dab62f
Update PolygonDictionaryUtils.h 2024-01-13 21:04:08 +01:00
yariks5s
d42a7e5f8f init 2024-01-12 23:46:50 +00:00
jsc0218
0312ea0379 Merge remote-tracking branch 'origin/master' into DictShortCircuit 2024-01-12 01:17:57 +00:00
jsc0218
925f78174d support cache dict short circuit 2024-01-11 15:46:49 +00:00
Raúl Marín
ff90f64bc1 Merge remote-tracking branch 'blessed/master' into speedup_numbers 2024-01-03 13:33:22 +00:00
Raúl Marín
bda6104f84 Replace std::iota with DB::iota where possible 2023-12-29 14:38:22 +01:00
Amos Bird
6b6e40831c
Move symbols from src/* into namespace DB 2023-12-29 14:37:08 +08:00
jsc0218
0fc569a3e3 support range hashed short circuit 2023-12-28 22:21:51 +00:00
jsc0218
17f391abb8 fix nullable issue, range hash setting and strengthen test 2023-12-28 16:16:06 +00:00
jsc0218
fd9a2a6417 support hashed array short circuit 2023-12-27 00:15:57 +00:00
Azat Khuzhin
3be3b0a280 Fix incorrect Exceptions
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-12-24 21:26:32 +01:00
jsc0218
e25eda91b0 refactor 2023-12-22 22:12:32 +00:00
jsc0218
24e9ba681e Merge remote-tracking branch 'origin/master' into DictShortCircuit 2023-12-20 20:03:14 +00:00
jsc0218
23fb9b802f simplify 2023-12-20 18:42:30 +00:00
vdimir
ae4270465d
Merge pull request #57544 from ClickHouse/vdimir/dict_hashed_array_shard
Support SHARDS for HashedArrayDictionary
2023-12-20 13:02:26 +01:00
jsc0218
3bb196f612 support hash dict short circuit 2023-12-20 02:53:07 +00:00
jsc0218
11f63d59a5 support multi column flat dict short circuit 2023-12-19 02:43:48 +00:00
Nikita Mikhaylov
6360b76792 Merge branch 'master' of github.com:ClickHouse/ClickHouse into remove-the-limit-for-connections-per-endpoint 2023-12-18 21:49:31 +00:00
jsc0218
26a817914e support single column flat dict short circuit 2023-12-18 02:24:10 +00:00
Raúl Marín
b269f87f4c Better text_log with ErrnoException 2023-12-15 19:27:56 +01:00
Nikita Mikhaylov
a0af0392cd
Random changes in random files (#57642) 2023-12-14 12:47:11 +01:00
vdimir
398499d253
Support SHARDS for HashedArrayDictionary 2023-12-13 13:00:28 +00:00
jsc0218
8fa4f29b6f refine interface again 2023-12-13 03:42:15 +00:00
jsc0218
fdcd94bf5f refine the interface 2023-12-12 16:40:18 +00:00
Nikita Mikhaylov
04d167c6d9 Better 2023-12-05 13:34:37 +01:00
Alexey Milovidov
10d5ba57e8
Merge pull request #57124 from azat/build/split-HashedDictionary-CU
Split HashedDictionary CU
2023-11-23 23:14:47 +01:00
Vitaly Baranov
e0c9661115 Check dictionary source type on creation even if "dictionaries_lazy_load" is enabled. 2023-11-23 01:45:08 +01:00
Azat Khuzhin
cf3cd099a5 Split HashedDictionary CU
Before HashedDictionary.cpp exceeds 50MiB, now:

    -rw-r--r-- 1 azat azat  37M Nov 22 17:56 SparseHashedDictionary.cpp.o
    -rw-r--r-- 1 azat azat  34M Nov 22 17:56 HashedDictionary.cpp.o
    -rw-r--r-- 1 azat azat 716K Nov 22 17:56 registerHashedDictionary.cpp.o

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-11-22 19:00:40 +01:00
Alexey Milovidov
d56cbda185 Add metrics for the number of queued jobs, which is useful for the IO thread pool 2023-11-18 19:07:59 +01:00
Alexey Milovidov
be88873c3e
Merge pull request #56776 from mkmkme/mkmkme/fix-typo
fix typo in ClickHouseDictionarySource
2023-11-15 12:30:25 +01:00
Mikhail Koviazin
aea43bdfad
fix typo in ClickHouseDictionarySource
hostnmae -> hostname
2023-11-15 10:06:00 +02:00
Alexey Milovidov
3bec4dce8e Merge branch 'master' into remove-cpp-templates-2 2023-11-11 00:50:32 +01:00
Alexey Milovidov
9317357873
Merge pull request #56546 from ClickHouse/remove-1
Remove useless using
2023-11-10 16:56:56 +01:00
Kseniia Sumarokova
ecd98006ce
Merge pull request #56306 from ClickHouse/fix-backup-restore-flatten-nested
Fix restore from backup with `flatten_nested` and `data_type_default_nullable`
2023-11-10 11:43:17 +01:00
Alexey Milovidov
8c253b9e3e Remove C++ templates 2023-11-10 05:25:02 +01:00
Alexey Milovidov
3f5b94b8ca Remove useless using 2023-11-09 23:37:39 +01:00
kssenii
41a880e57c Review fix 2023-11-09 16:03:51 +01:00
kssenii
9178fd4ad1 Fix case with replicated database 2023-11-07 16:02:51 +01:00
Nikolay Degterinsky
29da9f9645 Fix ClickHouse-sourced dictionaries with explicit query 2023-11-02 04:31:01 +00:00
vdimir
da6f3346fe
Merge pull request #55839 from ClickHouse/vdimir/async_executor_for_dictionary_load
Use AsyncPipelineExecutor all dictionaries
2023-10-24 11:31:36 +02:00
Nikita Taranov
900844605f
Optimise memory consumption during loading of hierarchical dictionaries (#55838)
* impl

* add comment

* fix build
2023-10-23 13:52:27 +02:00
Anton Popov
5819bcd07a
Support asynchronous inserts for native protocol (#54730)
* support async insert for native protocol

* use separate queue for async inserts via native protocol

* fix test

* better logging for async inserts and more tests

* disable mixed internal and external data in async inserts

* fix tests

* fix quota in async inserts

* disable async insert for secondary query of distributed
2023-10-20 18:39:48 +02:00
vdimir
8bc384e708
More dictionaries support dictionary_use_async_executor 2023-10-20 13:14:15 +00:00
vdimir
1aa4a542bb
Handle empty block in DictionaryPipelineExecutor 2023-10-20 12:30:21 +00:00
vdimir
b4851cf2ef
Use AsyncPipelineExecutor for reading clickhouse dictionary source 2023-10-19 15:16:33 +00:00
vdimir
0ece6e0263
Use AsyncPipelineExecutor for HashedArrayDictionary 2023-10-19 15:16:33 +00:00
Alexey Milovidov
b2b6720737
Merge pull request #49043 from RoryCrispin/dict-lifetime-validation
Validate direct dictionary lifetime is unset during creation
2023-09-30 07:54:50 +03:00
Robert Schulze
9fff447716
Re-enable clang-tidy checks 2023-09-26 09:34:12 +00:00
Robert Schulze
f5e8028bb1
Merge pull request #54642 from rschu1ze/broken-re2st
Remove broken lockless variant of re2
2023-09-17 15:30:57 +02:00
Robert Schulze
7b378dbad3
Remove broken lockless variant of re2 2023-09-14 16:40:42 +00:00
Maksim Kita
310dc22266 FunctionHelpers remove areTypesEqual function 2023-09-14 13:51:06 +03:00
johanngan
bcb058f999 Add case insensitive and dot-all modes to RegExpTree dictionary
The new per-dictionary settings control regex match semantics around
case sensitivity and the '.' wildcard with newlines. They must be set at
the dictionary level since they're applied to regex engines at
pattern-compile-time.

- regexp_dict_flag_case_insensitive: case insensitive matching
- regexp_dict_flag_dotall: '.' matches all characters including newlines

They correspond to HS_FLAG_CASELESS and HS_FLAG_DOTALL in Vectorscan
and case_sensitive and dot_nl in RE2. These are the most useful options
compatible with the internal behavior of RegExpTreeDictionary around
splitting up simple and complex patterns between Vectorscan and RE2.

The alternative is to use (?i) and/or (?s) for all patterns. However,
(?s) isn't handled properly by OptimizedRegularExpression::analyze().
And while (?i) is, it still causes the dictionary to treat the pattern
as "complex" for sequential scanning with RE2 rather than multi-matching
with Vectorscan, even though Vectorscan supports case insensitive
literal matching. Setting dictionary-wide flags is both more convenient,
and circumvents these problems.
2023-09-06 11:28:53 -05:00
Vitaly Baranov
3b58c5baa6 Always check that block has rows to fix wrong allocation in HashedArrayDictionary::updateData and others. 2023-09-05 09:57:13 +02:00
Sergei Trifonov
802579f3f1
Merge pull request #49618 from ClickHouse/concurrency-control-controllable
Make concurrency control controllable
2023-08-29 19:44:51 +02:00
Raúl Marín
93dac0c880 Support clang-18 (Wmissing-field-initializers) 2023-08-23 15:53:45 +02:00
Amos Bird
076a67bdaa
Consistent file management in CMake 2023-08-21 11:45:08 +08:00
Amos Bird
c43bf153f5
Refactor 2023-08-18 15:38:46 +08:00
Amos Bird
dd0c71b32a
Add error_exit_reaction 2023-08-18 15:38:46 +08:00
Amos Bird
476f3cedc1
Various reactions when executable stderr has data 2023-08-18 15:38:45 +08:00
Sergei Trifonov
771710b377
Merge branch 'master' into concurrency-control-controllable 2023-08-11 16:50:13 +02:00
Alexey Milovidov
aa757490bd Ditch tons of garbage 2023-08-09 02:19:02 +02:00
Han Fei
65dcd79eb0 fix mem leak in RegExpTreeDictionary 2023-08-08 14:58:18 +02:00
Sergei Trifonov
01196ac41f
Merge branch 'master' into concurrency-control-controllable 2023-08-01 15:40:50 +02:00
xiebin
33e2cfcecb
Merge branch 'master' into master 2023-07-30 12:20:54 +08:00
Yakov Olkhovskiy
9a1c59a2f1
Merge branch 'master' into fix-ip-dict 2023-07-26 12:08:49 -04:00
Alexey Milovidov
21382afa2b Check for punctuation 2023-07-25 06:10:04 +02:00
Nikita Mikhaylov
ee0bbc0e54
Merge branch 'master' into headers-blacklist 2023-07-17 19:08:52 +02:00
Yakov Olkhovskiy
e95d413d9a
Merge branch 'master' into fix-ip-dict 2023-07-14 09:11:42 -04:00
Dmitry Kardymon
385a210fee Merge remote-tracking branch 'origin/master' into ADQM-870 2023-07-10 13:19:21 +00:00
robot-clickhouse
1343e5cc45
Merge pull request #51853 from kitaisreal/cache-dictionary-request-only-unique-keys-from-source
CacheDictionary request only unique keys from source
2023-07-08 20:58:16 +02:00
Maksim Kita
8266067e1a Fixed style check 2023-07-07 19:09:55 +03:00
Maksim Kita
23bd23802f CacheDictionary request only unique keys from source 2023-07-07 12:26:15 +03:00
Nikolay Degterinsky
e98d136243
Merge branch 'master' into headers-blacklist 2023-07-07 04:44:06 +02:00
Kseniia Sumarokova
e97e107bcc
Merge branch 'master' into add-separate-access-for-use-named-collections 2023-07-06 12:16:53 +02:00
Alexey Milovidov
2c96580a77
Merge branch 'master' into concurrency-control-controllable 2023-07-04 23:16:04 +03:00
Dmitry Kardymon
ab4142eb8f Merge remote-tracking branch 'clickhouse/master' into ADQM-870 2023-07-04 08:23:31 +03:00
Yakov Olkhovskiy
0529772dd8 support IPv4 and IPv6 as dictionary attributes 2023-07-04 02:19:45 +00:00
Nikolay Degterinsky
82e0237e67
Merge branch 'master' into headers-blacklist 2023-07-03 16:54:50 +02:00
kssenii
ac77f5fe6f Merge remote-tracking branch 'upstream/master' into add-separate-access-for-use-named-collections 2023-07-03 13:55:45 +02:00
Robert Schulze
fe49e98455
Follow-up to re2 update 2023-06-02 (#50949) 2023-07-03 08:28:25 +00:00
Nikolay Degterinsky
8dfa773f44
Merge branch 'master' into headers-blacklist 2023-06-30 23:40:17 +02:00
Sema Checherinda
d0d12bbf3b
Merge branch 'master' into no-finalize-WriteBufferFromOStream 2023-06-30 12:15:17 +02:00
Robert Schulze
6872084051
Merge pull request #50949 from georgthegreat/update-re2
Update contrib/re2 to 2023-06-02
2023-06-30 10:40:17 +02:00
Sema Checherinda
2a1f34e3f9
Merge branch 'master' into no-finalize-WriteBufferFromOStream 2023-06-30 08:01:05 +02:00
Igor Nikonov
56354b7251 Fix yet another place 2023-06-28 16:55:22 +00:00
Igor Nikonov
0b19c1832a Fix: detach from thread group 2023-06-28 14:15:03 +00:00
Sema Checherinda
fe97021929 add missing finalize calls in buffers 2023-06-27 16:54:14 +02:00
Yuriy Chernyshov
3e6654a1fe
Merge branch 'master' into update-re2 2023-06-24 22:34:44 +02:00
Nikita Taranov
fb7d23f245 fix build 2023-06-22 23:54:25 +02:00
Anton Kozlov
0c440b9d6f Report loading status for executable dictionaries correctly 2023-06-22 10:28:13 +00:00
Nikolay Degterinsky
575a1a4907 Add header checks to HTTP dictionary source 2023-06-20 13:29:25 +00:00
Dmitry Kardymon
806176d88e Add input_format_csv_missing_as_default setting and tests 2023-06-15 11:23:08 +00:00
kssenii
25ae93bbf8 Merge remote-tracking branch 'upstream/master' into add-separate-access-for-use-named-collections 2023-06-14 13:33:56 +02:00