avogar
7508448275
Better
2022-07-15 16:23:56 +00:00
Kruglov Pavel
0867e5fc4b
Merge branch 'master' into multiStringAllPositions-non-const-needle
2022-07-15 14:26:24 +02:00
Robert Schulze
deda29b46b
Pass const StringRef by value, not by reference
...
See #39224
2022-07-15 11:34:56 +00:00
Kruglov Pavel
a944a92d4c
Merge pull request #39224 from Avogar/string-view-by-value
...
Pass const std::string_view by value, not by reference
2022-07-15 13:05:56 +02:00
avogar
9291d33080
Pass const std::string_view & by value, not by reference
2022-07-14 16:11:57 +00:00
Maksim Kita
f5bacedaf9
Merge pull request #38553 from hexiaoting/mapupdate_dev
...
Fix bug for mapUpdate
2022-07-14 17:59:37 +02:00
Robert Schulze
ac5a06d944
Update doxygen
2022-07-14 11:02:01 +00:00
Kruglov Pavel
6d85dcd8a8
Update src/Functions/isNotNull.cpp
...
Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>
2022-07-14 12:51:50 +02:00
Robert Schulze
198abad284
Disallow const haystack with non-const needle argument
2022-07-14 08:16:01 +00:00
Robert Schulze
6ec4f3cf3d
Implement non-const needle arguments in multiSearchAllPositions
2022-07-14 06:24:28 +00:00
avogar
390b1ac2f7
Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument
2022-07-13 17:56:34 +00:00
mergify[bot]
79e6f85412
Merge branch 'master' into mapupdate_dev
2022-07-13 17:06:51 +00:00
Nikolay Degterinsky
e3a79520c1
Merge remote-tracking branch 'upstream/master' into translate
2022-07-13 11:42:40 +00:00
Nikolay Degterinsky
c344e2de84
Fix style
2022-07-13 11:41:08 +00:00
Robert Schulze
04c1cb207b
Use fmt-based exceptions in FunctionsMultiStringPosition
...
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:40 +00:00
Robert Schulze
59a64ac902
Rename some variables
...
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:39 +00:00
Robert Schulze
f8c38b0d3a
Remove obsolete doxygen
...
(the removed functions are implemented elsewhere)
2022-07-13 09:46:38 +00:00
Nikolay Degterinsky
daae55b4ac
Better
2022-07-13 01:41:27 +00:00
Anton Popov
81da0bb9f5
keep LowCardinality type in array() and map() functions
2022-07-12 13:31:00 +00:00
Andrey Zvonov
7bfe155f09
Merge branch 'master' into dt64_timeslots
2022-07-12 10:26:38 +03:00
Anton Popov
72fe4ce680
fix function toColumnTypeName with LowCardinality
2022-07-12 03:12:42 +00:00
Anton Popov
b67405915e
keep LowCardinality type in tuple() function
2022-07-12 02:01:41 +00:00
Kruglov Pavel
57a719bafd
Merge pull request #39037 from amosbird/index-fix-1-again
...
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis) (second try)
2022-07-11 13:36:01 +02:00
Robert Schulze
5b8e448c7e
Merge pull request #39012 from ClickHouse/fix-crashing-stringsearch-with-empty-needle
...
Don't throw Logical error in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
2022-07-11 08:13:52 +02:00
Amos Bird
28aefc33a0
Revert "Merge pull request #39001 from ClickHouse/revert-38675-index-fix-1"
...
This reverts commit 1cf01bb959
, reversing
changes made to af1136c990
.
2022-07-08 23:22:10 +08:00
Robert Schulze
8e1a3cd194
Don't crash in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
...
Queries like
"select multiMatchAnyIndex('abc', []::Array(String))"
were not properly handled and crashed.
2022-07-08 11:18:53 +00:00
Alexander Tokmakov
d4784203b7
Revert "Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)"
2022-07-08 12:51:30 +03:00
Robert Schulze
524f39551c
Merge pull request #38485 from ClickHouse/multi-match-with-non_const-patterns
...
Multi match with non const patterns
2022-07-08 09:29:10 +02:00
Alexey Milovidov
89fdcbf08c
Merge pull request #38675 from amosbird/index-fix-1
...
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)
2022-07-08 00:32:27 +03:00
Robert Schulze
49348b833a
Simplify
2022-07-07 20:25:26 +00:00
Robert Schulze
1de5e9a7da
Avoid copy-ing array elements
2022-07-07 12:33:34 +00:00
Nikolay Degterinsky
1869b7f408
Add functions translate & translateUTF8
2022-07-07 08:25:00 +00:00
Robert Schulze
dec184f61d
Add comment about const needle + const haystack
2022-07-06 17:41:15 +00:00
Robert Schulze
8a18705729
More variable renamings for more uniformity
2022-07-06 14:55:02 +00:00
Robert Schulze
0b0d64c5ca
Don't resize output vector in each loop iteration
2022-07-06 14:45:22 +00:00
Robert Schulze
144c1edb03
Inherit default implementation of getArgumentsThatAreAlwaysConstant()
2022-07-06 14:42:19 +00:00
mergify[bot]
8cc2c3914b
Merge branch 'master' into isnullable
2022-07-06 14:40:01 +00:00
Robert Schulze
6a907b23fb
Replace typeid_cast() with checkAndGetColumnConst()
...
... syntactic sugar
2022-07-06 14:38:28 +00:00
Robert Schulze
0c4da85e75
More uniform naming
2022-07-06 14:29:42 +00:00
mergify[bot]
e5535f5ab4
Merge branch 'master' into mapupdate_dev
2022-07-06 13:54:36 +00:00
Nikolai Kochetov
020c99a269
Merge pull request #38617 from azat/contrib-debug-symbols
...
Add separate option to omit symbols from heavy contrib
2022-07-06 14:40:24 +02:00
Robert Schulze
d0b2f13f9d
Fix style check
2022-07-05 13:41:52 +02:00
lokax
e6bd0105b1
feat(Function): isNullable
...
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-05 15:51:53 +08:00
lokax
849c46e6fa
feat(Function): isNullable
2022-07-05 15:23:07 +08:00
mergify[bot]
f14b62b2d6
Merge branch 'master' into index-fix-1
2022-07-04 22:50:36 +00:00
Robert Schulze
1eed72b525
Make more multi-search methods work with non-const needles
...
After making function multi[Fuzzy]Match(Any|AnyIndex|AllIndices)() work
with non-const needles, 12 more functions started to fail in test
"00233_position_function_family":
multiSearchAny()
multiSearchAnyCaseInsensitive()
multiSearchAnyUTF8
multiSearchAnyCaseInsensitiveUTF8()
multiSearchFirstPosition()
multiSearchFirstPositionCaseInsensitive()
multiSearchFirstPositionUTF8()
multiSearchFirstPositionCaseInsensitiveUTF8()
multiSearchFirstIndex()
multiSearchFirstIndexCaseInsensitive()
multiSearchFirstIndexUTF8()
multiSearchFirstIndexCaseInsensitiveUTF8()
Failing queries take the form
select 0 = multiSearchAny('\0', CAST([], 'Array(String)'));
2022-07-04 14:00:21 +00:00
Robert Schulze
ece61f6da3
Fix davenger's review comments
...
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907397214
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907385290
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907406097
(the latter is no longer relevant as the affected places were removed in
the meantime)
2022-07-04 10:43:21 +00:00
Robert Schulze
d547aa7849
Allow non-const pattern array argument in multi[Fuzzy]Match*()
...
Resolves #38046
2022-07-04 10:43:16 +00:00
Alexander Gololobov
8ce8158f7f
Do computations in Float32 (not Float64) for arrays of Float32
2022-07-03 10:33:11 +02:00
Alexander Gololobov
c6691cc5f2
Improved vectorized execution of main loop for array norm/distance
2022-07-02 22:45:22 +02:00
mergify[bot]
b016be264c
Merge branch 'master' into squared_l2
2022-07-02 09:17:28 +00:00
Azat Khuzhin
e8f5cd3c68
Add separate option to omit symbols from heavy contrib
...
Sometimes it is useful to build contrib with debug symbols for further
debugging.
With everything turned ON (i.e. debug build) I got 3.3GB vs 3.0GB w/o
this patch, 9% bloat, thoughts about this is this OK or not for you, if
not STRIP_DEBUG_SYMBOLS_HEAVY_CONTRIB can be OFF by default (regardless
of build type).
P.S. aws debug symbols adds just 1.7%.
v2: rename STRIP_HEAVY_DEBUG_SYMBOLS
v3: OMIT_HEAVY_DEBUG_SYMBOLS
v4: documentation had been removed
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-02 06:32:03 +03:00
Robert Schulze
2a1ede0f5a
Merge pull request #38589 from ClickHouse/fix-zero-bytes-in-haystack
...
Fix countSubstrings() & position() on patterns with 0-bytes
2022-07-01 16:15:43 +02:00
Amos Bird
84a407f381
Fix toHour monotonicity
2022-07-01 18:24:24 +08:00
Alexander Gololobov
b2b31103c5
Reuse common code for L2Squared and L2
2022-06-30 14:12:25 +02:00
Azat Khuzhin
a47355877e
Add revision() function ( #38555 )
...
It can be useful to match versions, since in some tables
(system.trace_log) there is only revision column.
P.S. came to this when was digging into stress reports from CI.
P.P.S. case insensitive by analogy with version().
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-30 12:58:26 +02:00
Robert Schulze
81bb2242fd
Fix countSubstrings() & position() on patterns with 0-bytes
...
SQL functions countSubstrings(), countSubstringsCaseInsensitive(),
countSubstringsUTF8(), position(), positionCaseInsensitive(),
positionUTF8() with non-const pattern argument use fallback sorters
LibCASCIICaseSensitiveStringSearcher and LibCASCIICaseInsensitiveStringSearcher
which call ::strstr(), resp. ::strcasestr(). These functions assume that
the haystack is 0-terminated and they even document that. However, the
callers did not check if the haystack contains 0-byte (perhaps because
its sort of expensive). As a consequence, if the haystack contained a
zero byte in it's payload, matches behind this zero byte were ignored.
create table t (id UInt32, pattern String) engine = MergeTree() order by id;
insert into t values (1, 'x');
select countSubstrings('aaaxxxaa\0xxx', pattern) from t;
We returned 3 before this commit, now we return 6
2022-06-29 21:41:18 +00:00
hexiaoting
e32a0838d1
fix bug for mapUpdate
2022-06-29 15:52:08 +08:00
Julian Gilyadov
d0d72e0b5e
Implement L2Squared Distance and Norm
2022-06-28 15:28:58 -04:00
Miel Donkers
4e9f396a48
Small improvement of the error message to hint at possible issue ( #38458 )
2022-06-28 13:36:30 +02:00
Robert Schulze
959cbaab02
Move loop over patterns into implementations
...
- This is preparation for non-const regexp arguments, where this loop
will run for each row.
2022-06-26 16:26:13 +00:00
Robert Schulze
cb5d1a4a85
Fix style check
2022-06-26 16:25:49 +00:00
Robert Schulze
b8f67185bf
Cosmetics: Whitespaces
2022-06-26 16:25:49 +00:00
Robert Schulze
e2b11899a1
Move check if cfg allows hyperscan into implementations
...
- This is not needed for non-const regexp array arguments but cleans up
the code and runs the check only in functions which actually use
hyperscan.
2022-06-26 16:25:49 +00:00
Robert Schulze
c2cea38b97
Move local variable into if statement
2022-06-26 16:25:49 +00:00
Robert Schulze
c9ce0efa66
Instantiate MultiMatchAnyImpl template using enums
...
- With this, invalid combinations of the FindAny/FindAnyIndex bools are
no longer possible and we can remove the corresponding check
- Also makes the instantiations more readable.
2022-06-26 16:25:49 +00:00
Robert Schulze
2f15d45f27
Move check for regexp array size into implementations
...
- This is not needed for non-const regexp array arguments (the
cardinality of arrays is fixed per column) but it cleans up the code
and runs the check only in functions which have restrictions on the
number of patterns.
- For functions using hyperscans, it was checked that the number of
regexes is < 2^32. Removed the check because I don't think anyone will
every specify 4 billion patterns.
2022-06-26 16:25:43 +00:00
Robert Schulze
3478db9fb6
Move check for regexp array size into implementations
...
- This is not needed for non-const regexp array arguments (the
cardinality of arrays is fixed per column) but it cleans up the code
and runs the check only in functions which have restrictions on the
number of patterns.
- For functions using hyperscans, it was checked that the number of
regexes is < 2^32. Removed the check because I don't think anyone will
every specify 4 billion patterns.
2022-06-26 15:38:12 +00:00
Robert Schulze
7913edc172
Move check for hyperscan regexp constraints into implementations
...
- This is preparation for non-const regexp arguments, where this check
will run for each row.
2022-06-26 15:38:05 +00:00
Robert Schulze
89bfdd50bf
Remove unnecessary check
...
- getReturnTypeImpl() ensures that the haystack column has type "String"
and we can simply assert that.
2022-06-26 15:34:24 +00:00
Robert Schulze
580d89477f
Minimally faster performance
2022-06-26 15:34:22 +00:00
Robert Schulze
4bc59c18e3
Cosmetics: Move some code around + docs + whitespaces + minor stuff
2022-06-26 15:34:15 +00:00
Robert Schulze
1273756911
Cosmetics: fmt-based exceptions
2022-06-26 15:33:18 +00:00
Robert Schulze
e5c74a14f7
Cosmetics: More consistent naming
...
- rename utility function and file to "checkHyperscanRegexp"
2022-06-26 15:33:18 +00:00
Robert Schulze
072e0855a8
Cosmetics: Make member variables const
2022-06-26 15:32:26 +00:00
Robert Schulze
2ebfd01c2e
Cosmetics: Pull out settings variable
2022-06-26 15:32:23 +00:00
Robert Schulze
bb7c627964
Cosmetics: Pass patterns around as std::string_view instead of StringRef
...
- The patterns are not used in hashing, there should not be a performance
impact when we use stuff from the standard library instead.
- added forgotten .reserve() in FunctionsMultiStringPosition.h
2022-06-26 15:32:19 +00:00
Alexey Milovidov
b3098822e0
Merge pull request #38171 from ClickHouse/hyper-to-vectorscan
...
Replace hyperscan by vectorscan
2022-06-26 10:01:45 +03:00
Alexey Milovidov
0654684bd4
Fix wrong implementation of filesystem* functions
2022-06-25 06:10:50 +02:00
Andrey Zvonov
ea73d9c492
Merge branch 'master' into zvonand-base58
2022-06-24 21:37:20 +03:00
Kruglov Pavel
0201d62090
Merge pull request #38173 from Avogar/fix-short-circuit
...
Fix bug with nested short-circuit functions
2022-06-24 16:04:17 +02:00
Robert Schulze
2c828338f4
Replace hyperscan by vectorscan
...
This commit migrates ClickHouse to Vectorscan. The first 10 min of
[0] explain the reasons for it.
(*) Addresses (but does not resolve) #38046
(*) Config parameter names (e.g. "max_hyperscan_regexp_length") are
preserved for compatibility. Likewise, error codes (e.g.
"ErrorCodes::HYPERSCAN_CANNOT_SCAN_TEXT") and function/class names (e.g.
"HyperscanDeleter") are preserved as vectorscan aims to be a drop-in
replacement.
[0] https://www.youtube.com/watch?v=KlZWmmflW6M
2022-06-24 10:47:52 +02:00
Andrey Zvonov
c18d09a617
Merge branch 'master' into zvonand-base58
2022-06-24 07:05:49 +03:00
zvonand
dd8203038f
updated exception handling
2022-06-24 00:36:57 +05:00
zvonand
a94c40e33a
Merge branch 'master' of github.com:ClickHouse/ClickHouse into dt64_timeslots
2022-06-23 17:08:28 +05:00
mergify[bot]
234f0c6399
Merge branch 'master' into revert-35914-FIPS_compliance
2022-06-23 12:06:17 +00:00
zvonand
946117ec89
Merge branch 'master' of github.com:ClickHouse/ClickHouse into zvonand-base58
2022-06-23 17:04:40 +05:00
Alexey Milovidov
5855668514
Remove trash
2022-06-22 06:23:35 +02:00
mergify[bot]
bb79eb73e6
Merge branch 'master' into fix-short-circuit
2022-06-21 10:40:07 +00:00
Kruglov Pavel
b9b58b4305
Merge pull request #37759 from Avogar/fix-nothing-error
...
Fix possible logical error with type Nothing in some functions
2022-06-21 12:35:05 +02:00
Larry Luo
bbd73ba727
use utility methods to access x509 struct fields.
2022-06-20 21:27:33 -04:00
zvonand
22af00b757
rename variable + fix handling of ENABLE_LIBRARIES
2022-06-20 23:53:47 +05:00
mergify[bot]
b440ee84ae
Merge branch 'master' into fix-short-circuit
2022-06-20 15:14:19 +00:00
zvonand
d4e5686b99
minor: fix message for base64
2022-06-20 20:13:09 +05:00
zvonand
78d55d6f46
small fixes
2022-06-20 19:30:54 +05:00
zvonand
832fd6e0a9
Added tests + minor updates
2022-06-19 23:10:28 +05:00
Alexey Milovidov
0cf88e0950
Revert "ClickHouse's boringssl module updated to the official version of the FIPS compliant."
2022-06-18 23:16:18 +03:00
zvonand
f4b3af091d
fix zero byte
2022-06-17 23:48:14 +05:00
avogar
23f48a9fb9
Fix bug with nested short-circuit functions
2022-06-17 11:44:49 +00:00
Andrey Zvonov
f987f461e5
fix style -- rm unused ErrorCode
2022-06-17 15:00:32 +05:00