Commit Graph

3908 Commits

Author SHA1 Message Date
Anton Popov
f94d4d4877
Merge branch 'master' into hash-functions-map 2022-08-02 13:26:54 +02:00
Robert Schulze
bf574b9154
Merge pull request #39760 from ClickHouse/bit-fiddling
Use std::popcount, ::countl_zero, ::countr_zero functions
2022-08-01 17:04:51 +02:00
Kruglov Pavel
dfdfabec94
Merge pull request #39218 from evillique/file_default_value
Add default argument to the function `file`
2022-08-01 13:04:19 +02:00
Nikolai Kochetov
22fbfe19a4 Merge branch 'master' into use-dag-in-key-condition 2022-07-31 21:54:12 +02:00
Robert Schulze
a7734672b9
Use std::popcount, ::countl_zero, ::countr_zero functions
- Introduced with the C++20 <bit> header

- The problem with __builtin_c(l|t)z() is that 0 as input has an
  undefined result (*) and the code did not always check. The std::
  versions do not have this issue.

- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
  src/Common/BitHelpers.h) because the std:: versions only accept
  unsigned inputs (and they also check that) and the casting would be
  ugly.

(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
2022-07-31 15:16:51 +00:00
Robert Schulze
4d7627e45e
Fix include 2022-07-31 07:36:40 +00:00
Robert Schulze
8ca236de08
Enable SQL function getOSKernelVersion() on all platforms
Follow up to PR #38615
2022-07-30 22:36:47 +00:00
Robert Schulze
85773e0926
Merge pull request #38615 from liyinsg/simplified_function_registration_interface
Simplified function registration interface
2022-07-31 00:18:37 +02:00
Robert Schulze
b52843d5fd
Merge pull request #37951 from zvonand/dt64_timeslots
Fix timeSlots for DateTime64
2022-07-30 20:49:05 +02:00
Alexey Milovidov
8f348edbbd
Merge pull request #39000 from ClickHouse/avx-enablement
Avx enablement
2022-07-30 04:51:07 +03:00
Arthur Passos
f0f19874da fix style 2022-07-29 13:56:01 -03:00
Arthur Passos
59fbd21024 Unwrap LC column in IExecutablefunction::executeWithoutSparseColumns 2022-07-29 12:03:09 -03:00
Nikolai Kochetov
59a11b32ad
Merge branch 'master' into use-dag-in-key-condition 2022-07-29 17:01:33 +02:00
zvonand
b390bcfe7c fix wrong data type cast 2022-07-29 13:25:40 +03:00
Li Yin
4088c0a7f3 Automated function registration
Automated register all functions with below naming convention by
iterating through the symbols:
void DB::registerXXX(DB::FunctionFactory &)
2022-07-29 15:39:50 +08:00
Anton Popov
45da56d802 support hash functions with Map type 2022-07-28 19:15:19 +00:00
Alexander Gololobov
9525bd19bf
Merge pull request #39592 from ClickHouse/fix-wrong-regexp-replace
Fix wrong regexp replace
2022-07-27 09:54:28 +02:00
Antonio Andelic
904a05ac21
Merge pull request #39496 from azat/custom-tld-exclamation-asterisk
Add support of !/* (exclamation/asterisk) in custom TLDs
2022-07-27 08:55:49 +02:00
Vladimir C
d9e8e9b948
Merge pull request #39552 from filimonov/maxsplit-bug
Fix bug with maxsplit in the splitByChar
2022-07-26 11:14:27 +02:00
Alexey Milovidov
833b24b486 Fix the wrong REGEXP_REPLACE alias 2022-07-26 08:01:49 +02:00
Azat Khuzhin
1d4a7c7290 Add support of !/* (exclamation/asterisk) in custom TLDs
Public suffix list may contain special characters (you may find format
here - [1]):
- asterisk (*)
- exclamation mark (!)

  [1]: https://github.com/publicsuffix/list/wiki/Format

It is easier to describe how it should be interpreted with an examples.

Consider the following part of the list:

    *.sch.uk
    *.kawasaki.jp
    !city.kawasaki.jp

And here are the results for `cutToFirstSignificantSubdomainCustom()`:

If you have only asterisk (*):

    foo.something.sheffield.sch.uk -> something.sheffield.sch.uk
    sheffield.sch.uk               -> sheffield.sch.uk

If you have exclamation mark (!) too:

    foo.kawasaki.jp                -> foo.kawasaki.jp
    foo.foo.kawasaki.jp            -> foo.foo.kawasaki.jp
    city.kawasaki.jp               -> city.kawasaki.jp
    some.city.kawasaki.jp          -> city.kawasaki.jp

TLDs had been verified wit the following script [2], to match with
python publicsuffix2 module.

  [2]: https://gist.github.com/azat/c1a7a9f1e3519793134ef4b1df5461a6

v2: fix StringHashTable padding requirements
Fixes: #39468
Follow-up for: #17748
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-26 08:34:30 +03:00
Roman Vasin
b462366415 Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-485 2022-07-25 17:55:47 +00:00
Roman Vasin
b412ea5f6d Improve generateRandom() for Date32; fix tests 01087_table_function_generate, 01277_fromUnixTimestamp64, 01691_DateTime64_clamp and 01702_toDateTime_from_string_clamping 2022-07-25 17:06:11 +00:00
Nikolai Kochetov
b70be40804
Merge branch 'master' into use-dag-in-key-condition 2022-07-25 14:30:22 +02:00
Alexander Gololobov
dbcb7e5f1e Fix for empty function name on FreeBSD build 2022-07-25 13:11:36 +02:00
Mikhail Filimonov
33ee858d18
Fix bug with maxsplit in the splitByChar 2022-07-25 13:11:02 +02:00
Alexey Milovidov
6fdcb009ff
Merge pull request #39533 from ClickHouse/now-in-block
Add function `nowInBlock`
2022-07-25 04:22:11 +03:00
Alexey Milovidov
cff712970e Add function nowInBlock 2022-07-24 19:58:48 +02:00
Robert Schulze
c788e05c77
Merge pull request #39292 from zvonand/zvonand-b58-datatype
Simplify Base58 encoding/decoding
2022-07-24 18:09:40 +02:00
Kruglov Pavel
f79924f270
Merge branch 'master' into file_default_value 2022-07-22 21:17:37 +02:00
Robert Schulze
6a631426b7
Disable vectorization for uint64 --> float32 cast 2022-07-22 12:49:16 +00:00
Robert Schulze
cad0e7a62c
Move hot loop inside nested if/else statements
- required to control the vectorization of a specific cast which
  annotates the loop using pragmas
2022-07-22 11:53:23 +00:00
Alexander Tokmakov
bed2206ae9
Merge pull request #39460 from ClickHouse/remove_some_dead_and_commented_code
Remove some dead and commented code
2022-07-22 13:24:34 +03:00
Robert Schulze
ea0a3bf600
Merge branch 'master' into stringref-to-string_view 2022-07-21 18:33:06 +02:00
Kruglov Pavel
f4f05ec786
Merge pull request #39447 from Avogar/better-parse-time-delta
Extend time units in parseTimeDelta function
2022-07-21 17:39:40 +02:00
Andrey Zvonov
e473606cd1
Merge branch 'master' into zvonand-b58-datatype 2022-07-21 15:55:05 +02:00
Alexander Tokmakov
a8da5d96fc remove some dead and commented code 2022-07-21 15:05:48 +02:00
avogar
d63786e709 Better 2022-07-21 09:07:23 +00:00
avogar
49d980d16c Extend time units in parseTimeDelta function 2022-07-21 08:55:51 +00:00
Roman Vasin
e3192cf753 Correct docs to reflect new range 1900..2299 for Date32 and DateTime64; Cleanup code 2022-07-20 15:19:02 +00:00
lokax
22792d1c6a fix style
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
140f5e6685 recursive check array offsets
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
d40f04b860 fix function name
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
ff433c1c01 fix build
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
647eafa00e support for array type 2022-07-20 21:20:00 +08:00
lokax
5c6b18a9bd fix: 3rd parameter must be constant 2022-07-20 21:20:00 +08:00
lokax
6e23d2cb85 feat(function): tupleElement with default value 2022-07-20 21:20:00 +08:00
Nikolay Degterinsky
c9ae525da7
Merge branch 'master' into file_default_value 2022-07-20 14:22:58 +02:00
zvonand
11b8d788ca small improvements 2022-07-20 13:36:47 +02:00
zvonand
592499e290 fix server crash on negative size 2022-07-19 23:11:03 +02:00
Nikolai Kochetov
f570cde815 Fixing build. 2022-07-19 20:19:57 +00:00
Roman Vasin
5f9c293963 Fix addDays() and addWeeks() in upper and lower limits of Date and Date32 2022-07-19 17:29:08 +00:00
Nikolai Kochetov
eaeb30a71a Merge branch 'master' into use-dag-in-key-condition 2022-07-19 18:39:52 +02:00
Robert Schulze
9d0e3d107b
Update src/Functions/formatReadableTimeDelta.cpp
Co-authored-by: Nikolay Degterinsky <43110995+evillique@users.noreply.github.com>
2022-07-19 13:17:33 +02:00
alesapin
bdc09c8319
Merge pull request #39303 from ClickHouse/whitespaces
Whitespaces
2022-07-19 13:17:29 +02:00
Kruglov Pavel
12221cffc9
Merge pull request #39071 from jiahui-97/parse_timedelta
implementation of parseTimeDelta function
2022-07-19 12:39:56 +02:00
zvonand
d245d6349a fixed zero check 2022-07-19 12:08:08 +02:00
Nikolay Degterinsky
4ae356f218 Fix NULL, add test 2022-07-19 09:15:42 +00:00
Robert Schulze
81ef1099cc
Even less usage of StringRef
--> see #39300
2022-07-19 07:01:06 +00:00
jiahui-97
e7af88b688 implementation of parseTimeDelta function
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
2022-07-19 09:33:02 +08:00
Robert Schulze
6df3c9d799
Merge pull request #39300 from ClickHouse/stringref-to-stringview
First try at reducing the use of StringRef
2022-07-18 19:20:16 +02:00
Robert Schulze
74e55e42f6
Merge pull request #39167 from ClickHouse/multiStringAllPositions-non-const-needle
multiStringAllPositions() with non-const needle
2022-07-18 15:30:14 +02:00
Kruglov Pavel
85f8b5990f
Merge pull request #39192 from Avogar/improve-is-nullable
Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument
2022-07-18 12:33:34 +02:00
Robert Schulze
3b0ca82d56
Fix build, pt. II 2022-07-18 09:41:01 +00:00
Alexey Milovidov
03aec06da1 Whitespaces 2022-07-17 23:20:05 +02:00
zvonand
18827ab312 Merge branch 'master' of github.com:ClickHouse/ClickHouse into dt64_timeslots 2022-07-17 22:59:18 +02:00
Robert Schulze
13482af4ee
First try at reducing the use of StringRef
- to be replaced by std::string_view
- suggested in #39262
2022-07-17 17:26:02 +00:00
Andrey Zvonov
e0d1954fac
Merge branch 'master' into dt64_timeslots 2022-07-16 22:33:46 +02:00
zvonand
728219e640 updated due to review 2022-07-16 22:16:19 +02:00
zvonand
d07a652883 merge 2022-07-16 19:05:07 +02:00
zvonand
4ab52b6873 added new DataType + fixes 2022-07-16 18:58:47 +02:00
avogar
7508448275 Better 2022-07-15 16:23:56 +00:00
Kruglov Pavel
0867e5fc4b
Merge branch 'master' into multiStringAllPositions-non-const-needle 2022-07-15 14:26:24 +02:00
Robert Schulze
deda29b46b
Pass const StringRef by value, not by reference
See #39224
2022-07-15 11:34:56 +00:00
Kruglov Pavel
a944a92d4c
Merge pull request #39224 from Avogar/string-view-by-value
Pass const std::string_view by value, not by reference
2022-07-15 13:05:56 +02:00
Roman Vasin
1d0818d9cf Set max year to 2299; Code cleanup; Make working 02245_make_datetime64 test 2022-07-15 10:33:52 +00:00
Nikolay Degterinsky
6f5275d9bb Fix build 2022-07-15 10:08:14 +00:00
Roman Vasin
266039ea64 Correct gTests for DateLUT 2022-07-14 19:00:17 +00:00
avogar
9291d33080 Pass const std::string_view & by value, not by reference 2022-07-14 16:11:57 +00:00
Maksim Kita
f5bacedaf9
Merge pull request #38553 from hexiaoting/mapupdate_dev
Fix bug for mapUpdate
2022-07-14 17:59:37 +02:00
Robert Schulze
ac5a06d944
Update doxygen 2022-07-14 11:02:01 +00:00
Kruglov Pavel
6d85dcd8a8
Update src/Functions/isNotNull.cpp
Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>
2022-07-14 12:51:50 +02:00
Nikolay Degterinsky
9ca2935d72 Add default argument to function file 2022-07-14 09:54:44 +00:00
Robert Schulze
198abad284
Disallow const haystack with non-const needle argument 2022-07-14 08:16:01 +00:00
Robert Schulze
6ec4f3cf3d
Implement non-const needle arguments in multiSearchAllPositions 2022-07-14 06:24:28 +00:00
avogar
390b1ac2f7 Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument 2022-07-13 17:56:34 +00:00
mergify[bot]
79e6f85412
Merge branch 'master' into mapupdate_dev 2022-07-13 17:06:51 +00:00
Nikolay Degterinsky
e3a79520c1 Merge remote-tracking branch 'upstream/master' into translate 2022-07-13 11:42:40 +00:00
Nikolay Degterinsky
c344e2de84 Fix style 2022-07-13 11:41:08 +00:00
Robert Schulze
04c1cb207b
Use fmt-based exceptions in FunctionsMultiStringPosition
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:40 +00:00
Robert Schulze
59a64ac902
Rename some variables
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:39 +00:00
Robert Schulze
f8c38b0d3a
Remove obsolete doxygen
(the removed functions are implemented elsewhere)
2022-07-13 09:46:38 +00:00
Nikolay Degterinsky
daae55b4ac Better 2022-07-13 01:41:27 +00:00
Anton Popov
81da0bb9f5 keep LowCardinality type in array() and map() functions 2022-07-12 13:31:00 +00:00
Andrey Zvonov
7bfe155f09
Merge branch 'master' into dt64_timeslots 2022-07-12 10:26:38 +03:00
Anton Popov
72fe4ce680 fix function toColumnTypeName with LowCardinality 2022-07-12 03:12:42 +00:00
Anton Popov
b67405915e keep LowCardinality type in tuple() function 2022-07-12 02:01:41 +00:00
Kruglov Pavel
57a719bafd
Merge pull request #39037 from amosbird/index-fix-1-again
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis) (second try)
2022-07-11 13:36:01 +02:00
Robert Schulze
5b8e448c7e
Merge pull request #39012 from ClickHouse/fix-crashing-stringsearch-with-empty-needle
Don't throw Logical error in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
2022-07-11 08:13:52 +02:00
Amos Bird
28aefc33a0
Revert "Merge pull request #39001 from ClickHouse/revert-38675-index-fix-1"
This reverts commit 1cf01bb959, reversing
changes made to af1136c990.
2022-07-08 23:22:10 +08:00
Robert Schulze
8e1a3cd194
Don't crash in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
Queries like
  "select multiMatchAnyIndex('abc', []::Array(String))"
were not properly handled and crashed.
2022-07-08 11:18:53 +00:00
Alexander Tokmakov
d4784203b7
Revert "Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)" 2022-07-08 12:51:30 +03:00
Robert Schulze
524f39551c
Merge pull request #38485 from ClickHouse/multi-match-with-non_const-patterns
Multi match with non const patterns
2022-07-08 09:29:10 +02:00
Roman Vasin
12f4a48957 Extend LUT range to 1900..2300 2022-07-08 06:48:05 +00:00
Alexey Milovidov
89fdcbf08c
Merge pull request #38675 from amosbird/index-fix-1
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)
2022-07-08 00:32:27 +03:00
Robert Schulze
49348b833a
Simplify 2022-07-07 20:25:26 +00:00
Robert Schulze
1de5e9a7da
Avoid copy-ing array elements 2022-07-07 12:33:34 +00:00
Nikolay Degterinsky
1869b7f408 Add functions translate & translateUTF8 2022-07-07 08:25:00 +00:00
Robert Schulze
dec184f61d
Add comment about const needle + const haystack 2022-07-06 17:41:15 +00:00
Robert Schulze
8a18705729
More variable renamings for more uniformity 2022-07-06 14:55:02 +00:00
Robert Schulze
0b0d64c5ca
Don't resize output vector in each loop iteration 2022-07-06 14:45:22 +00:00
Robert Schulze
144c1edb03
Inherit default implementation of getArgumentsThatAreAlwaysConstant() 2022-07-06 14:42:19 +00:00
mergify[bot]
8cc2c3914b
Merge branch 'master' into isnullable 2022-07-06 14:40:01 +00:00
Robert Schulze
6a907b23fb
Replace typeid_cast() with checkAndGetColumnConst()
... syntactic sugar
2022-07-06 14:38:28 +00:00
Robert Schulze
0c4da85e75
More uniform naming 2022-07-06 14:29:42 +00:00
mergify[bot]
e5535f5ab4
Merge branch 'master' into mapupdate_dev 2022-07-06 13:54:36 +00:00
Nikolai Kochetov
020c99a269
Merge pull request #38617 from azat/contrib-debug-symbols
Add separate option to omit symbols from heavy contrib
2022-07-06 14:40:24 +02:00
Robert Schulze
d0b2f13f9d
Fix style check 2022-07-05 13:41:52 +02:00
lokax
e6bd0105b1 feat(Function): isNullable
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-05 15:51:53 +08:00
lokax
849c46e6fa feat(Function): isNullable 2022-07-05 15:23:07 +08:00
mergify[bot]
f14b62b2d6
Merge branch 'master' into index-fix-1 2022-07-04 22:50:36 +00:00
Robert Schulze
1eed72b525
Make more multi-search methods work with non-const needles
After making function multi[Fuzzy]Match(Any|AnyIndex|AllIndices)() work
with non-const needles, 12 more functions started to fail in test
"00233_position_function_family":

multiSearchAny()
multiSearchAnyCaseInsensitive()
multiSearchAnyUTF8
multiSearchAnyCaseInsensitiveUTF8()

multiSearchFirstPosition()
multiSearchFirstPositionCaseInsensitive()
multiSearchFirstPositionUTF8()
multiSearchFirstPositionCaseInsensitiveUTF8()

multiSearchFirstIndex()
multiSearchFirstIndexCaseInsensitive()
multiSearchFirstIndexUTF8()
multiSearchFirstIndexCaseInsensitiveUTF8()

Failing queries take the form
  select 0 = multiSearchAny('\0', CAST([], 'Array(String)'));
2022-07-04 14:00:21 +00:00
Robert Schulze
ece61f6da3
Fix davenger's review comments
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907397214
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907385290
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907406097

(the latter is no longer relevant as the affected places were removed in
the meantime)
2022-07-04 10:43:21 +00:00
Robert Schulze
d547aa7849
Allow non-const pattern array argument in multi[Fuzzy]Match*()
Resolves #38046
2022-07-04 10:43:16 +00:00
Alexander Gololobov
8ce8158f7f Do computations in Float32 (not Float64) for arrays of Float32 2022-07-03 10:33:11 +02:00
Alexander Gololobov
c6691cc5f2 Improved vectorized execution of main loop for array norm/distance 2022-07-02 22:45:22 +02:00
mergify[bot]
b016be264c
Merge branch 'master' into squared_l2 2022-07-02 09:17:28 +00:00
Azat Khuzhin
e8f5cd3c68 Add separate option to omit symbols from heavy contrib
Sometimes it is useful to build contrib with debug symbols for further
debugging.

With everything turned ON (i.e. debug build) I got 3.3GB vs 3.0GB w/o
this patch, 9% bloat, thoughts about this is this OK or not for you, if
not STRIP_DEBUG_SYMBOLS_HEAVY_CONTRIB can be OFF by default (regardless
of build type).

P.S. aws debug symbols adds just 1.7%.
v2: rename STRIP_HEAVY_DEBUG_SYMBOLS
v3: OMIT_HEAVY_DEBUG_SYMBOLS
v4: documentation had been removed
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-02 06:32:03 +03:00
Robert Schulze
2a1ede0f5a
Merge pull request #38589 from ClickHouse/fix-zero-bytes-in-haystack
Fix countSubstrings() & position() on patterns with 0-bytes
2022-07-01 16:15:43 +02:00
Amos Bird
84a407f381
Fix toHour monotonicity 2022-07-01 18:24:24 +08:00
Alexander Gololobov
b2b31103c5 Reuse common code for L2Squared and L2 2022-06-30 14:12:25 +02:00
Azat Khuzhin
a47355877e
Add revision() function (#38555)
It can be useful to match versions, since in some tables
(system.trace_log) there is only revision column.

P.S. came to this when was digging into stress reports from CI.
P.P.S. case insensitive by analogy with version().

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-30 12:58:26 +02:00
Robert Schulze
81bb2242fd
Fix countSubstrings() & position() on patterns with 0-bytes
SQL functions countSubstrings(), countSubstringsCaseInsensitive(),
countSubstringsUTF8(), position(), positionCaseInsensitive(),
positionUTF8() with non-const pattern argument use fallback sorters
LibCASCIICaseSensitiveStringSearcher and LibCASCIICaseInsensitiveStringSearcher
which call ::strstr(), resp. ::strcasestr(). These functions assume that
the haystack is 0-terminated and they even document that. However, the
callers did not check if the haystack contains 0-byte (perhaps because
its sort of expensive). As a consequence, if the haystack contained a
zero byte in it's payload, matches behind this zero byte were ignored.

    create table t (id UInt32, pattern String) engine = MergeTree() order by id;
    insert into t values (1, 'x');
    select countSubstrings('aaaxxxaa\0xxx', pattern) from t;

We returned 3 before this commit, now we return 6
2022-06-29 21:41:18 +00:00
hexiaoting
e32a0838d1 fix bug for mapUpdate 2022-06-29 15:52:08 +08:00
Julian Gilyadov
d0d72e0b5e
Implement L2Squared Distance and Norm 2022-06-28 15:28:58 -04:00
Miel Donkers
4e9f396a48
Small improvement of the error message to hint at possible issue (#38458) 2022-06-28 13:36:30 +02:00
Robert Schulze
959cbaab02
Move loop over patterns into implementations
- This is preparation for non-const regexp arguments, where this loop
  will run for each row.
2022-06-26 16:26:13 +00:00
Robert Schulze
cb5d1a4a85
Fix style check 2022-06-26 16:25:49 +00:00
Robert Schulze
b8f67185bf
Cosmetics: Whitespaces 2022-06-26 16:25:49 +00:00
Robert Schulze
e2b11899a1
Move check if cfg allows hyperscan into implementations
- This is not needed for non-const regexp array arguments but cleans up
  the code and runs the check only in functions which actually use
  hyperscan.
2022-06-26 16:25:49 +00:00
Robert Schulze
c2cea38b97
Move local variable into if statement 2022-06-26 16:25:49 +00:00
Robert Schulze
c9ce0efa66
Instantiate MultiMatchAnyImpl template using enums
- With this, invalid combinations of the FindAny/FindAnyIndex bools are
  no longer possible and we can remove the corresponding check

- Also makes the instantiations more readable.
2022-06-26 16:25:49 +00:00
Robert Schulze
2f15d45f27
Move check for regexp array size into implementations
- This is not needed for non-const regexp array arguments (the
  cardinality of arrays is fixed per column) but it cleans up the code
  and runs the check only in functions which have restrictions on the
  number of patterns.

- For functions using hyperscans, it was checked that the number of
  regexes is < 2^32. Removed the check because I don't think anyone will
  every specify 4 billion patterns.
2022-06-26 16:25:43 +00:00
Robert Schulze
3478db9fb6
Move check for regexp array size into implementations
- This is not needed for non-const regexp array arguments (the
  cardinality of arrays is fixed per column) but it cleans up the code
  and runs the check only in functions which have restrictions on the
  number of patterns.

- For functions using hyperscans, it was checked that the number of
  regexes is < 2^32. Removed the check because I don't think anyone will
  every specify 4 billion patterns.
2022-06-26 15:38:12 +00:00
Robert Schulze
7913edc172
Move check for hyperscan regexp constraints into implementations
- This is preparation for non-const regexp arguments, where this check
  will run for each row.
2022-06-26 15:38:05 +00:00
Robert Schulze
89bfdd50bf
Remove unnecessary check
- getReturnTypeImpl() ensures that the haystack column has type "String"
  and we can simply assert that.
2022-06-26 15:34:24 +00:00
Robert Schulze
580d89477f
Minimally faster performance 2022-06-26 15:34:22 +00:00
Robert Schulze
4bc59c18e3
Cosmetics: Move some code around + docs + whitespaces + minor stuff 2022-06-26 15:34:15 +00:00
Robert Schulze
1273756911
Cosmetics: fmt-based exceptions 2022-06-26 15:33:18 +00:00
Robert Schulze
e5c74a14f7
Cosmetics: More consistent naming
- rename utility function and file to "checkHyperscanRegexp"
2022-06-26 15:33:18 +00:00
Robert Schulze
072e0855a8
Cosmetics: Make member variables const 2022-06-26 15:32:26 +00:00
Robert Schulze
2ebfd01c2e
Cosmetics: Pull out settings variable 2022-06-26 15:32:23 +00:00
Robert Schulze
bb7c627964
Cosmetics: Pass patterns around as std::string_view instead of StringRef
- The patterns are not used in hashing, there should not be a performance
  impact when we use stuff from the standard library instead.

- added forgotten .reserve() in FunctionsMultiStringPosition.h
2022-06-26 15:32:19 +00:00
Alexey Milovidov
b3098822e0
Merge pull request #38171 from ClickHouse/hyper-to-vectorscan
Replace hyperscan by vectorscan
2022-06-26 10:01:45 +03:00
Alexey Milovidov
0654684bd4 Fix wrong implementation of filesystem* functions 2022-06-25 06:10:50 +02:00
Andrey Zvonov
ea73d9c492
Merge branch 'master' into zvonand-base58 2022-06-24 21:37:20 +03:00
Kruglov Pavel
0201d62090
Merge pull request #38173 from Avogar/fix-short-circuit
Fix bug with nested short-circuit functions
2022-06-24 16:04:17 +02:00
Robert Schulze
2c828338f4
Replace hyperscan by vectorscan
This commit migrates ClickHouse to Vectorscan. The first 10 min of
[0] explain the reasons for it.

(*) Addresses (but does not resolve) #38046

(*) Config parameter names (e.g. "max_hyperscan_regexp_length") are
    preserved for compatibility. Likewise, error codes (e.g.
    "ErrorCodes::HYPERSCAN_CANNOT_SCAN_TEXT") and function/class names (e.g.
    "HyperscanDeleter") are preserved as vectorscan aims to be a drop-in
    replacement.

[0] https://www.youtube.com/watch?v=KlZWmmflW6M
2022-06-24 10:47:52 +02:00
Andrey Zvonov
c18d09a617
Merge branch 'master' into zvonand-base58 2022-06-24 07:05:49 +03:00
zvonand
dd8203038f updated exception handling 2022-06-24 00:36:57 +05:00
zvonand
a94c40e33a Merge branch 'master' of github.com:ClickHouse/ClickHouse into dt64_timeslots 2022-06-23 17:08:28 +05:00
mergify[bot]
234f0c6399
Merge branch 'master' into revert-35914-FIPS_compliance 2022-06-23 12:06:17 +00:00
zvonand
946117ec89 Merge branch 'master' of github.com:ClickHouse/ClickHouse into zvonand-base58 2022-06-23 17:04:40 +05:00
Alexey Milovidov
5855668514 Remove trash 2022-06-22 06:23:35 +02:00
mergify[bot]
bb79eb73e6
Merge branch 'master' into fix-short-circuit 2022-06-21 10:40:07 +00:00
Kruglov Pavel
b9b58b4305
Merge pull request #37759 from Avogar/fix-nothing-error
Fix possible logical error with type Nothing in some functions
2022-06-21 12:35:05 +02:00
Larry Luo
bbd73ba727 use utility methods to access x509 struct fields. 2022-06-20 21:27:33 -04:00
zvonand
22af00b757 rename variable + fix handling of ENABLE_LIBRARIES 2022-06-20 23:53:47 +05:00
mergify[bot]
b440ee84ae
Merge branch 'master' into fix-short-circuit 2022-06-20 15:14:19 +00:00
zvonand
d4e5686b99 minor: fix message for base64 2022-06-20 20:13:09 +05:00
zvonand
78d55d6f46 small fixes 2022-06-20 19:30:54 +05:00
zvonand
832fd6e0a9 Added tests + minor updates 2022-06-19 23:10:28 +05:00
Alexey Milovidov
0cf88e0950
Revert "ClickHouse's boringssl module updated to the official version of the FIPS compliant." 2022-06-18 23:16:18 +03:00
zvonand
f4b3af091d fix zero byte 2022-06-17 23:48:14 +05:00
avogar
23f48a9fb9 Fix bug with nested short-circuit functions 2022-06-17 11:44:49 +00:00
Andrey Zvonov
f987f461e5 fix style -- rm unused ErrorCode 2022-06-17 15:00:32 +05:00
Igor Nikonov
baebbc084f
Merge pull request #38027 from ClickHouse/decimal_rounding_fix
Fix: rounding for Decimal128/Decimal256 with more than 19-digits long scale
2022-06-17 09:48:18 +02:00
zvonand
c1b2b669ab remove wrong code 2022-06-17 01:52:45 +05:00
mergify[bot]
f46f7257dd
Merge branch 'master' into fix-nothing-error 2022-06-16 10:58:03 +00:00
mergify[bot]
2557e8ad51
Merge branch 'master' into decimal_rounding_fix 2022-06-16 10:53:49 +00:00
avogar
a3a7cc7a5d Fix logical error in array mapped functions with const nullable column 2022-06-16 10:41:53 +00:00
zvonand
a800158438 wip upload 2022-06-16 15:11:41 +05:00
Danila Kutenin
048f56bf4d Fix some tests and comments 2022-06-15 14:40:21 +00:00
Danila Kutenin
08e3f77a9c Optimize most important parts with NEON SIMD
First part, updated most UTF8, hashing, memory and codecs. Except
utf8lower and upper, maybe a little later.

That includes huge amount of research with movemask dealing. Exact
details and blog post TBD.
2022-06-15 13:19:29 +00:00
mergify[bot]
d704264fae
Merge branch 'master' into decimal_rounding_fix 2022-06-15 10:47:09 +00:00
zvonand
c149c916ec initial setup 2022-06-15 11:49:55 +05:00
Igor Nikonov
bf7dd39282 Fix: decimal rounding
Fixes #37531
2022-06-14 18:03:05 +00:00
Maksim Kita
dc2e117cce UnaryLogicalFunctions improve performance using dynamic dispatch 2022-06-14 17:30:11 +02:00
zvonand
a5a980b69d Added no_sanitize 2022-06-13 19:45:54 +05:00
zvonand
54b8709cb1 minor fix 2022-06-13 19:21:07 +05:00
Robert Schulze
5f5732a2c4
Merge pull request #37969 from ClickHouse/consistent-macro-usage
More consistent use of platform macros
2022-06-10 14:10:01 +02:00
zvonand
fb67b080b9 added docs 2022-06-10 14:30:17 +03:00
zvonand
551d1ea875 fix wrong interval 2022-06-10 13:21:31 +03:00
Robert Schulze
1a0b5f33b3
More consistent use of platform macros
cmake/target.cmake defines macros for the supported platforms, this
commit changes predefined system macros to our own macros.

__linux__ --> OS_LINUX
__APPLE__ --> OS_DARWIN
__FreeBSD__ --> OS_FREEBSD
2022-06-10 10:22:31 +02:00
zvonand
e19653618c fix wrongfully added submodule 2022-06-10 11:19:38 +03:00
zvonand
16087ea400 enable dt64 for timeslots 2022-06-09 15:28:18 +03:00
Maksim Kita
0c1211eb61
Merge pull request #37930 from kitaisreal/function-dict-get-check-arguments-size
Function dictGet check arguments size
2022-06-08 23:25:14 +02:00
Maksim Kita
b7152fa2bf Function dictGet check arguments size 2022-06-08 17:19:30 +02:00
Maksim Kita
7d1a43cfeb Fix setting cast_ipv4_ipv6_default_on_conversion_error for internal cast 2022-06-08 12:43:39 +02:00
Maksim Kita
4e160105b9
Merge pull request #37805 from kitaisreal/dictionaries-hierarchy-nullable-key-support
Hierarchical dictinaries support nullable parent key
2022-06-08 12:36:09 +02:00