Commit Graph

3645 Commits

Author SHA1 Message Date
Robert Schulze
05f08357a9
Merge pull request #37764 from ClickHouse/like_with_trailing_backslash
Disallow LIKE patterns with trailing escape
2022-06-03 13:19:51 +02:00
Alexey Milovidov
1529d47207
Merge pull request #34754 from ClickHouse/llvm-14
Switch to clang/llvm 14
2022-06-03 14:07:34 +03:00
Alexey Milovidov
de16784832
Merge pull request #37633 from ClickHouse/dump-column-structure-more-precise
More precise result of the `dumpColumnStructure` and `byteSize` miscellaneous functions
2022-06-03 14:05:20 +03:00
Alexey Milovidov
ea89f81a78 Merge branch 'master' of github.com:ClickHouse/ClickHouse into llvm-14 2022-06-03 03:07:14 +02:00
Robert Schulze
657662d89f
Minor follow-up to cache table: std::{vector-->array} 2022-06-02 20:18:10 +02:00
Maksim Kita
20b55a45b2 Hierarchical dictionaries support nullable parent key 2022-06-02 19:24:23 +02:00
HeenaBansal2009
e3080f2a97 Merge remote-tracking branch 'origin' into Fix-all-CheckTriviallyCopyableMove-Errors 2022-06-02 07:30:08 -07:00
Alexander Gololobov
b34782dc6a
Merge pull request #37775 from liuneng1994/fix_date32_to_string
fix toString error on DatatypeDate32
2022-06-02 16:40:47 +03:00
Vladimir C
670c721ded
Merge pull request #37742 from ucasfl/hashid 2022-06-02 12:47:11 +02:00
Robert Schulze
4e18659bfd
Fix tests + more precise exception msg 2022-06-02 11:11:56 +02:00
liuneng1994
7b15055e72 fix toString error on DatatypeDate32
Signed-off-by: liuneng1994 <1398775315@qq.com>
2022-06-02 16:56:43 +08:00
Alexey Milovidov
b5f48a7d3f Merge branch 'master' of github.com:ClickHouse/ClickHouse into llvm-14 2022-06-01 22:09:58 +02:00
Robert Schulze
366f368d06
Disallow LIKE patterns with trailing escape
Trailing escape ('ab\') is disallowed in SQL, in standardese:

  "If an escape character is specified, then [...] If there is not a
  partitioning of the string PVC into substrings such that each substring
  has length 1 (one) or 2, no substring of length 1 (one) is the escape
  character ECV, and each substring of length 2 is the escape character
  ECV followed by either the escape character ECV, an <underscore>
  character, or the <percent> character, then an exception condition is
  raised: data exception - invalid escape sequence."

I first thought this is checked already higher up in the stack, at least
for const needles, as single trailing backslashes ('ab\') are rejected,
but then I realized that ClickHouse quotes by default. I.e., double
trailing backslashes ('ab\\') are not rejected but when interpreted as
LIKE needle ('ab\') they should.
2022-06-01 21:38:46 +02:00
Robert Schulze
b3b0716b32
Merge pull request #37544 from ClickHouse/cached_patterns
Cache compiled regexps when evaluating non-const needles
2022-06-01 19:55:25 +02:00
avogar
966b864986 Fix possible logical error with type Nothing and JSON functions 2022-06-01 16:34:31 +00:00
flynn
b62e4cec65 Fix crash of FunctionHashID 2022-06-01 12:39:16 +00:00
Alexander Tokmakov
75f49a48e1 Merge branch 'master' into fix_trash 2022-06-01 14:20:46 +02:00
Robert Schulze
600512cc08
Replace exceptions thrown for programming errors by asserts 2022-06-01 11:53:37 +02:00
Anton Popov
20e319d67a
Merge pull request #37666 from CurtizJ/optimize-coalesce
Optimize function `COALESCE` with two arguments
2022-05-31 23:48:13 +02:00
Yakov Olkhovskiy
873ac9f8ff
Merge pull request #37540 from ClickHouse/feature-server-certificate
showCertificate function implementation
2022-05-31 02:50:03 -04:00
Anton Popov
30f8eb800a optimize function coalesce with two arguments 2022-05-30 22:29:35 +00:00
Nikolai Kochetov
77b07dd0a8
Merge pull request #37163 from ClickHouse/grouping-function
Add GROUPING function
2022-05-30 20:45:04 +02:00
HeenaBansal2009
b7eb6bbd38 Fixed clang-tidy-CheckTriviallyCopyableMove-errors 2022-05-30 11:09:03 -07:00
Robert Schulze
ad12adc31c
Measure and rework internal re2 caching
This commit is based on local benchmarks of ClickHouse's re2 caching.

Question 1: -----------------------------------------------------------
Is pattern caching useful for queries with const LIKE/REGEX
patterns? E.g. SELECT LIKE(col_haystack, '%HelloWorld') FROM T;

The short answer is: no. Runtime is (unsurprisingly) dominated by
pattern evaluation + other stuff going on in queries, but definitely not
pattern compilation. For space reasons, I omit details of the local
experiments.

(Side note: the current caching scheme is unbounded in size which poses
a DoS risk (think of multi-tenancy). This risk is more pronounced when
unbounded caching is used with non-const patterns ..., see next
question)

Question 2: -----------------------------------------------------------
Is pattern caching useful for queries with non-const LIKE/REGEX
patterns? E.g. SELECT LIKE(col_haystack, col_needle) FROM T;

I benchmarked five caching strategies:
1. no caching as a baseline (= recompile for each row)
2. unbounded cache (= threadsafe global hash-map)
3. LRU cache (= threadsafe global hash-map + LRU queue)
4. lightweight local cache 1 (= not threadsafe local hashmap with
   collision list which grows to a certain size (here: 10 elements) and
   afterwards never changes)
5. lightweight local cache 2 (not threadsafe local hashmap without
   collision list in which a collision replaces the stored element, idea
   by Alexey)

... using a haystack of 2 mio strings and
A). 2 mio distinct simple patterns
B). 10 simple patterns
C)  2 mio distinct complex patterns
D)  10 complex patterns

Fo A) and C), caching does not help but these queries still allow to
judge the static overhead of caching on query runtimes.

B) and D) are extreme but common cases in practice. They include
queries like "SELECT ... WHERE LIKE (col_haystack, flag ? '%pattern1%' :
'%pattern2%'). Caching should help significantly.

Because LIKE patterns are internally translated to re2 expressions, I
show only measurements for MATCH queries.

Results in sec, averaged over on multiple measurements;

1.A): 2.12
  B): 1.68
  C): 9.75
  D): 9.45

2.A): 2.17
  B): 1.73
  C): 9.78
  D): 9.47

3.A): 9.8
  B): 0.63
  C): 31.8
  D): 0.98

4.A): 2.14
  B): 0.29
  C): 9.82
  D): 0.41

5.A) 2.12 / 2.15 / 2.26
  B) 1.51 / 0.43 / 0.30
  C) 9.97 / 9.88 / 10.13
  D) 5.70 / 0.42 / 0.43
(10/100/1000 buckets, resp. 10/1/0.1% collision rate)

Evaluation:

1. This is the baseline. It was surprised that complex patterns (C, D)
   slow down the queries so badly compared to simple patterns (A, B).
   The runtime includes evaluation costs, but as caching only helps with
   compilation, and looking at 4.D and 5.D, compilation makes up over 90%
   of the runtime!

2. No speedup compared to 1, probably due to locking overhead. The cache
   is unbounded, and in experiments with data sets > 2 mio rows, 2. is
   the only scheme to throw OOM exceptions which is not acceptable.

3. Unique patterns (A and C) lead to thrashing of the LRU cache and very
   bad runtimes due to LRU queue maintenance and locking. Works pretty
   well however with few distinct patterns (B and D).

4. This scheme is tailored to queries B and D where it performs pretty
   good. More importantly, the caching is lightweight enough to not
   deteriorate performance on datasets A and C.

5. After some tuning of the hash map size, 100 buckets seem optimal to
   be in the same ballpark with 10 distinct patterns as 4. Performance
   also does not deteriorate on A and C compared to the baseline.
   Unlike 4., this scheme behaves LRU-like and can adjust to changing
   pattern distributions.

As a conclusion, this commit implementes two things:

1. Based on Q1, pattern search with const needle no longer uses
   caching. This applies to LIKE and MATCH + a few (exotic) other SQL
   functions. The code for the unbounded caching was removed.

2. Based on Q2, pattern search with non-const needles now use method 5.
2022-05-30 20:00:35 +02:00
Alexey Milovidov
f1fb57c6ce Fix clang-tidy-14 2022-05-30 05:36:26 +02:00
Alexey Milovidov
c0e6ff4216 More precise result of "dumpColumnStructure" and "byteSize" miscellaneous functions 2022-05-30 04:56:54 +02:00
Alexey Milovidov
c1169019d2 Merge branch 'master' into llvm-14 2022-05-29 02:29:02 +02:00
Alexey Milovidov
73e2e63414
Merge pull request #37612 from ClickHouse/clang-tidy-14
Fix clang-tidy-14, part 1
2022-05-29 03:16:32 +03:00
Alexander Tokmakov
4e52f45695 Merge branch 'master' into fix_trash 2022-05-28 19:43:19 +02:00
Alexey Milovidov
c50791dd3b Fix clang-tidy-14, part 1 2022-05-27 22:52:14 +02:00
Alexey Milovidov
d2c6fd90cb Fix clang-tidy-14, part 1 2022-05-27 22:51:37 +02:00
Alexander Gololobov
9b1b30855c Fixed check for HUGE_VAL 2022-05-27 18:25:11 +02:00
Alexander Gololobov
6361c5f38c Fix for failed style check 2022-05-27 18:22:16 +02:00
Alexander Gololobov
540353566c Added LpNorm and LpDistance functions for arrays 2022-05-27 17:17:08 +02:00
Robert Schulze
80061aa3e2
Merge remote-tracking branch 'origin/master' into cached_patterns 2022-05-27 09:21:01 +02:00
Alexey Milovidov
86afa3a245
Merge pull request #37502 from ClickHouse/array_norm_dist_fixes
Renamed arrayXXNorm/arrayXXDistance functions to XXNorm/XXDistance and fixed some overflow cases
2022-05-27 00:56:29 +03:00
mergify[bot]
a7629f900f
Merge branch 'master' into normalize-utf8-performance-tests-fix 2022-05-26 10:29:55 +00:00
Maksim Kita
3a92e61827
Merge pull request #37148 from kitaisreal/dictionary-get-descendants-performance-improvement
Dictionary getDescendants performance improvement
2022-05-26 12:29:17 +02:00
Yakov Olkhovskiy
2dc160a4c3 style fix 2022-05-25 20:56:36 -04:00
Dmitry Novik
7cd7782e4f Process columns more efficiently in GROUPING() 2022-05-25 21:55:41 +00:00
Dmitry Novik
3c1b6609ae Add comments and make tests more verbose 2022-05-25 21:23:35 +00:00
Maksim Kita
58cd1bd3ec
Merge pull request #36843 from bharatnc/ncb/h3-unidirectionaledges-funcs
add h3 unidirectional edge functions
2022-05-25 22:46:40 +02:00
Maksim Kita
bee3c30f66
Merge pull request #37524 from kitaisreal/geo-distance-functions-improve-performance
Geo distance functions improve performance
2022-05-25 22:40:40 +02:00
Alexander Gololobov
168b47d0ad Use same norm and distance function names for tuples and arrays 2022-05-25 22:39:59 +02:00
Alexander Gololobov
b065839f44 always return Float64 2022-05-25 22:27:00 +02:00
Alexander Gololobov
5df14cd956 Cast arguments to result type to avoid int overflow 2022-05-25 22:27:00 +02:00
Robert Schulze
49934a3dc8
Cache compiled regexps when evaluating non-const needles
Needles in a (non-const) needle column may repeat and this commit allows
to skip compilation for known needles. Out of the different design
alternatives (see below, if someone is interested), we now maintain
- one global pattern cache,
- with a fixed size of 42k elements currently,
- and use LRU as eviction strategy.

------------------------------------------------------------------------

(sorry for the wall of text, dumping it here not for reading but just
for reference)

Write-up about considered design alternatives:

1. Keep the current global cache of const needles. For non-const
   needles, probe the cache but don't store values in it.
   Pros: need to maintain just a single cache, no problem with cache
         pollution assuming there are few distinct constant needles
   Cons: only useful if a non-const needle occurred as already as a
         const needle
   --> overall too simplistic

2. Keep the current global cache for const needles. For non-const
   needles, create a local (e.g. per-query) cache
   Pros: unlike (1.), non-const needles can be skipped even if they
         did not occur yet, no pollution of the const pattern cache when
         there are very many non-const needles (e.g. large / highly
         distinct needle columns).
   Cons: caches may explode "horizontally", i.e. we'll end up with the
         const cache + caches for Q1, Q2, ... QN, this makes it harder
         to control the overall space consumption, also patterns
         residing in different caches cannot be reused between queries,
         another difficulty is that the concept of "query" does not
         really exist at matching level - there are only column chunks
         and we'd potentially end up with 1 cache / chunk

3. Queries with const and non-const needles insert into the same global
   cache.
   Pros: the advantages of (2.) + allows to reuse compiled patterns
         accross parallel queries
   Cons: needs an eviction strategy to control cache size and pollution
         (and btw. (2.) also needs eviction strategies for the
         individual caches)

4. Queries with const needle use global cache, queries with non-const
   needle use a different global cache
   --> Overall similar to (3) but ignores the (likely) edge case that
       const and non-const needles overlap.

In sum, (3.) seems the simplest and most beneficial approach.

Eviction strategies:

0. Don't ever evict --> cache may grow infinitely and eventually make
   the system unusable (may even pose a DoS risk)

1. Flush the cache after a certain threshold is exceeded --> very
   simple but may lead to peridic performance drops

2. Use LRU --> more graceful performance degradation at threshold but
   comes with a (constant) performance overhead to maintain the LRU
   queue

In sum, given that the pattern compilation in RE2 should be quite costly
(pattern-to-DFA/NFA), LRU may be acceptable.
2022-05-25 22:04:06 +02:00
Robert Schulze
ea60a614d2
Decrease namespace indent 2022-05-25 21:56:35 +02:00
Alexey Milovidov
abf2558fba
Merge pull request #37491 from ClickHouse/match_refactoring
Refactorings of LIKE/MATCH code
2022-05-25 22:05:38 +03:00
Alexey Milovidov
4482da9eb6
Update greatCircleDistance.cpp 2022-05-25 21:59:31 +03:00
Alexander Tokmakov
779e6ea0b9 make it better, fix on cluster queries 2022-05-25 20:17:49 +02:00
Nikolai Kochetov
ff98c24d44
Merge pull request #37048 from Avogar/fix-array-map-nothing
Add default implementation for Nothing in functions
2022-05-25 19:10:40 +02:00
Yakov Olkhovskiy
6692b9c2ed showCertificate function implementation 2022-05-25 12:11:44 -04:00
Alexey Milovidov
cb92482ca5
Merge pull request #37484 from kitaisreal/function-has-all-avx2-dynamic-dispatch
Function hasAll added dynamic dispatch
2022-05-25 19:05:32 +03:00
Maksim Kita
28355114c0 Fixed tests 2022-05-25 16:19:29 +02:00
Maksim Kita
e67b3537f7 Functions normalizeUTF8 unstable performance tests fix 2022-05-25 15:54:52 +02:00
Maksim Kita
45da28ecae Improve performance of geo distance functions 2022-05-25 14:22:22 +02:00
Maksim Kita
c372c3d6aa Fix performance tests 2022-05-25 11:49:59 +02:00
Kseniia Sumarokova
b50d4549c9
Merge pull request #37356 from amosbird/partition-prune-for-s3
"Partition pruning" for s3
2022-05-25 11:03:07 +02:00
Robert Schulze
05e4fa7df1
Fix special case of trivial regexp
Previously, we would alsays set 1 in case of a trivial regex (which is
correct). If someone in future builds a negated operator, then this
will produce wrong results. Right now, negation of regexp (SQL: NOT
MATCH) is implemented at a higher level, so we are safe and this is more
a preventive fix.
2022-05-25 10:05:55 +02:00
Robert Schulze
01ab7b9bad
Pass strings in some places as string_view
The original goal was to get change

  const auto & needle = String(
        reinterpret_cast<const char *>(cur_needle_data),
        cur_needle_length);

in Functions/MatchImpl.h into a std::string_view to save an allocation +
copy. The needle is eventually passed as search pattern into the re2
library. Re2 has an alternative constructor taking a const char * i.e. a
NULL-terminated string. Here, the needle is NULL-terminated but
1. this is only because it is passed inside a ColumnString yet this is
   not always the case (e.g. fixed string columns has a dense layout w/o
   NULL terminator).
2. assuming NULL termination for users != MatchImpl of the regex code is
   too dangerous.

So, for now we'll stay with copying to be on the safe side. One fine day
when re2 has a ptr/size ctor, we can use std::string_view.

Just changing a few other places from std::string to std::string_view
but this will not help with performance.
2022-05-25 10:05:51 +02:00
Robert Schulze
040fbf3686
Tighter sanity checks in matching code 2022-05-25 10:05:06 +02:00
Robert Schulze
35bef17302
Introduce variables to hold the match result
--> nicer when debugging
2022-05-25 10:04:47 +02:00
Robert Schulze
b044d44fef
Refactoring: Make template instantiation easier to read
- introduced class MatchTraits with enums that replace bool template
  parameters

- (minor: made negation the last template parameters because negation
  executes last during evaluation)
2022-05-25 10:03:58 +02:00
Bharat Nallan Chakravarthy
57cfc0bd04 check for validity of h3 index 2022-05-25 06:17:15 +05:30
Alexander Gololobov
2ff747785e
Merge pull request #37394 from ClickHouse/array_norm_dist_fixes
Do computations on the raw input data without copying to Eigen::Matrix
2022-05-24 20:59:04 +02:00
Robert Schulze
7348a0eb28
Merge pull request #37251 from ClickHouse/non_const_like
Support non-constant SQL functions (NOT) (I)LIKE and MATCH
2022-05-24 20:28:31 +02:00
Robert Schulze
028f15c4fa
Review comment: Throw LOGICAL_ERROR for different sizes of haystack / needles 2022-05-24 20:19:13 +02:00
Maksim Kita
3c0c322d7c
Merge pull request #37480 from kitaisreal/dynamic-dispatch-infrastructure-improvements
Dynamic dispatch infrastructure style fixes
2022-05-24 18:13:53 +02:00
Maksim Kita
6fb51e8bd3 Function hasAll added dynamic dispatch 2022-05-24 17:06:06 +02:00
Maksim Kita
86180614e7 Fixed tests 2022-05-24 15:33:03 +02:00
Anton Popov
e96af9fd75 better binary serialization of ColumnObject 2022-05-24 13:16:11 +00:00
Maksim Kita
e6e4b2826d Dynamic dispatch infrastructure style fixes 2022-05-24 14:25:29 +02:00
Amos Bird
c25ef92139
Fix tests 2022-05-24 18:57:55 +08:00
Amos Bird
093d315756
partition pruning for s3 2022-05-24 18:57:55 +08:00
Maksim Kita
712b000f2a
Merge pull request #37443 from kitaisreal/functions-normalize-utf8-fix
Functions normalize utf8 fix
2022-05-24 11:11:15 +02:00
Alexander Gololobov
7d0ed7e51a Remove eigen library 2022-05-24 10:24:50 +02:00
Alexander Gololobov
caad1435d5 Optimized the case when one the argumnets is Const 2022-05-24 10:24:50 +02:00
Alexander Gololobov
65fbda436a Do computations on the raw input data without copying to Eigen::Matrix 2022-05-24 10:24:50 +02:00
Bharat Nallan Chakravarthy
6e49b76cfd try suppress h3 asan errors 2022-05-24 10:22:46 +05:30
Maksim Kita
996241493f
Merge pull request #37447 from kitaisreal/binary-function-vectorized-remove-macro
BinaryFunctionVectorized remove macro
2022-05-23 16:50:12 +02:00
Maksim Kita
fe21b4ca9e Fixed style check 2022-05-23 14:41:07 +02:00
Maksim Kita
008de5c779
Merge pull request #37438 from kitaisreal/function-binary-representation-style-fixes
FunctionBinaryRepresentation style fixes
2022-05-23 13:54:15 +02:00
Maksim Kita
e550843d56 BinaryFunctionVectorized remove macro 2022-05-23 12:45:16 +02:00
Maksim Kita
585b86446e Added hierarchical_index_bytes_allocated column in system.dictionaries 2022-05-23 12:42:00 +02:00
Maksim Kita
be9c3d9bd4 Fixed build 2022-05-23 12:42:00 +02:00
Maksim Kita
100afa8bcf Dictionary getDescendants performance improvement 2022-05-23 12:42:00 +02:00
Maksim Kita
78782de887 Functions normalizeUTF8 logical error fix 2022-05-23 12:19:14 +02:00
Maksim Kita
98bb34f2f2 FunctionBinaryRepresentation style fixes 2022-05-23 10:59:33 +02:00
Robert Schulze
e25ca139cd
Implement SQL functions (NOT) (I)LIKE() + MATCH() with non-const needles
With this commit, SQL functions LIKE and MATCH and their variants can
work with non-const needle arguments. E.g.

  create table tab
    (id UInt32, haystack String, needle String)
    engine = MergeTree()
    order by id;

  insert into tab values
  (1, 'Hello', '%ell%')
  (2, 'World', '%orl%')

  select id, haystack, needle, like(haystack, needle)
  from tab;

For that, methods vectorVector() and vectorFixedVector() were added to
MatchImpl. The existing code for const needles has an optimization where
the compiled regexp is cached. The new code expects a different needle
per row and consequently does not cache the regexp.
2022-05-23 09:41:28 +02:00
Alexey Milovidov
698e5e5352
Merge pull request #37415 from Joeywzr/gen_uuid
Generate multiple columns with UUID
2022-05-23 00:29:42 +03:00
Robert Schulze
4829ae8380
Replace overly clever const argument logic by something simpler
The previous logic was smart but too inflexible to support the next
commits. Replace by a simple pushdown logic where string search
implementations return their const arguments instead of having the
common class figure these out based on properties/traits.
2022-05-22 17:50:38 +02:00
Robert Schulze
0299cc87e4
Improve naming consistency of string search code
Just renamings, nothing major ...
2022-05-22 17:50:38 +02:00
Robert Schulze
19d53c14fa
Merge pull request #37382 from ClickHouse/wc++98-compat-extra-semi
Enable -Wc++98-compat-extra-semi
2022-05-22 09:40:45 +02:00
Memo
15a76d012f add NUMBER_OF_ARGUMENTS_DOESNT_MATCH defination 2022-05-22 13:38:47 +08:00
Yakov Olkhovskiy
d878f193d8
Merge pull request #37013 from mnutt/hashid
Add hashid support
2022-05-21 17:14:54 -04:00
Memo
942af133e5 init 2022-05-21 23:54:12 +08:00
Maksim Kita
0d69f35b6a Fixed style check 2022-05-21 14:54:45 +02:00
Maksim Kita
42439aeb3c Improve performance of number comparison functions 2022-05-20 22:42:48 +02:00
Robert Schulze
0f6715bd91
Follow-up to PR #37300: semicolon warnings
In PR #37300, Alexej asked why we the compiler does not warn about
unnecessary semicolons, e.g.

  f()
  {
  }; // <-- here

The answer is surprising: In C++98, above syntax was disallowed but by
most compilers accepted it regardless. C++>11 introduced "empty
declarations" which made the syntax legal.

The previous behavior can be restored using flag
-Wc++98-compat-extra-semi. This finds many useless semicolons which were
removed in this change. Unfortunately, there are also false positives
which would require #pragma-s and HAS_* logic (--> check_flags.cmake) to
suppress. In the end, -Wc++98-compat-extra-semi comes with extra effort
for little benefit. Therefore, this change only fixes some semicolons
but does not enable the flag.
2022-05-20 15:06:34 +02:00
Michael Nutt
23dbf1b257 Merge branch 'master' into hashid 2022-05-20 08:42:01 -04:00
Robert Schulze
b475fbc9a7
Merge pull request #37300 from ClickHouse/cmake-cleanup-pt3
Various cmake cleanups
2022-05-20 10:02:36 +02:00
Dmitry Novik
b3ccf96c81 Merge remote-tracking branch 'origin/master' into grouping-function 2022-05-19 17:58:33 +00:00
Dmitry Novik
d4c66f4a48 Code cleanup & fix GROUPING() with TOTALS 2022-05-19 16:36:51 +00:00
avogar
f69c3175af Fix comments 2022-05-19 10:13:44 +00:00
avogar
cb8646fbb4 Merge branch 'master' of github.com:ClickHouse/ClickHouse into fix-array-map-nothing 2022-05-19 07:18:48 +00:00
Michael Nutt
e0c14dfc01 fix includes 2022-05-18 20:16:43 -04:00
Bharat Nallan Chakravarthy
00d3bbc2e0 review fixes 2022-05-18 17:04:15 -07:00
Michael Nutt
c87638d2ba put hashid behind allow_experimental_hash_functions setting 2022-05-18 19:06:33 -04:00
Michael Nutt
11a17997b3 better const column checking 2022-05-18 18:09:45 -04:00
Michael Nutt
da99b1b250 simplify hashing 2022-05-18 16:57:30 -04:00
Michael Nutt
d6d1c22008 better argument type checking 2022-05-18 16:57:21 -04:00
Michael Nutt
e453132db8 remove hashid define guard 2022-05-18 15:26:54 -04:00
Maksim Kita
df0cb06209
Merge pull request #37289 from kitaisreal/unary-arithmetic-functions-improve-performance-dynamic-dispatch
Improve performance of unary arithmetic functions
2022-05-18 19:16:30 +02:00
Dmitry Novik
6356112a76 Refactor GROUPING function 2022-05-18 15:23:31 +00:00
Anton Popov
715d5b0173
Merge pull request #37270 from Avogar/fix-bool-eof
Fix Nullable(String) to Nullable(Bool/IPv4/IPv6) conversion
2022-05-18 14:08:52 +02:00
Nikolai Kochetov
64ecb3941c
Merge pull request #37259 from ClickHouse/clangtidies2
Activate more clangtidies
2022-05-18 13:01:40 +02:00
Bharat Nallan Chakravarthy
c476b8dd92 Merge remote-tracking branch 'upstream/master' into ncb/h3-unidirectionaledges-funcs 2022-05-17 20:10:03 -07:00
mergify[bot]
05305811f8
Merge branch 'master' into fix-bool-eof 2022-05-17 19:28:11 +00:00
Robert Schulze
0c55ac76d2
A few clangtidy updates
Enable:

- bugprone-lambda-function-name: "Checks for attempts to get the name of
  a function from within a lambda expression. The name of a lambda is
  always something like operator(), which is almost never what was
  intended."

- bugprone-unhandled-self-assignment: "Finds user-defined copy
  assignment operators which do not protect the code against
  self-assignment either by checking self-assignment explicitly or using
  the copy-and-swap or the copy-and-move method.""

- hicpp-invalid-access-moved: "Warns if an object is used after it has
  been moved."

- hicpp-use-noexcept: "This check replaces deprecated dynamic exception
  specifications with the appropriate noexcept specification (introduced
  in C++11)"

- hicpp-use-override: "Adds override (introduced in C++11) to overridden
  virtual functions and removes virtual from those functions as it is
  not required."

- performance-type-promotion-in-math-fn: "Finds calls to C math library
  functions (from math.h or, in C++, cmath) with implicit float to
  double promotions."

Split up:

- cppcoreguidelines-*. Some of them may be useful (haven't checked in
  detail), therefore allow to toggle them individually.

Disable:

- linuxkernel-*. Obvious.
2022-05-17 20:56:57 +02:00
mergify[bot]
36b4ed19c5
Merge branch 'master' into unary-arithmetic-functions-improve-performance-dynamic-dispatch 2022-05-17 18:08:24 +00:00
Alexander Gololobov
38f291c70d
Merge pull request #37030 from bharatnc/ncb/h3-missing-traversal-funcs
add remaining h3 traversal funcs
2022-05-17 18:19:56 +02:00
avogar
46f4f8a457 Fix use of unitialized memory 2022-05-17 12:59:46 +00:00
Maksim Kita
beb34e7062 Improve performance of unary arithmetic functions 2022-05-17 13:53:20 +02:00
Alexander Gololobov
670a8bac29 Fixed required array size calculation and reduced number of reallocations 2022-05-17 09:45:49 +02:00
Kseniia Sumarokova
94683786dc
Merge branch 'master' into MeiliSearch 2022-05-16 22:42:09 +02:00
Alexander Gololobov
e2e3536a80 Fixed handling of gridPathCellsSize() errors 2022-05-16 21:23:45 +02:00
avogar
415aabd4d0 Fix Nullable(String) to Nullable(Bool/IPv4/IPv6) conversion 2022-05-16 19:15:18 +00:00
Robert Schulze
43945cea1b
Fixing some warnings 2022-05-16 20:59:27 +02:00
Dmitry Novik
e5b395e054 Support ROLLUP and CUBE in GROUPING function 2022-05-16 17:33:38 +00:00
Robert Schulze
e3cfec5b09
Merge remote-tracking branch 'origin/master' into clangtidies 2022-05-16 10:12:50 +02:00
Michael Nutt
8bff9b8ce9
Merge branch 'master' into hashid 2022-05-14 09:52:05 +09:00
Dmitry Novik
6fc7dfea80 Support ordinary GROUP BY 2022-05-13 23:04:12 +00:00
Maksim Kita
3f18d7da33
Merge pull request #37189 from kitaisreal/function-h3-k-ring-add-cast
Function h3kRing added cast
2022-05-13 22:53:20 +02:00
Dmitry Novik
efb30bdf64 Correctly use __grouping_set_map column 2022-05-13 18:20:12 +00:00
Dmitry Novik
ae81268d4d Try to compute helper column lazy 2022-05-13 14:55:50 +00:00
Maksim Kita
ef7e21ea46 Function h3kRing added cast 2022-05-13 15:20:04 +02:00
Michael Nutt
9599c1f05c use single-character find for bad alphabet 2022-05-13 19:01:20 +09:00
qieqieplus
8b3fb22c6d check array sizes for short cut 2022-05-13 17:05:18 +08:00
mergify[bot]
2fdd305ef1
Merge branch 'master' into array-distance-functions 2022-05-13 07:56:57 +00:00
Michael Nutt
62a1e1c0cd use existing error code 2022-05-13 09:58:14 +09:00
Michael Nutt
03a7f7c4bd disallow null characters in custom alphabet 2022-05-13 08:43:42 +09:00
Dmitry Novik
92575fc3e5 Add missing file 2022-05-12 16:54:02 +00:00
Dmitry Novik
c5b40a9c91 WIP on GROUPING function 2022-05-12 16:40:26 +00:00
avogar
4c945d7fe5 Fix 2022-05-12 16:07:58 +00:00
avogar
0311dbb422 Add default implementation for Nothing, support arrays of nullable for arrayFilter and similar functions 2022-05-12 15:15:31 +00:00
Alexander Gololobov
548625a003 Reserve result vectors 2022-05-12 14:33:20 +02:00
Alexander Gololobov
7c226f6067 Fixed special case condition 2022-05-12 14:32:47 +02:00
Alexander Gololobov
355c5443a0 Trying to fix sanitizer failure 2022-05-12 13:50:53 +02:00
Robert Schulze
f8c24c5fe8
Merge pull request #37117 from ClickHouse/bug-37114
Fix Bug 37114 - ilike on FixedString(N)s produces wrong results
2022-05-12 09:39:36 +02:00