Commit Graph

5344 Commits

Author SHA1 Message Date
Robert Schulze
fb6c4f2802
Fix msan issue, pt. II 2023-05-30 08:51:44 +00:00
Victor Krasnov
6c94632d47 Deprive toStartOfWeek and toLastDayOfWeek functions of in-source documentation 2023-05-29 22:10:34 +00:00
Robert Schulze
516fa1c375
Merge branch 'master' into rs/entropy-learned-hashing 2023-05-29 17:40:14 +02:00
Victor Krasnov
0ad5b9f598 Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-810-dev 2023-05-29 08:26:26 +00:00
Robert Schulze
41d60f0be3
Fix style 2023-05-29 08:08:47 +00:00
Maksim Kita
7ef20bbdcd Function equals NaN fix 2023-05-28 17:02:46 +03:00
Robert Schulze
f49160ef4e
Build partial key positions from entire training data 2023-05-26 15:27:56 +00:00
Robert Schulze
2298eeb2b2
Merge branch 'master' into rs/entropy-learned-hashing 2023-05-26 12:04:49 +02:00
Robert Schulze
ad4a21034f
Fix msan issue in keyed siphash
Issue:
https://s3.amazonaws.com/clickhouse-test-reports/0/ffdd91669471f4934704f98f0191524496b4e85b/fuzzer_astfuzzermsan/report.html

Repro:
SELECT hex(sipHash128ReferenceKeyed((toUInt64(2), toUInt64(-9223372036854775807)))) GROUP BY (toUInt64(506097522914230528), toUInt64(now64(2, NULL + NULL), 1084818905618843912)), toUInt64(2), NULL + NULL, char(-2147483649, 1)

Minimal repro:
SELECT sipHash64Keyed((2::UInt64, toUInt64(2)), 4) GROUP BY toUInt64(2)
2023-05-25 17:52:03 +00:00
Robert Schulze
eca08438f4
Fix macos build 2023-05-25 17:05:18 +00:00
Robert Schulze
8804dfd4b0
Fix resizing 2023-05-25 11:57:09 +00:00
Victor Krasnov
03ca3f96d2 Add built-in documentation to toStartOfWeek and toLastDayOfWeek functions 2023-05-24 17:40:21 +00:00
Robert Schulze
889489b02e
Merge branch 'master' into space 2023-05-23 23:18:19 +02:00
Robert Schulze
285e8f4ae1
Protect against DOS 2023-05-23 12:16:49 +00:00
Robert Schulze
f4c73e94d2
Merge pull request #49989 from arenadata/ADQM-811
Add support of Date|Date32 arguments to the toUnixTimestamp() function
2023-05-23 08:55:56 +02:00
Robert Schulze
d9a7227cf4
Fix style check 2023-05-23 06:49:19 +00:00
avogar
646eeb63a4 Fix build 2023-05-22 19:46:05 +00:00
avogar
4f85d6a1bb Merge branch 'master' of github.com:ClickHouse/ClickHouse into random-structure 2023-05-22 19:43:24 +00:00
Robert Schulze
d76498dca0
reserve() --> resize() 2023-05-22 19:19:08 +00:00
Robert Schulze
d5cfcdfae1
String terminator: \n --> \0 2023-05-22 19:10:03 +00:00
Robert Schulze
df436b2cd4
Spark compatibility: Add new function space() 2023-05-22 14:52:51 +00:00
Maksim Kita
804e5e12ba JIT compilation not equals NaN fix 2023-05-22 13:14:27 +03:00
Victor Krasnov
98aace14ae Add DATE_SECONDS_PER_DAY macro definition to replace the numeric literal 86400 2023-05-22 09:23:23 +00:00
vdimir
8b77e2096c
Merge pull request #49760 from arthurpassos/extract_kv_ignore_kv_delimiter_when_reading_value 2023-05-20 13:27:59 +02:00
Victor Krasnov
3a3e413552 Implement toLastDayWeek function 2023-05-18 21:47:52 +00:00
Victor Krasnov
83d066e5cf Re-enable Date and Date32 as parameters of toUnixTimestamp function 2023-05-18 09:07:27 +00:00
FFFFFFFHHHHHHH
fd1e6557e1
Merge branch 'master' into dot_product 2023-05-17 14:40:06 +08:00
fhbai
c104354894 fix 2023-05-17 14:39:30 +08:00
Kruglov Pavel
d50e6fe868
Fix build after bad conflicts resolution 2023-05-16 15:35:16 +02:00
Kruglov Pavel
b6d2a84e83
Try to fix build 2023-05-16 12:01:55 +02:00
Kruglov Pavel
362fa4849f
Try to fix build 2023-05-15 17:56:53 +02:00
vdimir
07de815d96
Merge pull request #49836 from arthurpassos/add_extract_kv_max_number_of_pairs_safeguard 2023-05-15 16:31:01 +02:00
Arthur Passos
e8f971aa2b use LIMIT_EXCEEDED instead of TOO_LARGE_MAP_SIZE 2023-05-15 09:25:10 -03:00
Arthur Passos
b06e34a77f Accept key value delimiter as part of value 2023-05-15 13:52:47 +02:00
Kruglov Pavel
c901d2a7be
Fix style 2023-05-15 13:46:18 +02:00
avogar
aa7ab1f23b Fix comments 2023-05-15 11:20:03 +00:00
Kruglov Pavel
558eda4146
Merge pull request #49412 from azat/block-use-dense-hash-map
Switch Block::NameMap to google::dense_hash_map over HashMap
2023-05-15 12:22:55 +02:00
Alexey Milovidov
1db35384d9 Support bitCount for big integers 2023-05-15 03:30:03 +02:00
robot-clickhouse
33ca77b4ca
Merge pull request #49843 from azat/joinGet-non-deterministic
[RFC] Mark joinGet() as non deterministic (so as dictGet)
2023-05-14 11:12:12 +02:00
Alexey Milovidov
4f7bcf01f6
Merge pull request #49858 from ucasfl/bit-hamming
bitHammingDistance support String and FixedString data type
2023-05-14 08:28:01 +03:00
Robert Schulze
c4f7c3daa1
Merge branch 'master' into rs/entropy-learned-hashing 2023-05-13 17:33:12 +02:00
flynn
2f88605c3d remove space
format
2023-05-13 14:03:21 +00:00
flynn
2ffd00df8a bitHammingDistance support String and FixedString data type 2023-05-13 13:56:36 +00:00
Azat Khuzhin
a96067987e Mark joinGet() as non deterministic (so as dictGet)
joinGet() should not be considered as deterministic function, since
shards could have different data in tables.

Also since now there is allow_nondeterministic_mutations, it could be
used as a workaround for this backward incompatible change.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-13 08:12:53 +02:00
Alexey Milovidov
5a44dc26e7 Fixes for clang-17 2023-05-13 02:57:31 +02:00
avogar
602b9a740e Make better, allow generateRandom without structure argument 2023-05-12 19:39:33 +00:00
Arthur Passos
b1549a19a5 Use 0 as unlimited 2023-05-12 11:19:35 -03:00
Arthur Passos
1e3b7af97a Add setting to limit the max number of pairs produced by extractKeyValuePairs 2023-05-12 10:26:05 -03:00
Robert Schulze
b9c185af44
Merge pull request #49678 from azat/build/llvm-16
Switch to LLVM/clang 16 (16.0.3)
2023-05-12 13:47:36 +02:00
Anton Popov
3351ef7398
Merge pull request #49789 from CurtizJ/fix-array-map-tuple
Fix `arrayMap` with array of tuples with single argument
2023-05-12 13:27:40 +02:00
Robert Schulze
922420420c
Merge pull request #49300 from ClickHouse/rs/functdocs
Introduce more fields for in-source function documentation
2023-05-12 11:36:04 +02:00
Robert Schulze
d15f19912f
Merge pull request #49198 from ClibMouse/s390x_reinterpretas_fix
Fix reinterpretAs*() on big endian machines
2023-05-12 10:33:50 +02:00
Azat Khuzhin
2c40dd6a4c Switch Block::NameMap to google::dense_hash_map over HashMap
Since HashMap creates 2^8 elements by default, while dense_hash_map
should be good here.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-12 05:52:57 +02:00
Alexey Milovidov
76f7f5750d
Merge pull request #49799 from rschu1ze/demange
Typo: demange.cpp --> demangle.cpp
2023-05-12 02:01:49 +03:00
Robert Schulze
8ca804d40e
Typo: demange.cpp --> demangle.cpp 2023-05-11 21:32:12 +00:00
Robert Schulze
bbfb74ab70
Update comment 2023-05-11 19:06:04 +00:00
Robert Schulze
4a168444fa
Store keys as std::string_view 2023-05-11 19:03:17 +00:00
Robert Schulze
37c1b1aa58
Some fixups 2023-05-11 18:49:05 +00:00
Suzy Wang
70db49cdeb
Merge branch 'master' into s390x_reinterpretas_fix 2023-05-11 14:41:57 -04:00
Anton Popov
84aa97b738 fix arrayMap with array of tuples with single argument 2023-05-11 14:52:01 +00:00
Azat Khuzhin
00fdfa115f Suppress MSan warning in NgramDistanceImpl::unrollLowering()
NgramDistanceImpl::unrollLowering() relies on the fact that PODArray has
padding and it is OK to access more items.

Here is an MSan report:

    ==656==WARNING: MemorySanitizer: use-of-uninitialized-value
        0 0x557fd825485f in DB::NgramDistanceImpl<4ul, char8_t, false, true, false>::vectorConstant(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 63ul, 64ul> const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 63ul, 64ul> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, DB::PODArray<float, 4096ul, Allocator<false, false>, 63ul, 64ul>&) (/usr/bin/clickhouse+0x124d885f) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        1 0x557fd824eb83 in DB::FunctionsStringSimilarity<DB::NgramDistanceImpl<4ul, char8_t, false, true, false>, DB::NameNgramSearchCaseInsensitive>::executeImpl(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataType const> const&, unsigned long) const (/usr/bin/clickhouse+0x124d2b83) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        2 0x557fd50023b7 in DB::FunctionToExecutableFunctionAdaptor::executeImpl() const (/usr/bin/clickhouse+0xf2863b7) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)

      Uninitialized value was stored to memory at
        0 0x557fd4f8da5a in __msan_memcpy (/usr/bin/clickhouse+0xf211a5a) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        1 0x557fd8253803 in DB::NgramDistanceImpl<4ul, char8_t, false, true, false>::vectorConstant(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 63ul, 64ul> const&, DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 63ul, 64ul> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, DB::PODArray<float, 4096ul, Allocator<false, false>, 63ul, 64ul>&) (/usr/bin/clickhouse+0x124d7803) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        2 0x557fd824eb83 in DB::FunctionsStringSimilarity<DB::NgramDistanceImpl<4ul, char8_t, false, true, false>, DB::NameNgramSearchCaseInsensitive>::executeImpl(std::__1::vector<DB::ColumnWithTypeAndName, std::__1::allocator<DB::ColumnWithTypeAndName>> const&, std::__1::shared_ptr<DB::IDataType const> const&, unsigned long) const (/usr/bin/clickhouse+0x124d2b83) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        3 0x557fd50023b7 in DB::FunctionToExecutableFunctionAdaptor::executeImpl() const (/usr/bin/clickhouse+0xf2863b7) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)

      Uninitialized value was stored to memory at
        0 0x557fd4f8da5a in __msan_memcpy (/usr/bin/clickhouse+0xf211a5a) (BuildId: 76773125d8739591c75d4f4d263a2ffe7ca96855)
        1 0x5580061699f5 in detail::memcpySmallAllowReadWriteOverflow15Impl(char*, char const*, long) build_docker/./src/Common/memcpySmall.h:42:13
        2 0x5580061699f5 in memcpySmallAllowReadWriteOverflow15(void*, void const*, unsigned long) build_docker/./src/Common/memcpySmall.h:57:5
        3 0x5580061699f5 in DB::ColumnString::replicate(DB::PODArray<unsigned long, 4096ul, Allocator<false, false>, 63ul, 64ul> const&) const build_docker/./src/Columns/ColumnString.cpp:462:13
        4 0x558005d3fae4 in DB::ColumnConst::convertToFullColumn() const build_docker/./src/Columns/ColumnConst.cpp:48:18

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-05-11 16:30:17 +02:00
avogar
5ed1b12e19 Fix build 2023-05-11 12:12:43 +00:00
avogar
604bd24995 Refactor, remove no more needed arguments 2023-05-11 11:58:08 +00:00
Robert Schulze
c2a4d89b6f
Fix style 2023-05-11 09:29:05 +00:00
Suzy Wang
d267c914c3
Merge branch 'master' into s390x_reinterpretas_fix 2023-05-10 16:08:50 -04:00
Suzy Wang
24b6ff47ac fix format and some more fix for fixedstring 2023-05-10 13:06:30 -07:00
avogar
9096f62efc Merge branch 'master' of github.com:ClickHouse/ClickHouse into random-structure 2023-05-10 18:46:19 +00:00
Robert Schulze
9795d5403f
Merge branch 'master' into rs/msan-randomStringUTF8 2023-05-10 20:16:49 +02:00
Robert Schulze
374dbd9c39
Fix msan issue in randomStringUTF8(<uneven number>) 2023-05-10 17:49:23 +00:00
Suzy Wang
1b21f13605 ip encoding fix 2023-05-09 13:57:22 -07:00
Suzy Wang
ce471a2e8b Updated code as suggested 2023-05-09 13:31:54 -07:00
FFFFFFFHHHHHHH
4a10f4b3d0
Merge branch 'master' into dot_product 2023-05-09 12:06:28 +08:00
FFFFFFFHHHHHHH
79398f612f fix style 2023-05-09 11:50:38 +08:00
fhbai
a7e04b7576 fix return type 2023-05-09 11:36:15 +08:00
Robert Schulze
e9d9eda3a2
More typedef usage 2023-05-08 12:46:14 +00:00
Robert Schulze
6a454ed6c3
Add Entropies typedef 2023-05-08 12:41:30 +00:00
Robert Schulze
d2dc5e9fc8
Improve naming 2023-05-08 12:36:28 +00:00
Robert Schulze
8b77b706c4
Optimize allocations 2023-05-08 12:31:25 +00:00
Robert Schulze
d2216a4339
Remove leftover 2023-05-08 12:20:40 +00:00
Robert Schulze
1b7c207d7a
Replace ACM link by DOI link 2023-05-08 12:16:12 +00:00
Robert Schulze
fdabce9a68
Move chooseBytes() up 2023-05-08 12:15:19 +00:00
Robert Schulze
03e9522de4
Less namespace clutter 2023-05-08 12:13:52 +00:00
Robert Schulze
267e0c4ef5
More typedef usage 2023-05-08 12:12:24 +00:00
Robert Schulze
b9e8c52057
Fix function registration 2023-05-08 12:08:22 +00:00
Robert Schulze
bb5a25e81c
Fix typo 2023-05-08 12:05:44 +00:00
Robert Schulze
500f3d3951
Add SQL functions for Entropy Learned Hashing
Courtesy to @Dmitry909, I just wrapped up his work.
2023-05-08 10:18:55 +00:00
Robert Schulze
d8d2b0af76
Merge pull request #49466 from ucasfl/str_to_map
add alias str_to_map and mapFromString for extractKeyValuePairs
2023-05-08 10:11:06 +02:00
robot-clickhouse-ci-2
6c02b6b327
Merge pull request #49627 from ClickHouse/rs/obsolete-ccache-knob
CMake: Remove legacy switch for ccache
2023-05-08 00:16:09 +02:00
Robert Schulze
f4eabd967d
Merge pull request #49603 from ClickHouse/rs/makedate-mysql
Implement a MySQL-compatible variant of makeDate()
2023-05-07 21:51:03 +02:00
Robert Schulze
e275da1d31
Remove deprecated logic for ccache 2023-05-07 15:41:56 +00:00
Robert Schulze
b995795971
Fix style 2023-05-07 13:27:57 +00:00
Robert Schulze
aa09b6154b
Various cleanups 2023-05-07 13:06:35 +00:00
Alexey Milovidov
72e1f751bb Fix error in #48300 2023-05-07 04:16:18 +02:00
Robert Schulze
c893302a08
Implement a MySQL-compatible variant of makeDate()
Fixes #49143
2023-05-06 20:11:36 +00:00
Robert Schulze
2986c28761
Small fixes 2023-05-06 18:12:10 +00:00
Robert Schulze
3dfc0bd265
Merge pull request #49413 from azat/build/headers
Slightly reduce inter-header dependencies
2023-05-05 23:37:58 +02:00
Robert Schulze
45c28e1221
Introduce more fields for in-source function documentation 2023-05-05 21:30:21 +00:00
FFFFFFFHHHHHHH
d3e027390d
Merge branch 'master' into dot_product 2023-05-05 10:48:02 +08:00
flynn
236a0d9da0 add alias str_to_map and mapFromString for extractKeyValuePairs 2023-05-03 15:46:17 +00:00
Alexander Tokmakov
e399903030
Merge pull request #48548 from ClickHouse/clusters_is_active_column
Add some columns to system.clusters
2023-05-03 17:42:40 +03:00