Commit Graph

4550 Commits

Author SHA1 Message Date
Vitaly Baranov
65c61877c7
Merge pull request #33435 from BoloniniD/BLAKE3
Integrating Rust code into ClickHouse
2022-10-03 15:25:06 +02:00
Anton Popov
6e61cf92f5 Merge remote-tracking branch 'upstream/master' into HEAD 2022-10-03 13:16:57 +00:00
Roman Vasin
dab5e80c81 Add support of Date32 arguments 2022-10-03 13:15:32 +00:00
Duc Canh Le
528591245f Fix register name 2022-10-03 18:46:20 +08:00
Duc Canh Le
acfc0cf0ab add doc for tryDecrypt 2022-10-03 18:06:54 +08:00
Duc Canh Le
fbb9b3f5ab Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-10-03 17:56:06 +08:00
AlfVII
8fc073d3f9
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-10-03 10:00:35 +02:00
Alfonso Martinez
926210b1dd Merge branch 'fix-slow-json-extract-with-low-cardinality' of github.com:AlfVII/ClickHouse into fix-slow-json-extract-with-low-cardinality 2022-10-03 10:00:08 +02:00
Alfonso Martinez
b1fced367e Fixed warning 2022-10-03 09:59:52 +02:00
Robert Schulze
db5ef7b3cb
Merge branch 'master' into generated-file-cleanup 2022-10-02 23:13:18 +02:00
Robert Schulze
d726ca3212
Refactoring: Un-inline some error handling methods 2022-10-02 19:52:40 +00:00
Robert Schulze
8e727d4fbc
Merge pull request #41910 from arenadata/ADQM-583
Improve enable_extended_results_for_datetime_functions option to return results of type DateTime64
2022-10-02 20:46:51 +02:00
BoloniniD
3a3ca51980 Fixes after review 2022-10-02 20:52:05 +03:00
Alexey Milovidov
0d1d177013
Merge pull request #41131 from JackyWoo/add_function_java_int_hash
Support Java integers hashing in `javaHash`
2022-10-02 18:10:51 +03:00
Duc Canh Le
99bf356220
Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-10-02 14:35:42 +08:00
AlfVII
461141bcc4
Merge branch 'master' into fix-slow-json-extract-with-low-cardinality 2022-10-01 11:35:49 +02:00
BoloniniD
fe7b2fdf6e
Merge branch 'master' into BLAKE3 2022-09-30 18:51:31 +03:00
Anton Popov
9916e35b3f
Merge pull request #41701 from CurtizJ/fix-monotonic-order-by
Fix `ORDER BY` monotonic functions
2022-09-30 17:32:37 +02:00
Alfonso Martinez
875145a813 FIxed corner case in lowCardinalityFixedString and FixedString 2022-09-30 12:41:41 +02:00
Robert Schulze
cc92a2d174
Merge branch 'master' into generated-file-cleanup 2022-09-30 09:56:31 +02:00
Roman Vasin
f716fc2fe5 Put list of functions in correct order in docs; Use new formatting in exception messages. 2022-09-30 07:54:35 +00:00
BoloniniD
55c79230b3 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-29 23:53:25 +03:00
Roman Vasin
45414b251d Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-583 2022-09-29 17:02:49 +00:00
Maksim Kita
f5a8acaa4a
Merge pull request #41870 from jiebinn/compare_with_empty_string
FunctionsComparison: optimize the string_vector_constant when empty string constant
2022-09-29 19:30:02 +03:00
Roman Vasin
c6fda60f32 Cleanup code 2022-09-29 15:45:49 +00:00
Roman Vasin
02e08fa301 Use another approach for timeSlot() 2022-09-29 15:44:56 +00:00
Roman Vasin
e0099d52de Use C++20 format() syntax for exceptions 2022-09-29 14:07:02 +00:00
JackyWoo
c70d687cfb fix build error 2022-09-29 19:56:50 +08:00
Roman Vasin
53b34c7e0a Put enable_extended_results_for_datetime_functions into private; correct if/else braces 2022-09-29 11:04:13 +00:00
Roman Vasin
22ccce6946 Remove WithContext; Rename execute_extended_result to executeExtendedResult 2022-09-29 09:29:07 +00:00
JackyWoo
5683676b72 throw exception for unsigned types 2022-09-29 15:50:03 +08:00
Alexey Milovidov
3170e5c4b9
Update FunctionsHashing.h 2022-09-28 21:11:32 +03:00
Roman Vasin
e66e70870a Fix timezone issues 2022-09-28 16:05:59 +00:00
Jiebin Sun
2288970d3e FunctionsComparison: optimize the comparison with empty string
Add the fast path in string_vector_constant if the string constant
is empty string. If the string size a_size and the string constant
size b_size are both 0, they are equal and both empty string. And
there is no need to call memequalSmallAllowOverflow15() for string
comparison.

We have tested the patch on ICX 8380 x 2 server with ClickBench.

Query 5, 10, 12, 13, 14, 15, 18, 20, 21, 22, 24, 25, 26, 27, 29, 34
of Clickbench have gained 2%-6% improvement. The overall geomean has
gained 1% improvement.
2022-09-28 23:48:34 +08:00
Robert Schulze
f24fab7747
Fix some #include atrocities 2022-09-28 13:49:28 +00:00
Robert Schulze
fd86829824
Consolidate config_core.h into config.h
Less duplication, less confusion ...
2022-09-28 13:31:57 +00:00
Robert Schulze
0753fd1c77
Consolidate config_functions.h into config.h
Less duplication, less confusion ...
2022-09-28 12:48:26 +00:00
Robert Schulze
6d70b4a1f6
Generate config_version.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Sergei Trifonov
dc75c31855
Update src/Functions/FunctionsJSON.cpp 2022-09-28 12:16:20 +02:00
Robert Schulze
06507c40de
${ConfigIncludePath} --> ${CONFIG_INCLUDE_PATH} 2022-09-28 08:28:47 +00:00
Robert Schulze
0c095b30b2
Remove unused file 2022-09-28 08:12:15 +00:00
Robert Schulze
1885bb0524
Make comment consistent accross generated files 2022-09-28 08:11:09 +00:00
Robert Schulze
0a4862f177
Fix style 2022-09-28 07:48:36 +00:00
Alexey Milovidov
683b7e5e95
Merge branch 'master' into add_function_java_int_hash 2022-09-28 01:30:26 +03:00
Robert Schulze
06493a0062
Fix style 2022-09-27 11:54:55 +00:00
Robert Schulze
aa7e62ad5f
Add functio ntryBase58Decode()
- makes it consistent with tryBase64Decode(), follow-up to #39292

- additionally the following minor changes:

  - rename Common/base58.h|cpp to Common/Base58.h|cpp for constency with
    Common/Base64.h|cpp

  - check that (encode|decode|tryDecode)Base64() gets just one argument
2022-09-27 10:18:36 +00:00
Alfonso Martinez
8307b034a5 Fixed typo 2022-09-27 10:56:52 +02:00
Alfonso Martinez
b4f481e724 Added some comments and fixed some format 2022-09-27 10:20:54 +02:00
Roman Vasin
36274baba9 Code cleanup 2022-09-26 15:36:32 +03:00
Roman Vasin
2a92c2aae9 Fix timeSlot() for DateTime64 argument 2022-09-26 15:26:07 +03:00
Kruglov Pavel
3dc54272ed
Merge branch 'master' into improve-combinators 2022-09-26 13:03:32 +02:00
Duc Canh Le
3283cdb3d1 fix style 2022-09-25 13:54:20 +08:00
Duc Canh Le
e882b4f694 remove settings, add tryDecrypt function 2022-09-25 13:50:22 +08:00
Duc Canh Le
a2c6d2267f Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-09-25 13:06:32 +08:00
Alfonso Martinez
dd47d7fd66 Fixed style 2022-09-23 17:13:59 +02:00
Anton Popov
41fc531ac6
Merge branch 'master' into fix-monotonic-order-by 2022-09-23 17:12:36 +02:00
Alfonso Martinez
9cb74c7807 Low cardinality cases moved to the function for its corresponding type 2022-09-23 14:12:37 +02:00
Anton Popov
4c7a820685 fix order by monotonic functions 2022-09-22 16:21:28 +00:00
Roman Vasin
348f28a3e7 Fix toStartOfFiveMinutes, toStartOfTenMinutes, toStartOfFifteenMinutes 2022-09-22 13:30:40 +00:00
Roman Vasin
cdd20241bf Finish toStartOfDay; fix toStartOfMinute 2022-09-22 13:16:46 +00:00
Kruglov Pavel
55d7addcfe
Merge branch 'master' into fix-format-row 2022-09-22 12:32:58 +02:00
Roman Vasin
4ddd6f3c60 Fix timeSlot(); partial fix StartOfDay 2022-09-22 09:19:22 +00:00
Roman Vasin
49b0da0273 Add date32IsNotSupported() 2022-09-22 07:15:03 +00:00
Roman Vasin
f6ebd94ce1 Add execute_extended_result for Date32 argument 2022-09-22 07:07:04 +00:00
Alexey Milovidov
45afacdae4
Merge pull request #41186 from ClickHouse/fix-three-fourth-of-trash
Fix more than half of the trash
2022-09-22 07:28:26 +03:00
Alexey Milovidov
b47cdd2783
Merge pull request #41602 from ClickHouse/bitmap-wrong-exception-code
Fix wrong exception code in the `bitmapTransform` function
2022-09-22 07:26:22 +03:00
Kruglov Pavel
a4feb81383
Merge pull request #41541 from canhld94/ch_fix_tostring
Fix conversion from nullable fixed string to string
2022-09-21 19:44:14 +02:00
avogar
67fe53e274 Fix style, add tests 2022-09-21 13:01:56 +00:00
Roman Vasin
30c4719f7b Add FunctionDateOrDateTimeToDateTimeOrDateTime64; function toStartOfHour works 2022-09-21 12:58:57 +00:00
Alexey Milovidov
3b6354fc62 Fix clang-tidy 2022-09-21 14:35:24 +02:00
Alexey Milovidov
acc1613e3b Fix wrong exception code in the bitmapTransform function 2022-09-21 05:28:47 +02:00
Alexey Milovidov
f939820a9b Fix clang-tidy 2022-09-21 02:16:47 +02:00
Alexey Milovidov
45bd3cfc30 Merge branch 'master' into fix-three-fourth-of-trash 2022-09-20 21:27:41 +02:00
Alexey Milovidov
1c3696bd38 Merge branch 'fix-three-fourth-of-trash' of github.com:ClickHouse/ClickHouse into fix-three-fourth-of-trash 2022-09-20 05:22:51 +02:00
Alexey Milovidov
cd68c6dc2d Update tests (the new results looks correct) 2022-09-20 05:22:31 +02:00
Alexey Milovidov
a35c62933f
Update src/Functions/sleep.h
Co-authored-by: Robert Schulze <robert@clickhouse.com>
2022-09-20 05:16:21 +03:00
Duc Canh Le
e2bd478b98 fix conversion from nullable fixed string to string 2022-09-20 10:05:18 +08:00
BoloniniD
55fcb98f29 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-19 21:53:14 +03:00
Kruglov Pavel
57f0dc1f89
Merge branch 'master' into fix-format-row 2022-09-19 14:37:58 +02:00
Kruglov Pavel
47f6f09ce0
Merge branch 'master' into improve-combinators 2022-09-19 14:31:12 +02:00
Yakov Olkhovskiy
2f2080f5a4
Merge branch 'master' into composable-protocol 2022-09-19 08:15:07 -04:00
Kruglov Pavel
4c3194eefe
Merge pull request #41286 from azat/utf8-fix
Do not allow invalid sequences influence other rows in lowerUTF8/upperUTF8
2022-09-19 14:07:10 +02:00
Alexey Milovidov
000b6ac81c Fix error 2022-09-19 09:30:48 +02:00
Alexey Milovidov
5224fc4d7f Fix strange code 2022-09-19 08:53:20 +02:00
Alexey Milovidov
d1275091e8 Fix strange code 2022-09-19 08:53:20 +02:00
Alexey Milovidov
bb6f1bfce2 Fix 9/10 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
730655d4fd Fix 8/9 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
42b0d444da Fix 7/8 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
91baedf03a Fix 6/7 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
ab4db2d0c4 Fix 5/6 of trash 2022-09-19 08:50:53 +02:00
Alexey Milovidov
84f42e0874 Fix 3/4 of trash 2022-09-19 08:50:53 +02:00
Alexey Milovidov
d4b9fe41be
Merge pull request #41457 from ClickHouse/remove-trash-5
Remove trash from Field
2022-09-19 06:36:48 +03:00
Duc Canh Le
35fbac03f2 fix FunctionDecrypt constructor 2022-09-19 11:08:03 +08:00
Duc Canh Le
91615c34f5 Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-09-19 10:39:34 +08:00
BoloniniD
5b4fb4bf66
Merge branch 'master' into BLAKE3 2022-09-18 23:14:37 +03:00
Alexey Milovidov
a6e3ab5515 Fix error 2022-09-18 09:21:57 +02:00
Alexey Milovidov
5c75a7d661 Fix error 2022-09-18 09:20:48 +02:00
Yakov Olkhovskiy
c66f412300 pass session certificate for showCertificate() 2022-09-18 07:11:52 +00:00
Alexey Milovidov
c41e37fa8e Code simplification 2022-09-18 08:58:32 +02:00
Alexey Milovidov
a958751c98
Merge pull request #40219 from den-crane/test/zvonand-monotonic
test for datetime64 monotonic #39864
2022-09-18 07:48:49 +03:00
Alexey Milovidov
791de6592b Remove trash from Field 2022-09-18 05:16:08 +02:00
Alexey Milovidov
15e5fc96a1 Add a trap 2022-09-18 02:24:56 +02:00
Alexey Milovidov
f4a48cd4d6 Remove cruft 2022-09-17 22:16:31 +02:00
Robert Schulze
38aecc15a2
Merge pull request #41214 from arenadata/ADQM-528-B
Add enable_extended_results_for_datetime_functions option to return results of type Date32
2022-09-17 19:25:11 +02:00
BoloniniD
1c0047dbff Review fixes 2022-09-17 16:34:19 +03:00
Azat Khuzhin
01b5b9ceba Do not allow invalid sequences influence other rows in lowerUTF8/upperUTF8
Right now lowerUTF8() and upperUTF8() does not respect row boundaries,
and so one row may break another.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-17 11:16:45 +02:00
Alexey Milovidov
ada7a44ae4 Remove -WithTerminatingZero methods 2022-09-17 05:34:18 +02:00
Denny Crane
6540d55d42
Merge branch 'master' into test/zvonand-monotonic 2022-09-16 17:02:09 -03:00
avogar
0101cc2e56 Support complex combinators in window transform, arrayReduce*, initializeAggregation and Aggregate functons versionning 2022-09-16 19:07:36 +00:00
BoloniniD
452ef4435b Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-09-16 20:05:56 +03:00
Robert Schulze
b32b02d844
Merge pull request #40897 from ClickHouse/catboost-bridge-resurrected
Move CatBoost evaluation into clickhouse-library-bridge
2022-09-16 13:12:09 +02:00
Roman Vasin
bc97da686a Add const for bool enable_extended_results_for_datetime_functions 2022-09-16 09:54:24 +00:00
Roman Vasin
d82cd7b007 Add many cosmetic changes to C++ code and MD docs 2022-09-16 09:37:30 +00:00
Roman Vasin
7e733887b9 Add narrowing for toStartOfISOYear; improve 02403_enable_extended_results_for_datetime_functions 2022-09-16 08:40:15 +00:00
Roman Vasin
5c242de43a Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-528-B 2022-09-15 13:47:51 +00:00
Roman Vasin
563449779c Replace enable_date32_results by enable_extended_results_for_datetime_functions 2022-09-15 12:48:09 +00:00
Roman Vasin
3346d8e27b Add many cosmetic changes to C++ code 2022-09-15 10:56:05 +00:00
Robert Schulze
fd97058e45
fix: incorporate review comments 2022-09-14 15:21:24 +00:00
JackyWoo
2c8e5f0156 Merge branch 'master' into add_function_java_int_hash 2022-09-14 10:11:15 +08:00
Dmitry Novik
7e3fb0a681
Merge pull request #40762 from ClickHouse/grouping-comp
Fix GROUPING function SQL compatibility
2022-09-13 17:03:06 +02:00
JackyWoo
db6dcd6ce9 delete a line 2022-09-13 21:28:24 +08:00
JackyWoo
e3caa6f096 some fixes 2022-09-13 21:21:35 +08:00
Kruglov Pavel
3f4e998802
Merge branch 'master' into fix-format-row 2022-09-13 14:37:10 +02:00
Duc Canh Le
4c8c804e25 fix style 2022-09-13 15:47:52 +08:00
JackyWoo
ccff846e92 Merge branch 'master' into add_function_java_int_hash 2022-09-13 15:25:11 +08:00
Duc Canh Le
7657febf05
Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-09-13 14:41:09 +08:00
BoloniniD
147dfac11e Try using Corrosion 2022-09-12 23:05:41 +03:00
BoloniniD
fe239e7ee9 Fix style + free err_msg before throwing exception 2022-09-12 18:46:54 +03:00
Alexey Milovidov
2aedd41023
Remove strange code (#40195)
* Remove strange code

* Even more code removal

* Fix style

* Remove even more code

* Simplify code by making it slower

* Attempt to do something

* Attempt to do something

* Well do something with this horrible trash

* Add a test
2022-09-12 16:29:23 +02:00
Duc Canh Le
9d94179217 try make code less ugly 2022-09-12 21:41:18 +08:00
Dmitry Novik
c236706703
Merge branch 'master' into grouping-comp 2022-09-12 15:19:51 +02:00
Duc Canh Le
89ad7d696d
Merge branch 'master' into ch_canh_fix_decrypt_with_null 2022-09-12 18:47:40 +08:00
Duc Canh Le
d901c4dca5 add option to return NULL when decrypt fail 2022-09-12 18:45:43 +08:00
Alexey Milovidov
8b5328f7e5 Merge branch 'master' into fix-trash-base58 2022-09-11 20:43:54 +02:00
BoloniniD
c0024dff3a Add embedded docs for BLAKE3 2022-09-11 19:42:36 +03:00
Alexey Milovidov
e0a9ae0496 Fix base58 trash 2022-09-11 08:09:14 +02:00
Alexey Milovidov
7f1e7b5967 Merge branch 'master' into fix-half-of-trash 2022-09-11 06:20:47 +02:00
Alexey Milovidov
fd235919aa Remove some methods 2022-09-10 05:04:40 +02:00
Alexey Milovidov
fa62c7e982 Fix half of trash 2022-09-10 04:08:16 +02:00
avogar
6d5f9e5554 Proper implementation for rowFormat function, delete rowFormatNoNewLine function 2022-09-09 17:42:33 +00:00
Dmitry Novik
64c4023737
Merge branch 'master' into grouping-comp 2022-09-09 18:05:01 +02:00
Dmitry Novik
877776e569
Update src/Functions/grouping.h 2022-09-09 18:04:29 +02:00
Roman Vasin
3aa33f39ac Fix ErrorCodes for style check 2022-09-09 14:17:46 +00:00
Roman Vasin
da3e938ad6 Enable Date32 results for toStartOfWeek 2022-09-09 13:45:51 +00:00
Roman Vasin
b03a0a724f Enable Date32 results for toStartOfYear, toStartOfISOYear, toStartOfQuarter, toStartOfMonth, toMonday, toLastDayOfMonth 2022-09-09 13:25:52 +00:00
JackyWoo
2399d8b5ae add function javaIntHash 2022-09-09 18:27:29 +08:00
BoloniniD
e8bcbcd016
Merge branch 'master' into BLAKE3 2022-09-09 11:48:31 +03:00
BoloniniD
ff82784b91 Fix MSAN test 2022-09-08 23:41:18 +03:00
Anton Popov
86b29b7f1a fix serilization of Object inside other types 2022-09-08 15:16:39 +00:00
Robert Schulze
c16707ff00
chore: delete obsolete modelEvaluate() function + SYSTEM.MODELS view
- The deleted function modelEvaluate() was superseded by
  catboostEvaluate().

- Also delete the external model repository, as modelEvaluate() was it's
  last user. Additionally remove the system view SYSTEM.MODELS for
  inspecting the repository.

- SYSTEM RELOAD MODELS is also obsolete. HOWEVER, it was retained and
  made a no-op instead of deleted.

  Why?
  The reason is that RBAC in distributed setups works by storing
  privileges (granted and revoked) as plain SQL statements in Keeper.
  Nodes read these statements at startup and parse them. If a privilege
  for SYSTEM RELOAD MODELS exists but parser doesn't recognize it
  nodes would fail to come up.

  Considered but rejected alternatives:
  - Ignore SYSTEM RELOAD MODELS during parsing RBAC privileges and
    return an error for regular SYSTEM RELOAD MODELS SQL. Special-case
    of no-op behavior, too brittle.
  - Remove SYSTEM RELOAD MODELS manually from Keeper via command-line
    manipulation of Keeper nodes or via SQL by dropping the privileges.
    Needs user intervention during upgrade.
2022-09-08 09:10:11 +00:00
Robert Schulze
60f9f6855d
feat: implement catboost in library-bridge
This commit moves the catboost model evaluation out of the server
process into the library-bridge binary. This serves two goals: On the
one hand, crashes / memory corruptions of the catboost library no longer
affect the server. On the other hand, we can forbid loading dynamic
libraries in the server (catboost was the last consumer of this
functionality), thus improving security.

SQL syntax:

  SELECT
    catboostEvaluate('/path/to/model.bin', FEAT_1, ..., FEAT_N) > 0 AS prediction,
    ACTION AS target
  FROM amazon_train
  LIMIT 10

Required configuration:

  <catboost_lib_path>/path/to/libcatboostmodel.so</catboost_lib_path>

*** Implementation Details ***

The internal protocol between the server and the library-bridge is
simple:

- HTTP GET on path "/extdict_ping":
  A ping, used during the handshake to check if the library-bridge runs.

- HTTP POST on path "extdict_request"
  (1) Send a "catboost_GetTreeCount" request from the server to the
      bridge, containing a library path (e.g /home/user/libcatboost.so) and
      a model path (e.g. /home/user/model.bin). Rirst, this unloads the
      catboost library handler associated to the model path (if it was
      loaded), then loads the catboost library handler associated to the
      model path, then executes GetTreeCount() on the library handler and
      finally sends the result back to the server. Step (1) is called once
      by the server from FunctionCatBoostEvaluate::getReturnTypeImpl(). The
      library path handler is unloaded in the beginning because it contains
      state which may no longer be valid if the user runs
      catboost("/path/to/model.bin", ...) more than once and if "model.bin"
      was updated in between.
  (2) Send "catboost_Evaluate" from the server to the bridge, containing
      the model path and the features to run the interference on. Step (2)
      is called multiple times (once per chunk) by the server from function
      FunctionCatBoostEvaluate::executeImpl(). The library handler for the
      given model path is expected to be already loaded by Step (1).

Fixes #27870
2022-09-08 09:01:32 +00:00
Anton Popov
f0a404e2c8 Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-06 15:51:16 +00:00
Alexey Milovidov
940a53e519
Merge pull request #40984 from Lucky-Chang/typo_fix
Fix some typos and clang-tidy warnings
2022-09-06 02:37:50 +03:00
Kruglov Pavel
582216a3ca
Merge pull request #39919 from pzhdfy/UniqSketch
UniqThetaSketch support set operation such as union/intersect/not
2022-09-05 13:42:14 +02:00
Luck-Chang
1ac8e739c9 fix some typos and clang-tidy warnings 2022-09-05 09:50:24 +08:00
Roman Vasin
e1833d3cd6 Merge branch 'master' into ADQM-528-B 2022-09-02 10:40:44 +00:00
Antonio Andelic
e64436fef3 Fix typos with new codespell 2022-09-02 08:54:48 +00:00
Fangyuan Deng
bc7d661668
Merge branch 'master' into UniqSketch 2022-09-01 19:31:53 +08:00
pzhdfy
acec516271 add docs 2022-09-01 19:31:01 +08:00
Fangyuan Deng
5c2f35c302
Merge branch 'master' into UniqSketch 2022-08-31 10:23:46 +08:00
Dmitry Novik
003483b616 Add compatibility setting 2022-08-30 20:49:40 +02:00
Cory Levy
cd371110ad Fix timezone offset formatting 2022-08-29 21:01:27 -04:00
Dmitry Novik
d32492ce8a Fix GROUPING function SQL compatibility 2022-08-29 18:26:56 +02:00
Cory Levy
d0ace9a8fb Add the %z format descriptor to formatDateTime 2022-08-28 22:28:11 -04:00
Alexey Milovidov
00074a5e14
Merge pull request #40711 from ClickHouse/all_new_functions_must_be_documented
All new functions must be documented
2022-08-28 17:05:05 +03:00
Robert Schulze
df934d8762
Merge pull request #40217 from zvonand/zvonand-minmax
Fix conversion Date32 / DateTime64 / Date to narrow types
2022-08-28 09:42:39 +02:00
Alexey Milovidov
1ff535a128 One step back 2022-08-28 02:00:09 +02:00
Alexey Milovidov
8a4bacc4b7 Update example 2022-08-28 00:30:14 +02:00
Alexey Milovidov
6ac9eabbd4 Register more aliases correctly 2022-08-28 00:17:13 +02:00
Alexey Milovidov
a6d99e795e Register more aliases correctly 2022-08-28 00:14:38 +02:00
Alexey Milovidov
e266688576 Remove something stupid 2022-08-27 23:58:34 +02:00
Alexey Milovidov
421d28542b Register all function aliases correctly 2022-08-27 23:53:51 +02:00
Alexey Milovidov
6139cc8f7a Add Documentation to FunctionFactory 2022-08-27 22:06:57 +02:00
Alexey Milovidov
53343e0fc2 Remove accurate_Cast and accurate_CastOrNull, which are unused 2022-08-26 21:27:30 +02:00
Kruglov Pavel
e2e1ca26e9
Merge pull request #40628 from Avogar/fix-to-fixed-string
Fix short circuit execution of toFixedString function
2022-08-26 12:26:16 +02:00
Robert Schulze
1c8c83ccf3
Merge pull request #40620 from zvonand/zvonand-b58
Base58 fix handling leading 0 / '1'
2022-08-26 10:18:26 +02:00
Alexey Milovidov
fcba079baa
Merge pull request #40572 from Avogar/fix-map-bug
Fix possible logical error in arrayElement function with Map
2022-08-25 20:17:22 +03:00
avogar
1844873dbc Fix short circuit execution of toFixedString function 2022-08-25 16:15:22 +02:00
Roman Vasin
5f33ed711b Add checkArguments() to IFunctionDateOrDateTime; Move name and getName() to IFunctionDateOrDateTime and IFunctionCustomWeek.h 2022-08-25 12:38:28 +00:00
zvonand
b3360c169d Fix handling zeros/'1's at the beginning of input 2022-08-25 15:35:39 +03:00
zvonand
23357350d3 Merge branch 'master' of github.com:ClickHouse/ClickHouse into zvonand-minmax 2022-08-25 15:32:04 +03:00
pzhdfy
e85e885d40 add blank line 2022-08-25 18:15:01 +08:00
zvonand
24dfa0c64a tryfix darwin 2022-08-25 12:28:15 +03:00
Kruglov Pavel
25d7d05148
Merge pull request #40567 from ClickHouse/Avogar-patch-1
Better exception messages in FunctionArrayMapped
2022-08-25 11:14:06 +02:00
Roman Vasin
bff61d1081 Add tests for toStartOfWeek(), fix executeImpl for Date32 2022-08-25 08:54:26 +00:00
Roman Vasin
4394929085 Add checkArguments() to IFunctionCustomWeek 2022-08-25 08:30:23 +00:00
Roman Vasin
1285abc195 Add IFunctionCustomWeek and FunctionCustomWeekToDateOrDate32 2022-08-25 08:11:21 +00:00
zvonand
3797e7cfbe Merge branch 'master' of github.com:ClickHouse/ClickHouse into zvonand-minmax 2022-08-25 01:57:49 +03:00
zvonand
a61fd73c88 fix test fails 2022-08-25 00:26:44 +03:00
zvonand
5c00428940 updated as was before 2022-08-24 19:46:17 +03:00
Roman Vasin
01dc483b64 Add IFunctionDateOrDateTime 2022-08-24 15:44:53 +00:00
Alexander Gololobov
c9177d2cb3
Merge pull request #40475 from ClickHouse/allow-conversion-from-string-datetime64-to-date-date32
Allow conversion from String with DateTime64 to Date and Date32
2022-08-24 17:33:20 +02:00
zvonand
b4564099f9 fix inherited parts 2022-08-24 17:07:30 +03:00
avogar
fa320f65f6 Fix possible logical error in arrayElement function with Map 2022-08-24 11:01:39 +00:00
Kruglov Pavel
860d0baa48
Better exception messages in FunctionArrayMapped 2022-08-24 12:28:44 +02:00
zvonand
e257f9d0cd update docs, tests + small fixes 2022-08-24 01:09:14 +03:00
Roman Vasin
423a4f22cc Fix toStartOfYear, toStartOfISOYear, toStartOfQuarter and toStartOfMonth to return Date32 2022-08-23 16:55:53 +00:00
Roman Vasin
505a8ad0e0 Fix toMonday() function to return Date32 2022-08-23 15:39:21 +00:00
zvonand
42f86442ab updated tests + toStartOfWeek 2022-08-23 18:33:09 +03:00
Andrey Zvonov
52159b77d0
Merge branch 'master' into zvonand-minmax 2022-08-23 17:18:57 +03:00
zvonand
6abe09a725 updated tests + teStartOfWeek 2022-08-23 17:13:39 +03:00
Roman Vasin
e40b8f471d Add compatibility parameter enable_date32_results 2022-08-23 12:43:08 +00:00
pzhdfy
d81462c095 check all ArgumentsDataTypes and use equals 2022-08-23 15:08:35 +08:00
Robert Schulze
ec9ef741fc
Merge branch 'master' into less-string-ref 2022-08-22 20:48:16 +02:00
Roman Vasin
d73a80a54c Delete not needed code from getReturnTypeImpl of FunctionDateOrDateTimeToDateOrDate32 2022-08-22 16:24:45 +00:00
zvonand
d789dabc9d Fixed , updated docs 2022-08-22 17:36:56 +03:00
zvonand
e5b4fb4191 fix some issues 2022-08-22 17:36:56 +03:00
zvonand
a7a1269e60 updated tests + improve logic 2022-08-22 17:36:55 +03:00
zvonand
537fb8c4ee fix 1 2022-08-22 17:36:55 +03:00
Alexey Milovidov
a3e4753d83
Merge pull request #40466 from ClickHouse/remove-some-trash
Remove some trash
2022-08-22 15:59:50 +03:00
Alexey Milovidov
c4a60b8123 Allow conversion from String with DateTime64 to Date and Date32 2022-08-22 00:23:10 +02:00
Alexey Milovidov
996aa2d126 Remove some trash 2022-08-21 20:46:33 +02:00
Alexey Milovidov
8dd5c215e0
Merge pull request #40462 from ClickHouse/revert-39000-avx-enablement
Revert "Avx enablement"
2022-08-21 21:32:45 +03:00
Robert Schulze
4caf2c4c33
Reduce some usage of StringRef
See #39535 and #39300
2022-08-21 18:10:32 +00:00
Robert Schulze
3aa5acdb35
Merge pull request #40391 from ClickHouse/reduce-StringRef
Reduce some usage of StringRef
2022-08-21 20:04:57 +02:00
Alexey Milovidov
6fbff6d1f6
Revert "Avx enablement" 2022-08-21 15:47:02 +03:00
Robert Schulze
77e64935e1
Reduce some usage of StringRef 2022-08-19 09:56:59 +00:00
Robert Schulze
24615059fb
Merge pull request #40319 from ClickHouse/custom-error-code-in-throwif
Allow custom error code in throwIf
2022-08-19 09:49:35 +02:00
Robert Schulze
5037ce547f
Only allow 8/16/32-bit signed integers as custom error code data type
The internally used data type for error code is "int", and by
disallowing unsigned integers or integers >= 64 bit we avoid truncation
issues.
2022-08-18 15:16:08 +00:00
Robert Schulze
d6e5a6b7e7
fix: prevent possible race between getReturnTypeImpl() and executeImpl()
I don't know if the context (in detail: the config settings) can change
between calls to getReturnTypeImpl() and executeImpl() but we better
don't find out. Instead, read setting "allow_custom_error_code_in_throwif"
just just once in the ctor of FunctionThrowIf.
2022-08-18 12:24:42 +00:00
Robert Schulze
baae7726a7
style: "Fix" style check in CI 2022-08-18 10:13:58 +00:00
Robert Schulze
f121941916
feat: allow custom error code for SQL function throwIf() 2022-08-17 21:14:40 +00:00
Robert Schulze
164fc3d296
style: improve code aesthetics 2022-08-17 20:39:37 +00:00
Roman Vasin
19cb6d87c8 Add FunctionDateOrDateTimeToDateOrDate32 and make toLastDayOfMonth working 2022-08-17 16:38:06 +00:00
Roman Vasin
be3151937c Set return type to Date32 for toLastDayOfMonth function 2022-08-17 08:51:14 +00:00
pzhdfy
48fcf6580d Merge branch 'UniqSketch' of github.com:pzhdfy/ClickHouse into UniqSketch 2022-08-17 11:26:44 +08:00
pzhdfy
172b5025e2 use camelCase aNotB 2022-08-17 11:24:19 +08:00
Robert Schulze
df889351ad
feat: replace unbounded vectorscan cache by bounded cache
VectorScan patterns can grow large (up to multiple MBs) and queries
involving different VectorScan patterns lead to an ever increasing
pattern cache size.

With this commit, the unbounded cache for vectorscan patterns is
replaced by a bounded cache with "CacheTable" eviction strategy, similar
to what we have for re2 patterns. The cache size is currently hard-coded
to 500 entries.

Fixes #19869
2022-08-16 21:09:47 +00:00
Robert Schulze
89bf69c35f
style: improve code aesthetics and make it slightly safer 2022-08-16 21:09:32 +00:00
Igor Nikonov
e7baf920a9
Merge pull request #39600 from zvonand/zvonand-decimal-overflow
Check Decimal division overflow based on operands scale
2022-08-15 15:03:06 +02:00
Alexey Milovidov
ad936ae32a
Merge pull request #40211 from canhld94/ch_canh_fix_datetime
Fix unexpected result arrayDifference of Array(UInt32)
2022-08-15 04:52:58 +03:00
Igor Nikonov
6f00643bc1
Merge branch 'master' into zvonand-decimal-overflow 2022-08-15 02:38:45 +02:00
Igor Nikonov
2cb78c7220 Detailed comment about overflow check 2022-08-15 00:37:18 +00:00
Duc Canh Le
272447c5dc remove junk log 2022-08-14 16:19:09 +08:00
Duc Canh Le
71dd2a19fc fix arrayDiff 2022-08-14 16:14:13 +08:00
Alexey Milovidov
10022ee974 Fix insufficient argument check for encryption functions 2022-08-14 04:28:30 +02:00
Alexey Milovidov
31dbb68185 Fix array signed const positive subscript operator 2022-08-13 08:04:24 +02:00
Kseniia Sumarokova
a6cfc7bc3b
Merge pull request #34651 from alexX512/master
New caching strategies
2022-08-12 17:23:37 +02:00
Kruglov Pavel
456291f335
Fix style 2022-08-11 13:08:56 +02:00
Igor Nikonov
394a71dcfd More correct fix + tests 2022-08-11 06:24:40 +00:00
Alexey Milovidov
8374f31306
Merge pull request #39425 from arenadata/ADQM-485
Add support of dates from year 1900 to 2299 for Date32 and DateTime64
2022-08-11 05:01:53 +03:00
Nikolay Degterinsky
547662f88b
Merge pull request #40002 from Algunenano/fix_hashid
Fix hashId crash and salt parameter not being used
2022-08-11 02:25:10 +02:00
Alexey Milovidov
53097b3d65
Merge pull request #40015 from tbragin/master
Add parseDateTime64BestEffortUS* functions
2022-08-10 20:00:50 +03:00
Roman Vasin
537ba613dc Remove clamping from AddDaysImpl and AddWeeksImpl; fix 01921_datatype_date32 test 2022-08-10 08:14:16 +00:00
Kruglov Pavel
4bbe5186c2
Merge pull request #39999 from HeenaBansal2009/Support_H_syntax_in_Hour_Interval_Kind
Added H literal for Hour IntervalKind
2022-08-09 11:24:41 +02:00
Roman Vasin
96598e3574 Replace normalizeDayNum() by std::clamp() 2022-08-09 08:52:02 +00:00
Tanya Bragin
eb61db3b67 initial changes to close #37492 2022-08-08 19:45:00 -07:00
HeenaBansal2009
d2bdf6fc3e Added tests 2022-08-08 11:51:05 -07:00
Raúl Marín
c8f8ceea8d Fix hashId crash and salt not applying 2022-08-08 18:25:34 +02:00
alexX512
6bf29cb610 Change class LRUCache to class CachBase. Check running CacheBase with default pcahce policy SLRU 2022-08-07 19:59:30 +00:00
BoloniniD
24f3b00193
Merge branch 'master' into BLAKE3 2022-08-06 19:48:10 +03:00
BoloniniD
ef4ca4279a Fix build cfg for BLAKE3 2022-08-06 18:24:25 +03:00
pzhdfy
2d5446a86b add FunctionsUniqTheta 2022-08-05 14:11:52 +08:00
BoloniniD
48c91619b0
Merge branch 'master' into BLAKE3 2022-08-04 20:30:47 +03:00
Nikolai Kochetov
658a269d56
Merge branch 'master' into use-dag-in-key-condition 2022-08-04 16:18:40 +02:00
Kruglov Pavel
b84e65bb3b
Merge pull request #39716 from arthurpassos/fix_scalar_cte_with_lc_result
Unwrap LC column in IExecutablefunction::executeWithoutSparseColumns
2022-08-03 18:53:37 +02:00
zvonand
99a458c12d small fix 2022-08-03 16:36:36 +03:00
BoloniniD
29084d92d8 Fix CMake for ENABLE_RUST 2022-08-02 20:44:12 +03:00
BoloniniD
b161773f71 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-08-02 20:25:25 +03:00
Arthur Passos
e9b124b4bc Don't use default implementation for LC columns in Scalar functions 2022-08-02 09:17:53 -03:00
Anton Popov
f94d4d4877
Merge branch 'master' into hash-functions-map 2022-08-02 13:26:54 +02:00
Robert Schulze
bf574b9154
Merge pull request #39760 from ClickHouse/bit-fiddling
Use std::popcount, ::countl_zero, ::countr_zero functions
2022-08-01 17:04:51 +02:00
Kruglov Pavel
dfdfabec94
Merge pull request #39218 from evillique/file_default_value
Add default argument to the function `file`
2022-08-01 13:04:19 +02:00
Nikolai Kochetov
22fbfe19a4 Merge branch 'master' into use-dag-in-key-condition 2022-07-31 21:54:12 +02:00
Robert Schulze
a7734672b9
Use std::popcount, ::countl_zero, ::countr_zero functions
- Introduced with the C++20 <bit> header

- The problem with __builtin_c(l|t)z() is that 0 as input has an
  undefined result (*) and the code did not always check. The std::
  versions do not have this issue.

- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
  src/Common/BitHelpers.h) because the std:: versions only accept
  unsigned inputs (and they also check that) and the casting would be
  ugly.

(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
2022-07-31 15:16:51 +00:00
Robert Schulze
4d7627e45e
Fix include 2022-07-31 07:36:40 +00:00
Robert Schulze
8ca236de08
Enable SQL function getOSKernelVersion() on all platforms
Follow up to PR #38615
2022-07-30 22:36:47 +00:00
Robert Schulze
85773e0926
Merge pull request #38615 from liyinsg/simplified_function_registration_interface
Simplified function registration interface
2022-07-31 00:18:37 +02:00
Alexey Milovidov
36e6500e54
Merge branch 'master' into BLAKE3 2022-07-30 23:14:05 +03:00
Robert Schulze
b52843d5fd
Merge pull request #37951 from zvonand/dt64_timeslots
Fix timeSlots for DateTime64
2022-07-30 20:49:05 +02:00
Alexey Milovidov
8f348edbbd
Merge pull request #39000 from ClickHouse/avx-enablement
Avx enablement
2022-07-30 04:51:07 +03:00
Arthur Passos
f0f19874da fix style 2022-07-29 13:56:01 -03:00
Arthur Passos
59fbd21024 Unwrap LC column in IExecutablefunction::executeWithoutSparseColumns 2022-07-29 12:03:09 -03:00
Nikolai Kochetov
59a11b32ad
Merge branch 'master' into use-dag-in-key-condition 2022-07-29 17:01:33 +02:00
zvonand
b390bcfe7c fix wrong data type cast 2022-07-29 13:25:40 +03:00
Li Yin
4088c0a7f3 Automated function registration
Automated register all functions with below naming convention by
iterating through the symbols:
void DB::registerXXX(DB::FunctionFactory &)
2022-07-29 15:39:50 +08:00
Anton Popov
45da56d802 support hash functions with Map type 2022-07-28 19:15:19 +00:00
Alexander Gololobov
9525bd19bf
Merge pull request #39592 from ClickHouse/fix-wrong-regexp-replace
Fix wrong regexp replace
2022-07-27 09:54:28 +02:00
Antonio Andelic
904a05ac21
Merge pull request #39496 from azat/custom-tld-exclamation-asterisk
Add support of !/* (exclamation/asterisk) in custom TLDs
2022-07-27 08:55:49 +02:00
Vladimir C
d9e8e9b948
Merge pull request #39552 from filimonov/maxsplit-bug
Fix bug with maxsplit in the splitByChar
2022-07-26 11:14:27 +02:00
Alexey Milovidov
833b24b486 Fix the wrong REGEXP_REPLACE alias 2022-07-26 08:01:49 +02:00
Azat Khuzhin
1d4a7c7290 Add support of !/* (exclamation/asterisk) in custom TLDs
Public suffix list may contain special characters (you may find format
here - [1]):
- asterisk (*)
- exclamation mark (!)

  [1]: https://github.com/publicsuffix/list/wiki/Format

It is easier to describe how it should be interpreted with an examples.

Consider the following part of the list:

    *.sch.uk
    *.kawasaki.jp
    !city.kawasaki.jp

And here are the results for `cutToFirstSignificantSubdomainCustom()`:

If you have only asterisk (*):

    foo.something.sheffield.sch.uk -> something.sheffield.sch.uk
    sheffield.sch.uk               -> sheffield.sch.uk

If you have exclamation mark (!) too:

    foo.kawasaki.jp                -> foo.kawasaki.jp
    foo.foo.kawasaki.jp            -> foo.foo.kawasaki.jp
    city.kawasaki.jp               -> city.kawasaki.jp
    some.city.kawasaki.jp          -> city.kawasaki.jp

TLDs had been verified wit the following script [2], to match with
python publicsuffix2 module.

  [2]: https://gist.github.com/azat/c1a7a9f1e3519793134ef4b1df5461a6

v2: fix StringHashTable padding requirements
Fixes: #39468
Follow-up for: #17748
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-26 08:34:30 +03:00
Roman Vasin
b462366415 Merge branch 'master' of github.com:ClickHouse/ClickHouse into ADQM-485 2022-07-25 17:55:47 +00:00
Roman Vasin
b412ea5f6d Improve generateRandom() for Date32; fix tests 01087_table_function_generate, 01277_fromUnixTimestamp64, 01691_DateTime64_clamp and 01702_toDateTime_from_string_clamping 2022-07-25 17:06:11 +00:00
Nikolai Kochetov
b70be40804
Merge branch 'master' into use-dag-in-key-condition 2022-07-25 14:30:22 +02:00
Alexander Gololobov
dbcb7e5f1e Fix for empty function name on FreeBSD build 2022-07-25 13:11:36 +02:00
Mikhail Filimonov
33ee858d18
Fix bug with maxsplit in the splitByChar 2022-07-25 13:11:02 +02:00
Alexey Milovidov
6fdcb009ff
Merge pull request #39533 from ClickHouse/now-in-block
Add function `nowInBlock`
2022-07-25 04:22:11 +03:00
Alexey Milovidov
cff712970e Add function nowInBlock 2022-07-24 19:58:48 +02:00
Robert Schulze
c788e05c77
Merge pull request #39292 from zvonand/zvonand-b58-datatype
Simplify Base58 encoding/decoding
2022-07-24 18:09:40 +02:00
Kruglov Pavel
f79924f270
Merge branch 'master' into file_default_value 2022-07-22 21:17:37 +02:00
Robert Schulze
6a631426b7
Disable vectorization for uint64 --> float32 cast 2022-07-22 12:49:16 +00:00
Robert Schulze
cad0e7a62c
Move hot loop inside nested if/else statements
- required to control the vectorization of a specific cast which
  annotates the loop using pragmas
2022-07-22 11:53:23 +00:00
Alexander Tokmakov
bed2206ae9
Merge pull request #39460 from ClickHouse/remove_some_dead_and_commented_code
Remove some dead and commented code
2022-07-22 13:24:34 +03:00
Robert Schulze
ea0a3bf600
Merge branch 'master' into stringref-to-string_view 2022-07-21 18:33:06 +02:00
Kruglov Pavel
f4f05ec786
Merge pull request #39447 from Avogar/better-parse-time-delta
Extend time units in parseTimeDelta function
2022-07-21 17:39:40 +02:00
Andrey Zvonov
e473606cd1
Merge branch 'master' into zvonand-b58-datatype 2022-07-21 15:55:05 +02:00
Alexander Tokmakov
a8da5d96fc remove some dead and commented code 2022-07-21 15:05:48 +02:00
avogar
d63786e709 Better 2022-07-21 09:07:23 +00:00
avogar
49d980d16c Extend time units in parseTimeDelta function 2022-07-21 08:55:51 +00:00
Roman Vasin
e3192cf753 Correct docs to reflect new range 1900..2299 for Date32 and DateTime64; Cleanup code 2022-07-20 15:19:02 +00:00
lokax
22792d1c6a fix style
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
140f5e6685 recursive check array offsets
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
d40f04b860 fix function name
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
ff433c1c01 fix build
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-20 21:20:00 +08:00
lokax
647eafa00e support for array type 2022-07-20 21:20:00 +08:00
lokax
5c6b18a9bd fix: 3rd parameter must be constant 2022-07-20 21:20:00 +08:00
lokax
6e23d2cb85 feat(function): tupleElement with default value 2022-07-20 21:20:00 +08:00
Nikolay Degterinsky
c9ae525da7
Merge branch 'master' into file_default_value 2022-07-20 14:22:58 +02:00
zvonand
11b8d788ca small improvements 2022-07-20 13:36:47 +02:00
zvonand
592499e290 fix server crash on negative size 2022-07-19 23:11:03 +02:00
Nikolai Kochetov
f570cde815 Fixing build. 2022-07-19 20:19:57 +00:00
Roman Vasin
5f9c293963 Fix addDays() and addWeeks() in upper and lower limits of Date and Date32 2022-07-19 17:29:08 +00:00
Nikolai Kochetov
eaeb30a71a Merge branch 'master' into use-dag-in-key-condition 2022-07-19 18:39:52 +02:00
Robert Schulze
9d0e3d107b
Update src/Functions/formatReadableTimeDelta.cpp
Co-authored-by: Nikolay Degterinsky <43110995+evillique@users.noreply.github.com>
2022-07-19 13:17:33 +02:00
alesapin
bdc09c8319
Merge pull request #39303 from ClickHouse/whitespaces
Whitespaces
2022-07-19 13:17:29 +02:00
Kruglov Pavel
12221cffc9
Merge pull request #39071 from jiahui-97/parse_timedelta
implementation of parseTimeDelta function
2022-07-19 12:39:56 +02:00
zvonand
d245d6349a fixed zero check 2022-07-19 12:08:08 +02:00
Nikolay Degterinsky
4ae356f218 Fix NULL, add test 2022-07-19 09:15:42 +00:00
Robert Schulze
81ef1099cc
Even less usage of StringRef
--> see #39300
2022-07-19 07:01:06 +00:00
jiahui-97
e7af88b688 implementation of parseTimeDelta function
Co-authored-by: Kruglov Pavel <48961922+Avogar@users.noreply.github.com>
2022-07-19 09:33:02 +08:00
Robert Schulze
6df3c9d799
Merge pull request #39300 from ClickHouse/stringref-to-stringview
First try at reducing the use of StringRef
2022-07-18 19:20:16 +02:00
Robert Schulze
74e55e42f6
Merge pull request #39167 from ClickHouse/multiStringAllPositions-non-const-needle
multiStringAllPositions() with non-const needle
2022-07-18 15:30:14 +02:00
Kruglov Pavel
85f8b5990f
Merge pull request #39192 from Avogar/improve-is-nullable
Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument
2022-07-18 12:33:34 +02:00
Robert Schulze
3b0ca82d56
Fix build, pt. II 2022-07-18 09:41:01 +00:00
Alexey Milovidov
03aec06da1 Whitespaces 2022-07-17 23:20:05 +02:00
zvonand
18827ab312 Merge branch 'master' of github.com:ClickHouse/ClickHouse into dt64_timeslots 2022-07-17 22:59:18 +02:00
Robert Schulze
13482af4ee
First try at reducing the use of StringRef
- to be replaced by std::string_view
- suggested in #39262
2022-07-17 17:26:02 +00:00
Andrey Zvonov
e0d1954fac
Merge branch 'master' into dt64_timeslots 2022-07-16 22:33:46 +02:00
zvonand
728219e640 updated due to review 2022-07-16 22:16:19 +02:00
zvonand
d07a652883 merge 2022-07-16 19:05:07 +02:00
zvonand
4ab52b6873 added new DataType + fixes 2022-07-16 18:58:47 +02:00
avogar
7508448275 Better 2022-07-15 16:23:56 +00:00
Kruglov Pavel
0867e5fc4b
Merge branch 'master' into multiStringAllPositions-non-const-needle 2022-07-15 14:26:24 +02:00
Robert Schulze
deda29b46b
Pass const StringRef by value, not by reference
See #39224
2022-07-15 11:34:56 +00:00
Kruglov Pavel
a944a92d4c
Merge pull request #39224 from Avogar/string-view-by-value
Pass const std::string_view by value, not by reference
2022-07-15 13:05:56 +02:00
Roman Vasin
1d0818d9cf Set max year to 2299; Code cleanup; Make working 02245_make_datetime64 test 2022-07-15 10:33:52 +00:00
Nikolay Degterinsky
6f5275d9bb Fix build 2022-07-15 10:08:14 +00:00
Roman Vasin
266039ea64 Correct gTests for DateLUT 2022-07-14 19:00:17 +00:00
avogar
9291d33080 Pass const std::string_view & by value, not by reference 2022-07-14 16:11:57 +00:00
Maksim Kita
f5bacedaf9
Merge pull request #38553 from hexiaoting/mapupdate_dev
Fix bug for mapUpdate
2022-07-14 17:59:37 +02:00
Robert Schulze
ac5a06d944
Update doxygen 2022-07-14 11:02:01 +00:00
Kruglov Pavel
6d85dcd8a8
Update src/Functions/isNotNull.cpp
Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>
2022-07-14 12:51:50 +02:00
Nikolay Degterinsky
9ca2935d72 Add default argument to function file 2022-07-14 09:54:44 +00:00
Robert Schulze
198abad284
Disallow const haystack with non-const needle argument 2022-07-14 08:16:01 +00:00
Robert Schulze
6ec4f3cf3d
Implement non-const needle arguments in multiSearchAllPositions 2022-07-14 06:24:28 +00:00
avogar
390b1ac2f7 Improve isNullable/isConstant/isNull/isNotNull performance for LowCardinality argument 2022-07-13 17:56:34 +00:00
mergify[bot]
79e6f85412
Merge branch 'master' into mapupdate_dev 2022-07-13 17:06:51 +00:00
Nikolay Degterinsky
e3a79520c1 Merge remote-tracking branch 'upstream/master' into translate 2022-07-13 11:42:40 +00:00
Nikolay Degterinsky
c344e2de84 Fix style 2022-07-13 11:41:08 +00:00
Robert Schulze
04c1cb207b
Use fmt-based exceptions in FunctionsMultiStringPosition
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:40 +00:00
Robert Schulze
59a64ac902
Rename some variables
(for consistency with Functions/FunctionsMultiStringSearch.h)
2022-07-13 09:46:39 +00:00
Robert Schulze
f8c38b0d3a
Remove obsolete doxygen
(the removed functions are implemented elsewhere)
2022-07-13 09:46:38 +00:00
Nikolay Degterinsky
daae55b4ac Better 2022-07-13 01:41:27 +00:00
Anton Popov
81da0bb9f5 keep LowCardinality type in array() and map() functions 2022-07-12 13:31:00 +00:00
Andrey Zvonov
7bfe155f09
Merge branch 'master' into dt64_timeslots 2022-07-12 10:26:38 +03:00
Anton Popov
72fe4ce680 fix function toColumnTypeName with LowCardinality 2022-07-12 03:12:42 +00:00
Anton Popov
b67405915e keep LowCardinality type in tuple() function 2022-07-12 02:01:41 +00:00
Kruglov Pavel
57a719bafd
Merge pull request #39037 from amosbird/index-fix-1-again
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis) (second try)
2022-07-11 13:36:01 +02:00
Robert Schulze
5b8e448c7e
Merge pull request #39012 from ClickHouse/fix-crashing-stringsearch-with-empty-needle
Don't throw Logical error in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
2022-07-11 08:13:52 +02:00
Amos Bird
28aefc33a0
Revert "Merge pull request #39001 from ClickHouse/revert-38675-index-fix-1"
This reverts commit 1cf01bb959, reversing
changes made to af1136c990.
2022-07-08 23:22:10 +08:00
Robert Schulze
8e1a3cd194
Don't crash in functions multiMatch[Fuzzy](AllIndices/Any/AnyIndex)() with empty needle
Queries like
  "select multiMatchAnyIndex('abc', []::Array(String))"
were not properly handled and crashed.
2022-07-08 11:18:53 +00:00
Alexander Tokmakov
d4784203b7
Revert "Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)" 2022-07-08 12:51:30 +03:00
Robert Schulze
524f39551c
Merge pull request #38485 from ClickHouse/multi-match-with-non_const-patterns
Multi match with non const patterns
2022-07-08 09:29:10 +02:00
Roman Vasin
12f4a48957 Extend LUT range to 1900..2300 2022-07-08 06:48:05 +00:00
Alexey Milovidov
89fdcbf08c
Merge pull request #38675 from amosbird/index-fix-1
Fix toHour() monotonicity which can lead to incorrect query result (incorrect index analysis)
2022-07-08 00:32:27 +03:00
Robert Schulze
49348b833a
Simplify 2022-07-07 20:25:26 +00:00
Robert Schulze
1de5e9a7da
Avoid copy-ing array elements 2022-07-07 12:33:34 +00:00
Nikolay Degterinsky
1869b7f408 Add functions translate & translateUTF8 2022-07-07 08:25:00 +00:00
Robert Schulze
dec184f61d
Add comment about const needle + const haystack 2022-07-06 17:41:15 +00:00
Robert Schulze
8a18705729
More variable renamings for more uniformity 2022-07-06 14:55:02 +00:00
Robert Schulze
0b0d64c5ca
Don't resize output vector in each loop iteration 2022-07-06 14:45:22 +00:00
Robert Schulze
144c1edb03
Inherit default implementation of getArgumentsThatAreAlwaysConstant() 2022-07-06 14:42:19 +00:00
mergify[bot]
8cc2c3914b
Merge branch 'master' into isnullable 2022-07-06 14:40:01 +00:00
Robert Schulze
6a907b23fb
Replace typeid_cast() with checkAndGetColumnConst()
... syntactic sugar
2022-07-06 14:38:28 +00:00
Robert Schulze
0c4da85e75
More uniform naming 2022-07-06 14:29:42 +00:00
mergify[bot]
e5535f5ab4
Merge branch 'master' into mapupdate_dev 2022-07-06 13:54:36 +00:00
Nikolai Kochetov
020c99a269
Merge pull request #38617 from azat/contrib-debug-symbols
Add separate option to omit symbols from heavy contrib
2022-07-06 14:40:24 +02:00
Robert Schulze
d0b2f13f9d
Fix style check 2022-07-05 13:41:52 +02:00
lokax
e6bd0105b1 feat(Function): isNullable
Signed-off-by: lokax <m632656684@gmail.com>
2022-07-05 15:51:53 +08:00
lokax
849c46e6fa feat(Function): isNullable 2022-07-05 15:23:07 +08:00
mergify[bot]
f14b62b2d6
Merge branch 'master' into index-fix-1 2022-07-04 22:50:36 +00:00
Robert Schulze
1eed72b525
Make more multi-search methods work with non-const needles
After making function multi[Fuzzy]Match(Any|AnyIndex|AllIndices)() work
with non-const needles, 12 more functions started to fail in test
"00233_position_function_family":

multiSearchAny()
multiSearchAnyCaseInsensitive()
multiSearchAnyUTF8
multiSearchAnyCaseInsensitiveUTF8()

multiSearchFirstPosition()
multiSearchFirstPositionCaseInsensitive()
multiSearchFirstPositionUTF8()
multiSearchFirstPositionCaseInsensitiveUTF8()

multiSearchFirstIndex()
multiSearchFirstIndexCaseInsensitive()
multiSearchFirstIndexUTF8()
multiSearchFirstIndexCaseInsensitiveUTF8()

Failing queries take the form
  select 0 = multiSearchAny('\0', CAST([], 'Array(String)'));
2022-07-04 14:00:21 +00:00
Robert Schulze
ece61f6da3
Fix davenger's review comments
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907397214
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907385290
https://github.com/ClickHouse/ClickHouse/pull/38434#discussion_r907406097

(the latter is no longer relevant as the affected places were removed in
the meantime)
2022-07-04 10:43:21 +00:00
Robert Schulze
d547aa7849
Allow non-const pattern array argument in multi[Fuzzy]Match*()
Resolves #38046
2022-07-04 10:43:16 +00:00
Alexander Gololobov
8ce8158f7f Do computations in Float32 (not Float64) for arrays of Float32 2022-07-03 10:33:11 +02:00
Alexander Gololobov
c6691cc5f2 Improved vectorized execution of main loop for array norm/distance 2022-07-02 22:45:22 +02:00
mergify[bot]
b016be264c
Merge branch 'master' into squared_l2 2022-07-02 09:17:28 +00:00
Azat Khuzhin
e8f5cd3c68 Add separate option to omit symbols from heavy contrib
Sometimes it is useful to build contrib with debug symbols for further
debugging.

With everything turned ON (i.e. debug build) I got 3.3GB vs 3.0GB w/o
this patch, 9% bloat, thoughts about this is this OK or not for you, if
not STRIP_DEBUG_SYMBOLS_HEAVY_CONTRIB can be OFF by default (regardless
of build type).

P.S. aws debug symbols adds just 1.7%.
v2: rename STRIP_HEAVY_DEBUG_SYMBOLS
v3: OMIT_HEAVY_DEBUG_SYMBOLS
v4: documentation had been removed
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-07-02 06:32:03 +03:00
Robert Schulze
2a1ede0f5a
Merge pull request #38589 from ClickHouse/fix-zero-bytes-in-haystack
Fix countSubstrings() & position() on patterns with 0-bytes
2022-07-01 16:15:43 +02:00
Amos Bird
84a407f381
Fix toHour monotonicity 2022-07-01 18:24:24 +08:00
Alexander Gololobov
b2b31103c5 Reuse common code for L2Squared and L2 2022-06-30 14:12:25 +02:00
Azat Khuzhin
a47355877e
Add revision() function (#38555)
It can be useful to match versions, since in some tables
(system.trace_log) there is only revision column.

P.S. came to this when was digging into stress reports from CI.
P.P.S. case insensitive by analogy with version().

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-06-30 12:58:26 +02:00
Robert Schulze
81bb2242fd
Fix countSubstrings() & position() on patterns with 0-bytes
SQL functions countSubstrings(), countSubstringsCaseInsensitive(),
countSubstringsUTF8(), position(), positionCaseInsensitive(),
positionUTF8() with non-const pattern argument use fallback sorters
LibCASCIICaseSensitiveStringSearcher and LibCASCIICaseInsensitiveStringSearcher
which call ::strstr(), resp. ::strcasestr(). These functions assume that
the haystack is 0-terminated and they even document that. However, the
callers did not check if the haystack contains 0-byte (perhaps because
its sort of expensive). As a consequence, if the haystack contained a
zero byte in it's payload, matches behind this zero byte were ignored.

    create table t (id UInt32, pattern String) engine = MergeTree() order by id;
    insert into t values (1, 'x');
    select countSubstrings('aaaxxxaa\0xxx', pattern) from t;

We returned 3 before this commit, now we return 6
2022-06-29 21:41:18 +00:00
hexiaoting
e32a0838d1 fix bug for mapUpdate 2022-06-29 15:52:08 +08:00
Julian Gilyadov
d0d72e0b5e
Implement L2Squared Distance and Norm 2022-06-28 15:28:58 -04:00
BoloniniD
6ddcec0906
Merge branch 'master' into BLAKE3 2022-06-28 16:53:06 +03:00
Miel Donkers
4e9f396a48
Small improvement of the error message to hint at possible issue (#38458) 2022-06-28 13:36:30 +02:00
Robert Schulze
959cbaab02
Move loop over patterns into implementations
- This is preparation for non-const regexp arguments, where this loop
  will run for each row.
2022-06-26 16:26:13 +00:00
Robert Schulze
cb5d1a4a85
Fix style check 2022-06-26 16:25:49 +00:00
Robert Schulze
b8f67185bf
Cosmetics: Whitespaces 2022-06-26 16:25:49 +00:00
Robert Schulze
e2b11899a1
Move check if cfg allows hyperscan into implementations
- This is not needed for non-const regexp array arguments but cleans up
  the code and runs the check only in functions which actually use
  hyperscan.
2022-06-26 16:25:49 +00:00
Robert Schulze
c2cea38b97
Move local variable into if statement 2022-06-26 16:25:49 +00:00
Robert Schulze
c9ce0efa66
Instantiate MultiMatchAnyImpl template using enums
- With this, invalid combinations of the FindAny/FindAnyIndex bools are
  no longer possible and we can remove the corresponding check

- Also makes the instantiations more readable.
2022-06-26 16:25:49 +00:00
Robert Schulze
2f15d45f27
Move check for regexp array size into implementations
- This is not needed for non-const regexp array arguments (the
  cardinality of arrays is fixed per column) but it cleans up the code
  and runs the check only in functions which have restrictions on the
  number of patterns.

- For functions using hyperscans, it was checked that the number of
  regexes is < 2^32. Removed the check because I don't think anyone will
  every specify 4 billion patterns.
2022-06-26 16:25:43 +00:00
Robert Schulze
3478db9fb6
Move check for regexp array size into implementations
- This is not needed for non-const regexp array arguments (the
  cardinality of arrays is fixed per column) but it cleans up the code
  and runs the check only in functions which have restrictions on the
  number of patterns.

- For functions using hyperscans, it was checked that the number of
  regexes is < 2^32. Removed the check because I don't think anyone will
  every specify 4 billion patterns.
2022-06-26 15:38:12 +00:00
Robert Schulze
7913edc172
Move check for hyperscan regexp constraints into implementations
- This is preparation for non-const regexp arguments, where this check
  will run for each row.
2022-06-26 15:38:05 +00:00
Robert Schulze
89bfdd50bf
Remove unnecessary check
- getReturnTypeImpl() ensures that the haystack column has type "String"
  and we can simply assert that.
2022-06-26 15:34:24 +00:00
Robert Schulze
580d89477f
Minimally faster performance 2022-06-26 15:34:22 +00:00
Robert Schulze
4bc59c18e3
Cosmetics: Move some code around + docs + whitespaces + minor stuff 2022-06-26 15:34:15 +00:00
Robert Schulze
1273756911
Cosmetics: fmt-based exceptions 2022-06-26 15:33:18 +00:00
Robert Schulze
e5c74a14f7
Cosmetics: More consistent naming
- rename utility function and file to "checkHyperscanRegexp"
2022-06-26 15:33:18 +00:00
Robert Schulze
072e0855a8
Cosmetics: Make member variables const 2022-06-26 15:32:26 +00:00
Robert Schulze
2ebfd01c2e
Cosmetics: Pull out settings variable 2022-06-26 15:32:23 +00:00
Robert Schulze
bb7c627964
Cosmetics: Pass patterns around as std::string_view instead of StringRef
- The patterns are not used in hashing, there should not be a performance
  impact when we use stuff from the standard library instead.

- added forgotten .reserve() in FunctionsMultiStringPosition.h
2022-06-26 15:32:19 +00:00
Alexey Milovidov
b3098822e0
Merge pull request #38171 from ClickHouse/hyper-to-vectorscan
Replace hyperscan by vectorscan
2022-06-26 10:01:45 +03:00
Alexey Milovidov
0654684bd4 Fix wrong implementation of filesystem* functions 2022-06-25 06:10:50 +02:00
Andrey Zvonov
ea73d9c492
Merge branch 'master' into zvonand-base58 2022-06-24 21:37:20 +03:00
Kruglov Pavel
0201d62090
Merge pull request #38173 from Avogar/fix-short-circuit
Fix bug with nested short-circuit functions
2022-06-24 16:04:17 +02:00
Robert Schulze
2c828338f4
Replace hyperscan by vectorscan
This commit migrates ClickHouse to Vectorscan. The first 10 min of
[0] explain the reasons for it.

(*) Addresses (but does not resolve) #38046

(*) Config parameter names (e.g. "max_hyperscan_regexp_length") are
    preserved for compatibility. Likewise, error codes (e.g.
    "ErrorCodes::HYPERSCAN_CANNOT_SCAN_TEXT") and function/class names (e.g.
    "HyperscanDeleter") are preserved as vectorscan aims to be a drop-in
    replacement.

[0] https://www.youtube.com/watch?v=KlZWmmflW6M
2022-06-24 10:47:52 +02:00
Andrey Zvonov
c18d09a617
Merge branch 'master' into zvonand-base58 2022-06-24 07:05:49 +03:00
zvonand
dd8203038f updated exception handling 2022-06-24 00:36:57 +05:00
zvonand
a94c40e33a Merge branch 'master' of github.com:ClickHouse/ClickHouse into dt64_timeslots 2022-06-23 17:08:28 +05:00
mergify[bot]
234f0c6399
Merge branch 'master' into revert-35914-FIPS_compliance 2022-06-23 12:06:17 +00:00
zvonand
946117ec89 Merge branch 'master' of github.com:ClickHouse/ClickHouse into zvonand-base58 2022-06-23 17:04:40 +05:00
Alexey Milovidov
5855668514 Remove trash 2022-06-22 06:23:35 +02:00
mergify[bot]
bb79eb73e6
Merge branch 'master' into fix-short-circuit 2022-06-21 10:40:07 +00:00
Kruglov Pavel
b9b58b4305
Merge pull request #37759 from Avogar/fix-nothing-error
Fix possible logical error with type Nothing in some functions
2022-06-21 12:35:05 +02:00
Larry Luo
bbd73ba727 use utility methods to access x509 struct fields. 2022-06-20 21:27:33 -04:00
zvonand
22af00b757 rename variable + fix handling of ENABLE_LIBRARIES 2022-06-20 23:53:47 +05:00
mergify[bot]
b440ee84ae
Merge branch 'master' into fix-short-circuit 2022-06-20 15:14:19 +00:00
zvonand
d4e5686b99 minor: fix message for base64 2022-06-20 20:13:09 +05:00
zvonand
78d55d6f46 small fixes 2022-06-20 19:30:54 +05:00
zvonand
832fd6e0a9 Added tests + minor updates 2022-06-19 23:10:28 +05:00
Alexey Milovidov
0cf88e0950
Revert "ClickHouse's boringssl module updated to the official version of the FIPS compliant." 2022-06-18 23:16:18 +03:00
zvonand
f4b3af091d fix zero byte 2022-06-17 23:48:14 +05:00
avogar
23f48a9fb9 Fix bug with nested short-circuit functions 2022-06-17 11:44:49 +00:00
Andrey Zvonov
f987f461e5 fix style -- rm unused ErrorCode 2022-06-17 15:00:32 +05:00
Igor Nikonov
baebbc084f
Merge pull request #38027 from ClickHouse/decimal_rounding_fix
Fix: rounding for Decimal128/Decimal256 with more than 19-digits long scale
2022-06-17 09:48:18 +02:00
zvonand
c1b2b669ab remove wrong code 2022-06-17 01:52:45 +05:00
mergify[bot]
f46f7257dd
Merge branch 'master' into fix-nothing-error 2022-06-16 10:58:03 +00:00
mergify[bot]
2557e8ad51
Merge branch 'master' into decimal_rounding_fix 2022-06-16 10:53:49 +00:00
avogar
a3a7cc7a5d Fix logical error in array mapped functions with const nullable column 2022-06-16 10:41:53 +00:00
zvonand
a800158438 wip upload 2022-06-16 15:11:41 +05:00
Danila Kutenin
048f56bf4d Fix some tests and comments 2022-06-15 14:40:21 +00:00
Danila Kutenin
08e3f77a9c Optimize most important parts with NEON SIMD
First part, updated most UTF8, hashing, memory and codecs. Except
utf8lower and upper, maybe a little later.

That includes huge amount of research with movemask dealing. Exact
details and blog post TBD.
2022-06-15 13:19:29 +00:00
mergify[bot]
d704264fae
Merge branch 'master' into decimal_rounding_fix 2022-06-15 10:47:09 +00:00
zvonand
c149c916ec initial setup 2022-06-15 11:49:55 +05:00
Igor Nikonov
bf7dd39282 Fix: decimal rounding
Fixes #37531
2022-06-14 18:03:05 +00:00
Maksim Kita
dc2e117cce UnaryLogicalFunctions improve performance using dynamic dispatch 2022-06-14 17:30:11 +02:00
zvonand
a5a980b69d Added no_sanitize 2022-06-13 19:45:54 +05:00
zvonand
54b8709cb1 minor fix 2022-06-13 19:21:07 +05:00
BoloniniD
617ef0809f Minor fixes 2022-06-12 01:22:33 +03:00
Robert Schulze
5f5732a2c4
Merge pull request #37969 from ClickHouse/consistent-macro-usage
More consistent use of platform macros
2022-06-10 14:10:01 +02:00
zvonand
fb67b080b9 added docs 2022-06-10 14:30:17 +03:00
zvonand
551d1ea875 fix wrong interval 2022-06-10 13:21:31 +03:00
Robert Schulze
1a0b5f33b3
More consistent use of platform macros
cmake/target.cmake defines macros for the supported platforms, this
commit changes predefined system macros to our own macros.

__linux__ --> OS_LINUX
__APPLE__ --> OS_DARWIN
__FreeBSD__ --> OS_FREEBSD
2022-06-10 10:22:31 +02:00
zvonand
e19653618c fix wrongfully added submodule 2022-06-10 11:19:38 +03:00
zvonand
16087ea400 enable dt64 for timeslots 2022-06-09 15:28:18 +03:00
Maksim Kita
0c1211eb61
Merge pull request #37930 from kitaisreal/function-dict-get-check-arguments-size
Function dictGet check arguments size
2022-06-08 23:25:14 +02:00
Maksim Kita
b7152fa2bf Function dictGet check arguments size 2022-06-08 17:19:30 +02:00
Maksim Kita
7d1a43cfeb Fix setting cast_ipv4_ipv6_default_on_conversion_error for internal cast 2022-06-08 12:43:39 +02:00
Maksim Kita
4e160105b9
Merge pull request #37805 from kitaisreal/dictionaries-hierarchy-nullable-key-support
Hierarchical dictinaries support nullable parent key
2022-06-08 12:36:09 +02:00
Anton Popov
df6882d2b9
Revert "Fix errors of CheckTriviallyCopyableMove type" 2022-06-07 13:53:10 +02:00
mergify[bot]
014d9e2144
Merge branch 'master' into fix-nothing-error 2022-06-07 11:24:28 +00:00
avogar
cbd50aecd4 Fix 2022-06-07 11:23:59 +00:00
Vitaly Baranov
d199478169
Merge pull request #37303 from ClickHouse/fix_trash
Try to fix some trash
2022-06-07 10:17:39 +02:00
BoloniniD
b05ee41d25 Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-06-06 16:03:10 +03:00
Robert Schulze
2d87af2a15
Merge pull request #37647 from DevTeamBK/Fix-all-CheckTriviallyCopyableMove-Errors
Fix errors of CheckTriviallyCopyableMove type
2022-06-05 19:58:47 +02:00
zvonand
5475f62363 32 to 64 2022-06-05 13:06:48 +03:00
Maksim Kita
6db5c08fde Functions dictGetChildren, dictGetDescendants added support for nullable parent key 2022-06-03 17:36:16 +02:00
Maksim Kita
a0cbbd9edc Hierarchical Cache, Direct dictionaries added support for nullable parent key 2022-06-03 17:21:55 +02:00
Anton Popov
f592a802a1
Merge pull request #37482 from CurtizJ/json-new-serialization
Better binary serialization of `ColumnObject`
2022-06-03 13:29:19 +02:00
Robert Schulze
05f08357a9
Merge pull request #37764 from ClickHouse/like_with_trailing_backslash
Disallow LIKE patterns with trailing escape
2022-06-03 13:19:51 +02:00
Alexey Milovidov
1529d47207
Merge pull request #34754 from ClickHouse/llvm-14
Switch to clang/llvm 14
2022-06-03 14:07:34 +03:00
Alexey Milovidov
de16784832
Merge pull request #37633 from ClickHouse/dump-column-structure-more-precise
More precise result of the `dumpColumnStructure` and `byteSize` miscellaneous functions
2022-06-03 14:05:20 +03:00
Alexey Milovidov
ea89f81a78 Merge branch 'master' of github.com:ClickHouse/ClickHouse into llvm-14 2022-06-03 03:07:14 +02:00
Robert Schulze
657662d89f
Minor follow-up to cache table: std::{vector-->array} 2022-06-02 20:18:10 +02:00
Maksim Kita
20b55a45b2 Hierarchical dictionaries support nullable parent key 2022-06-02 19:24:23 +02:00
HeenaBansal2009
e3080f2a97 Merge remote-tracking branch 'origin' into Fix-all-CheckTriviallyCopyableMove-Errors 2022-06-02 07:30:08 -07:00
Alexander Gololobov
b34782dc6a
Merge pull request #37775 from liuneng1994/fix_date32_to_string
fix toString error on DatatypeDate32
2022-06-02 16:40:47 +03:00
Vladimir C
670c721ded
Merge pull request #37742 from ucasfl/hashid 2022-06-02 12:47:11 +02:00
Robert Schulze
4e18659bfd
Fix tests + more precise exception msg 2022-06-02 11:11:56 +02:00
liuneng1994
7b15055e72 fix toString error on DatatypeDate32
Signed-off-by: liuneng1994 <1398775315@qq.com>
2022-06-02 16:56:43 +08:00
Alexey Milovidov
b5f48a7d3f Merge branch 'master' of github.com:ClickHouse/ClickHouse into llvm-14 2022-06-01 22:09:58 +02:00
Robert Schulze
366f368d06
Disallow LIKE patterns with trailing escape
Trailing escape ('ab\') is disallowed in SQL, in standardese:

  "If an escape character is specified, then [...] If there is not a
  partitioning of the string PVC into substrings such that each substring
  has length 1 (one) or 2, no substring of length 1 (one) is the escape
  character ECV, and each substring of length 2 is the escape character
  ECV followed by either the escape character ECV, an <underscore>
  character, or the <percent> character, then an exception condition is
  raised: data exception - invalid escape sequence."

I first thought this is checked already higher up in the stack, at least
for const needles, as single trailing backslashes ('ab\') are rejected,
but then I realized that ClickHouse quotes by default. I.e., double
trailing backslashes ('ab\\') are not rejected but when interpreted as
LIKE needle ('ab\') they should.
2022-06-01 21:38:46 +02:00
Robert Schulze
b3b0716b32
Merge pull request #37544 from ClickHouse/cached_patterns
Cache compiled regexps when evaluating non-const needles
2022-06-01 19:55:25 +02:00
avogar
966b864986 Fix possible logical error with type Nothing and JSON functions 2022-06-01 16:34:31 +00:00
flynn
b62e4cec65 Fix crash of FunctionHashID 2022-06-01 12:39:16 +00:00
BoloniniD
44ce53f4b3 Fix new line 2022-06-01 15:22:22 +03:00
Alexander Tokmakov
75f49a48e1 Merge branch 'master' into fix_trash 2022-06-01 14:20:46 +02:00
Robert Schulze
600512cc08
Replace exceptions thrown for programming errors by asserts 2022-06-01 11:53:37 +02:00
BoloniniD
dd8aefdf1e Merge branch 'master' of github.com:ClickHouse/ClickHouse into BLAKE3 2022-06-01 11:46:55 +03:00
Anton Popov
20e319d67a
Merge pull request #37666 from CurtizJ/optimize-coalesce
Optimize function `COALESCE` with two arguments
2022-05-31 23:48:13 +02:00
Yakov Olkhovskiy
873ac9f8ff
Merge pull request #37540 from ClickHouse/feature-server-certificate
showCertificate function implementation
2022-05-31 02:50:03 -04:00
Anton Popov
30f8eb800a optimize function coalesce with two arguments 2022-05-30 22:29:35 +00:00
Nikolai Kochetov
77b07dd0a8
Merge pull request #37163 from ClickHouse/grouping-function
Add GROUPING function
2022-05-30 20:45:04 +02:00
HeenaBansal2009
b7eb6bbd38 Fixed clang-tidy-CheckTriviallyCopyableMove-errors 2022-05-30 11:09:03 -07:00
Robert Schulze
ad12adc31c
Measure and rework internal re2 caching
This commit is based on local benchmarks of ClickHouse's re2 caching.

Question 1: -----------------------------------------------------------
Is pattern caching useful for queries with const LIKE/REGEX
patterns? E.g. SELECT LIKE(col_haystack, '%HelloWorld') FROM T;

The short answer is: no. Runtime is (unsurprisingly) dominated by
pattern evaluation + other stuff going on in queries, but definitely not
pattern compilation. For space reasons, I omit details of the local
experiments.

(Side note: the current caching scheme is unbounded in size which poses
a DoS risk (think of multi-tenancy). This risk is more pronounced when
unbounded caching is used with non-const patterns ..., see next
question)

Question 2: -----------------------------------------------------------
Is pattern caching useful for queries with non-const LIKE/REGEX
patterns? E.g. SELECT LIKE(col_haystack, col_needle) FROM T;

I benchmarked five caching strategies:
1. no caching as a baseline (= recompile for each row)
2. unbounded cache (= threadsafe global hash-map)
3. LRU cache (= threadsafe global hash-map + LRU queue)
4. lightweight local cache 1 (= not threadsafe local hashmap with
   collision list which grows to a certain size (here: 10 elements) and
   afterwards never changes)
5. lightweight local cache 2 (not threadsafe local hashmap without
   collision list in which a collision replaces the stored element, idea
   by Alexey)

... using a haystack of 2 mio strings and
A). 2 mio distinct simple patterns
B). 10 simple patterns
C)  2 mio distinct complex patterns
D)  10 complex patterns

Fo A) and C), caching does not help but these queries still allow to
judge the static overhead of caching on query runtimes.

B) and D) are extreme but common cases in practice. They include
queries like "SELECT ... WHERE LIKE (col_haystack, flag ? '%pattern1%' :
'%pattern2%'). Caching should help significantly.

Because LIKE patterns are internally translated to re2 expressions, I
show only measurements for MATCH queries.

Results in sec, averaged over on multiple measurements;

1.A): 2.12
  B): 1.68
  C): 9.75
  D): 9.45

2.A): 2.17
  B): 1.73
  C): 9.78
  D): 9.47

3.A): 9.8
  B): 0.63
  C): 31.8
  D): 0.98

4.A): 2.14
  B): 0.29
  C): 9.82
  D): 0.41

5.A) 2.12 / 2.15 / 2.26
  B) 1.51 / 0.43 / 0.30
  C) 9.97 / 9.88 / 10.13
  D) 5.70 / 0.42 / 0.43
(10/100/1000 buckets, resp. 10/1/0.1% collision rate)

Evaluation:

1. This is the baseline. It was surprised that complex patterns (C, D)
   slow down the queries so badly compared to simple patterns (A, B).
   The runtime includes evaluation costs, but as caching only helps with
   compilation, and looking at 4.D and 5.D, compilation makes up over 90%
   of the runtime!

2. No speedup compared to 1, probably due to locking overhead. The cache
   is unbounded, and in experiments with data sets > 2 mio rows, 2. is
   the only scheme to throw OOM exceptions which is not acceptable.

3. Unique patterns (A and C) lead to thrashing of the LRU cache and very
   bad runtimes due to LRU queue maintenance and locking. Works pretty
   well however with few distinct patterns (B and D).

4. This scheme is tailored to queries B and D where it performs pretty
   good. More importantly, the caching is lightweight enough to not
   deteriorate performance on datasets A and C.

5. After some tuning of the hash map size, 100 buckets seem optimal to
   be in the same ballpark with 10 distinct patterns as 4. Performance
   also does not deteriorate on A and C compared to the baseline.
   Unlike 4., this scheme behaves LRU-like and can adjust to changing
   pattern distributions.

As a conclusion, this commit implementes two things:

1. Based on Q1, pattern search with const needle no longer uses
   caching. This applies to LIKE and MATCH + a few (exotic) other SQL
   functions. The code for the unbounded caching was removed.

2. Based on Q2, pattern search with non-const needles now use method 5.
2022-05-30 20:00:35 +02:00
Alexey Milovidov
f1fb57c6ce Fix clang-tidy-14 2022-05-30 05:36:26 +02:00
Alexey Milovidov
c0e6ff4216 More precise result of "dumpColumnStructure" and "byteSize" miscellaneous functions 2022-05-30 04:56:54 +02:00
Alexey Milovidov
c1169019d2 Merge branch 'master' into llvm-14 2022-05-29 02:29:02 +02:00
Alexey Milovidov
73e2e63414
Merge pull request #37612 from ClickHouse/clang-tidy-14
Fix clang-tidy-14, part 1
2022-05-29 03:16:32 +03:00
Alexander Tokmakov
4e52f45695 Merge branch 'master' into fix_trash 2022-05-28 19:43:19 +02:00
Alexey Milovidov
c50791dd3b Fix clang-tidy-14, part 1 2022-05-27 22:52:14 +02:00
Alexey Milovidov
d2c6fd90cb Fix clang-tidy-14, part 1 2022-05-27 22:51:37 +02:00
Alexander Gololobov
9b1b30855c Fixed check for HUGE_VAL 2022-05-27 18:25:11 +02:00
Alexander Gololobov
6361c5f38c Fix for failed style check 2022-05-27 18:22:16 +02:00
Alexander Gololobov
540353566c Added LpNorm and LpDistance functions for arrays 2022-05-27 17:17:08 +02:00
Robert Schulze
80061aa3e2
Merge remote-tracking branch 'origin/master' into cached_patterns 2022-05-27 09:21:01 +02:00
Alexey Milovidov
86afa3a245
Merge pull request #37502 from ClickHouse/array_norm_dist_fixes
Renamed arrayXXNorm/arrayXXDistance functions to XXNorm/XXDistance and fixed some overflow cases
2022-05-27 00:56:29 +03:00
mergify[bot]
a7629f900f
Merge branch 'master' into normalize-utf8-performance-tests-fix 2022-05-26 10:29:55 +00:00
Maksim Kita
3a92e61827
Merge pull request #37148 from kitaisreal/dictionary-get-descendants-performance-improvement
Dictionary getDescendants performance improvement
2022-05-26 12:29:17 +02:00
Yakov Olkhovskiy
2dc160a4c3 style fix 2022-05-25 20:56:36 -04:00
Dmitry Novik
7cd7782e4f Process columns more efficiently in GROUPING() 2022-05-25 21:55:41 +00:00
Dmitry Novik
3c1b6609ae Add comments and make tests more verbose 2022-05-25 21:23:35 +00:00
Maksim Kita
58cd1bd3ec
Merge pull request #36843 from bharatnc/ncb/h3-unidirectionaledges-funcs
add h3 unidirectional edge functions
2022-05-25 22:46:40 +02:00
Maksim Kita
bee3c30f66
Merge pull request #37524 from kitaisreal/geo-distance-functions-improve-performance
Geo distance functions improve performance
2022-05-25 22:40:40 +02:00
Alexander Gololobov
168b47d0ad Use same norm and distance function names for tuples and arrays 2022-05-25 22:39:59 +02:00
Alexander Gololobov
b065839f44 always return Float64 2022-05-25 22:27:00 +02:00
Alexander Gololobov
5df14cd956 Cast arguments to result type to avoid int overflow 2022-05-25 22:27:00 +02:00
Robert Schulze
49934a3dc8
Cache compiled regexps when evaluating non-const needles
Needles in a (non-const) needle column may repeat and this commit allows
to skip compilation for known needles. Out of the different design
alternatives (see below, if someone is interested), we now maintain
- one global pattern cache,
- with a fixed size of 42k elements currently,
- and use LRU as eviction strategy.

------------------------------------------------------------------------

(sorry for the wall of text, dumping it here not for reading but just
for reference)

Write-up about considered design alternatives:

1. Keep the current global cache of const needles. For non-const
   needles, probe the cache but don't store values in it.
   Pros: need to maintain just a single cache, no problem with cache
         pollution assuming there are few distinct constant needles
   Cons: only useful if a non-const needle occurred as already as a
         const needle
   --> overall too simplistic

2. Keep the current global cache for const needles. For non-const
   needles, create a local (e.g. per-query) cache
   Pros: unlike (1.), non-const needles can be skipped even if they
         did not occur yet, no pollution of the const pattern cache when
         there are very many non-const needles (e.g. large / highly
         distinct needle columns).
   Cons: caches may explode "horizontally", i.e. we'll end up with the
         const cache + caches for Q1, Q2, ... QN, this makes it harder
         to control the overall space consumption, also patterns
         residing in different caches cannot be reused between queries,
         another difficulty is that the concept of "query" does not
         really exist at matching level - there are only column chunks
         and we'd potentially end up with 1 cache / chunk

3. Queries with const and non-const needles insert into the same global
   cache.
   Pros: the advantages of (2.) + allows to reuse compiled patterns
         accross parallel queries
   Cons: needs an eviction strategy to control cache size and pollution
         (and btw. (2.) also needs eviction strategies for the
         individual caches)

4. Queries with const needle use global cache, queries with non-const
   needle use a different global cache
   --> Overall similar to (3) but ignores the (likely) edge case that
       const and non-const needles overlap.

In sum, (3.) seems the simplest and most beneficial approach.

Eviction strategies:

0. Don't ever evict --> cache may grow infinitely and eventually make
   the system unusable (may even pose a DoS risk)

1. Flush the cache after a certain threshold is exceeded --> very
   simple but may lead to peridic performance drops

2. Use LRU --> more graceful performance degradation at threshold but
   comes with a (constant) performance overhead to maintain the LRU
   queue

In sum, given that the pattern compilation in RE2 should be quite costly
(pattern-to-DFA/NFA), LRU may be acceptable.
2022-05-25 22:04:06 +02:00
Robert Schulze
ea60a614d2
Decrease namespace indent 2022-05-25 21:56:35 +02:00
Alexey Milovidov
abf2558fba
Merge pull request #37491 from ClickHouse/match_refactoring
Refactorings of LIKE/MATCH code
2022-05-25 22:05:38 +03:00
Alexey Milovidov
4482da9eb6
Update greatCircleDistance.cpp 2022-05-25 21:59:31 +03:00
Alexander Tokmakov
779e6ea0b9 make it better, fix on cluster queries 2022-05-25 20:17:49 +02:00
Nikolai Kochetov
ff98c24d44
Merge pull request #37048 from Avogar/fix-array-map-nothing
Add default implementation for Nothing in functions
2022-05-25 19:10:40 +02:00
Yakov Olkhovskiy
6692b9c2ed showCertificate function implementation 2022-05-25 12:11:44 -04:00
Alexey Milovidov
cb92482ca5
Merge pull request #37484 from kitaisreal/function-has-all-avx2-dynamic-dispatch
Function hasAll added dynamic dispatch
2022-05-25 19:05:32 +03:00
Maksim Kita
28355114c0 Fixed tests 2022-05-25 16:19:29 +02:00
Maksim Kita
e67b3537f7 Functions normalizeUTF8 unstable performance tests fix 2022-05-25 15:54:52 +02:00
Maksim Kita
45da28ecae Improve performance of geo distance functions 2022-05-25 14:22:22 +02:00
Maksim Kita
c372c3d6aa Fix performance tests 2022-05-25 11:49:59 +02:00
Kseniia Sumarokova
b50d4549c9
Merge pull request #37356 from amosbird/partition-prune-for-s3
"Partition pruning" for s3
2022-05-25 11:03:07 +02:00
Robert Schulze
05e4fa7df1
Fix special case of trivial regexp
Previously, we would alsays set 1 in case of a trivial regex (which is
correct). If someone in future builds a negated operator, then this
will produce wrong results. Right now, negation of regexp (SQL: NOT
MATCH) is implemented at a higher level, so we are safe and this is more
a preventive fix.
2022-05-25 10:05:55 +02:00
Robert Schulze
01ab7b9bad
Pass strings in some places as string_view
The original goal was to get change

  const auto & needle = String(
        reinterpret_cast<const char *>(cur_needle_data),
        cur_needle_length);

in Functions/MatchImpl.h into a std::string_view to save an allocation +
copy. The needle is eventually passed as search pattern into the re2
library. Re2 has an alternative constructor taking a const char * i.e. a
NULL-terminated string. Here, the needle is NULL-terminated but
1. this is only because it is passed inside a ColumnString yet this is
   not always the case (e.g. fixed string columns has a dense layout w/o
   NULL terminator).
2. assuming NULL termination for users != MatchImpl of the regex code is
   too dangerous.

So, for now we'll stay with copying to be on the safe side. One fine day
when re2 has a ptr/size ctor, we can use std::string_view.

Just changing a few other places from std::string to std::string_view
but this will not help with performance.
2022-05-25 10:05:51 +02:00
Robert Schulze
040fbf3686
Tighter sanity checks in matching code 2022-05-25 10:05:06 +02:00
Robert Schulze
35bef17302
Introduce variables to hold the match result
--> nicer when debugging
2022-05-25 10:04:47 +02:00
Robert Schulze
b044d44fef
Refactoring: Make template instantiation easier to read
- introduced class MatchTraits with enums that replace bool template
  parameters

- (minor: made negation the last template parameters because negation
  executes last during evaluation)
2022-05-25 10:03:58 +02:00
Bharat Nallan Chakravarthy
57cfc0bd04 check for validity of h3 index 2022-05-25 06:17:15 +05:30
Alexander Gololobov
2ff747785e
Merge pull request #37394 from ClickHouse/array_norm_dist_fixes
Do computations on the raw input data without copying to Eigen::Matrix
2022-05-24 20:59:04 +02:00
Robert Schulze
7348a0eb28
Merge pull request #37251 from ClickHouse/non_const_like
Support non-constant SQL functions (NOT) (I)LIKE and MATCH
2022-05-24 20:28:31 +02:00
Robert Schulze
028f15c4fa
Review comment: Throw LOGICAL_ERROR for different sizes of haystack / needles 2022-05-24 20:19:13 +02:00
Maksim Kita
3c0c322d7c
Merge pull request #37480 from kitaisreal/dynamic-dispatch-infrastructure-improvements
Dynamic dispatch infrastructure style fixes
2022-05-24 18:13:53 +02:00
Maksim Kita
6fb51e8bd3 Function hasAll added dynamic dispatch 2022-05-24 17:06:06 +02:00
Maksim Kita
86180614e7 Fixed tests 2022-05-24 15:33:03 +02:00
Anton Popov
e96af9fd75 better binary serialization of ColumnObject 2022-05-24 13:16:11 +00:00
Maksim Kita
e6e4b2826d Dynamic dispatch infrastructure style fixes 2022-05-24 14:25:29 +02:00
Amos Bird
c25ef92139
Fix tests 2022-05-24 18:57:55 +08:00