Commit Graph

440 Commits

Author SHA1 Message Date
Robert Schulze
aaf7653108
Merge remote-tracking branch 'origin/master' into block-non-float-gorilla-v2 2023-01-24 10:14:10 +00:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Robert Schulze
4ece499f19
Fix build 2023-01-22 12:26:03 +00:00
Robert Schulze
e6167d6b36
Deprecate Gorilla compression of non-float columns
Reasons:

1. The original Gorilla paper proposed a compression schema for pairs of
   time stamps and double-precision FP values. ClickHouse's Gorilla
   codec only implements compression of the latter and it does not
   impose any data type restrictions.
   - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are
     definitely not supposed to be used with Gorilla.
   - (U)Int* types are debatable. The paper only considers
     integers-stored-as-FP-values, a practical use case for which
     Gorilla works well. Standalone integers are not considered which
     makes them at least suspicious.

2. Achieve consistency with FPC, another specialized floating-point
   timeseries codec, which rejects non-float data.

3. On practical datasets, ZSTD is often "good enough" (**) so it should
   be okay to disincentive non-ZSTD codecs a little bit. If needed,
   Delta and DoubleDelta codecs are viable alternative for slowly
   changing (time-series-like) integer sequences.

Since on-prem and hosted users may still have Gorilla-compressed
non-float data, this combination is only deprecated for now. No warning
or error will be emitted. Users are encouraged to migrate
Gorilla-compressed non-float data to an alternative codec. It is planned
to treat Gorilla-compressed non-float columns as "suspicious" six months
after this commit (i.e. in v23.6). Even then, it will still be possible
to set "allow_suspicious_codecs = true" and read and write
Gorilla-compressed non-float data.

(*) Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a
    double floating point type.", https://doi.org/10.14778/2824032.2824078

(**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema
2023-01-20 17:31:16 +00:00
Alexander Tokmakov
df75c24f01
Revert "Disallow Gorilla codec on non-float columns" 2023-01-16 19:14:28 +03:00
Robert Schulze
b6f12f9edd
Fix unit_tests_dbms 2023-01-15 14:49:04 +00:00
Robert Schulze
a4a6126c9d
Prohibit manual delta compression before floating-point time series compression 2023-01-14 20:09:50 +00:00
Robert Schulze
fbdaca4e2a
Code cleanup 2023-01-14 19:21:30 +00:00
Robert Schulze
5d3f0ec4a0
Disallow Gorilla codec on non-float columns
Cf. #45195
2023-01-13 16:53:28 +00:00
alesapin
6c8dbcc040 Fix flaky test 01459_manual_write_to_replicas_quorum_detach_attach and several typos 2023-01-03 16:27:51 +01:00
Alexander Gololobov
f75b203071
Merge pull request #44034 from ClickHouse/evillique-patch-1
Fix exception message
2022-12-08 23:30:09 +01:00
Robert Schulze
375f488156
Merge pull request #44024 from jinjunzh/iaadeflate_upgrade_qpl030
Upgrade Intel®-IAA/QPL-based DEFLATE_QPL Codec
2022-12-08 12:56:03 +01:00
Nikolay Degterinsky
16b33c204c
Fix exception message 2022-12-08 10:49:25 +01:00
Duc Canh Le
88716b1ec3 Minor fix 2022-12-08 09:39:25 +08:00
jinjunzh
47c3508337 fixed build issues for QPL 0.3.0 2022-12-07 16:41:12 -05:00
Sema Checherinda
f49936b9a0 remove tmp dir from inmemory part when freeze, more logs about old parts 2022-11-23 15:16:09 +00:00
Nikita Taranov
0b4e643c27
Add check to CompressionCodecDelta (#43255)
* Add check to CompressionCodecDelta

* Apply suggestions from code review

* Update src/Compression/CompressionCodecDelta.cpp
2022-11-19 14:16:14 +01:00
Sema Checherinda
7e73b187cc
Merge pull request #43159 from Algunenano/over-read
Fix several buffer over-reads
2022-11-15 15:04:24 +01:00
Raúl Marín
54db7c6520 Enforce checking read output 2022-11-11 10:56:18 +01:00
Tiaonmmn
331c1cbeb7
Update CompressionCodecDeflateQpl.cpp
Modify the data type to resolve warning of clang-tidy "implicit conversion".
2022-11-11 10:00:59 +08:00
Nikita Taranov
2c7708a03e
Add bound check to lz4 decompression (#42868)
* impl

* add test

* fix bullshit tidy error
2022-11-04 22:12:36 +01:00
Anton Popov
7715afa595
Merge pull request #42618 from CurtizJ/refactor-data-part
Try to save `IDataPartStorage` interface
2022-10-27 22:20:44 +02:00
Anton Popov
c8199bc125
Merge branch 'master' into refactor-data-part 2022-10-25 14:31:05 +02:00
Robert Schulze
c119cd2f00
Merge branch 'master' into update-libcxx-to-15 2022-10-24 08:29:37 +02:00
Anton Popov
b40d9200d2 better semantic of constsness of DataPartStorage 2022-10-23 15:24:20 +00:00
Azat Khuzhin
3d400068fd tests/gtest_compressionCodec: fix UBSAN report to avoid test failure
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:43 +02:00
Azat Khuzhin
2474303aef tests/gtest_compressionCodec: Fix UBSAN report about signed integer overflow
UBSAN report:

    $ UBSAN_SYMBOLIZER_PATH=/usr/bin/llvm-symbolizer-14 UBSAN_OPTIONS=print_stacktrace=1 ./unit_tests_dbms --gtest_filter=*Gorilla*
    ../src/Compression/tests/gtest_compressionCodec.cpp:1216:47: runtime error: signed integer overflow: 23 * 100000000 cannot be represented in type 'int'
        #0 0x14f67fd1 in auto (anonymous namespace)::$_6::operator()(int) const::'lambda'(auto)::operator()<int>(auto) const build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1216:47
        #1 0x14f67fd1 in (anonymous namespace)::CodecTestSequence (anonymous namespace)::generateSeq<long, (anonymous namespace)::$_6::operator()(int) const::'lambda'(auto), int, int>((anonymous namespace)::$_6::operator()(int) const::'lambda'(auto), char const*, int, int) build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:394:36
        #2 0x14f67fd1 in auto (anonymous namespace)::GCompatibilityTestSequence<long>() build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1224:12
        #3 0x14f3c7f3 in (anonymous namespace)::gtest_GorillaCodecTestCompatibility_EvalGenerator_() build_docker/../src/Compression/tests/gtest_compressionCodec.cpp:1227:1
        #4 0x14f6bdb5 in testing::internal::ParameterizedTestSuiteInfo<(anonymous namespace)::CodecTestCompatibility>::RegisterTests() build_docker/../contrib/googletest/googletest/include/gtest/internal/gtest-param-util.h:553:45
        #5 0x27d87988 in testing::internal::ParameterizedTestSuiteRegistry::RegisterTests() build_docker/../contrib/googletest/googletest/include/gtest/internal/gtest-param-util.h:726:24
        #6 0x27d87988 in testing::internal::UnitTestImpl::RegisterParameterizedTests() build_docker/../contrib/googletest/googletest/src/gtest.cc:2805:34
        #7 0x27d87988 in testing::internal::UnitTestImpl::PostFlagParsingInit() build_docker/../contrib/googletest/googletest/src/gtest.cc:5492:5
        #8 0x27d9d002 in void testing::internal::InitGoogleTestImpl<char>(int*, char**) build_docker/../contrib/googletest/googletest/src/gtest.cc:6499:22
        #9 0x14fd5495 in main build_docker/../src/Coordination/tests/gtest_coordination.cpp:2189:5
        #10 0x7f8c29005209  (/lib/x86_64-linux-gnu/libc.so.6+0x29209) (BuildId: 71a7c7b97bc0b3e349a3d8640252655552082bf5)
        #11 0x7f8c290052bb in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x292bb) (BuildId: 71a7c7b97bc0b3e349a3d8640252655552082bf5)
        #12 0x14ce356d in _start (/work1/azat/tmp/42190/unit_tests_dbms+0x14ce356d) (BuildId: 482550e3f8d45f06e8c7f8147f427ee798c1f645)

    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../src/Compression/tests/gtest_compressionCodec.cpp:1216:47 in

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:43 +02:00
Azat Khuzhin
b8ab686561 tests/gtest_compressionCodec: fix for darwin
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:43 +02:00
Azat Khuzhin
4e76629aaf Fixes for -Wshorten-64-to-32
- lots of static_cast
- add safe_cast
- types adjustments
  - config
  - IStorage::read/watch
  - ...
- some TODO's (to convert types in future)

P.S. That was quite a journey...

v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-10-21 13:25:19 +02:00
Robert Schulze
820e6b4276
Build with libcxx(abi) 15 2022-10-20 10:52:43 +00:00
Alexey Milovidov
191158f93b
Merge pull request #42314 from HarryLeeIBM/hlee-s390x-t64-codec
Fix Codec T64 on s390x
2022-10-17 01:13:57 +02:00
Alexey Milovidov
bfd8811a4d
Update CompressionCodecT64.cpp 2022-10-15 00:32:37 +03:00
Alexey Milovidov
4608b70dda
Update CompressionCodecT64.cpp 2022-10-15 00:31:33 +03:00
HarryLeeIBM
6cdb7b5a94 Fix Codec T64 on s390x 2022-10-14 07:16:18 -07:00
Alexander Tokmakov
4175f8cde6 abort instead of __builtin_unreachable in debug builds 2022-10-07 21:49:08 +02:00
Robert Schulze
da5a2e2db0
Merge remote-tracking branch 'origin/master' into generated-file-cleanup
Physical merge conflicts:
- src/Common/ZooKeeper/ZooKeeperImpl.cpp
- src/Core/config_core.h.in
- src/Functions/FunctionsAES.h
- src/Functions/config_functions.h.in
- src/configure_config.cmake

Logical merge conflicts:
- Functions/tryDecrypt.cpp
2022-10-06 08:43:25 +00:00
root
4318ec125e Merge branch 'master' of https://github.com/MeenaRenganathan22/ClickHouse into OpenSSL_InHouse 2022-09-30 15:50:22 -07:00
Robert Schulze
f24fab7747
Fix some #include atrocities 2022-09-28 13:49:28 +00:00
Robert Schulze
fd86829824
Consolidate config_core.h into config.h
Less duplication, less confusion ...
2022-09-28 13:31:57 +00:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Alexey Milovidov
730655d4fd Fix 8/9 of trash 2022-09-19 08:53:20 +02:00
Meena Renganathan
51d6611b96 Committing the ClickHouse core changes and other libraries to support OpenSSL. BoringSSL is still set as default 2022-09-12 09:05:38 -07:00
Anton Popov
7c12b448b8 Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-08 01:52:52 +00:00
Nikita Taranov
7c4f42d014
Skip empty literals in lz4 decompression (#40142) 2022-09-06 13:58:26 +02:00
Anton Popov
9dda9658a8 Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-02 12:48:27 +00:00
Antonio Andelic
e64436fef3 Fix typos with new codespell 2022-09-02 08:54:48 +00:00
HarryLeeIBM
d609da182b Fix Endian issue in Codec for s390x 2022-08-08 11:28:35 -07:00
Robert Schulze
a7734672b9
Use std::popcount, ::countl_zero, ::countr_zero functions
- Introduced with the C++20 <bit> header

- The problem with __builtin_c(l|t)z() is that 0 as input has an
  undefined result (*) and the code did not always check. The std::
  versions do not have this issue.

- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
  src/Common/BitHelpers.h) because the std:: versions only accept
  unsigned inputs (and they also check that) and the casting would be
  ugly.

(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
2022-07-31 15:16:51 +00:00
Robert Schulze
7d653b3bd8
Merge pull request #39495 from ClickHouse/mark-qpl-deflate-experimental
Mark new codec DEFLATE_QPL as experimental + cosmetics
2022-07-26 19:00:21 +02:00
Robert Schulze
2785783def
Mark DEFLAPT_QPL as 'experimental' codec + cosmetics 2022-07-24 20:10:11 +00:00