Commit Graph

9 Commits

Author SHA1 Message Date
Robert Schulze
856eba0a4b
Mark delta/doubledelta codec followed by a time series codec as suspicious 2023-01-29 08:51:13 +00:00
Robert Schulze
574cab5d7e
Remove transitory parameter 2023-01-24 11:05:29 +00:00
Robert Schulze
e6167d6b36
Deprecate Gorilla compression of non-float columns
Reasons:

1. The original Gorilla paper proposed a compression schema for pairs of
   time stamps and double-precision FP values. ClickHouse's Gorilla
   codec only implements compression of the latter and it does not
   impose any data type restrictions.
   - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are
     definitely not supposed to be used with Gorilla.
   - (U)Int* types are debatable. The paper only considers
     integers-stored-as-FP-values, a practical use case for which
     Gorilla works well. Standalone integers are not considered which
     makes them at least suspicious.

2. Achieve consistency with FPC, another specialized floating-point
   timeseries codec, which rejects non-float data.

3. On practical datasets, ZSTD is often "good enough" (**) so it should
   be okay to disincentive non-ZSTD codecs a little bit. If needed,
   Delta and DoubleDelta codecs are viable alternative for slowly
   changing (time-series-like) integer sequences.

Since on-prem and hosted users may still have Gorilla-compressed
non-float data, this combination is only deprecated for now. No warning
or error will be emitted. Users are encouraged to migrate
Gorilla-compressed non-float data to an alternative codec. It is planned
to treat Gorilla-compressed non-float columns as "suspicious" six months
after this commit (i.e. in v23.6). Even then, it will still be possible
to set "allow_suspicious_codecs = true" and read and write
Gorilla-compressed non-float data.

(*) Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a
    double floating point type.", https://doi.org/10.14778/2824032.2824078

(**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema
2023-01-20 17:31:16 +00:00
Alexander Tokmakov
df75c24f01
Revert "Disallow Gorilla codec on non-float columns" 2023-01-16 19:14:28 +03:00
Robert Schulze
a4a6126c9d
Prohibit manual delta compression before floating-point time series compression 2023-01-14 20:09:50 +00:00
Robert Schulze
fbdaca4e2a
Code cleanup 2023-01-14 19:21:30 +00:00
Robert Schulze
a5e14a85ab
Fix 01272_suspicious_codecs 2023-01-14 17:29:10 +00:00
Vitaly Baranov
39d73c01b2 Add tags to tests. 2021-09-12 17:15:28 +03:00
Alexey Milovidov
8a8014fa39 Added a test 2020-05-04 17:46:11 +03:00