Resolves: #51486
Until now, it was illegal to encode 64-bit (unsigned) integers with
MSB=1, i.e. values > (1ULL<<63) - 1, as var-int. In more detail, the
var-int code used by ClickHouse server and client spent at most 9 bytes
per value such that 9 * 7 = 63 bits could be encoded. Some 3rd party
clients (e.g. Rust clickhouse-rs) had the same limitation, whereas other
clients understand the full range (Python clickhouse-driver).
PRs #47608 and #48628 added sanity checks as asserts or exceptions
during var-int encoding on the server side. This was considered okay as
such huge integers so far occurred only during testing (usually fuzzing)
but not in practice.
Issue #51486 is a new fuzzing issue where the exception thrown from the
sanity check led to a half-baked progress packet and as a result, a
logical error / server crash.
The only fix which is not another bandaid is to allow the full range in
var-int coding. Clients will have to allow the full range too, a note
will be added to the changelog. (the alternative was to create another
protocol version but as var-int is used all over the place this was
considered infeasible)
Review note: this is the relevant commit.
This is follow-up for the discussion in #48628. The sanity check against
VAR_UINT_MAX is now unconditional (independent of the build type). Let's
see how that affects performance ... I'll probably add "Unchecked"
overloads later on (to avoid clutter they are avoided for now)
PR #48154 introduced a sanity check in the form of a debug assertion
that the input values for VarInt encoding are not too big. Such values
should be exceptionally rare in practice but the AST fuzzer managed to
trigger the assertion regardless. The strategy to deal with such values
until now was to bypass the check by limiting the value to the maximum
allowed value (see #48412). Because a new AST Fuzzer failure appeared
(#48497) and there may be more failures in future, this PR changes the
sanity check from an assert to an exception.
Fixes: #48497
Otherwise query like this, can trigger sanity check:
WITH x AS (SELECT [], number AS a FROM numbers(9223372036854775807)), y AS (SELECT arrayLastOrNull(x -> (x >= -inf), []), arrayLastOrNull(x -> (x >= NULL), [1]), number AS a FROM numbers(1.)) SELECT [1023], * FROM x WHERE a IN (SELECT a FROM y) ORDER BY arrayLastOrNull(x -> (x >= 1025), [1048577, 1048576]) DESC NULLS LAST, '0.0000000002' ASC NULLS LAST, a DESC NULLS FIRST
CI: https://s3.amazonaws.com/clickhouse-test-reports/0/a9bcd022d5f4a5be530595dbfae3ed177b5c1972/fuzzer_astfuzzermsan/report.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* save format string for NetException
* format exceptions
* format exceptions 2
* format exceptions 3
* format exceptions 4
* format exceptions 5
* format exceptions 6
* fix
* format exceptions 7
* format exceptions 8
* Update MergeTreeIndexGin.cpp
* Update AggregateFunctionMap.cpp
* Update AggregateFunctionMap.cpp
* fix
- lots of static_cast
- add safe_cast
- types adjustments
- config
- IStorage::read/watch
- ...
- some TODO's (to convert types in future)
P.S. That was quite a journey...
v2: fixes after rebase
v3: fix conflicts after #42308 merged
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>