The (experimental) inverted index writes/reads files different from the
standard files written by the other skip indexes. The original problem
was that with database engine "ordinary", DROP TABLE of a table with
inverted index finds unknown files in persistence and complains. The
same will happen with engine "atomic" but deferred. As a hotfix, the
error was silenced by explicitly adding the four files created in a
specific test to the deletion code.
This PR tries a cleaner solution where all needed files are provided via
the normal checksum structure. One drawback remains which is that the
affected files were written earlier and we don't have their checksums
available. Therefore, the inverted index is currently excluded from
CHECK TABLE.
Minimal repro:
SET allow_experimental_inverted_index = 1;
DROP TABLE IF EXISTS tab;
CREATE TABLE tab(s String, INDEX af(s) TYPE inverted(2)) ENGINE = MergeTree() ORDER BY s;
INSERT INTO tab VALUES ('Alick a01');
CHECK TABLE tab;
DROP TABLE IF EXISTS tab;
run ./clickhouse-test with --db-engine Ordinary
Fixes#45204
The problem is that ASTSelectQuery::group_by_with_grouping_sets == true
implies ASTSelectQuery::groupBy() but sometimes this wasn't the case. I
added a sanity check a few months ago but had no idea how the AST became
corrupt.
All crashes/exceptions were during AST fuzzing. Looking at
Client/QueryFuzzer.cpp, there is a very small chance to run into the
issue. In detail:
1. In QueryFuzzer::fuzz(), we find that the AST is a ASTSelectQuery and
groupBy() returns true.
2. With small probability, we do
select->group_by_with_grouping_sets = !select->group_by_with_grouping_sets;
where the (default false) group_by_with_grouping_sets flips true.
3. With small probability, we change the expression type in the
following WHERE or PREWHERE if-branches.
This situation is illegal. One possibility is changing the fuzzing code
to not generate it. The fuzzing code is however generic, and doesn't
really care about such details. Therefore, instead add an (theoretically
unnecessary) extra check to ASTSelectQuery::formatImpl() for robustness.
The following metrics can be useful to calculate various rates (i.e.
disk/network IO rates):
- AsynchronousHeavyMetricsUpdateInterval
- AsynchronousMetricsUpdateInterval
The following had been added by analogy with the
AsynchronousMetricsCalculationTimeSpent:
- AsynchronousHeavyMetricsCalculationTimeSpent
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>