* Replicate poco into base/poco/
* De-register poco submodule
* Build poco from ClickHouse
* Exclude poco from stylecheck
* Exclude poco from whitespace check
* Exclude poco from typo check
* Remove x bit from sources/headers (the style check complained)
* Exclude poco from duplicate include check
* Fix fasttest
* Remove contrib/poco-cmake/*
* Simplify poco build descriptions
* Remove poco stuff not used by ClickHouse
* Glob poco sources
* Exclude poco from clang-tidy
Consider the following example:
CREATE TABLE data (root.array_str Array(UInt8)) ENGINE = MergeTree() ORDER BY tuple();
INSERT INTO data VALUES ([]);
ALTER TABLE data ADD COLUMN root.nested_array Array(Array(UInt8));
In this case the first part will not have data for root.nested_array,
and thanks to #37152 it will simply read offsets column from
root.array_str, however since root.nested_array is a nested array, it
will try to read elements from the same offsets stream and if you are
lucky enough you will get one of the following errors:
- Cannot read all data. Bytes read: 1. Bytes expected: 8.: (while reading column root.nested_array): While executing MergeTreeInOrder. (CANNOT_READ_ALL_DATA)
- DB::Exception: Array size is too large: 8233460228287709730: (while reading column serp.serp_features): While executing MergeTreeInOrder.
So to address this, findColumnForOffsets() had been changed to return
the level of the column too, to allow to read only up to this level.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Otherwise the following leads to SIGSEGV in debug/sanitizers builds:
echo '0000000000Custom NULL representation0000000000' | clickhouse-local -q "desc file('-', 'TSV')"
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Refernce files had been checked manually and using this onelinear:
$ diff <(jq -r .bool ../tests/queries/0_stateless/02152_bool_type_parsing.stdout) ../tests/queries/0_stateless/02152_bool_type_parsing.reference
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Previously the following query does not works correctly:
SELECT number FROM numbers(5) SETTINGS output_format_json_array_of_rows = 1 FORMAT JSONEachRow
While this one works OK:
SELECT number FROM numbers(5) FORMAT JSONEachRow SETTINGS output_format_json_array_of_rows = 1
The problem is in which AST those settings are stored, use the logic as
executeQuery() has to apply them:
c83f701696/src/Interpreters/executeQuery.cpp (L467-L497)
Note, the only problem should be with the settings for FORMAT, since
client applies thoes settings (and formats) locally w/o server, while in
case of i.e. HTTP it will be applied on the server and everything will
works fine.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
There are very frequent flakiness of `test_cluster_copier` test, here is
an example of copier failures on CI [1]:
AssertionError: Instance: s0_1_0 (172.16.29.9). Info: {'ID': '5d68dcb46fdb4b0c54b7c7ba1ddde83b8f34d483bbb32abcb0c52b966444ce82', 'Running': False, 'ExitCode': 85, 'ProcessConfig': {'tty': False, 'entrypoint': '/usr/bin/clickhouse', 'arguments': ['copier', '--config', '/etc/clickhouse-server/config-copier.xml', '--task-path', '/clickhouse-copier/task_simple_4DFWYTDD49', '--task-file', '/task0_description.xml', '--task-upload-force', 'true', '--base-dir', '/var/log/clickhouse-server/copier', '--copy-fault-probability', '0.2', '--experimental-use-sample-offset', '1'], 'privileged': False, 'user': '0'}, 'OpenStdin': False, 'OpenStderr': True, 'OpenStdout': True, 'CanRemove': False, 'ContainerID': 'f356df6694b3cc09ee9830c623681626f8e8d999677c188b9fe911aa702784ca', 'DetachKeys': '', 'Pid': 84332}
assert 85 == 0
But let's look what the error it is, apparently it is UNFINISHED:
SELECT
name,
code
FROM system.errors
WHERE ((code % 256) = 85) AND (NOT remote)
SETTINGS system_events_show_zero_values = 1
┌─name─────────────────────────────┬─code─┐
│ FORMAT_IS_NOT_SUITABLE_FOR_INPUT │ 85 │
│ UNFINISHED │ 341 │
│ NO_SUCH_ERROR_CODE │ 597 │
└──────────────────────────────────┴──────┘
Let's verify:
$ grep -r UNFINISHED ./test_cluster_copier/_instances_0/s0_1_0/logs/copier/clickhouse-copier_*
./test_cluster_copier/_instances_0/s0_1_0/logs/copier/clickhouse-copier_20230206220846_368/log.log:2023.02.06 22:09:19.015251 [ 368 ] {} <Error> : virtual int DB::ClusterCopierApp::main(const std::vector<std::string> &): Code: 341. DB::Exception: Too many tries to process table cluster1.default.hits. Abort remaining execution. (UNFINISHED), Stack trace (when copying this message, always include the lines below):
And apparently that it is due to query error with fault injection:
2023.02.06 22:09:15.654724 [ 368 ] {} <Error> Application: An error occurred while processing partition 0: Code: 62. DB::Exception: Syntax error (Query): failed at position 168 ('Native'): Native. Expected one of: token, Dot, OR, AND, BETWEEN, NOT BETWEEN, LIKE, ILIKE, NOT LIKE, NOT ILIKE, IN, NOT IN, GLOBAL IN, GLOBAL NOT IN, MOD, DIV, IS NULL, IS NOT NULL, alias, AS, Comma, OFFSET, WITH TIES, BY, LIMIT, SETTINGS, UNION, EXCEPT, INTERSECT, INTO OUTFILE, FORMAT, end of query. (SYNTAX_ERROR), Stack trace (when copying this message, always include the lines below):
Example:
select x from x limit 1FORMAT Native
Syntax error: failed at position 32 ('Native'):
So fixing this should fix test_cluster_copier flakiness.
[1]: https://s3.amazonaws.com/clickhouse-test-reports/46045/bd4170e03c6af583a51d12d2c39fa775dcb9997b/integration_tests__release__[4/4].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This commit achieved the data parallelism for filter generations of
the nullable columns by replacing the logical AND operator with the
bitwise one, which could be auto-vectorized by the compiler.