Right now RemoteInserter does not read ProfileEvents for INSERT, it
handles them only after sending the query or on finish.
But #37391 sends them for each INSERT block, but sometimes they can be
no ProfileEvents packet, since it sends only non-empty blocks.
And this adds too much complexity, and anyway ProfileEvents are useless
for the server, so let's send them only if the query is initial (i.e.
send by user).
Note, that it is okay to change the logic of sending ProfileEvents w/o
changing DBMS_TCP_PROTOCOL_VERSION, because there were no public
releases with the original patch included yet.
Fixes: #37391
Refs: #35075
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This test executes SYSTEM RESTART REPLICAS, that may leave some tables
in some tests, and the problem test is
01414_mutations_and_errors_zookeeper, that has invalid values in the
table that produces the following error:
2022.06.19 19:02:07.165320 [ 1242562 ] {} <Error> MutateFromLogEntryTask: virtual bool DB::ReplicatedMergeMutateTaskBase::executeStep(): Code: 6. DB::Exception: Cannot parse string 'Hello' as UInt64: syntax error at begin of string. Note: there are toUInt64OrZero and toUInt64OrNull functions, which returns zero/NULL instead of throwing exception.: while executing 'FUNCTION _CAST(value :: 2, 'UInt64' :: 3) -> _CAST(value, 'UInt64') UInt64 : 4': (while reading from part /var/lib/clickhouse/store/700/70043200-eae1-44da-8554-0d43b7e936d7/20191002_1_1_0/): While executing MergeTreeInOrder. (CANNOT_PARSE_TEXT), Stack trace (when copying this message, always include the lines below):
Here is part of the server log that relevant for the test:
...
2022.06.19 18:33:22.495867 [ 391343 ] {e0332447-5473-4653-ba8b-b976acb304a1} <Trace> InterpreterSystemQuery: Restarting replica on test_9.replicated_mutation_table
2022.06.19 18:33:22.503462 [ 390869 ] {} <Information> test_9.replicated_mutation_table (70043200-eae1-44da-8554-0d43b7e936d7): Stopped being leader
...
2022.06.19 18:33:23.396760 [ 395825 ] {09ee374d-a8d9-47db-bdca-611d605b40c6} <Error> executeQuery: Code: 341. DB::Exception: Mutation is not finished because table shutdown was called. It will be done after table restart. (UNFINISHED) (version 22.6.1.1985 (official build)) (from [::1]:40558) (comment: '01414_mutations_and_errors_zookeeper.sh') (in query: ALTER TABLE replicated_mutation_table MODIFY COLUMN value UInt64 SETTINGS replication_alter_partitions_sync = 2), Stack trace (when copying this message, always include the lines below):
...
2022.06.19 18:33:23.467115 [ 390869 ] {} <Debug> test_9.replicated_mutation_table (70043200-eae1-44da-8554-0d43b7e936d7): Loading data parts
2022.06.19 18:33:23.471062 [ 390869 ] {} <Debug> test_9.replicated_mutation_table (70043200-eae1-44da-8554-0d43b7e936d7): Loaded data parts (3 items)
...
2022.06.19 18:33:23.515997 [ 390869 ] {} <Trace> test_9.replicated_mutation_table (ReplicatedMergeTreeRestartingThread): Restarting thread finished
...
2022.06.19 18:33:23.522475 [ 390869 ] {} <Trace> test_9.replicated_mutation_table (PartMovesBetweenShardsOrchestrator): PartMovesBetweenShardsOrchestrator thread finished
...
2022.06.19 18:33:24.960630 [ 391343 ] {e0332447-5473-4653-ba8b-b976acb304a1} <Error> executeQuery: Code: 57. DB::Exception: Cannot attach table with UUID 9b62c1d4-cf4a-4e41-bd11-bafb1446495c, because it was detached but still used by some query. Retry later. (TABLE_ALREADY_EXISTS) (version 22.6.1.1985 (official build)) (from [::1]:47448) (comment: 01646_system_restart_replicas_smoke.sql) (in query: SYSTEM RESTART REPLICAS;), Stack trace (when copying this message, always include the lines below):
...
2022.06.19 18:33:24.490940 [ 400623 ] {00c29852-e786-4e53-a44a-5f1c5f23c698} <Debug> executeQuery: (from [::1]:48804) (comment: '01414_mutations_and_errors_zookeeper.sh') SELECT distinct(value) FROM replicated_mutation_table ORDER BY value (stage: Complete)
2022.06.19 18:33:24.502168 [ 400623 ] {00c29852-e786-4e53-a44a-5f1c5f23c698} <Error> executeQuery: Code: 60. DB::Exception: Table test_9.replicated_mutation_table doesn't exist. (UNKNOWN_TABLE) (version 22.6.1.1985 (official build)) (from [::1]:48804) (comment: '01414_mutations_and_errors_zookeeper.sh') (in query: SELECT distinct(value) FROM replicated_mutation_table ORDER BY value), Stack trace (when copying this message, always include the lines below):
...
2022.06.19 18:33:25.048152 [ 395940 ] {bb31a17f-aca1-411a-ab30-c6b7598c59e5} <Debug> executeQuery: (from [::1]:49236) (comment: '01414_mutations_and_errors_zookeeper.sh') DROP TABLE IF EXISTS replicated_mutation_table (stage: Complete)
And if this table will be left, then checking error messages in
/var/log/clickhouse-server/clickhouse-server.backward.clean.log will
fail, like in [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/38205/90b57e7445d5167ea2170bfe03af29faffc195cf/stress_test__undefined__actions_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Caching header of the source table in the WINDOW VIEW should not be
done, since there is no ability to get notification when it had been
changed (ALTER or CREATE/DROP).
And this fires on [CI], when the following tests had been executed in
order in stress tests:
- 01050_window_view_parser_tumble (leaves wm for mt)
- 01748_partition_id_pruning (cache input_header)
- 01188_attach_table_from_path (insert into mt with wm attached and
incorrect structure)
[CI]: https://s3.amazonaws.com/clickhouse-test-reports/38056/109980eb275c064d08bc031bfdc14d95b9a7272b/stress_test__undefined__actions_.html
Follow-up for: #37965 (@Vxider)
Fixes: #37815
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
After AggregatingStep is used, there is not StrictResize processor,
since there is only one stream.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
v2: use real column name instead of aliases from GROUP BY
Fixes the following error in 01710_projection_aggregation_in_order:
Not found column a in block. There are only columns: toStartOfHour(ts), sum(value). (NOT_FOUND_COLUMN_IN_BLOCK)
v2.1: Get back support for projected and non-projected parts
v2.2: merge tests and rename
v3: Reduce copy-paste for optimize_aggregation_in_order for projections
v4: rebase on top of QueryPlanResourceHolder
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In case of INSERT into Distributed table with send_logs_level!=none it
is possible to receive tons of Log packets, and w/o consuming it
properly the socket buffer will be full, and eventually the query will
hung.
This happens because receiver will not read data until it will send logs
packets, but sender does not reads those Log packets and so receiver
hung, and hence the sender will hung too, because receiver do not
consume Data packets anymore.
In the initial version of this patch I tried to properly consume Log
packets, but it is not possible to ensure that before writing Data
blocks all Log packets had been consumed, that said that with current
protocol implementation it is not possible to fix Log packets consuming
properly, to avoid deadlock, so send_logs_level had been simply
disabled.
But note, that this does not differs to the user, in what ClickHouse did
before, since before it simply does not consume those packets, so the
client does not saw those messages anyway.
<details>
The receiver:
Poco::Net::SocketImpl::poll(Poco::Timespan const&, int)
Poco::Net::SocketImpl::sendBytes(void const*, int, int)
Poco::Net::StreamSocketImpl::sendBytes(void const*, int, int)
DB::WriteBufferFromPocoSocket::nextImpl()
DB::TCPHandler::sendLogData(DB::Block const&)
DB::TCPHandler::sendLogs()
DB::TCPHandler::readDataNext()
DB::TCPHandler::processInsertQuery()
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 4331792 211637 127.0.0.1:9000 127.0.0.1:24446 users:(("clickhouse-serv",pid=46874,fd=3850))
The sender:
Poco::Net::SocketImpl::poll(Poco::Timespan const&, int)
Poco::Net::SocketImpl::sendBytes(void const*, int, int)
Poco::Net::StreamSocketImpl::sendBytes(void const*, int, int)
DB::WriteBufferFromPocoSocket::nextImpl()
DB::WriteBuffer::write(char const*, unsigned long)
DB::CompressedWriteBuffer::nextImpl()
DB::WriteBuffer::write(char const*, unsigned long)
DB::SerializationString::serializeBinaryBulk(DB::IColumn const&, DB::WriteBuffer&, unsigned long, unsigned long) const
DB::NativeWriter::write(DB::Block const&)
DB::Connection::sendData(DB::Block const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool)
DB::RemoteInserter::write(DB::Block)
DB::RemoteSink::consume(DB::Chunk)
DB::SinkToStorage::onConsume(DB::Chunk)
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
ESTAB 67883 3008240 127.0.0.1:24446 127.0.0.1:9000 users:(("clickhouse-serv",pid=41610,fd=25))
</details>
v2: rebase to use clickhouse_client_timeout and add clickhouse_test_wait_queries
v3: use KILL QUERY
v4: adjust the test
v5: disable send_logs_level for INSERT into Distributed
v6: add no-backward-compatibility-check tag
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Before this patch SELECT queries hold parts even if they were not
required by select (had been eliminated by partition pruning).
This defers removing parts if you have long running queries.
This had been introduced in #23932, with introduction of
StorageSnapshotPtr.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>