Due to the fact that integration tests are run in parallel and there
might be no ClickHouse restart between several tests, relying on
`system.events` in the test is fragile as it can be affected by previous
tests. This commit removes any assumptions regarding `system.events`
from the test and tries to keep it robust.
<!---
A technical comment, you are free to remove or leave it as it is when PR is created
The following categories are used in the next scripts, update them accordingly
utils/changelog/changelog.py
tests/ci/cancel_and_rerun_workflow_lambda/app.py
-->
### Changelog category (leave one):
- Not for changelog (changelog entry is not required)
Use LDAP hostname with regular DNS lookup to check if LDAP is online.
Before that, we used the IP address that was extracted via the docker
client (not via DNS lookup) and it could happen that LDAP was reachable
via the IP, thus passing the online check, but not via DNS lookup, which
led to test failures (e.g. LDAP not reachable from ClickHouse instance).
* initial commit. integ tests passing, need to re-run unit & my own personal tests
* partial refactoring to remove Protocol::ANY
* improve naming
* remove all usages of ProxyConfiguration::Protocol::ANY
* fix ut
* blabla
* support url functions as well
* support for HTTPS requests over HTTP proxy with tunneling off
* remove gtestabc
* fix silly mistake
* ...
* remove usages of httpclientsession::proxyconfig in src/
* got you
* remove stale comment
* it seems like I need reasonable defaults
* fix ut
* add some comments
* remove no longer needed header
* matrix out
* add https over http proxy with no tunneling
* soem docs
* partial refactoring
* rename to use_tunneling_for_https_requests_over_http_proxy
* improve docs
* use shorter version
* remove useless test
* rename the setting
* update
* fix typo
* fix setting docs typo
* move ); up
* move ) up
* Limit log frequence for "Skipping send data over distributed table" message
After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename directory monitor concept into async INSERT
Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure
Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms
And also update all the references:
$ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename async_insert for Distributed into background_insert
This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.
Mostly done with:
$ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Mark 02417_opentelemetry_insert_on_distributed_table as long
CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
---------
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Sometimes the requests library detect the encoding incorrectly, and
because this test compares binary data it fails.
Here is an example of successfull attempt:
2023-10-30 07:32:37 [ 654 ] DEBUG : http://172.16.1.2:8123 "GET /?query=SELECT+%2A+FROM+test.simple+FORMAT+Protobuf+SETTINGS+format_schema%3D%27simple%3AKeyValuePair%27 HTTP/1.1" 200 None (connectionpool.py:546, _make_request)
2023-10-30 07:32:37 [ 654 ] DEBUG : Encoding detection: utf_8 will be used as a fallback match (api.py:480, from_bytes)
2023-10-30 07:32:37 [ 654 ] DEBUG : Encoding detection: Found utf_8 as plausible (best-candidate) for content. With 0 alternatives. (api.py:487, from_bytes)
And here is failed [1]:
2023-10-29 18:12:56 [ 525 ] DEBUG : http://172.16.9.2:8123 "GET /?query=SELECT+%2A+FROM+test.simple+FORMAT+Protobuf+SETTINGS+format_schema%3D%27message_tmp%3AMessageTmp%27 HTTP/1.1" 200 None (connectionpool.py:547, _make_request)
2023-10-29 18:12:56 [ 525 ] DEBUG : Encoding detection: Found utf_16_be as plausible (best-candidate) for content. With 1 alternatives. (api.py:487, from_bytes)
E AssertionError: assert '܈Ē͡扣܈Ȓͤ敦' == '\x07\x08\x01\x12\x03abc\x07\x08\x02\x12\x03def'
E - abcdef
E + ܈Ē͡扣܈Ȓͤ敦
[1]: https://s3.amazonaws.com/clickhouse-test-reports/56030/c7f392500e93863638c9ca9bd56c93b3193091f3/integration_tests__release__[3_4].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Before least_used fails to detect when the disk started to have more
space, it works only when the disk starts to have less space.
The reason for this is that it uses priority_queue, and once the disk
goes at the bottom of the queue, free space will not be updated for it
until it will be selected again.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>