* Limit log frequence for "Skipping send data over distributed table" message
After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename directory monitor concept into async INSERT
Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure
Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms
And also update all the references:
$ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename async_insert for Distributed into background_insert
This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.
Mostly done with:
$ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Mark 02417_opentelemetry_insert_on_distributed_table as long
CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
---------
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Before the following did not work, it always uses user `dev`, even with
`clickhouse-client --connection prod`:
```yaml
user: dev
connections_credentials:
prod:
name: prod
user: prod
```
The problem was that before it was not possible to distinguish options
that had been set via command line options and via configuration file.
I've splitted this two actions, and embedded a call to
parseConnectionsCredentials() in between.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This is required for client, to handle comments in multiquery mode.
v0: separate context for input format
v2: cannot use separate context since params and stuff are changed in global context
v3: do not sent this setting to the server (breaks queries for readonly profiles)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Added new type of authentication based on SSH keys. It works only for Native TCP protocol.
Co-authored-by: Nikita Mikhaylov <nikitamikhaylov@clickhouse.com>
Co-authored-by: Robert Schulze <robert@clickhouse.com>
In some cases native copy is not possible, and such requests should be
throttled.
v0: copyS3FileNativeWithFallback
v2: revert v0 and pass write_settings
v3: pass read_settings to copyFile()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* initial impl
* fix env ut
* move ut directory
* make sure no null proxy resolver is returned by ProxyConfigurationResolverProvider
* minor adjustment
* add a few tests, still incomplete
* add proxy support for url table function
* use proxy for select from url as well
* remove optional from return type, just returns empty config
* fix style
* style
* black
* ohg boy
* rm in progress file
* god pls don't let me kill anyone
* ...
* add use_aws guards
* remove hard coded s3 proxy resolver
* add concurrency-mt-unsafe
* aa
* black
* add logging back
* revert change
* imrpove code a bit
* helper functions and separate tests
* for some reason, this env test is not working..
* formatting
* :)
* clangtidy
* lint
* revert some stupid things
* small test adjusmtments
* simplify tests
* rename test
* remove extra line
* freaking style change
* simplify a bit
* fix segfault & remove an extra call
* tightly couple proxy provider with context..
* remove useless include
* rename config prefix parameter
* simplify provider a bit
* organize provider a bit
* add a few comments
* comment out proxy env tests
* fix nullptr in unit tests
* make sure old storage proxy config is properly covered without global context instance
* move a few functions from class to anonymous namespace
* fix no fallback for specific storage conf
* change API to accept http method instead of bool
* implement http/https distinction in listresolver, any still not implemented
* implement http/https distinction in remote resolver
* progress on code, improve tests and add url function working test
* use protcol instead of method for http and https
* small fix
* few more adjustments
* fix style
* black
* move enum to proxyconfiguration
* wip
* fix build
* fix ut
* delete atomicroundrobin class
* remove stale include
* add some tests.. need to spend some more time on the design..
* change design a bit
* progress
* use existing context for tests
* rename aux function and fix ut
* ..
* rename test
* try to simplify tests a bit
* simplify tests a bit more
* attempt to fix tests, accept more than one remote resolver
* use proper log id
* try waiting for resolver
* proper wait logic
* black
* empty
* address a few comments
* refactor tests
* remove old tests
* baclk
* use RAII to set/unset env
* black
* clang tidy
* fix env proxy not respecting any
* use log trace
* fix wrong logic in getRemoteREsolver
* fix wrong logic in getRemoteREsolver
* fix test
* remove unwanted code
* remove ClientConfigurationperRequest and auxilary classes
* remove unwanted code
* remove adapter test
* few adjustments and add test for s3 storage conf with new proxy settings
* black
* use chassert for context
* Add getenv comment
1. is an expert-level setting, default is 0.5, applies only to SLRU.
Also, I noticed that we expose cache policy settings for the mark and
the uncompresed cache but not for the index mark and the index
uncompressed cache. Changed that as well, it simplifies the code a bit.
Cgroups allows to change the amount of memory available to a process
while it runs. The previous logic calculated the amount of available
memory only once at server startup. As a result, memory thresholds set
via cgroups were not picked up when the settings changed. We now always
incorporate the current limits during re-configuraton.
Note 1: getMemoryAmount() opens/reads a file which is potentially
expensive. Should be fine though since that happens only when
the server configuration changes.
Note 2: An better approach would be to treat cgroup limit changes as
another trigger for ClickHouse server re-configuration (which
currently only happens when the config files change). Shied away
from that for now because of the case that when the cgroup limit
is lowered, there is no guarantee that ClickHouse can shrink the
memory amount accordingly in time (afaik, it does so only lazily
by denying new allocations). As a result, the OOM killer would
kill the server. The same will happen with this PR but at a
lower implementation complexity.