* Limit log frequence for "Skipping send data over distributed table" message
After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename directory monitor concept into async INSERT
Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure
Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms
And also update all the references:
$ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename async_insert for Distributed into background_insert
This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.
Mostly done with:
$ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Mark 02417_opentelemetry_insert_on_distributed_table as long
CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
---------
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Add two new settings for the Distributed engine:
- bytes_to_delay_insert
- max_delay_to_insert
If at the beginning of INSERT there will be too much pending data, more
then bytes_to_delay_insert, then the INSERT will wait until it will be
shrinked, and not more then max_delay_to_insert seconds.
If after this there will be still too much pending, it will throw an
exception.
Also new profile events were added (by analogy to the MergeTree):
- DistributedDelayedInserts (although you can use system.errors instead
of this, but still)
- DistributedRejectedInserts
- DistributedDelayedInsertsMilliseconds
Right now with distributed_directory_monitor_batch_inserts=1 and
insert_distributed_sync=0 INSERT into Distributed table will store
blocks that should be sent to remote (and in case of
prefer_localhost_replica=0 to the localhost too) on the local
filesystem, and sent it in background.
However there is no limit for this storage, and if the remote is
unavailable (or some other error), these pending blocks may take
significant space, and this is not always desired behaviour.
Add new Distributed setting - bytes_to_throw_insert, that will set the
limit for how much pending bytes is allowed, if the limit will be
reached an exception will be throw.
By default was set to 0, to avoid surprises.
Two new settings (by analogy with MergeTree family) has been added:
- `fsync_after_insert` - Do fsync for every inserted. Will decreases
performance of inserts.
- `fsync_tmp_directory` - Do fsync for temporary directory (that is used
for async INSERT only) after all part operations (writes, renames,
etc.).
Refs: #17380 (p1)
Add inter-server cluster secret, it is used for Distributed queries
inside cluster, you can configure in the configuration file:
<remote_servers>
<logs>
<shard>
<secret>foobar</secret> <!-- empty -- works as before -->
...
</shard>
</logs>
</remote_servers>
And this will allow clickhouse to make sure that the query was not
faked, and was issued from the node that knows the secret. And since
trust appeared it can use initial_user for query execution, this will
apply correct *_for_user (since with inter-server secret enabled, the
query will be executed from the same user on the shards as on initator,
unlike "default" user w/o it).
v2: Change user to the initial_user for Distributed queries if secret match
v3: Add Protocol::Cluster package
v4: Drop Protocol::Cluster and use plain Protocol::Hello + user marker
v5: Do not use user from Hello for cluster-secure (superfluous)
* DOCSUP-1036: Describe settings for different pools (#118)
* describe settings for different pools
* minor changes
* add refs to and from distributed, server settings, kafka, replication, mergetree
* changes in links description
* add description and links to RU version
* add descrtiption and links to ru version
* minor changes
* CLICKHOUSEDOCS-620: Returned text lost by the last merge.
Co-authored-by: Evgenia Sudarikova <56156889+otrazhenia@users.noreply.github.com>
Co-authored-by: Sergei Shtykov <bayonet@yandex-team.ru>
* Refactoring in preparation for AMP support
* infrastructure for AMP validation
* Add Metrika counter on AMP page + adjust layout
* more content fixes
* improve amp layout
* improve navigation
* Move converting admonitions to generation time
* strict amp test check
* Batch AMP validation
* Add date published/modified to docs articles and their ld+json meta
* few more content fixes
* improve ld+json meta
* adjust margins
* skip published/modified for stable release docs
* adapt single page mode
* update po
* skip published/modified for single page docs
* update po
* adjust layout
* adjust layout