Commit Graph

53 Commits

Author SHA1 Message Date
johnnymatthews
e9d9048903 Changes 'cannot run on cloud' message. 2023-12-05 17:14:10 -04:00
johnnymatthews
c6ca43b341 Moves self-hosted-only box under page title. 2023-12-04 18:05:34 -04:00
johnnymatthews
40062405fb Adds 'not available on cloud' to Distributed Table Engine. 2023-12-04 17:59:11 -04:00
johnnymatthews
e2eb47b2ec Reverts last commit. 2023-12-04 17:58:19 -04:00
johnnymatthews
7ce33b0737 Adds 'not available on cloud' to Distributed Table Engine. 2023-12-04 17:57:33 -04:00
justindeguzman
f3b0550dd3 [Docs] Add details about sharding_key for distributed table engine 2023-11-12 19:43:43 -08:00
Azat Khuzhin
c25d6cd624
Rename directory monitor concept into background INSERT (#55978)
* Limit log frequence for "Skipping send data over distributed table" message

After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename directory monitor concept into async INSERT

Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure

Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms

And also update all the references:

    $ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Rename async_insert for Distributed into background_insert

This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.

Mostly done with:

    $ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

* Mark 02417_opentelemetry_insert_on_distributed_table as long

CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

---------

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-11-01 15:09:39 +01:00
Mohammad Arab Anvari
0d0e53ecc0
Update distributed.md
Fix broken link in `**See Also**` section.
2023-05-26 13:07:37 +03:30
Ivan Takarlikov
8873856ce5 Fix some grammar mistakes in documentation, code and tests 2023-05-04 13:35:18 -03:00
Robert Schulze
c406663442
Docs: Replace annoying three spaces in enumerations by a single space 2023-04-19 15:56:55 +00:00
Aleksei Filatov
0ac9dcd723 Add allow_distributed_ddl_queries option to the cluster config 2023-03-29 18:15:46 +03:00
rfraposa
ac5ed141d8 New nav - reverting the revert 2023-03-17 21:45:43 -05:00
Alexander Tokmakov
ec44c8293a
Revert "New navigation" 2023-03-17 21:21:11 +03:00
rfraposa
854cdae311 Link fixes 2023-03-07 17:58:36 -07:00
rfraposa
008845216d Fix broken links 2023-03-07 14:06:14 -07:00
Ivan Blinkov
61c2f23713 Remove leftover empty lines at the end of markdown files 2023-01-09 15:15:18 +01:00
Ivan Blinkov
b7e082d033 Remove "Original article links" 2023-01-09 15:13:36 +01:00
DanRoscigno
5b5fcc56aa add slugs 2022-08-28 10:53:34 -04:00
DanRoscigno
70de1afad7 move settings to H3 level 2022-06-24 12:16:20 -04:00
rfraposa
869967de41 Remove H1 anchor tags from docs 2022-06-02 04:55:18 -06:00
rfraposa
8f01fe9c49 Revised /en folder 2022-04-09 07:34:21 -06:00
rfraposa
5250d9ad11 Removed /ja folder, cleaned up /ru markdown 2022-04-09 07:29:05 -06:00
Alexey Milovidov
9854b55835
Revert "Format changes for new docs" 2022-04-04 02:05:35 +03:00
rfraposa
00ddb72eea Update /engines docs 2022-03-29 17:43:34 -06:00
Ivan Blinkov
d77da1f98f
[docs] update distributed.md (#35220)
* Update distributed.md

* Update distributed.md

* Update distributed.md

* Update distributed.md
2022-03-11 23:49:24 +03:00
Maksim Kita
bce821ae52
Update distributed.md 2022-02-13 15:17:20 +01:00
Gaurav Kumar
fc800bf191
reference to distributed queries processing
The reading section is missing important link to how distributed queries `in` queries are processed
2022-02-04 10:02:05 -08:00
Christoph Wurm
0b1b4fe9ad Fix list formatting in Distributed docs. 2021-12-16 12:31:28 +00:00
Christoph Wurm
3e5a6c8730 Add sections to Distributed documentation. 2021-12-10 18:29:15 +00:00
João Figueiredo
360ec76c29
Grammar suggestions to distributed.md
* fixed some typos.
* improved wording of some statements.
2021-10-20 22:35:17 +02:00
Alexey
2b272f5781 Virtual column in Distributed updated, link fixed, links added
Translated that part
2021-10-09 19:17:02 +00:00
Alexey
7f5852a711 New buildId variant
Links from Distributed
2021-10-09 18:37:28 +00:00
Ivan Blinkov
f429db1ee9 find . -type f -name '*.md'| xargs -I{} perl -pi -e 's|https://clickhouse.tech|https://clickhouse.com|g' {} 2021-09-19 23:05:54 +03:00
Azat Khuzhin
f3d3ec44a6 Add ability to set Distributed directory monitor settings via CREATE TABLE 2021-07-16 04:10:47 +03:00
Romain Neutron
dbcd573018
Fix some typos 2021-05-27 21:48:20 +02:00
Romain Neutron
7b515c7235
Avoid short syntax 2021-05-27 21:44:11 +02:00
alesapin
e27715e55e
Merge pull request #21331 from godliness/master
Fix error configuration for cluster secret
2021-03-10 10:09:29 +03:00
Azat Khuzhin
6965ac26c3 Distributed: Add ability to delay/throttle INSERT until pending data will be reduced
Add two new settings for the Distributed engine:
- bytes_to_delay_insert
- max_delay_to_insert

If at the beginning of INSERT there will be too much pending data, more
then bytes_to_delay_insert, then the INSERT will wait until it will be
shrinked, and not more then max_delay_to_insert seconds.

If after this there will be still too much pending, it will throw an
exception.

Also new profile events were added (by analogy to the MergeTree):
- DistributedDelayedInserts (although you can use system.errors instead
  of this, but still)
- DistributedRejectedInserts
- DistributedDelayedInsertsMilliseconds
2021-03-03 23:30:23 +03:00
Azat Khuzhin
b5a5778589 Distributed: Add ability to limit amount of pending bytes for async INSERT
Right now with distributed_directory_monitor_batch_inserts=1 and
insert_distributed_sync=0 INSERT into Distributed table will store
blocks that should be sent to remote (and in case of
prefer_localhost_replica=0 to the localhost too) on the local
filesystem, and sent it in background.

However there is no limit for this storage, and if the remote is
unavailable (or some other error), these pending blocks may take
significant space, and this is not always desired behaviour.

Add new Distributed setting - bytes_to_throw_insert, that will set the
limit for how much pending bytes is allowed, if the limit will be
reached an exception will be throw.

By default was set to 0, to avoid surprises.
2021-03-03 23:30:00 +03:00
Chao Ma
c2b8612525 Fix error configuration for cluster secret 2021-03-01 16:30:42 +08:00
Azat Khuzhin
471deab63a Rename fsync_tmp_directory to fsync_directories for Distributed engine 2021-01-09 17:51:30 +03:00
Mikhail Filimonov
4cabfa356e Update documentation for Distributed fsync settings. 2021-01-09 11:31:32 +03:00
Azat Khuzhin
b5ace27014 Add fsync support for Distributed engine.
Two new settings (by analogy with MergeTree family) has been added:

- `fsync_after_insert` - Do fsync for every inserted. Will decreases
  performance of inserts.

- `fsync_tmp_directory` - Do fsync for temporary directory (that is used
  for async INSERT only) after all part operations (writes, renames,
  etc.).

Refs: #17380 (p1)
2021-01-09 11:31:32 +03:00
Alexey Milovidov
db4db42b65 Fix broken links in docs 2020-10-13 20:23:29 +03:00
Azat Khuzhin
0159c74f21 Secure inter-cluster query execution (with initial_user as current query user) [v3]
Add inter-server cluster secret, it is used for Distributed queries
inside cluster, you can configure in the configuration file:

  <remote_servers>
      <logs>
          <shard>
              <secret>foobar</secret> <!-- empty -- works as before -->
              ...
          </shard>
      </logs>
  </remote_servers>

And this will allow clickhouse to make sure that the query was not
faked, and was issued from the node that knows the secret. And since
trust appeared it can use initial_user for query execution, this will
apply correct *_for_user (since with inter-server secret enabled, the
query will be executed from the same user on the shards as on initator,
unlike "default" user w/o it).

v2: Change user to the initial_user for Distributed queries if secret match
v3: Add Protocol::Cluster package
v4: Drop Protocol::Cluster and use plain Protocol::Hello + user marker
v5: Do not use user from Hello for cluster-secure (superfluous)
2020-09-15 01:36:28 +03:00
BayoNet
52f95e53fb
DOCS-229: insert_distributed_sync (#12579)
* Revolg docsup 727 insert distributed sync setting (#130)

* Add insert_distributed_sync setting

* insert_distributed_sync en doc upd

* Translated to russian.

* Update docs/ru/operations/settings/settings.md

Co-authored-by: BayoNet <da-daos@yandex.ru>

* Added the link to the setting in the russian version.

Co-authored-by: Olga Revyakina <revolg@yandex-team.ru>
Co-authored-by: BayoNet <da-daos@yandex.ru>

* Update settings.md

* Update docs/ru/operations/settings/settings.md

Co-authored-by: olgarev <56617294+olgarev@users.noreply.github.com>
Co-authored-by: Olga Revyakina <revolg@yandex-team.ru>
Co-authored-by: Sergei Shtykov <bayonet@yandex-team.ru>
Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
2020-07-21 15:40:03 +03:00
Azat Khuzhin
3395276748 Add replica priority into documentation 2020-06-29 23:03:28 +03:00
Ivan Blinkov
7170f3c534
[docs] split aggregate function and system table references (#11742)
* prefer relative links from root

* wip

* split aggregate function reference

* split system tables
2020-06-18 11:24:31 +03:00
Ivan Blinkov
f5b7665271
Update distributed.md 2020-06-10 23:18:36 +03:00
Mikhail f. Shiryaev
7f09bb8264
Replase back/forward quotes and apostrophes by straight 2020-06-10 12:52:41 +02:00