Update s3queue.md

2024-11-25 09:02:00 +00:00 · 2024-09-19 16:05:33 +02:00 · 2024-09-19 16:05:33 +02:00 · 043d830313
commit 043d830313
parent b4504f20bf
1 changed files with 9 additions and 0 deletions
--- a/docs/en/engines/table-engines/integrations/s3queue.md
+++ b/docs/en/engines/table-engines/integrations/s3queue.md
@ -191,6 +191,15 @@ For 'Ordered' mode. Available since `24.6`. If there are several replicas of S3Q
 Engine supports all s3 related settings. For more information about S3 settings see [here](../../../engines/table-engines/integrations/s3.md).


+## S3Queue Ordered mode {#ordered-mode}
+
+`S3Queue` processing mode allows to store less metadata in ZooKeeper, but has a limitation that files, which added later by time, are required to have aplha-numerically bigger names.
+
+`S3Queue` `ordered` mode, as well as `unordered`, supports `(s3queue_)processing_threads_num` setting (`s3queue_` prefix is optional), which allows to control number of threads, which would do processing of `S3` files locally on the server.
+In addition, `ordered` mode also introduces another setting called `(s3queue_)buckets` which means "logical threads". It means that in distributed scenario, when there are several servers with `S3Queue` table replicas, where this setting defines the number of processing units. E.g. each processing thread on each `S3Queue` replica will try to lock a certain `bucket` for processing, each `bucket` is attributed to certain files by hash of the file name. Therefore, in distributed scenario it is highly recommended to have `(s3queue_)buckets` setting to be at least equal to the number of replicas or bigger. This is fine to have the number of buckets bigger than the number of replicas. The most optimal scenario would be for `(s3queue_)buckets` setting to equal a multiplication of `number_of_replicas` and `(s3queue_)processing_threads_num`.
+The setting `(s3queue_)processing_threads_num` is not recommended for usage before version `24.6`.
+The setting `(s3queue_)buckets` is available starting with version `24.6`.
+
 ## Description {#description}

 `SELECT` is not particularly useful for streaming import (except for debugging), because each file can be imported only once. It is more practical to create real-time threads using [materialized views](../../../sql-reference/statements/create/view.md). To do this: