The Kafka table engine allows global configuration and per-Kafka-topic
configuration. The latter uses syntax <kafka_TOPIC>, e.g. for topic
"football":
<kafka_football>
<retry_backoff_ms>250</retry_backoff_ms>
<fetch_min_bytes>100000</fetch_min_bytes>
</kafka_football>
Some users had to find out the hard way that such configuration doesn't
take effect if the topic name contains a period, e.g. "sports.football".
The reason is that ClickHouse configuration framework already uses
periods as level separators to descend the configuration hierarchy.
(Besides that, per-topic configuration at the same level as global
configuration could be considered ugly.)
Note that Kafka topics may contain characters "a-zA-Z0-9._-" (*) and
a tree-like topic organization using periods is quite common in
practice.
This PR deprecates the existing per-topic configuration syntax (but
continues to support it for backward compat) and introduces a new
per-topic configuration syntax below the global Kafka configuration of
the form:
<kafka>
<topic name="football">
<retry_backoff_ms>250</retry_backoff_ms>
<fetch_min_bytes>100000</fetch_min_bytes>
</topic>
</kafka>
The period restriction doesn't apply to XML attributes, so <topic
name="sports.football"> will work. Also, everything Kafka-related is
below <kafka>.
Considered but rejected alternatives:
- Extending Poco ConfigurationView with custom separators (e.g."/"
instead of "."). Won't work easily because ConfigurationView only
builds a path but defers descending the configuration tree to the
normal configuration classes.
- Reloading the configuration file in StorageKafka (instead of reading
the loaded file) but with a custom separator. This mode is supported
by XML configuration. Too ugly and error-prone since the true
configuration is composed from multiple configuration files.
(*) https://stackoverflow.com/a/37067544
In case of underlying table has an ALIAS for this column, while in Merge
table it is not marked as an alias, there will NOT_FOUND_COLUMN_IN_BLOCK
error.
Further more, when underlying tables has different default type for the
column, i.e. one has ALIAS and another has real column, then you will
also get NOT_FOUND_COLUMN_IN_BLOCK, because Merge engine should take
care of this.
Also this patch reworks how PREWHERE is handled for Merge table, and now
if you use PREWHERE on the column that has the same type and default
type (ALIAS, ...) then it will be possible, and only if the type
differs, it will be prohibited and throw ILLEGAL_PREWHERE error.
And last, but not least, also respect this restrictions for
optimize_move_to_prewhere.
v2: introduce IStorage::supportedPrewhereColumns()
v3: Remove excessive condition for PREWHERE in StorageMerge::read()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Add new engine to ReplacingMergeTree corresponding to the ReplacingCollapsingMergeTree
* Add new test for the new ReplacingMergeTree engine
* Limit sign value to -1/1
* Add new engine to ReplacingMergeTree corresponding to the ReplacingCollapsingMergeTree
* Add new test for the new ReplacingMergeTree engine
* Limit sign value to -1/1
* Replace sign column(Int8) by is_deleted(UInt8)
* Add new engine to ReplacingMergeTree corresponding to the ReplacingCollapsingMergeTree
* Add new test for the new ReplacingMergeTree engine
* Limit sign value to -1/1
* Replace sign column(Int8) by is_deleted(UInt8)
* Add new engine to ReplacingMergeTree corresponding to the ReplacingCollapsingMergeTree
* Add new test for the new ReplacingMergeTree engine
* Limit sign value to -1/1
* Replace sign column(Int8) by is_deleted(UInt8)
* Add keyword 'CLEANUP' when OPTIMIZE
* Cleanup uniquely when it's a replacingMergeTree
* Propagate CLEANUP information and change from 'with_cleanup' to 'cleanup'
* Cleanup data flagged as 'is_deleted'
* Fix merge when optimize and add a test
* Fix OPTIMIZE and INSERT + add tests
* New fix for cleanup at the merge
* Cleanup debug logs
* Add the SETTINGS option 'clean_deleted_rows' that can be 'never' or 'always'
* Fix regression bug; Now REplicatedMergeTree can be called as before without 'is_deleted'
* Add Replicated tests
* Disable tag 'long' for our test and cleanup some white spaces
* Update tests
* Fix tests and remove additional useless whitespace
* Fix replica test
* Style clean && add condition check for is_deleted values
* clean_deleted_rows settings is nom an enum
* Add valid default value to the clean_deleted_rows settings
* Update cleanup checkers to use the enum and fix typos in the test
* Fix submodule contrib/AMQP-CPP pointer
* Add missing messages in test reference and remove a print with non derterministic order
* fix replica test reference
* Fix edge case
* Fix a typo for the spell checker
* Fix reference
* Fix a condition to raise an error if is_deleted differ from 0/1 and cleanup
* Change tests file name and update number
* This should fix the ReplacingMergeTree parameter set
* Fix replicated parameters
* Disable allow_deprecated_syntax_for_merge_tree for our new column
* Fix a test
* Remove non deterministic order print in the test
* Test on replicas
* Remove a condition, when checking optional parameters, that should not be sueful since we disabled the deprected_syntaxe
* Revert "Remove a condition, when checking optional parameters, that should not be useful since we disabled the deprected_syntaxe"
This reverts commit b65d64c05e.
* Fix replica management and limit the number of argument to two maximum, due to the possiblity of deprecated table create/attach failing otherwise
* Test a fix for replicated log information error
* Try to add sync to have consistent results
* Change path of replicas that should cause one issue and add few prints in case it's not that
* Get cleanup info on replicas only if information found
* Fix style issues
* Try to avoid replication error 'cannot select parts...' and and replica read/write field order
* Cleanup according to PR reviews
and add tests on error raised.
* Update src/Storages/MergeTree/registerStorageMergeTree.cpp
Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>
* Select ... FINAL don't show rows with is_deleted = true
* Update and fix SELECT ... FINAL merge parameter
* Remove is_deleted rows only on the version inserted when merge
* Fix (master) updates issues
* Revert changes that should not be commited
* Add changes according to review
* Revert changes that should not be commited - part 2
---------
Co-authored-by: Alexander Tokmakov <tavplubix@gmail.com>