Previously setting `enable_order_by_all` distinguished for ORDER BY ALL
whether we should sort by column 'all' (if given in the SELECT clause)
or by all columns. The actual behavior was not always intuitive.
Now, we throw unconditionally an exception which also simplifies the
handling a bit. Only an edge case is affected and if users really want
to run ORDER BY ALL on a column names 'all', they can alias it.
* temp commit
* temp commit
* draft impl for feedback
* fix weird style changes
* fix weird style changes
* fix weird style changes
* fix weird style changes
* fix weird style changes
* aa
* aa
* Add integ tests and remove partition key restriction
* fix small incosistency in partition id
* style fix
* style fix
* style fix
* use existing DataPartStorageBuilder instead of new one
* Refactor part clone to make it more readable and maintainable
* Add MergeTreeDataPartCloner docs
* define ErrorCodes::BAD_ARGUMENTS
* Rebase
* camel case methods
* address some comments
* yet another rebase?
* Move from integ tests to stateless tests
* address more comments
* add finalize on min_max_idx files
* Add sync option to DistinctPartitionExpCloner
* just a temp test
* revert temp change
* Use echoOn to distinguish test queries
* remove comment
* fix build issue during rebase
* atempt to fix build after rebase
* finally fix build
* clear minmaxidx hyperrectangle before loading it
* Fix error on min_max files deletion where it was being assumed that partition expression contained all columns
* get it to the state it was previously
* add missing include
* getting functional?
* refactoring and renaming
* some more refactoring
* extern bad arguments
* try to fix style
* improvements and docs
* remove duplicate includes
* fix crash
* make tests more stable by ordering
* rebase once again..
* fix
* make ci happy?
* fix rebase issues
* docs
* rebase, but prolly needs to be improved
* refactor out from nasty inheritance to static methods
* fix style
* work around optional
* refactor & integrate some changes
* update column_type
* add tests by dencrane
* set utc
* fix ref file
* fix tests
* use MergeTree instead of SummingMergeTree
* mark MergeTreeDataPart::getBlock as const
* address a few comments
* compute module function name size at compile time
* simplify branching in getPartitionAstFieldsCount
* remove column_indexes argument
* merge getBlock with buildBlock
* add some const specifiers
* small adjustments
* remove no longer needed isNull check
* use std::min and max to update global min max idx
* add some assertions
* forward declare some symbols
* fix grammar
* forward decl
* try to fix build..
* remove IFunction forward decl
* Revert "use std::min and max to update global min max idx"
This reverts commit b2fe79dda7.
* Revert "remove no longer needed isNull check"
This reverts commit 129db2610f.
* Revert "Revert "remove no longer needed isNull check""
This reverts commit 9416087dd8.
* Revert "Revert "use std::min and max to update global min max idx""
This reverts commit 20246d4416.
* remove some comments
* partial use of MonotonicityCheckMatcher
* ranges
* remove KeyDescriptionMonotonicityChecker
* remove duplication of applyfunction
* move functions to anonymous namespace
* move functions to cpp
* Relax partition compatibility requirements by accepting subset, add tests from partitioned to unpartitioned
* updte reference file
* Support for partition by a, b, c to partition by a, b
* refactoring part 1
* refactoring part 2, use hyperrectangle, still not complete
* refactoring part 3, build hyperrectangle with intersection of source & destination min max columns
* Support attaching to table with partition expression of multiple expressions
* add tests
* rename method
* remove some code duplication
* draft impl for replicatedmergetree, need to dive deeper
* ship ref file
* fix impl for replicatedmergetree..
* forbid attach empty partition replicatedmergetree
* Add replicated merge tree integration tests
* add test missing files
* fix black
* do not check for monotonicity of empty partition
* add empty tests & fix replicated
* remove no longer needed buildBlockWithMinMaxINdexes
* remove column logic in buildHyperrectangle
* simplify implementation by using existing methods
* further simplify implementation
* move all MergeTreeDataPartClone private methods to .cpp file
* decrease decomposition
* use different namespaces
* reduce code duplication
* fix style
* address a few comments
* add chassert to assert arguments size on MonotonicityCheckVisitor
* remove deleteMinMaxFiles method
* remove useless checks from sanitycheck
* add tests for attach partition (not id)
* Remove sanityCheckASTPartition and bring back conditional getPartitionIDFromQuery
* remove empty block comment
* small fixes
* fix formatting
* add missing include
* remove duplicate iuncludes
* trigger ci
* reduce some code duplication
* use updated partition id on replicatedmergetree
* fix build
* fix build
* small refactor
* do not use insert increment on fetch part
* remove duplicate includes
* add one more integ test
* black
* black
* rely on partition exp instead of partition id on replicated part fetch to decide if it is a different partition exp
* add one more integ test
* add order by clause
* fix black
---------
Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
I've recently stumbled several times trying to figure out where to put the `SETTINGS` when inserting `VALUES` and `FROM INFILE`, so I'm clarifying it here in the docs.
Add syntax in SQL and XML to mark specific fields to allow
override or not.
Also add a new setting to control the default behaviour when
overriding support is not specified.
* Limit log frequence for "Skipping send data over distributed table" message
After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this
message.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename directory monitor concept into async INSERT
Rename the following query settings (with preserving backward
compatiblity, by keeping old name as an alias):
- distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms
- distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms
- distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts
- distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure
Rename the following table settings (with preserving backward
compatiblity, by keeping old name as an alias):
- monitor_batch_inserts -> async_insert_batch
- monitor_split_batch_on_failure -> async_insert_split_batch_on_failure
- directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms
- directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms
And also update all the references:
$ gg -e directory_monitor_ -e monitor_ tests docs | cut -d: -f1 | sort -u | xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Rename async_insert for Distributed into background_insert
This will avoid amigibuity between general async INSERT's and INSERT
into Distributed, which are indeed background, so new term express it
even better.
Mostly done with:
$ git di HEAD^ --name-only | xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g'
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* Mark 02417_opentelemetry_insert_on_distributed_table as long
CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
---------
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>