Previously setting `enable_order_by_all` distinguished for ORDER BY ALL
whether we should sort by column 'all' (if given in the SELECT clause)
or by all columns. The actual behavior was not always intuitive.
Now, we throw unconditionally an exception which also simplifies the
handling a bit. Only an edge case is affected and if users really want
to run ORDER BY ALL on a column names 'all', they can alias it.
In addition to the time since the most recent insert,
consider the elapsed time between the two recent queue
flushes when decreasing the timeout or processing an
entry synchronously.
Add AsynchronousInsertQueueSize and AsynchronousInsertQueueBytes
metrics to improve observability of asynchronous inserts.
The metrics do not account for tasks dispatched for immediate processing
(as opposed to, e.g., PendingAsyncInsert).
```
SELECT value
FROM system.metrics
WHERE metric IN ('AsynchronousInsertQueueSize', 'PendingAsyncInsert')
Query id: a711dd83-b48d-4ad5-8031-fa59b21a7c38
┌─value─┐
│ 18 │
│ 23 │
└───────┘
```
```
SELECT value
FROM system.metrics
WHERE metric IN ('AsynchronousInsertQueueSize', 'AsynchronousInsertQueueBytes')
Query id: b35a7ceb-2bb5-46ad-b301-e6cf03508699
┌─value─┐
│ 28 │
│ 1372 │
└───────┘
```
Implement the algorithm described in #56783 for adaptive asynchronous
insert timeouts.
- The adaptive async insert timeout can take values within
[async_insert_busy_timeout_min_ms, async_insert_busy_timeout_max_ms].
- The initial value is set to async_insert_busy_timeout_min_ms.
- If the elapsed time since the most recent queue insert was
greater than the maximum timeout, process the queue content immediately,
and reduce the timeout.
- If the elapsed time was long enough (longer than a would-be decreased
timeout), decrease the timeout.
- The adaptive timeout is changes exponentially based on the
async_insert_busy_timeout_{increase|decrease}_rate.
Fixes: https://github.com/ClickHouse/ClickHouse/issues/56783
The current timeout for checking updates in the asynchronous queue is
equal to the timeout used for queue entry
(async_insert_busy_timeout_ms).
That means that, in the worst case, an entry spends twice the time of the
asynchronous timeout in the queue.
* temp commit
* temp commit
* draft impl for feedback
* fix weird style changes
* fix weird style changes
* fix weird style changes
* fix weird style changes
* fix weird style changes
* aa
* aa
* Add integ tests and remove partition key restriction
* fix small incosistency in partition id
* style fix
* style fix
* style fix
* use existing DataPartStorageBuilder instead of new one
* Refactor part clone to make it more readable and maintainable
* Add MergeTreeDataPartCloner docs
* define ErrorCodes::BAD_ARGUMENTS
* Rebase
* camel case methods
* address some comments
* yet another rebase?
* Move from integ tests to stateless tests
* address more comments
* add finalize on min_max_idx files
* Add sync option to DistinctPartitionExpCloner
* just a temp test
* revert temp change
* Use echoOn to distinguish test queries
* remove comment
* fix build issue during rebase
* atempt to fix build after rebase
* finally fix build
* clear minmaxidx hyperrectangle before loading it
* Fix error on min_max files deletion where it was being assumed that partition expression contained all columns
* get it to the state it was previously
* add missing include
* getting functional?
* refactoring and renaming
* some more refactoring
* extern bad arguments
* try to fix style
* improvements and docs
* remove duplicate includes
* fix crash
* make tests more stable by ordering
* rebase once again..
* fix
* make ci happy?
* fix rebase issues
* docs
* rebase, but prolly needs to be improved
* refactor out from nasty inheritance to static methods
* fix style
* work around optional
* refactor & integrate some changes
* update column_type
* add tests by dencrane
* set utc
* fix ref file
* fix tests
* use MergeTree instead of SummingMergeTree
* mark MergeTreeDataPart::getBlock as const
* address a few comments
* compute module function name size at compile time
* simplify branching in getPartitionAstFieldsCount
* remove column_indexes argument
* merge getBlock with buildBlock
* add some const specifiers
* small adjustments
* remove no longer needed isNull check
* use std::min and max to update global min max idx
* add some assertions
* forward declare some symbols
* fix grammar
* forward decl
* try to fix build..
* remove IFunction forward decl
* Revert "use std::min and max to update global min max idx"
This reverts commit b2fe79dda7.
* Revert "remove no longer needed isNull check"
This reverts commit 129db2610f.
* Revert "Revert "remove no longer needed isNull check""
This reverts commit 9416087dd8.
* Revert "Revert "use std::min and max to update global min max idx""
This reverts commit 20246d4416.
* remove some comments
* partial use of MonotonicityCheckMatcher
* ranges
* remove KeyDescriptionMonotonicityChecker
* remove duplication of applyfunction
* move functions to anonymous namespace
* move functions to cpp
* Relax partition compatibility requirements by accepting subset, add tests from partitioned to unpartitioned
* updte reference file
* Support for partition by a, b, c to partition by a, b
* refactoring part 1
* refactoring part 2, use hyperrectangle, still not complete
* refactoring part 3, build hyperrectangle with intersection of source & destination min max columns
* Support attaching to table with partition expression of multiple expressions
* add tests
* rename method
* remove some code duplication
* draft impl for replicatedmergetree, need to dive deeper
* ship ref file
* fix impl for replicatedmergetree..
* forbid attach empty partition replicatedmergetree
* Add replicated merge tree integration tests
* add test missing files
* fix black
* do not check for monotonicity of empty partition
* add empty tests & fix replicated
* remove no longer needed buildBlockWithMinMaxINdexes
* remove column logic in buildHyperrectangle
* simplify implementation by using existing methods
* further simplify implementation
* move all MergeTreeDataPartClone private methods to .cpp file
* decrease decomposition
* use different namespaces
* reduce code duplication
* fix style
* address a few comments
* add chassert to assert arguments size on MonotonicityCheckVisitor
* remove deleteMinMaxFiles method
* remove useless checks from sanitycheck
* add tests for attach partition (not id)
* Remove sanityCheckASTPartition and bring back conditional getPartitionIDFromQuery
* remove empty block comment
* small fixes
* fix formatting
* add missing include
* remove duplicate iuncludes
* trigger ci
* reduce some code duplication
* use updated partition id on replicatedmergetree
* fix build
* fix build
* small refactor
* do not use insert increment on fetch part
* remove duplicate includes
* add one more integ test
* black
* black
* rely on partition exp instead of partition id on replicated part fetch to decide if it is a different partition exp
* add one more integ test
* add order by clause
* fix black
---------
Co-authored-by: Alexey Milovidov <milovidov@clickhouse.com>
This commit fixes the data race that was introduced in 3067ca6. That commit
added scope lock for `getInstance` and `setInstance`, but `getInstance`
returned a pointer that could be invalidated between call for `getInstance` and
actual usage of the pointer.
This commit changes `SensitiveDataMasker` singleton from `unique_ptr` to
`shared_ptr`, so even if `setInstance` will be called right after
`getInstance`, the pointer returned from `getInstance` will live long enough to
call `wipeSensitiveData` on it.