Merge branch 'master' of github.com:ClickHouse/ClickHouse into system-symbols

This commit is contained in:
Alexey Milovidov 2023-11-07 19:43:19 +01:00
commit ca83da14f2
179 changed files with 2765 additions and 1922 deletions

View File

@ -33,7 +33,12 @@ jobs:
- name: Python unit tests - name: Python unit tests
run: | run: |
cd "$GITHUB_WORKSPACE/tests/ci" cd "$GITHUB_WORKSPACE/tests/ci"
python3 -m unittest discover -s . -p '*_test.py' echo "Testing the main ci directory"
python3 -m unittest discover -s . -p 'test_*.py'
for dir in *_lambda/; do
echo "Testing $dir"
python3 -m unittest discover -s "$dir" -p 'test_*.py'
done
DockerHubPushAarch64: DockerHubPushAarch64:
runs-on: [self-hosted, style-checker-aarch64] runs-on: [self-hosted, style-checker-aarch64]
needs: CheckLabels needs: CheckLabels
@ -69,7 +74,7 @@ jobs:
name: changed_images_amd64 name: changed_images_amd64
path: ${{ runner.temp }}/docker_images_check/changed_images_amd64.json path: ${{ runner.temp }}/docker_images_check/changed_images_amd64.json
DockerHubPush: DockerHubPush:
needs: [DockerHubPushAmd64, DockerHubPushAarch64] needs: [DockerHubPushAmd64, DockerHubPushAarch64, PythonUnitTests]
runs-on: [self-hosted, style-checker] runs-on: [self-hosted, style-checker]
steps: steps:
- name: Check out repository code - name: Check out repository code

View File

@ -19,7 +19,12 @@ jobs:
- name: Python unit tests - name: Python unit tests
run: | run: |
cd "$GITHUB_WORKSPACE/tests/ci" cd "$GITHUB_WORKSPACE/tests/ci"
python3 -m unittest discover -s . -p '*_test.py' echo "Testing the main ci directory"
python3 -m unittest discover -s . -p 'test_*.py'
for dir in *_lambda/; do
echo "Testing $dir"
python3 -m unittest discover -s "$dir" -p 'test_*.py'
done
DockerHubPushAarch64: DockerHubPushAarch64:
runs-on: [self-hosted, style-checker-aarch64] runs-on: [self-hosted, style-checker-aarch64]
steps: steps:

View File

@ -47,10 +47,10 @@ jobs:
run: | run: |
cd "$GITHUB_WORKSPACE/tests/ci" cd "$GITHUB_WORKSPACE/tests/ci"
echo "Testing the main ci directory" echo "Testing the main ci directory"
python3 -m unittest discover -s . -p '*_test.py' python3 -m unittest discover -s . -p 'test_*.py'
for dir in *_lambda/; do for dir in *_lambda/; do
echo "Testing $dir" echo "Testing $dir"
python3 -m unittest discover -s "$dir" -p '*_test.py' python3 -m unittest discover -s "$dir" -p 'test_*.py'
done done
DockerHubPushAarch64: DockerHubPushAarch64:
needs: CheckLabels needs: CheckLabels

View File

@ -67,22 +67,30 @@ Implementations of `ReadBuffer`/`WriteBuffer` are used for working with files an
Read/WriteBuffers only deal with bytes. There are functions from `ReadHelpers` and `WriteHelpers` header files to help with formatting input/output. For example, there are helpers to write a number in decimal format. Read/WriteBuffers only deal with bytes. There are functions from `ReadHelpers` and `WriteHelpers` header files to help with formatting input/output. For example, there are helpers to write a number in decimal format.
Lets look at what happens when you want to write a result set in `JSON` format to stdout. You have a result set ready to be fetched from `IBlockInputStream`. You create `WriteBufferFromFileDescriptor(STDOUT_FILENO)` to write bytes to stdout. You create `JSONRowOutputStream`, initialized with that `WriteBuffer`, to write rows in `JSON` to stdout. You create `BlockOutputStreamFromRowOutputStream` on top of it, to represent it as `IBlockOutputStream`. Then you call `copyData` to transfer data from `IBlockInputStream` to `IBlockOutputStream`, and everything works. Internally, `JSONRowOutputStream` will write various JSON delimiters and call the `IDataType::serializeTextJSON` method with a reference to `IColumn` and the row number as arguments. Consequently, `IDataType::serializeTextJSON` will call a method from `WriteHelpers.h`: for example, `writeText` for numeric types and `writeJSONString` for `DataTypeString`. Let's examine what happens when you want to write a result set in `JSON` format to stdout.
You have a result set ready to be fetched from a pulling `QueryPipeline`.
First, you create a `WriteBufferFromFileDescriptor(STDOUT_FILENO)` to write bytes to stdout.
Next, you connect the result from the query pipeline to `JSONRowOutputFormat`, which is initialized with that `WriteBuffer`, to write rows in `JSON` format to stdout.
This can be done via the `complete` method, which turns a pulling `QueryPipeline` into a completed `QueryPipeline`.
Internally, `JSONRowOutputFormat` will write various JSON delimiters and call the `IDataType::serializeTextJSON` method with a reference to `IColumn` and the row number as arguments. Consequently, `IDataType::serializeTextJSON` will call a method from `WriteHelpers.h`: for example, `writeText` for numeric types and `writeJSONString` for `DataTypeString`.
## Tables {#tables} ## Tables {#tables}
The `IStorage` interface represents tables. Different implementations of that interface are different table engines. Examples are `StorageMergeTree`, `StorageMemory`, and so on. Instances of these classes are just tables. The `IStorage` interface represents tables. Different implementations of that interface are different table engines. Examples are `StorageMergeTree`, `StorageMemory`, and so on. Instances of these classes are just tables.
The key `IStorage` methods are `read` and `write`. There are also `alter`, `rename`, `drop`, and so on. The `read` method accepts the following arguments: the set of columns to read from a table, the `AST` query to consider, and the desired number of streams to return. It returns one or multiple `IBlockInputStream` objects and information about the stage of data processing that was completed inside a table engine during query execution. The key methods in `IStorage` are `read` and `write`, along with others such as `alter`, `rename`, and `drop`. The `read` method accepts the following arguments: a set of columns to read from a table, the `AST` query to consider, and the desired number of streams. It returns a `Pipe`.
In most cases, the read method is only responsible for reading the specified columns from a table, not for any further data processing. All further data processing is done by the query interpreter and is outside the responsibility of `IStorage`. In most cases, the read method is responsible only for reading the specified columns from a table, not for any further data processing.
All subsequent data processing is handled by another part of the pipeline, which falls outside the responsibility of `IStorage`.
But there are notable exceptions: But there are notable exceptions:
- The AST query is passed to the `read` method, and the table engine can use it to derive index usage and to read fewer data from a table. - The AST query is passed to the `read` method, and the table engine can use it to derive index usage and to read fewer data from a table.
- Sometimes the table engine can process data itself to a specific stage. For example, `StorageDistributed` can send a query to remote servers, ask them to process data to a stage where data from different remote servers can be merged, and return that preprocessed data. The query interpreter then finishes processing the data. - Sometimes the table engine can process data itself to a specific stage. For example, `StorageDistributed` can send a query to remote servers, ask them to process data to a stage where data from different remote servers can be merged, and return that preprocessed data. The query interpreter then finishes processing the data.
The tables `read` method can return multiple `IBlockInputStream` objects to allow parallel data processing. These multiple block input streams can read from a table in parallel. Then you can wrap these streams with various transformations (such as expression evaluation or filtering) that can be calculated independently and create a `UnionBlockInputStream` on top of them, to read from multiple streams in parallel. The tables `read` method can return a `Pipe` consisting of multiple `Processors`. These `Processors` can read from a table in parallel.
Then, you can connect these processors with various other transformations (such as expression evaluation or filtering), which can be calculated independently.
And then, create a `QueryPipeline` on top of them, and execute it via `PipelineExecutor`.
There are also `TableFunction`s. These are functions that return a temporary `IStorage` object to use in the `FROM` clause of a query. There are also `TableFunction`s. These are functions that return a temporary `IStorage` object to use in the `FROM` clause of a query.
@ -98,9 +106,19 @@ A hand-written recursive descent parser parses a query. For example, `ParserSele
## Interpreters {#interpreters} ## Interpreters {#interpreters}
Interpreters are responsible for creating the query execution pipeline from an `AST`. There are simple interpreters, such as `InterpreterExistsQuery` and `InterpreterDropQuery`, or the more sophisticated `InterpreterSelectQuery`. The query execution pipeline is a combination of block input or output streams. For example, the result of interpreting the `SELECT` query is the `IBlockInputStream` to read the result set from; the result of the `INSERT` query is the `IBlockOutputStream` to write data for insertion to, and the result of interpreting the `INSERT SELECT` query is the `IBlockInputStream` that returns an empty result set on the first read, but that copies data from `SELECT` to `INSERT` at the same time. Interpreters are responsible for creating the query execution pipeline from an AST. There are simple interpreters, such as `InterpreterExistsQuery` and `InterpreterDropQuery`, as well as the more sophisticated `InterpreterSelectQuery`.
`InterpreterSelectQuery` uses `ExpressionAnalyzer` and `ExpressionActions` machinery for query analysis and transformations. This is where most rule-based query optimizations are done. `ExpressionAnalyzer` is quite messy and should be rewritten: various query transformations and optimizations should be extracted to separate classes to allow modular transformations of query. The query execution pipeline is a combination of processors that can consume and produce chunks (sets of columns with specific types).
A processor communicates via ports and can have multiple input ports and multiple output ports.
A more detailed description can be found in [src/Processors/IProcessor.h](https://github.com/ClickHouse/ClickHouse/blob/master/src/Processors/IProcessor.h).
For example, the result of interpreting the `SELECT` query is a "pulling" `QueryPipeline` which has a special output port to read the result set from.
The result of the `INSERT` query is a "pushing" `QueryPipeline` with an input port to write data for insertion.
And the result of interpreting the `INSERT SELECT` query is a "completed" `QueryPipeline` that has no inputs or outputs but copies data from `SELECT` to `INSERT` simultaneously.
`InterpreterSelectQuery` uses `ExpressionAnalyzer` and `ExpressionActions` machinery for query analysis and transformations. This is where most rule-based query optimizations are performed. `ExpressionAnalyzer` is quite messy and should be rewritten: various query transformations and optimizations should be extracted into separate classes to allow for modular transformations of the query.
To address current problems that exist in interpreters, a new `InterpreterSelectQueryAnalyzer` is being developed. It is a new version of `InterpreterSelectQuery` that does not use `ExpressionAnalyzer` and introduces an additional abstraction level between `AST` and `QueryPipeline` called `QueryTree`. It is not production-ready yet, but it can be tested with the `allow_experimental_analyzer` flag.
## Functions {#functions} ## Functions {#functions}

View File

@ -345,7 +345,7 @@ struct ExtractDomain
**7.** For abstract classes (interfaces) you can add the `I` prefix. **7.** For abstract classes (interfaces) you can add the `I` prefix.
``` cpp ``` cpp
class IBlockInputStream class IProcessor
``` ```
**8.** If you use a variable locally, you can use the short name. **8.** If you use a variable locally, you can use the short name.

View File

@ -897,6 +897,12 @@ Use DOS/Windows-style line separator (CRLF) in CSV instead of Unix style (LF).
Disabled by default. Disabled by default.
### input_format_csv_allow_cr_end_of_line {#input_format_csv_allow_cr_end_of_line}
If it is set true, CR(\\r) will be allowed at end of line not followed by LF(\\n)
Disabled by default.
### input_format_csv_enum_as_number {#input_format_csv_enum_as_number} ### input_format_csv_enum_as_number {#input_format_csv_enum_as_number}
When enabled, always treat enum values as enum ids for CSV input format. It's recommended to enable this setting if data contains only enum ids to optimize enum parsing. When enabled, always treat enum values as enum ids for CSV input format. It's recommended to enable this setting if data contains only enum ids to optimize enum parsing.

View File

@ -3310,22 +3310,11 @@ Possible values:
Default value: `0`. Default value: `0`.
## use_mysql_types_in_show_columns {#use_mysql_types_in_show_columns}
Show the names of MySQL data types corresponding to ClickHouse data types in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Possible values:
- 0 - Show names of native ClickHouse data types.
- 1 - Show names of MySQL data types corresponding to ClickHouse data types.
Default value: `0`.
## mysql_map_string_to_text_in_show_columns {#mysql_map_string_to_text_in_show_columns} ## mysql_map_string_to_text_in_show_columns {#mysql_map_string_to_text_in_show_columns}
When enabled, [String](../../sql-reference/data-types/string.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns). When enabled, [String](../../sql-reference/data-types/string.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Has effect only when [use_mysql_types_in_show_columns](#use_mysql_types_in_show_columns) is enabled. Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`. - 0 - Use `BLOB`.
- 1 - Use `TEXT`. - 1 - Use `TEXT`.
@ -3336,7 +3325,7 @@ Default value: `0`.
When enabled, [FixedString](../../sql-reference/data-types/fixedstring.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns). When enabled, [FixedString](../../sql-reference/data-types/fixedstring.md) ClickHouse data type will be displayed as `TEXT` in [SHOW COLUMNS](../../sql-reference/statements/show.md#show_columns).
Has effect only when [use_mysql_types_in_show_columns](#use_mysql_types_in_show_columns) is enabled. Has an effect only when the connection is made through the MySQL wire protocol.
- 0 - Use `BLOB`. - 0 - Use `BLOB`.
- 1 - Use `TEXT`. - 1 - Use `TEXT`.
@ -4812,3 +4801,10 @@ LIFETIME(MIN 0 MAX 3600)
LAYOUT(COMPLEX_KEY_HASHED_ARRAY()) LAYOUT(COMPLEX_KEY_HASHED_ARRAY())
SETTINGS(dictionary_use_async_executor=1, max_threads=8); SETTINGS(dictionary_use_async_executor=1, max_threads=8);
``` ```
## storage_metadata_write_full_object_key {#storage_metadata_write_full_object_key}
When set to `true` the metadata files are written with `VERSION_FULL_OBJECT_KEY` format version. With that format full object storage key names are written to the metadata files.
When set to `false` the metadata files are written with the previous format version, `VERSION_INLINE_DATA`. With that format only suffixes of object storage key names are are written to the metadata files. The prefix for all of object storage key names is set in configurations files at `storage_configuration.disks` section.
Default value: `false`.

View File

@ -35,27 +35,25 @@ WITH arrayMap(x -> demangle(addressToSymbol(x)), trace) AS all SELECT thread_nam
``` text ``` text
Row 1: Row 1:
────── ──────
thread_name: clickhouse-serv thread_name: QueryPipelineEx
thread_id: 743490
thread_id: 686 query_id: dc55a564-febb-4e37-95bb-090ef182c6f1
query_id: 1a11f70b-626d-47c1-b948-f9c7b206395d res: memcpy
res: sigqueue large_ralloc
DB::StorageSystemStackTrace::fillData(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, DB::Context const&, DB::SelectQueryInfo const&) const arena_ralloc
DB::IStorageSystemOneBlock<DB::StorageSystemStackTrace>::read(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::SelectQueryInfo const&, DB::Context const&, DB::QueryProcessingStage::Enum, unsigned long, unsigned int) do_rallocx
DB::InterpreterSelectQuery::executeFetchColumns(DB::QueryProcessingStage::Enum, DB::QueryPipeline&, std::__1::shared_ptr<DB::PrewhereInfo> const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) Allocator<true, true>::realloc(void*, unsigned long, unsigned long, unsigned long)
DB::InterpreterSelectQuery::executeImpl(DB::QueryPipeline&, std::__1::shared_ptr<DB::IBlockInputStream> const&, std::__1::optional<DB::Pipe>) HashTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>::resize(unsigned long, unsigned long)
DB::InterpreterSelectQuery::execute() void DB::Aggregator::executeImplBatch<false, false, true, DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>>(DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>&, DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>::State&, DB::Arena*, unsigned long, unsigned long, DB::Aggregator::AggregateFunctionInstruction*, bool, char*) const
DB::InterpreterSelectWithUnionQuery::execute() DB::Aggregator::executeImpl(DB::AggregatedDataVariants&, unsigned long, unsigned long, std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>&, DB::Aggregator::AggregateFunctionInstruction*, bool, bool, char*) const
DB::executeQueryImpl(char const*, char const*, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer*) DB::Aggregator::executeOnBlock(std::__1::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>, unsigned long, unsigned long, DB::AggregatedDataVariants&, std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>&, std::__1::vector<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>, std::__1::allocator<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>>>&, bool&) const
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool) DB::AggregatingTransform::work()
DB::TCPHandler::runImpl() DB::ExecutionThreadContext::executeTask()
DB::TCPHandler::run() DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>*)
Poco::Net::TCPServerConnection::start() void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads()::$_0, void ()>>(std::__1::__function::__policy_storage const*)
Poco::Net::TCPServerDispatcher::run() ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__1::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>)
Poco::PooledThread::run() void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::__1::function<void ()>, Priority, std::__1::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__1::__function::__policy_storage const*)
Poco::ThreadImpl::runnableEntry(void*) void* std::__1::__thread_proxy[abi:v15000]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, Priority, std::__1::optional<unsigned long>, bool)::'lambda0'()>>(void*)
start_thread
__clone
``` ```
Getting filenames and line numbers in ClickHouse source code: Getting filenames and line numbers in ClickHouse source code:

View File

@ -12,6 +12,7 @@ A client application to interact with clickhouse-keeper by its native protocol.
- `-q QUERY`, `--query=QUERY` — Query to execute. If this parameter is not passed, `clickhouse-keeper-client` will start in interactive mode. - `-q QUERY`, `--query=QUERY` — Query to execute. If this parameter is not passed, `clickhouse-keeper-client` will start in interactive mode.
- `-h HOST`, `--host=HOST` — Server host. Default value: `localhost`. - `-h HOST`, `--host=HOST` — Server host. Default value: `localhost`.
- `-p N`, `--port=N` — Server port. Default value: 9181 - `-p N`, `--port=N` — Server port. Default value: 9181
- `-c FILE_PATH`, `--config-file=FILE_PATH` — Set path of config file to get the connection string. Default value: `config.xml`.
- `--connection-timeout=TIMEOUT` — Set connection timeout in seconds. Default value: 10s. - `--connection-timeout=TIMEOUT` — Set connection timeout in seconds. Default value: 10s.
- `--session-timeout=TIMEOUT` — Set session timeout in seconds. Default value: 10s. - `--session-timeout=TIMEOUT` — Set session timeout in seconds. Default value: 10s.
- `--operation-timeout=TIMEOUT` — Set operation timeout in seconds. Default value: 10s. - `--operation-timeout=TIMEOUT` — Set operation timeout in seconds. Default value: 10s.

View File

@ -2172,80 +2172,6 @@ Result:
└─────────────────────┘ └─────────────────────┘
``` ```
## arrayRandomSample
Function `arrayRandomSample` returns a subset with `samples`-many random elements of an input array. If `samples` exceeds the size of the input array, the sample size is limited to the size of the array. In this case, all elements of the input array are returned, but the order is not guaranteed. The function can handle both flat arrays and nested arrays.
**Syntax**
```sql
arrayRandomSample(arr, samples)
```
**Arguments**
- `arr` — The input array from which to sample elements. This may be flat or nested arrays.
- `samples` — An unsigned integer specifying the number of elements to include in the random sample.
**Returned Value**
- An array containing a random sample of elements from the input array.
**Examples**
Query:
```sql
SELECT arrayRandomSample(['apple', 'banana', 'cherry', 'date'], 2) as res;
```
Result:
```
┌─res────────────────┐
│ ['banana','apple'] │
└────────────────────┘
```
Query:
```sql
SELECT arrayRandomSample([[1, 2], [3, 4], [5, 6]], 2) as res;
```
Result:
```
┌─res───────────┐
│ [[3,4],[5,6]] │
└───────────────┘
```
Query:
```sql
SELECT arrayRandomSample([1, 2, 3, 4, 5], 0) as res;
```
Result:
```
┌─res─┐
│ [] │
└─────┘
```
Query:
```sql
SELECT arrayRandomSample([1, 2, 3], 5) as res;
```
Result:
```
┌─res─────┐
│ [3,1,2] │
└─────────┘
```
## Distance functions ## Distance functions
All supported functions are described in [distance functions documentation](../../sql-reference/functions/distance-functions.md). All supported functions are described in [distance functions documentation](../../sql-reference/functions/distance-functions.md).

View File

@ -2760,10 +2760,13 @@ message Root
Returns a formatted, possibly multi-line, version of the given SQL query. Returns a formatted, possibly multi-line, version of the given SQL query.
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQueryOrNull()` may be used.
**Syntax** **Syntax**
```sql ```sql
formatQuery(query) formatQuery(query)
formatQueryOrNull(query)
``` ```
**Arguments** **Arguments**
@ -2796,10 +2799,13 @@ WHERE (a > 3) AND (b < 3) │
Like formatQuery() but the returned formatted string contains no line breaks. Like formatQuery() but the returned formatted string contains no line breaks.
Throws an exception if the query is not well-formed. To return `NULL` instead, function `formatQuerySingleLineOrNull()` may be used.
**Syntax** **Syntax**
```sql ```sql
formatQuerySingleLine(query) formatQuerySingleLine(query)
formatQuerySingleLineOrNull(query)
``` ```
**Arguments** **Arguments**

View File

@ -1371,6 +1371,86 @@ Result:
└──────────────────┘ └──────────────────┘
``` ```
## byteHammingDistance
Calculates the [hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) between two byte strings.
**Syntax**
```sql
byteHammingDistance(string1, string2)
```
**Examples**
``` sql
SELECT byteHammingDistance('karolin', 'kathrin');
```
Result:
``` text
┌─byteHammingDistance('karolin', 'kathrin')─┐
│ 3 │
└───────────────────────────────────────────┘
```
Alias: mismatches
## stringJaccardIndex
Calculates the [Jaccard similarity index](https://en.wikipedia.org/wiki/Jaccard_index) between two byte strings.
**Syntax**
```sql
stringJaccardIndex(string1, string2)
```
**Examples**
``` sql
SELECT stringJaccardIndex('clickhouse', 'mouse');
```
Result:
``` text
┌─stringJaccardIndex('clickhouse', 'mouse')─┐
│ 0.4 │
└───────────────────────────────────────────┘
```
## stringJaccardIndexUTF8
Like [stringJaccardIndex](#stringJaccardIndex) but for UTF8-encoded strings.
## editDistance
Calculates the [edit distance](https://en.wikipedia.org/wiki/Edit_distance) between two byte strings.
**Syntax**
```sql
editDistance(string1, string2)
```
**Examples**
``` sql
SELECT editDistance('clickhouse', 'mouse');
```
Result:
``` text
┌─editDistance('clickhouse', 'mouse')─┐
│ 6 │
└─────────────────────────────────────┘
```
Alias: levenshteinDistance
## initcap ## initcap
Convert the first letter of each word to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters. Convert the first letter of each word to upper case and the rest to lower case. Words are sequences of alphanumeric characters separated by non-alphanumeric characters.

View File

@ -681,79 +681,3 @@ Like [hasSubsequence](#hasSubsequence) but assumes `haystack` and `needle` are U
## hasSubsequenceCaseInsensitiveUTF8 ## hasSubsequenceCaseInsensitiveUTF8
Like [hasSubsequenceUTF8](#hasSubsequenceUTF8) but searches case-insensitively. Like [hasSubsequenceUTF8](#hasSubsequenceUTF8) but searches case-insensitively.
## byteHammingDistance
Calculates the [hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) between two byte strings.
**Syntax**
```sql
byteHammingDistance(string2, string2)
```
**Examples**
``` sql
SELECT byteHammingDistance('abc', 'ab') ;
```
Result:
``` text
┌─byteHammingDistance('abc', 'ab')─┐
│ 1 │
└──────────────────────────────────┘
```
- Alias: mismatches
## jaccardIndex
Calculates the [Jaccard similarity index](https://en.wikipedia.org/wiki/Jaccard_index) between two byte strings.
**Syntax**
```sql
byteJaccardIndex(string1, string2)
```
**Examples**
``` sql
SELECT jaccardIndex('clickhouse', 'mouse');
```
Result:
``` text
┌─jaccardIndex('clickhouse', 'mouse')─┐
│ 0.4 │
└─────────────────────────────────────────┘
```
## editDistance
Calculates the [edit distance](https://en.wikipedia.org/wiki/Edit_distance) between two byte strings.
**Syntax**
```sql
editDistance(string1, string2)
```
**Examples**
``` sql
SELECT editDistance('clickhouse', 'mouse');
```
Result:
``` text
┌─editDistance('clickhouse', 'mouse')─┐
│ 6 │
└─────────────────────────────────────────┘
```
- Alias: levenshteinDistance

View File

@ -207,7 +207,7 @@ The optional keyword `FULL` causes the output to include the collation, comment
The statement produces a result table with the following structure: The statement produces a result table with the following structure:
- `field` - The name of the column (String) - `field` - The name of the column (String)
- `type` - The column data type. If setting `[use_mysql_types_in_show_columns](../../operations/settings/settings.md#use_mysql_types_in_show_columns) = 1` (default: 0), then the equivalent type name in MySQL is shown. (String) - `type` - The column data type. If the query was made through the MySQL wire protocol, then the equivalent type name in MySQL is shown. (String)
- `null` - `YES` if the column data type is Nullable, `NO` otherwise (String) - `null` - `YES` if the column data type is Nullable, `NO` otherwise (String)
- `key` - `PRI` if the column is part of the primary key, `SOR` if the column is part of the sorting key, empty otherwise (String) - `key` - `PRI` if the column is part of the primary key, `SOR` if the column is part of the sorting key, empty otherwise (String)
- `default` - Default expression of the column if it is of type `ALIAS`, `DEFAULT`, or `MATERIALIZED`, otherwise `NULL`. (Nullable(String)) - `default` - Default expression of the column if it is of type `ALIAS`, `DEFAULT`, or `MATERIALIZED`, otherwise `NULL`. (Nullable(String))

View File

@ -7,7 +7,7 @@ keywords: [gcs, bucket]
# gcs Table Function # gcs Table Function
Provides a table-like interface to select/insert files in [Google Cloud Storage](https://cloud.google.com/storage/). Provides a table-like interface to `SELECT` and `INSERT` data from [Google Cloud Storage](https://cloud.google.com/storage/). Requires the [`Storage Object User` IAM role](https://cloud.google.com/storage/docs/access-control/iam-roles).
**Syntax** **Syntax**

View File

@ -49,21 +49,9 @@ ClickHouse — полноценная столбцовая СУБД. Данны
Блоки создаются для всех обработанных фрагментов данных. Напоминаем, что одни и те же типы вычислений, имена столбцов и типы переиспользуются в разных блоках и только данные колонок изменяются. Лучше разделить данные и заголовок блока потому, что в блоках маленького размера мы имеем большой оверхэд по временным строкам при копировании умных указателей (`shared_ptrs`) и имен столбцов. Блоки создаются для всех обработанных фрагментов данных. Напоминаем, что одни и те же типы вычислений, имена столбцов и типы переиспользуются в разных блоках и только данные колонок изменяются. Лучше разделить данные и заголовок блока потому, что в блоках маленького размера мы имеем большой оверхэд по временным строкам при копировании умных указателей (`shared_ptrs`) и имен столбцов.
## Потоки блоков (Block Streams) {#block-streams} ## Процессоры
Потоки блоков обрабатывают данные. Мы используем потоки блоков для чтения данных, трансформации или записи данных куда-либо. `IBlockInputStream` предоставляет метод `read` для получения следующего блока, пока это возможно, и метод `write`, чтобы продвигать (push) блок куда-либо. Смотрите описание в файле [src/Processors/IProcessor.h](https://github.com/ClickHouse/ClickHouse/blob/master/src/Processors/IProcessor.h) исходного кода.
Потоки отвечают за:
1. Чтение и запись в таблицу. Таблица лишь возвращает поток для чтения или записи блоков.
2. Реализацию форматов данных. Например, при выводе данных в терминал в формате `Pretty`, вы создаете выходной поток блоков, который форматирует поступающие в него блоки.
3. Трансформацию данных. Допустим, у вас есть `IBlockInputStream` и вы хотите создать отфильтрованный поток. Вы создаете `FilterBlockInputStream` и инициализируете его вашим потоком. Затем вы тянете (pull) блоки из `FilterBlockInputStream`, а он тянет блоки исходного потока, фильтрует их и возвращает отфильтрованные блоки вам. Таким образом построены конвейеры выполнения запросов.
Имеются и более сложные трансформации. Например, когда вы тянете блоки из `AggregatingBlockInputStream`, он считывает все данные из своего источника, агрегирует их, и возвращает поток агрегированных данных вам. Другой пример: конструктор `UnionBlockInputStream` принимает множество источников входных данных и число потоков. Такой `Stream` работает в несколько потоков и читает данные источников параллельно.
> Потоки блоков используют «втягивающий» (pull) подход к управлению потоком выполнения: когда вы вытягиваете блок из первого потока, он, следовательно, вытягивает необходимые блоки из вложенных потоков, так и работает весь конвейер выполнения. Ни «pull» ни «push» не имеют явного преимущества, потому что поток управления неявный, и это ограничивает в реализации различных функций, таких как одновременное выполнение нескольких запросов (слияние нескольких конвейеров вместе). Это ограничение можно преодолеть с помощью сопрограмм (coroutines) или просто запуском дополнительных потоков, которые ждут друг друга. У нас может быть больше возможностей, если мы сделаем поток управления явным: если мы локализуем логику для передачи данных из одной расчетной единицы в другую вне этих расчетных единиц. Читайте эту [статью](http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/) для углубленного изучения.
Следует отметить, что конвейер выполнения запроса создает временные данные на каждом шаге. Мы стараемся сохранить размер блока достаточно маленьким, чтобы временные данные помещались в кэш процессора. При таком допущении запись и чтение временных данных практически бесплатны по сравнению с другими расчетами. Мы могли бы рассмотреть альтернативу, которая заключается в том, чтобы объединить многие операции в конвейере вместе. Это может сделать конвейер как можно короче и удалить большую часть временных данных, что может быть преимуществом, но у такого подхода также есть недостатки. Например, разделенный конвейер позволяет легко реализовать кэширование промежуточных данных, использование промежуточных данных из аналогичных запросов, выполняемых одновременно, и объединение конвейеров для аналогичных запросов.
## Форматы {#formats} ## Форматы {#formats}
@ -81,13 +69,16 @@ ClickHouse — полноценная столбцовая СУБД. Данны
Буферы чтения-записи имеют дело только с байтами. В заголовочных файлах `ReadHelpers` и `WriteHelpers` объявлены некоторые функции, чтобы помочь с форматированием ввода-вывода. Например, есть помощники для записи числа в десятичном формате. Буферы чтения-записи имеют дело только с байтами. В заголовочных файлах `ReadHelpers` и `WriteHelpers` объявлены некоторые функции, чтобы помочь с форматированием ввода-вывода. Например, есть помощники для записи числа в десятичном формате.
Давайте посмотрим, что происходит, когда вы хотите вывести результат в `JSON` формате в стандартный вывод (stdout). У вас есть результирующий набор данных, готовый к извлечению из `IBlockInputStream`. Вы создаете `WriteBufferFromFileDescriptor(STDOUT_FILENO)` чтобы записать байты в stdout. Вы создаете `JSONRowOutputStream`, инициализируете с этим `WriteBuffer`'ом, чтобы записать строки `JSON` в stdout. Кроме того вы создаете `BlockOutputStreamFromRowOutputStream`, реализуя `IBlockOutputStream`. Затем вызывается `copyData` для передачи данных из `IBlockInputStream` в `IBlockOutputStream` и все работает. Внутренний `JSONRowOutputStream` будет писать в формате `JSON` различные разделители и вызвать `IDataType::serializeTextJSON` метод со ссылкой на `IColumn` и номер строки в качестве аргументов. Следовательно, `IDataType::serializeTextJSON` вызовет метод из `WriteHelpers.h`: например, `writeText` для числовых типов и `writeJSONString` для `DataTypeString`. Давайте посмотрим, что происходит, когда вы хотите вывести результат в `JSON` формате в стандартный вывод (stdout). У вас есть результирующий набор данных, готовый к извлечению из `QueryPipeline`. Вы создаете `WriteBufferFromFileDescriptor(STDOUT_FILENO)` чтобы записать байты в stdout. Вы создаете `JSONRowOutputFormat`, инициализируете с этим `WriteBuffer`'ом, чтобы записать строки `JSON` в stdout.
Чтобы соеденить выход `QueryPipeline` с форматом, можно использовать метод `complete`, который превращает `QueryPipeline` в завершенный `QueryPipeline`.
Внутренний `JSONRowOutputStream` будет писать в формате `JSON` различные разделители и вызвать `IDataType::serializeTextJSON` метод со ссылкой на `IColumn` и номер строки в качестве аргументов. Следовательно, `IDataType::serializeTextJSON` вызовет метод из `WriteHelpers.h`: например, `writeText` для числовых типов и `writeJSONString` для `DataTypeString`.
## Таблицы {#tables} ## Таблицы {#tables}
Интерфейс `IStorage` служит для отображения таблицы. Различные движки таблиц являются реализациями этого интерфейса. Примеры `StorageMergeTree`, `StorageMemory` и так далее. Экземпляры этих классов являются просто таблицами. Интерфейс `IStorage` служит для отображения таблицы. Различные движки таблиц являются реализациями этого интерфейса. Примеры `StorageMergeTree`, `StorageMemory` и так далее. Экземпляры этих классов являются просто таблицами.
Ключевые методы `IStorage` это `read` и `write`. Есть и другие варианты — `alter`, `rename`, `drop` и так далее. Метод `read` принимает следующие аргументы: набор столбцов для чтения из таблицы, `AST` запрос и желаемое количество потоков для вывода. Он возвращает один или несколько объектов `IBlockInputStream` и информацию о стадии обработки данных, которая была завершена внутри табличного движка во время выполнения запроса. Ключевые методы `IStorage` это `read` и `write`. Есть и другие варианты — `alter`, `rename`, `drop` и так далее.
Метод `read` принимает следующие аргументы: набор столбцов для чтения из таблицы, `AST` запрос и желаемое количество потоков для вывода и возвращает `Pipe`.
В большинстве случаев метод read отвечает только за чтение указанных столбцов из таблицы, а не за дальнейшую обработку данных. Вся дальнейшая обработка данных осуществляется интерпретатором запросов и не входит в сферу ответственности `IStorage`. В большинстве случаев метод read отвечает только за чтение указанных столбцов из таблицы, а не за дальнейшую обработку данных. Вся дальнейшая обработка данных осуществляется интерпретатором запросов и не входит в сферу ответственности `IStorage`.
@ -96,7 +87,9 @@ ClickHouse — полноценная столбцовая СУБД. Данны
- AST-запрос, передающийся в метод `read`, может использоваться движком таблицы для получения информации о возможности использования индекса и считывания меньшего количества данных из таблицы. - AST-запрос, передающийся в метод `read`, может использоваться движком таблицы для получения информации о возможности использования индекса и считывания меньшего количества данных из таблицы.
- Иногда движок таблиц может сам обрабатывать данные до определенного этапа. Например, `StorageDistributed` можно отправить запрос на удаленные серверы, попросить их обработать данные до этапа, когда данные с разных удаленных серверов могут быть объединены, и вернуть эти предварительно обработанные данные. Затем интерпретатор запросов завершает обработку данных. - Иногда движок таблиц может сам обрабатывать данные до определенного этапа. Например, `StorageDistributed` можно отправить запрос на удаленные серверы, попросить их обработать данные до этапа, когда данные с разных удаленных серверов могут быть объединены, и вернуть эти предварительно обработанные данные. Затем интерпретатор запросов завершает обработку данных.
Метод `read` может возвращать несколько объектов `IBlockInputStream`, позволяя осуществлять параллельную обработку данных. Эти несколько блочных входных потоков могут считываться из таблицы параллельно. Затем вы можете обернуть эти потоки различными преобразованиями (такими как вычисление выражений или фильтрация), которые могут быть вычислены независимо, и создать `UnionBlockInputStream` поверх них, чтобы читать из нескольких потоков параллельно. Метод `read` может возвращать `Pipe`, состоящий из нескольких процессоров. Каждый их этих процессоров может читать данные параллельно.
Затем, вы можете соеденить эти просессоры с другими преобразованиями (такими как вычисление выражений или фильтрация), которые могут быть вычислены независимо.
Далее, создан `QueryPipeline` поверх них, можно выполнить пайплайн с помощью `PipelineExecutor`.
Есть и другие варианты. Например, `TableFunction` возвращает временный объект `IStorage`, который можно подставить во `FROM`. Есть и другие варианты. Например, `TableFunction` возвращает временный объект `IStorage`, который можно подставить во `FROM`.
@ -112,10 +105,18 @@ ClickHouse — полноценная столбцовая СУБД. Данны
## Интерпретаторы {#interpreters} ## Интерпретаторы {#interpreters}
Интерпретаторы отвечают за создание конвейера выполнения запроса из `AST`. Есть простые интерпретаторы, такие как `InterpreterExistsQuery` и `InterpreterDropQuery` или более сложный `InterpreterSelectQuery`. Конвейер выполнения запроса представляет собой комбинацию входных и выходных потоков блоков. Например, результатом интерпретации `SELECT` запроса является `IBlockInputStream` для чтения результирующего набора данных; результат интерпретации `INSERT` запроса — это `IBlockOutputStream`, для записи данных, предназначенных для вставки; результат интерпретации `INSERT SELECT` запроса — это `IBlockInputStream`, который возвращает пустой результирующий набор при первом чтении, но копирует данные из `SELECT` к `INSERT`. Интерпретаторы отвечают за создание конвейера выполнения запроса из `AST`. Есть простые интерпретаторы, такие как `InterpreterExistsQuery` и `InterpreterDropQuery` или более сложный `InterpreterSelectQuery`.
Конвейер выполнения запроса представляет собой комбинацию процессоров, которые могут принимать на вход и также возвращать чанки (набор колонок с их типами)
Процессоры обмениваются данными через порты и могут иметь несколько входных и выходных портов.
Более подробное описание можно найти в файле [src/Processors/IProcessor.h](https://github.com/ClickHouse/ClickHouse/blob/master/src/Processors/IProcessor.h).
Например, результатом интерпретации `SELECT` запроса является `QueryPipeline`, который имеет специальный выходной порт для чтения результирующего набора данных. Результатом интерпретации `INSERT` запроса является `QueryPipeline` с входным портом для записи данных для вставки. Результатом интерпретации `INSERT SELECT` запроса является завершенный `QueryPipeline`, который не имеет входов или выходов, но копирует данные из `SELECT` в `INSERT` одновременно.
`InterpreterSelectQuery` использует `ExpressionAnalyzer` и `ExpressionActions` механизмы для анализа запросов и преобразований. Именно здесь выполняется большинство оптимизаций запросов на основе правил. `ExpressionAnalyzer` написан довольно грязно и должен быть переписан: различные преобразования запросов и оптимизации должны быть извлечены в отдельные классы, чтобы позволить модульные преобразования или запросы. `InterpreterSelectQuery` использует `ExpressionAnalyzer` и `ExpressionActions` механизмы для анализа запросов и преобразований. Именно здесь выполняется большинство оптимизаций запросов на основе правил. `ExpressionAnalyzer` написан довольно грязно и должен быть переписан: различные преобразования запросов и оптимизации должны быть извлечены в отдельные классы, чтобы позволить модульные преобразования или запросы.
Для решения текущих проблем, существующих в интерпретаторах, разрабатывается новый `InterpreterSelectQueryAnalyzer`. Это новая версия `InterpreterSelectQuery`, которая не использует `ExpressionAnalyzer` и вводит дополнительный уровень абстракции между `AST` и `QueryPipeline`, называемый `QueryTree`. Он еще не готов к использованию в продакшене, но его можно протестировать с помощью флага `allow_experimental_analyzer`.
## Функции {#functions} ## Функции {#functions}
Существуют обычные функции и агрегатные функции. Агрегатные функции смотрите в следующем разделе. Существуют обычные функции и агрегатные функции. Агрегатные функции смотрите в следующем разделе.

View File

@ -345,7 +345,7 @@ struct ExtractDomain
**7.** Для абстрактных классов (интерфейсов) можно добавить в начало имени букву `I`. **7.** Для абстрактных классов (интерфейсов) можно добавить в начало имени букву `I`.
``` cpp ``` cpp
class IBlockInputStream class IProcessor
``` ```
**8.** Если переменная используется достаточно локально, то можно использовать короткое имя. **8.** Если переменная используется достаточно локально, то можно использовать короткое имя.

View File

@ -31,27 +31,25 @@ WITH arrayMap(x -> demangle(addressToSymbol(x)), trace) AS all SELECT thread_nam
``` text ``` text
Row 1: Row 1:
────── ──────
thread_name: clickhouse-serv thread_name: QueryPipelineEx
thread_id: 743490
thread_id: 686 query_id: dc55a564-febb-4e37-95bb-090ef182c6f1
query_id: 1a11f70b-626d-47c1-b948-f9c7b206395d res: memcpy
res: sigqueue large_ralloc
DB::StorageSystemStackTrace::fillData(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, DB::Context const&, DB::SelectQueryInfo const&) const arena_ralloc
DB::IStorageSystemOneBlock<DB::StorageSystemStackTrace>::read(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&, DB::SelectQueryInfo const&, DB::Context const&, DB::QueryProcessingStage::Enum, unsigned long, unsigned int) do_rallocx
DB::InterpreterSelectQuery::executeFetchColumns(DB::QueryProcessingStage::Enum, DB::QueryPipeline&, std::__1::shared_ptr<DB::PrewhereInfo> const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) Allocator<true, true>::realloc(void*, unsigned long, unsigned long, unsigned long)
DB::InterpreterSelectQuery::executeImpl(DB::QueryPipeline&, std::__1::shared_ptr<DB::IBlockInputStream> const&, std::__1::optional<DB::Pipe>) HashTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>::resize(unsigned long, unsigned long)
DB::InterpreterSelectQuery::execute() void DB::Aggregator::executeImplBatch<false, false, true, DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>>(DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>&, DB::AggregationMethodOneNumber<unsigned long, HashMapTable<unsigned long, HashMapCell<unsigned long, char*, HashCRC32<unsigned long>, HashTableNoState, PairNoInit<unsigned long, char*>>, HashCRC32<unsigned long>, HashTableGrowerWithPrecalculation<8ul>, Allocator<true, true>>, true, false>::State&, DB::Arena*, unsigned long, unsigned long, DB::Aggregator::AggregateFunctionInstruction*, bool, char*) const
DB::InterpreterSelectWithUnionQuery::execute() DB::Aggregator::executeImpl(DB::AggregatedDataVariants&, unsigned long, unsigned long, std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>&, DB::Aggregator::AggregateFunctionInstruction*, bool, bool, char*) const
DB::executeQueryImpl(char const*, char const*, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool, DB::ReadBuffer*) DB::Aggregator::executeOnBlock(std::__1::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>, unsigned long, unsigned long, DB::AggregatedDataVariants&, std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>&, std::__1::vector<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>, std::__1::allocator<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*>>>>&, bool&) const
DB::executeQuery(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, DB::Context&, bool, DB::QueryProcessingStage::Enum, bool) DB::AggregatingTransform::work()
DB::TCPHandler::runImpl() DB::ExecutionThreadContext::executeTask()
DB::TCPHandler::run() DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>*)
Poco::Net::TCPServerConnection::start() void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads()::$_0, void ()>>(std::__1::__function::__policy_storage const*)
Poco::Net::TCPServerDispatcher::run() ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::worker(std::__1::__list_iterator<ThreadFromGlobalPoolImpl<false>, void*>)
Poco::PooledThread::run() void std::__1::__function::__policy_invoker<void ()>::__call_impl<std::__1::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false>::ThreadFromGlobalPoolImpl<void ThreadPoolImpl<ThreadFromGlobalPoolImpl<false>>::scheduleImpl<void>(std::__1::function<void ()>, Priority, std::__1::optional<unsigned long>, bool)::'lambda0'()>(void&&)::'lambda'(), void ()>>(std::__1::__function::__policy_storage const*)
Poco::ThreadImpl::runnableEntry(void*) void* std::__1::__thread_proxy[abi:v15000]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, Priority, std::__1::optional<unsigned long>, bool)::'lambda0'()>>(void*)
start_thread
__clone
``` ```
Получение имен файлов и номеров строк в исходном коде ClickHouse: Получение имен файлов и номеров строк в исходном коде ClickHouse:

View File

@ -1106,7 +1106,7 @@ public:
{ {
if (isInteger(data_type)) if (isInteger(data_type))
{ {
if (isUnsignedInteger(data_type)) if (isUInt(data_type))
return std::make_unique<UnsignedIntegerModel>(seed); return std::make_unique<UnsignedIntegerModel>(seed);
else else
return std::make_unique<SignedIntegerModel>(seed); return std::make_unique<SignedIntegerModel>(seed);

View File

@ -239,7 +239,7 @@ public:
if constexpr (has_second_arg) if constexpr (has_second_arg)
{ {
assertBinary(Name::name, types); assertBinary(Name::name, types);
if (!isUnsignedInteger(types[1])) if (!isUInt(types[1]))
throw Exception( throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Second argument (weight) for function {} must be unsigned integer, but it has type {}", "Second argument (weight) for function {} must be unsigned integer, but it has type {}",

View File

@ -0,0 +1,68 @@
#include "ObjectStorageKey.h"
#include <Common/Exception.h>
#include <filesystem>
namespace fs = std::filesystem;
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}
const String & ObjectStorageKey::getPrefix() const
{
if (!is_relative)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "object key has no prefix, key: {}", key);
return prefix;
}
const String & ObjectStorageKey::getSuffix() const
{
if (!is_relative)
throw DB::Exception(DB::ErrorCodes::LOGICAL_ERROR, "object key has no suffix, key: {}", key);
return suffix;
}
const String & ObjectStorageKey::serialize() const
{
return key;
}
ObjectStorageKey ObjectStorageKey::createAsRelative(String key_)
{
ObjectStorageKey object_key;
object_key.suffix = std::move(key_);
object_key.key = object_key.suffix;
object_key.is_relative = true;
return object_key;
}
ObjectStorageKey ObjectStorageKey::createAsRelative(String prefix_, String suffix_)
{
ObjectStorageKey object_key;
object_key.prefix = std::move(prefix_);
object_key.suffix = std::move(suffix_);
if (object_key.prefix.empty())
object_key.key = object_key.suffix;
else
object_key.key = fs::path(object_key.prefix) / object_key.suffix;
object_key.is_relative = true;
return object_key;
}
ObjectStorageKey ObjectStorageKey::createAsAbsolute(String key_)
{
ObjectStorageKey object_key;
object_key.key = std::move(key_);
object_key.is_relative = false;
return object_key;
}
}

View File

@ -0,0 +1,29 @@
#pragma once
#include <base/types.h>
#include <memory>
namespace DB
{
struct ObjectStorageKey
{
ObjectStorageKey() = default;
bool hasPrefix() const { return is_relative; }
const String & getPrefix() const;
const String & getSuffix() const;
const String & serialize() const;
static ObjectStorageKey createAsRelative(String prefix_, String suffix_);
static ObjectStorageKey createAsRelative(String key_);
static ObjectStorageKey createAsAbsolute(String key_);
private:
String prefix;
String suffix;
String key;
bool is_relative = false;
};
}

View File

@ -101,6 +101,7 @@
M(ReplicatedPartChecks, "Number of times we had to perform advanced search for a data part on replicas or to clarify the need of an existing data part.") \ M(ReplicatedPartChecks, "Number of times we had to perform advanced search for a data part on replicas or to clarify the need of an existing data part.") \
M(ReplicatedPartChecksFailed, "Number of times the advanced search for a data part on replicas did not give result or when unexpected part has been found and moved away.") \ M(ReplicatedPartChecksFailed, "Number of times the advanced search for a data part on replicas did not give result or when unexpected part has been found and moved away.") \
M(ReplicatedDataLoss, "Number of times a data part that we wanted doesn't exist on any replica (even on replicas that are offline right now). That data parts are definitely lost. This is normal due to asynchronous replication (if quorum inserts were not enabled), when the replica on which the data part was written was failed and when it became online after fail it doesn't contain that data part.") \ M(ReplicatedDataLoss, "Number of times a data part that we wanted doesn't exist on any replica (even on replicas that are offline right now). That data parts are definitely lost. This is normal due to asynchronous replication (if quorum inserts were not enabled), when the replica on which the data part was written was failed and when it became online after fail it doesn't contain that data part.") \
M(ReplicatedCoveredPartsInZooKeeperOnStart, "For debugging purposes. Number of parts in ZooKeeper that have a covering part, but doesn't exist on disk. Checked on server start.") \
\ \
M(InsertedRows, "Number of rows INSERTed to all tables.") \ M(InsertedRows, "Number of rows INSERTed to all tables.") \
M(InsertedBytes, "Number of bytes (uncompressed; for columns as they stored in memory) INSERTed to all tables.") \ M(InsertedBytes, "Number of bytes (uncompressed; for columns as they stored in memory) INSERTed to all tables.") \

View File

@ -16,8 +16,8 @@ private:
bool nextImpl() override; bool nextImpl() override;
public: public:
explicit CompressedReadBuffer(ReadBuffer & in_, bool allow_different_codecs_ = false) explicit CompressedReadBuffer(ReadBuffer & in_, bool allow_different_codecs_ = false, bool external_data_ = false)
: CompressedReadBufferBase(&in_, allow_different_codecs_), BufferWithOwnMemory<ReadBuffer>(0) : CompressedReadBufferBase(&in_, allow_different_codecs_, external_data_), BufferWithOwnMemory<ReadBuffer>(0)
{ {
} }

View File

@ -114,7 +114,8 @@ static void readHeaderAndGetCodecAndSize(
CompressionCodecPtr & codec, CompressionCodecPtr & codec,
size_t & size_decompressed, size_t & size_decompressed,
size_t & size_compressed_without_checksum, size_t & size_compressed_without_checksum,
bool allow_different_codecs) bool allow_different_codecs,
bool external_data)
{ {
uint8_t method = ICompressionCodec::readMethod(compressed_buffer); uint8_t method = ICompressionCodec::readMethod(compressed_buffer);
@ -136,8 +137,11 @@ static void readHeaderAndGetCodecAndSize(
} }
} }
size_compressed_without_checksum = ICompressionCodec::readCompressedBlockSize(compressed_buffer); if (external_data)
size_decompressed = ICompressionCodec::readDecompressedBlockSize(compressed_buffer); codec->setExternalDataFlag();
size_compressed_without_checksum = codec->readCompressedBlockSize(compressed_buffer);
size_decompressed = codec->readDecompressedBlockSize(compressed_buffer);
/// This is for clang static analyzer. /// This is for clang static analyzer.
assert(size_decompressed > 0); assert(size_decompressed > 0);
@ -170,7 +174,8 @@ size_t CompressedReadBufferBase::readCompressedData(size_t & size_decompressed,
codec, codec,
size_decompressed, size_decompressed,
size_compressed_without_checksum, size_compressed_without_checksum,
allow_different_codecs); allow_different_codecs,
external_data);
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer(); auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
@ -221,7 +226,8 @@ size_t CompressedReadBufferBase::readCompressedDataBlockForAsynchronous(size_t &
codec, codec,
size_decompressed, size_decompressed,
size_compressed_without_checksum, size_compressed_without_checksum,
allow_different_codecs); allow_different_codecs,
external_data);
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer(); auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
@ -254,7 +260,8 @@ size_t CompressedReadBufferBase::readCompressedDataBlockForAsynchronous(size_t &
} }
} }
static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_decompressed, CompressionCodecPtr & codec, bool allow_different_codecs) static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_decompressed, CompressionCodecPtr & codec,
bool allow_different_codecs, bool external_data)
{ {
ProfileEvents::increment(ProfileEvents::CompressedReadBufferBlocks); ProfileEvents::increment(ProfileEvents::CompressedReadBufferBlocks);
ProfileEvents::increment(ProfileEvents::CompressedReadBufferBytes, size_decompressed); ProfileEvents::increment(ProfileEvents::CompressedReadBufferBytes, size_decompressed);
@ -278,17 +285,20 @@ static void readHeaderAndGetCodec(const char * compressed_buffer, size_t size_de
getHexUIntLowercase(method), getHexUIntLowercase(codec->getMethodByte())); getHexUIntLowercase(method), getHexUIntLowercase(codec->getMethodByte()));
} }
} }
if (external_data)
codec->setExternalDataFlag();
} }
void CompressedReadBufferBase::decompressTo(char * to, size_t size_decompressed, size_t size_compressed_without_checksum) void CompressedReadBufferBase::decompressTo(char * to, size_t size_decompressed, size_t size_compressed_without_checksum)
{ {
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs); readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs, external_data);
codec->decompress(compressed_buffer, static_cast<UInt32>(size_compressed_without_checksum), to); codec->decompress(compressed_buffer, static_cast<UInt32>(size_compressed_without_checksum), to);
} }
void CompressedReadBufferBase::decompress(BufferBase::Buffer & to, size_t size_decompressed, size_t size_compressed_without_checksum) void CompressedReadBufferBase::decompress(BufferBase::Buffer & to, size_t size_decompressed, size_t size_compressed_without_checksum)
{ {
readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs); readHeaderAndGetCodec(compressed_buffer, size_decompressed, codec, allow_different_codecs, external_data);
if (codec->isNone()) if (codec->isNone())
{ {
@ -320,8 +330,8 @@ void CompressedReadBufferBase::setDecompressMode(ICompressionCodec::CodecMode mo
} }
/// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'. /// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'.
CompressedReadBufferBase::CompressedReadBufferBase(ReadBuffer * in, bool allow_different_codecs_) CompressedReadBufferBase::CompressedReadBufferBase(ReadBuffer * in, bool allow_different_codecs_, bool external_data_)
: compressed_in(in), own_compressed_buffer(0), allow_different_codecs(allow_different_codecs_) : compressed_in(in), own_compressed_buffer(0), allow_different_codecs(allow_different_codecs_), external_data(external_data_)
{ {
} }

View File

@ -30,6 +30,9 @@ protected:
/// Allow reading data, compressed by different codecs from one file. /// Allow reading data, compressed by different codecs from one file.
bool allow_different_codecs; bool allow_different_codecs;
/// Report decompression errors as CANNOT_DECOMPRESS, not CORRUPTED_DATA
bool external_data;
/// Read compressed data into compressed_buffer. Get size of decompressed data from block header. Checksum if need. /// Read compressed data into compressed_buffer. Get size of decompressed data from block header. Checksum if need.
/// ///
/// If always_copy is true then even if the compressed block is already stored in compressed_in.buffer() /// If always_copy is true then even if the compressed block is already stored in compressed_in.buffer()
@ -67,7 +70,7 @@ protected:
public: public:
/// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'. /// 'compressed_in' could be initialized lazily, but before first call of 'readCompressedData'.
explicit CompressedReadBufferBase(ReadBuffer * in = nullptr, bool allow_different_codecs_ = false); explicit CompressedReadBufferBase(ReadBuffer * in = nullptr, bool allow_different_codecs_ = false, bool external_data_ = false);
virtual ~CompressedReadBufferBase(); virtual ~CompressedReadBufferBase();
/** Disable checksums. /** Disable checksums.

View File

@ -14,12 +14,6 @@
namespace DB namespace DB
{ {
namespace ErrorCodes
{
extern const int CORRUPTED_DATA;
}
CompressionCodecMultiple::CompressionCodecMultiple(Codecs codecs_) CompressionCodecMultiple::CompressionCodecMultiple(Codecs codecs_)
: codecs(codecs_) : codecs(codecs_)
{ {
@ -79,7 +73,7 @@ UInt32 CompressionCodecMultiple::doCompressData(const char * source, UInt32 sour
void CompressionCodecMultiple::doDecompressData(const char * source, UInt32 source_size, char * dest, UInt32 decompressed_size) const void CompressionCodecMultiple::doDecompressData(const char * source, UInt32 source_size, char * dest, UInt32 decompressed_size) const
{ {
if (source_size < 1 || !source[0]) if (source_size < 1 || !source[0])
throw Exception(ErrorCodes::CORRUPTED_DATA, "Wrong compression methods list"); throw Exception(decompression_error_code, "Wrong compression methods list");
UInt8 compression_methods_size = source[0]; UInt8 compression_methods_size = source[0];
@ -95,10 +89,10 @@ void CompressionCodecMultiple::doDecompressData(const char * source, UInt32 sour
auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer(); auto additional_size_at_the_end_of_buffer = codec->getAdditionalSizeAtTheEndOfBuffer();
compressed_buf.resize(compressed_buf.size() + additional_size_at_the_end_of_buffer); compressed_buf.resize(compressed_buf.size() + additional_size_at_the_end_of_buffer);
UInt32 uncompressed_size = ICompressionCodec::readDecompressedBlockSize(compressed_buf.data()); UInt32 uncompressed_size = readDecompressedBlockSize(compressed_buf.data());
if (idx == 0 && uncompressed_size != decompressed_size) if (idx == 0 && uncompressed_size != decompressed_size)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Wrong final decompressed size in codec Multiple, got {}, expected {}", throw Exception(decompression_error_code, "Wrong final decompressed size in codec Multiple, got {}, expected {}",
uncompressed_size, decompressed_size); uncompressed_size, decompressed_size);
uncompressed_buf.resize(uncompressed_size + additional_size_at_the_end_of_buffer); uncompressed_buf.resize(uncompressed_size + additional_size_at_the_end_of_buffer);

View File

@ -15,8 +15,6 @@ namespace DB
namespace ErrorCodes namespace ErrorCodes
{ {
extern const int CANNOT_DECOMPRESS;
extern const int CORRUPTED_DATA;
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
} }
@ -97,14 +95,14 @@ UInt32 ICompressionCodec::decompress(const char * source, UInt32 source_size, ch
UInt8 header_size = getHeaderSize(); UInt8 header_size = getHeaderSize();
if (source_size < header_size) if (source_size < header_size)
throw Exception(ErrorCodes::CORRUPTED_DATA, throw Exception(decompression_error_code,
"Can't decompress data: the compressed data size ({}, this should include header size) " "Can't decompress data: the compressed data size ({}, this should include header size) "
"is less than the header size ({})", source_size, static_cast<size_t>(header_size)); "is less than the header size ({})", source_size, static_cast<size_t>(header_size));
uint8_t our_method = getMethodByte(); uint8_t our_method = getMethodByte();
uint8_t method = source[0]; uint8_t method = source[0];
if (method != our_method) if (method != our_method)
throw Exception(ErrorCodes::CANNOT_DECOMPRESS, "Can't decompress data with codec byte {} using codec with byte {}", method, our_method); throw Exception(decompression_error_code, "Can't decompress data with codec byte {} using codec with byte {}", method, our_method);
UInt32 decompressed_size = readDecompressedBlockSize(source); UInt32 decompressed_size = readDecompressedBlockSize(source);
doDecompressData(&source[header_size], source_size - header_size, dest, decompressed_size); doDecompressData(&source[header_size], source_size - header_size, dest, decompressed_size);
@ -112,20 +110,20 @@ UInt32 ICompressionCodec::decompress(const char * source, UInt32 source_size, ch
return decompressed_size; return decompressed_size;
} }
UInt32 ICompressionCodec::readCompressedBlockSize(const char * source) UInt32 ICompressionCodec::readCompressedBlockSize(const char * source) const
{ {
UInt32 compressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[1]); UInt32 compressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[1]);
if (compressed_block_size == 0) if (compressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with compressed block size 0"); throw Exception(decompression_error_code, "Can't decompress data: header is corrupt with compressed block size 0");
return compressed_block_size; return compressed_block_size;
} }
UInt32 ICompressionCodec::readDecompressedBlockSize(const char * source) UInt32 ICompressionCodec::readDecompressedBlockSize(const char * source) const
{ {
UInt32 decompressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[5]); UInt32 decompressed_block_size = unalignedLoadLittleEndian<UInt32>(&source[5]);
if (decompressed_block_size == 0) if (decompressed_block_size == 0)
throw Exception(ErrorCodes::CORRUPTED_DATA, "Can't decompress data: header is corrupt with decompressed block size 0"); throw Exception(decompression_error_code, "Can't decompress data: header is corrupt with decompressed block size 0");
return decompressed_block_size; return decompressed_block_size;
} }

View File

@ -13,6 +13,12 @@ namespace DB
extern "C" int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size); extern "C" int LLVMFuzzerTestOneInput(const uint8_t * data, size_t size);
namespace ErrorCodes
{
extern const int CANNOT_DECOMPRESS;
extern const int CORRUPTED_DATA;
}
/** /**
* Represents interface for compression codecs like LZ4, ZSTD, etc. * Represents interface for compression codecs like LZ4, ZSTD, etc.
*/ */
@ -59,7 +65,10 @@ public:
CodecMode getDecompressMode() const{ return decompressMode; } CodecMode getDecompressMode() const{ return decompressMode; }
/// if set mode to CodecMode::Asynchronous, must be followed with flushAsynchronousDecompressRequests /// if set mode to CodecMode::Asynchronous, must be followed with flushAsynchronousDecompressRequests
void setDecompressMode(CodecMode mode){ decompressMode = mode; } void setDecompressMode(CodecMode mode) { decompressMode = mode; }
/// Report decompression errors as CANNOT_DECOMPRESS, not CORRUPTED_DATA
void setExternalDataFlag() { decompression_error_code = ErrorCodes::CANNOT_DECOMPRESS; }
/// Flush result for previous asynchronous decompression requests. /// Flush result for previous asynchronous decompression requests.
/// This function must be called following several requests offload to HW. /// This function must be called following several requests offload to HW.
@ -82,10 +91,10 @@ public:
static constexpr UInt8 getHeaderSize() { return COMPRESSED_BLOCK_HEADER_SIZE; } static constexpr UInt8 getHeaderSize() { return COMPRESSED_BLOCK_HEADER_SIZE; }
/// Read size of compressed block from compressed source /// Read size of compressed block from compressed source
static UInt32 readCompressedBlockSize(const char * source); UInt32 readCompressedBlockSize(const char * source) const;
/// Read size of decompressed block from compressed source /// Read size of decompressed block from compressed source
static UInt32 readDecompressedBlockSize(const char * source); UInt32 readDecompressedBlockSize(const char * source) const;
/// Read method byte from compressed source /// Read method byte from compressed source
static uint8_t readMethod(const char * source); static uint8_t readMethod(const char * source);
@ -131,6 +140,8 @@ protected:
/// Construct and set codec description from codec name and arguments. Must be called in codec constructor. /// Construct and set codec description from codec name and arguments. Must be called in codec constructor.
void setCodecDescription(const String & name, const ASTs & arguments = {}); void setCodecDescription(const String & name, const ASTs & arguments = {});
int decompression_error_code = ErrorCodes::CORRUPTED_DATA;
private: private:
ASTPtr full_codec_desc; ASTPtr full_codec_desc;
CodecMode decompressMode{CodecMode::Synchronous}; CodecMode decompressMode{CodecMode::Synchronous};

View File

@ -36,7 +36,7 @@ void CoordinationSettings::loadFromConfig(const String & config_elem, const Poco
} }
const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld,rclc,clrs,ftfl"; const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif,rqld,rclc,clrs,ftfl,ydld";
KeeperConfigurationAndSettings::KeeperConfigurationAndSettings() KeeperConfigurationAndSettings::KeeperConfigurationAndSettings()
: server_id(NOT_EXIST) : server_id(NOT_EXIST)

View File

@ -172,6 +172,9 @@ void FourLetterCommandFactory::registerCommands(KeeperDispatcher & keeper_dispat
FourLetterCommandPtr feature_flags_command = std::make_shared<FeatureFlagsCommand>(keeper_dispatcher); FourLetterCommandPtr feature_flags_command = std::make_shared<FeatureFlagsCommand>(keeper_dispatcher);
factory.registerCommand(feature_flags_command); factory.registerCommand(feature_flags_command);
FourLetterCommandPtr yield_leadership_command = std::make_shared<YieldLeadershipCommand>(keeper_dispatcher);
factory.registerCommand(yield_leadership_command);
factory.initializeAllowList(keeper_dispatcher); factory.initializeAllowList(keeper_dispatcher);
factory.setInitialize(true); factory.setInitialize(true);
} }
@ -579,4 +582,10 @@ String FeatureFlagsCommand::run()
return ret.str(); return ret.str();
} }
String YieldLeadershipCommand::run()
{
keeper_dispatcher.yieldLeadership();
return "Sent yield leadership request to leader.";
}
} }

View File

@ -415,4 +415,17 @@ struct FeatureFlagsCommand : public IFourLetterCommand
~FeatureFlagsCommand() override = default; ~FeatureFlagsCommand() override = default;
}; };
/// Yield leadership and become follower.
struct YieldLeadershipCommand : public IFourLetterCommand
{
explicit YieldLeadershipCommand(KeeperDispatcher & keeper_dispatcher_)
: IFourLetterCommand(keeper_dispatcher_)
{
}
String name() override { return "ydld"; }
String run() override;
~YieldLeadershipCommand() override = default;
};
} }

View File

@ -237,6 +237,12 @@ public:
return server->requestLeader(); return server->requestLeader();
} }
/// Yield leadership and become follower.
void yieldLeadership()
{
return server->yieldLeadership();
}
void recalculateStorageStats() void recalculateStorageStats()
{ {
return server->recalculateStorageStats(); return server->recalculateStorageStats();

View File

@ -1087,6 +1087,12 @@ bool KeeperServer::requestLeader()
return isLeader() || raft_instance->request_leadership(); return isLeader() || raft_instance->request_leadership();
} }
void KeeperServer::yieldLeadership()
{
if (isLeader())
raft_instance->yield_leadership();
}
void KeeperServer::recalculateStorageStats() void KeeperServer::recalculateStorageStats()
{ {
state_machine->recalculateStorageStats(); state_machine->recalculateStorageStats();

View File

@ -141,6 +141,8 @@ public:
bool requestLeader(); bool requestLeader();
void yieldLeadership();
void recalculateStorageStats(); void recalculateStorageStats();
}; };

View File

@ -208,9 +208,8 @@ class IColumn;
M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental inverted index.", 0) \ M(Bool, allow_experimental_inverted_index, false, "If it is set to true, allow to use experimental inverted index.", 0) \
\ \
M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \ M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \
M(Bool, use_mysql_types_in_show_columns, false, "Show native MySQL types in SHOW [FULL] COLUMNS", 0) \ M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Will only take effect if use_mysql_types_in_show_columns is enabled too", 0) \ M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise. Will only take effect if use_mysql_types_in_show_columns is enabled too", 0) \
\ \
M(UInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \ M(UInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \
\ \
@ -288,7 +287,8 @@ class IColumn;
M(Bool, http_write_exception_in_output_format, true, "Write exception in output format to produce valid output. Works with JSON and XML formats.", 0) \ M(Bool, http_write_exception_in_output_format, true, "Write exception in output format to produce valid output. Works with JSON and XML formats.", 0) \
M(UInt64, http_response_buffer_size, 0, "The number of bytes to buffer in the server memory before sending a HTTP response to the client or flushing to disk (when http_wait_end_of_query is enabled).", 0) \ M(UInt64, http_response_buffer_size, 0, "The number of bytes to buffer in the server memory before sending a HTTP response to the client or flushing to disk (when http_wait_end_of_query is enabled).", 0) \
\ \
M(Bool, fsync_metadata, true, "Do fsync after changing metadata for tables and databases (.sql files). Could be disabled in case of poor latency on server with high load of DDL queries and high load of disk subsystem.", 0) \ M(Bool, fsync_metadata, true, "Do fsync after changing metadata for tables and databases (.sql files). Could be disabled in case of poor latency on server with high load of DDL queries and high load of disk subsystem.", 0) \
M(Bool, storage_metadata_write_full_object_key, false, "Write disk metadata files with VERSION_FULL_OBJECT_KEY format", 0) \
\ \
M(Bool, join_use_nulls, false, "Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type.", IMPORTANT) \ M(Bool, join_use_nulls, false, "Use NULLs for non-joined rows of outer JOINs for types that can be inside Nullable. If false, use default value of corresponding columns data type.", IMPORTANT) \
\ \
@ -837,6 +837,7 @@ class IColumn;
MAKE_OBSOLETE(M, Bool, allow_experimental_bigint_types, true) \ MAKE_OBSOLETE(M, Bool, allow_experimental_bigint_types, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_window_functions, true) \ MAKE_OBSOLETE(M, Bool, allow_experimental_window_functions, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_geo_types, true) \ MAKE_OBSOLETE(M, Bool, allow_experimental_geo_types, true) \
MAKE_OBSOLETE(M, Bool, allow_experimental_query_cache, true) \
\ \
MAKE_OBSOLETE(M, Milliseconds, async_insert_stale_timeout_ms, 0) \ MAKE_OBSOLETE(M, Milliseconds, async_insert_stale_timeout_ms, 0) \
MAKE_OBSOLETE(M, StreamingHandleErrorMode, handle_kafka_error_mode, StreamingHandleErrorMode::DEFAULT) \ MAKE_OBSOLETE(M, StreamingHandleErrorMode, handle_kafka_error_mode, StreamingHandleErrorMode::DEFAULT) \
@ -848,6 +849,7 @@ class IColumn;
MAKE_OBSOLETE(M, UInt64, merge_tree_clear_old_parts_interval_seconds, 1) \ MAKE_OBSOLETE(M, UInt64, merge_tree_clear_old_parts_interval_seconds, 1) \
MAKE_OBSOLETE(M, UInt64, partial_merge_join_optimizations, 0) \ MAKE_OBSOLETE(M, UInt64, partial_merge_join_optimizations, 0) \
MAKE_OBSOLETE(M, MaxThreads, max_alter_threads, 0) \ MAKE_OBSOLETE(M, MaxThreads, max_alter_threads, 0) \
MAKE_OBSOLETE(M, Bool, use_mysql_types_in_show_columns, false) \
/* moved to config.xml: see also src/Core/ServerSettings.h */ \ /* moved to config.xml: see also src/Core/ServerSettings.h */ \
MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_buffer_flush_schedule_pool_size, 16) \ MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_buffer_flush_schedule_pool_size, 16) \
MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_pool_size, 16) \ MAKE_DEPRECATED_BY_SERVER_CONFIG(M, UInt64, background_pool_size, 16) \
@ -884,6 +886,7 @@ class IColumn;
M(Bool, format_csv_allow_single_quotes, false, "If it is set to true, allow strings in single quotes.", 0) \ M(Bool, format_csv_allow_single_quotes, false, "If it is set to true, allow strings in single quotes.", 0) \
M(Bool, format_csv_allow_double_quotes, true, "If it is set to true, allow strings in double quotes.", 0) \ M(Bool, format_csv_allow_double_quotes, true, "If it is set to true, allow strings in double quotes.", 0) \
M(Bool, output_format_csv_crlf_end_of_line, false, "If it is set true, end of line in CSV format will be \\r\\n instead of \\n.", 0) \ M(Bool, output_format_csv_crlf_end_of_line, false, "If it is set true, end of line in CSV format will be \\r\\n instead of \\n.", 0) \
M(Bool, input_format_csv_allow_cr_end_of_line, false, "If it is set true, \\r will be allowed at end of line not followed by \\n", 0) \
M(Bool, input_format_csv_enum_as_number, false, "Treat inserted enum values in CSV formats as enum indices", 0) \ M(Bool, input_format_csv_enum_as_number, false, "Treat inserted enum values in CSV formats as enum indices", 0) \
M(Bool, input_format_csv_arrays_as_nested_csv, false, R"(When reading Array from CSV, expect that its elements were serialized in nested CSV and then put into string. Example: "[""Hello"", ""world"", ""42"""" TV""]". Braces around array can be omitted.)", 0) \ M(Bool, input_format_csv_arrays_as_nested_csv, false, R"(When reading Array from CSV, expect that its elements were serialized in nested CSV and then put into string. Example: "[""Hello"", ""world"", ""42"""" TV""]". Braces around array can be omitted.)", 0) \
M(Bool, input_format_skip_unknown_fields, true, "Skip columns with unknown names from input data (it works for JSONEachRow, -WithNames, -WithNamesAndTypes and TSKV formats).", 0) \ M(Bool, input_format_skip_unknown_fields, true, "Skip columns with unknown names from input data (it works for JSONEachRow, -WithNames, -WithNamesAndTypes and TSKV formats).", 0) \

View File

@ -363,6 +363,9 @@ struct WhichDataType
constexpr bool isNativeInt() const { return isInt8() || isInt16() || isInt32() || isInt64(); } constexpr bool isNativeInt() const { return isInt8() || isInt16() || isInt32() || isInt64(); }
constexpr bool isInt() const { return isNativeInt() || isInt128() || isInt256(); } constexpr bool isInt() const { return isNativeInt() || isInt128() || isInt256(); }
constexpr bool isNativeInteger() const { return isNativeInt() || isNativeUInt(); }
constexpr bool isInteger() const { return isInt() || isUInt(); }
constexpr bool isDecimal32() const { return idx == TypeIndex::Decimal32; } constexpr bool isDecimal32() const { return idx == TypeIndex::Decimal32; }
constexpr bool isDecimal64() const { return idx == TypeIndex::Decimal64; } constexpr bool isDecimal64() const { return idx == TypeIndex::Decimal64; }
constexpr bool isDecimal128() const { return idx == TypeIndex::Decimal128; } constexpr bool isDecimal128() const { return idx == TypeIndex::Decimal128; }
@ -373,6 +376,9 @@ struct WhichDataType
constexpr bool isFloat64() const { return idx == TypeIndex::Float64; } constexpr bool isFloat64() const { return idx == TypeIndex::Float64; }
constexpr bool isFloat() const { return isFloat32() || isFloat64(); } constexpr bool isFloat() const { return isFloat32() || isFloat64(); }
constexpr bool isNativeNumber() const { return isNativeInteger() || isFloat(); }
constexpr bool isNumber() const { return isInteger() || isFloat() || isDecimal(); }
constexpr bool isEnum8() const { return idx == TypeIndex::Enum8; } constexpr bool isEnum8() const { return idx == TypeIndex::Enum8; }
constexpr bool isEnum16() const { return idx == TypeIndex::Enum16; } constexpr bool isEnum16() const { return idx == TypeIndex::Enum16; }
constexpr bool isEnum() const { return isEnum8() || isEnum16(); } constexpr bool isEnum() const { return isEnum8() || isEnum16(); }
@ -410,110 +416,60 @@ struct WhichDataType
/// IDataType helpers (alternative for IDataType virtual methods with single point of truth) /// IDataType helpers (alternative for IDataType virtual methods with single point of truth)
template <typename T> template <typename T> inline bool isUInt8(const T & data_type) { return WhichDataType(data_type).isUInt8(); }
inline bool isDate(const T & data_type) { return WhichDataType(data_type).isDate(); } template <typename T> inline bool isUInt16(const T & data_type) { return WhichDataType(data_type).isUInt16(); }
template <typename T> template <typename T> inline bool isUInt32(const T & data_type) { return WhichDataType(data_type).isUInt32(); }
inline bool isDate32(const T & data_type) { return WhichDataType(data_type).isDate32(); } template <typename T> inline bool isUInt64(const T & data_type) { return WhichDataType(data_type).isUInt64(); }
template <typename T> template <typename T> inline bool isNativeUInt(const T & data_type) { return WhichDataType(data_type).isNativeUInt(); }
inline bool isDateOrDate32(const T & data_type) { return WhichDataType(data_type).isDateOrDate32(); } template <typename T> inline bool isUInt(const T & data_type) { return WhichDataType(data_type).isUInt(); }
template <typename T>
inline bool isDateTime(const T & data_type) { return WhichDataType(data_type).isDateTime(); }
template <typename T>
inline bool isDateTime64(const T & data_type) { return WhichDataType(data_type).isDateTime64(); }
template <typename T>
inline bool isDateTimeOrDateTime64(const T & data_type) { return WhichDataType(data_type).isDateTimeOrDateTime64(); }
template <typename T>
inline bool isDateOrDate32OrDateTimeOrDateTime64(const T & data_type) { return WhichDataType(data_type).isDateOrDate32OrDateTimeOrDateTime64(); }
template <typename T> template <typename T> inline bool isInt8(const T & data_type) { return WhichDataType(data_type).isInt8(); }
inline bool isEnum(const T & data_type) { return WhichDataType(data_type).isEnum(); } template <typename T> inline bool isInt16(const T & data_type) { return WhichDataType(data_type).isInt16(); }
template <typename T> template <typename T> inline bool isInt32(const T & data_type) { return WhichDataType(data_type).isInt32(); }
inline bool isDecimal(const T & data_type) { return WhichDataType(data_type).isDecimal(); } template <typename T> inline bool isInt64(const T & data_type) { return WhichDataType(data_type).isInt64(); }
template <typename T> template <typename T> inline bool isNativeInt(const T & data_type) { return WhichDataType(data_type).isNativeInt(); }
inline bool isTuple(const T & data_type) { return WhichDataType(data_type).isTuple(); } template <typename T> inline bool isInt(const T & data_type) { return WhichDataType(data_type).isInt(); }
template <typename T>
inline bool isArray(const T & data_type) { return WhichDataType(data_type).isArray(); }
template <typename T>
inline bool isMap(const T & data_type) {return WhichDataType(data_type).isMap(); }
template <typename T>
inline bool isInterval(const T & data_type) {return WhichDataType(data_type).isInterval(); }
template <typename T>
inline bool isNothing(const T & data_type) { return WhichDataType(data_type).isNothing(); }
template <typename T>
inline bool isUUID(const T & data_type) { return WhichDataType(data_type).isUUID(); }
template <typename T>
inline bool isIPv4(const T & data_type) { return WhichDataType(data_type).isIPv4(); }
template <typename T>
inline bool isIPv6(const T & data_type) { return WhichDataType(data_type).isIPv6(); }
template <typename T> template <typename T> inline bool isInteger(const T & data_type) { return WhichDataType(data_type).isInteger(); }
inline bool isObject(const T & data_type) { return WhichDataType(data_type).isObject(); } template <typename T> inline bool isNativeInteger(const T & data_type) { return WhichDataType(data_type).isNativeInteger(); }
template <typename T> template <typename T> inline bool isDecimal(const T & data_type) { return WhichDataType(data_type).isDecimal(); }
inline bool isUInt8(const T & data_type) { return WhichDataType(data_type).isUInt8(); }
template <typename T>
inline bool isUInt16(const T & data_type) { return WhichDataType(data_type).isUInt16(); }
template <typename T>
inline bool isUInt32(const T & data_type) { return WhichDataType(data_type).isUInt32(); }
template <typename T>
inline bool isUInt64(const T & data_type) { return WhichDataType(data_type).isUInt64(); }
template <typename T>
inline bool isNativeUnsignedInteger(const T & data_type) { return WhichDataType(data_type).isNativeUInt(); }
template <typename T>
inline bool isUnsignedInteger(const T & data_type) { return WhichDataType(data_type).isUInt(); }
template <typename T> template <typename T> inline bool isFloat(const T & data_type) { return WhichDataType(data_type).isFloat(); }
inline bool isInt8(const T & data_type) { return WhichDataType(data_type).isInt8(); }
template <typename T>
inline bool isInt16(const T & data_type) { return WhichDataType(data_type).isInt16(); }
template <typename T>
inline bool isInt32(const T & data_type) { return WhichDataType(data_type).isInt32(); }
template <typename T>
inline bool isInt64(const T & data_type) { return WhichDataType(data_type).isInt64(); }
template <typename T>
inline bool isInt(const T & data_type) { return WhichDataType(data_type).isInt(); }
template <typename T> template <typename T> inline bool isNativeNumber(const T & data_type) { return WhichDataType(data_type).isNativeNumber(); }
inline bool isInteger(const T & data_type) template <typename T> inline bool isNumber(const T & data_type) { return WhichDataType(data_type).isNumber(); }
{
WhichDataType which(data_type);
return which.isInt() || which.isUInt();
}
template <typename T> template <typename T> inline bool isEnum(const T & data_type) { return WhichDataType(data_type).isEnum(); }
inline bool isFloat(const T & data_type)
{
WhichDataType which(data_type);
return which.isFloat();
}
template <typename T> template <typename T> inline bool isDate(const T & data_type) { return WhichDataType(data_type).isDate(); }
inline bool isNativeInteger(const T & data_type) template <typename T> inline bool isDate32(const T & data_type) { return WhichDataType(data_type).isDate32(); }
{ template <typename T> inline bool isDateOrDate32(const T & data_type) { return WhichDataType(data_type).isDateOrDate32(); }
WhichDataType which(data_type); template <typename T> inline bool isDateTime(const T & data_type) { return WhichDataType(data_type).isDateTime(); }
return which.isNativeInt() || which.isNativeUInt(); template <typename T> inline bool isDateTime64(const T & data_type) { return WhichDataType(data_type).isDateTime64(); }
} template <typename T> inline bool isDateTimeOrDateTime64(const T & data_type) { return WhichDataType(data_type).isDateTimeOrDateTime64(); }
template <typename T> inline bool isDateOrDate32OrDateTimeOrDateTime64(const T & data_type) { return WhichDataType(data_type).isDateOrDate32OrDateTimeOrDateTime64(); }
template <typename T> inline bool isString(const T & data_type) { return WhichDataType(data_type).isString(); }
template <typename T> inline bool isFixedString(const T & data_type) { return WhichDataType(data_type).isFixedString(); }
template <typename T> inline bool isStringOrFixedString(const T & data_type) { return WhichDataType(data_type).isStringOrFixedString(); }
template <typename T> template <typename T> inline bool isUUID(const T & data_type) { return WhichDataType(data_type).isUUID(); }
inline bool isNativeNumber(const T & data_type) template <typename T> inline bool isIPv4(const T & data_type) { return WhichDataType(data_type).isIPv4(); }
{ template <typename T> inline bool isIPv6(const T & data_type) { return WhichDataType(data_type).isIPv6(); }
WhichDataType which(data_type); template <typename T> inline bool isArray(const T & data_type) { return WhichDataType(data_type).isArray(); }
return which.isNativeInt() || which.isNativeUInt() || which.isFloat(); template <typename T> inline bool isTuple(const T & data_type) { return WhichDataType(data_type).isTuple(); }
} template <typename T> inline bool isMap(const T & data_type) {return WhichDataType(data_type).isMap(); }
template <typename T> inline bool isInterval(const T & data_type) {return WhichDataType(data_type).isInterval(); }
template <typename T> inline bool isObject(const T & data_type) { return WhichDataType(data_type).isObject(); }
template <typename T> template <typename T> inline bool isNothing(const T & data_type) { return WhichDataType(data_type).isNothing(); }
inline bool isNumber(const T & data_type)
{
WhichDataType which(data_type);
return which.isInt() || which.isUInt() || which.isFloat() || which.isDecimal();
}
template <typename T> template <typename T>
inline bool isColumnedAsNumber(const T & data_type) inline bool isColumnedAsNumber(const T & data_type)
{ {
WhichDataType which(data_type); WhichDataType which(data_type);
return which.isInt() || which.isUInt() || which.isFloat() || which.isDateOrDate32() || which.isDateTime() || which.isDateTime64() || which.isUUID() || which.isIPv4() || which.isIPv6(); return which.isInteger() || which.isFloat() || which.isDateOrDate32OrDateTimeOrDateTime64() || which.isUUID() || which.isIPv4() || which.isIPv6();
} }
template <typename T> template <typename T>
@ -531,24 +487,6 @@ inline bool isColumnedAsDecimalT(const DataType & data_type)
return (which.isDecimal() || which.isDateTime64()) && which.idx == TypeToTypeIndex<T>; return (which.isDecimal() || which.isDateTime64()) && which.idx == TypeToTypeIndex<T>;
} }
template <typename T>
inline bool isString(const T & data_type)
{
return WhichDataType(data_type).isString();
}
template <typename T>
inline bool isFixedString(const T & data_type)
{
return WhichDataType(data_type).isFixedString();
}
template <typename T>
inline bool isStringOrFixedString(const T & data_type)
{
return WhichDataType(data_type).isStringOrFixedString();
}
template <typename T> template <typename T>
inline bool isNotCreatable(const T & data_type) inline bool isNotCreatable(const T & data_type)
{ {
@ -567,12 +505,6 @@ inline bool isBool(const DataTypePtr & data_type)
return data_type->getName() == "Bool"; return data_type->getName() == "Bool";
} }
inline bool isAggregateFunction(const DataTypePtr & data_type)
{
WhichDataType which(data_type);
return which.isAggregateFunction();
}
inline bool isNullableOrLowCardinalityNullable(const DataTypePtr & data_type) inline bool isNullableOrLowCardinalityNullable(const DataTypePtr & data_type)
{ {
return data_type->isNullable() || data_type->isLowCardinalityNullable(); return data_type->isNullable() || data_type->isLowCardinalityNullable();

View File

@ -396,18 +396,20 @@ std::string ExternalQueryBuilder::composeLoadKeysQuery(
} }
else else
{ {
writeString(query, out);
auto condition_position = query.find(CONDITION_PLACEHOLDER_TO_REPLACE_VALUE); auto condition_position = query.find(CONDITION_PLACEHOLDER_TO_REPLACE_VALUE);
if (condition_position == std::string::npos) if (condition_position == std::string::npos)
{ {
writeString(" WHERE ", out); writeString("SELECT * FROM (", out);
writeString(query, out);
writeString(") WHERE ", out);
composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, out); composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, out);
writeString(";", out); writeString(";", out);
return out.str(); return out.str();
} }
writeString(query, out);
WriteBufferFromOwnString condition_value_buffer; WriteBufferFromOwnString condition_value_buffer;
composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, condition_value_buffer); composeKeysCondition(key_columns, requested_rows, method, partition_key_prefix, condition_value_buffer);
const auto & condition_value = condition_value_buffer.str(); const auto & condition_value = condition_value_buffer.str();

View File

@ -302,12 +302,14 @@ public:
struct LocalPathWithObjectStoragePaths struct LocalPathWithObjectStoragePaths
{ {
std::string local_path; std::string local_path;
std::string common_prefix_for_objects;
StoredObjects objects; StoredObjects objects;
LocalPathWithObjectStoragePaths( LocalPathWithObjectStoragePaths(
const std::string & local_path_, const std::string & common_prefix_for_objects_, StoredObjects && objects_) const std::string & local_path_,
: local_path(local_path_), common_prefix_for_objects(common_prefix_for_objects_), objects(std::move(objects_)) {} StoredObjects && objects_)
: local_path(local_path_)
, objects(std::move(objects_))
{}
}; };
virtual void getRemotePathsRecursive(const String &, std::vector<LocalPathWithObjectStoragePaths> &) virtual void getRemotePathsRecursive(const String &, std::vector<LocalPathWithObjectStoragePaths> &)

View File

@ -102,9 +102,9 @@ AzureObjectStorage::AzureObjectStorage(
data_source_description.is_encrypted = false; data_source_description.is_encrypted = false;
} }
std::string AzureObjectStorage::generateBlobNameForPath(const std::string & /* path */) ObjectStorageKey AzureObjectStorage::generateObjectKeyForPath(const std::string & /* path */) const
{ {
return getRandomASCIIString(32); return ObjectStorageKey::createAsRelative(getRandomASCIIString(32));
} }
bool AzureObjectStorage::exists(const StoredObject & object) const bool AzureObjectStorage::exists(const StoredObject & object) const
@ -320,18 +320,7 @@ void AzureObjectStorage::removeObjectsIfExist(const StoredObjects & objects)
auto client_ptr = client.get(); auto client_ptr = client.get();
for (const auto & object : objects) for (const auto & object : objects)
{ {
try removeObjectIfExists(object);
{
auto delete_info = client_ptr->DeleteBlob(object.remote_path);
}
catch (const Azure::Storage::StorageException & e)
{
/// If object doesn't exist...
if (e.StatusCode == Azure::Core::Http::HttpStatusCode::NotFound)
return;
tryLogCurrentException(__PRETTY_FUNCTION__);
throw;
}
} }
} }

View File

@ -121,7 +121,7 @@ public:
const std::string & config_prefix, const std::string & config_prefix,
ContextPtr context) override; ContextPtr context) override;
std::string generateBlobNameForPath(const std::string & path) override; ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return true; } bool isRemote() const override { return true; }

View File

@ -31,11 +31,12 @@ void registerDiskAzureBlobStorage(DiskFactory & factory, bool global_skip_access
getAzureBlobContainerClient(config, config_prefix), getAzureBlobContainerClient(config, config_prefix),
getAzureBlobStorageSettings(config, config_prefix, context)); getAzureBlobStorageSettings(config, config_prefix, context));
auto metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, ""); String key_prefix;
auto metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, key_prefix);
std::shared_ptr<IDisk> azure_blob_storage_disk = std::make_shared<DiskObjectStorage>( std::shared_ptr<IDisk> azure_blob_storage_disk = std::make_shared<DiskObjectStorage>(
name, name,
/* no namespaces */"", /* no namespaces */ key_prefix,
"DiskAzureBlobStorage", "DiskAzureBlobStorage",
std::move(metadata_storage), std::move(metadata_storage),
std::move(azure_object_storage), std::move(azure_object_storage),

View File

@ -42,9 +42,9 @@ FileCache::Key CachedObjectStorage::getCacheKey(const std::string & path) const
return cache->createKeyForPath(path); return cache->createKeyForPath(path);
} }
std::string CachedObjectStorage::generateBlobNameForPath(const std::string & path) ObjectStorageKey CachedObjectStorage::generateObjectKeyForPath(const std::string & path) const
{ {
return object_storage->generateBlobNameForPath(path); return object_storage->generateObjectKeyForPath(path);
} }
ReadSettings CachedObjectStorage::patchSettings(const ReadSettings & read_settings) const ReadSettings CachedObjectStorage::patchSettings(const ReadSettings & read_settings) const

View File

@ -92,7 +92,7 @@ public:
const std::string & getCacheName() const override { return cache_config_name; } const std::string & getCacheName() const override { return cache_config_name; }
std::string generateBlobNameForPath(const std::string & path) override; ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return object_storage->isRemote(); } bool isRemote() const override { return object_storage->isRemote(); }

View File

@ -48,14 +48,14 @@ DiskTransactionPtr DiskObjectStorage::createObjectStorageTransaction()
DiskObjectStorage::DiskObjectStorage( DiskObjectStorage::DiskObjectStorage(
const String & name_, const String & name_,
const String & object_storage_root_path_, const String & object_key_prefix_,
const String & log_name, const String & log_name,
MetadataStoragePtr metadata_storage_, MetadataStoragePtr metadata_storage_,
ObjectStoragePtr object_storage_, ObjectStoragePtr object_storage_,
const Poco::Util::AbstractConfiguration & config, const Poco::Util::AbstractConfiguration & config,
const String & config_prefix) const String & config_prefix)
: IDisk(name_, config, config_prefix) : IDisk(name_, config, config_prefix)
, object_storage_root_path(object_storage_root_path_) , object_key_prefix(object_key_prefix_)
, log (&Poco::Logger::get("DiskObjectStorage(" + log_name + ")")) , log (&Poco::Logger::get("DiskObjectStorage(" + log_name + ")"))
, metadata_storage(std::move(metadata_storage_)) , metadata_storage(std::move(metadata_storage_))
, object_storage(std::move(object_storage_)) , object_storage(std::move(object_storage_))
@ -80,7 +80,7 @@ void DiskObjectStorage::getRemotePathsRecursive(const String & local_path, std::
{ {
try try
{ {
paths_map.emplace_back(local_path, metadata_storage->getObjectStorageRootPath(), getStorageObjects(local_path)); paths_map.emplace_back(local_path, getStorageObjects(local_path));
} }
catch (const Exception & e) catch (const Exception & e)
{ {
@ -243,9 +243,9 @@ String DiskObjectStorage::getUniqueId(const String & path) const
bool DiskObjectStorage::checkUniqueId(const String & id) const bool DiskObjectStorage::checkUniqueId(const String & id) const
{ {
if (!id.starts_with(object_storage_root_path)) if (!id.starts_with(object_key_prefix))
{ {
LOG_DEBUG(log, "Blob with id {} doesn't start with blob storage prefix {}, Stack {}", id, object_storage_root_path, StackTrace().toString()); LOG_DEBUG(log, "Blob with id {} doesn't start with blob storage prefix {}, Stack {}", id, object_key_prefix, StackTrace().toString());
return false; return false;
} }
@ -470,7 +470,7 @@ DiskObjectStoragePtr DiskObjectStorage::createDiskObjectStorage()
const auto config_prefix = "storage_configuration.disks." + name; const auto config_prefix = "storage_configuration.disks." + name;
return std::make_shared<DiskObjectStorage>( return std::make_shared<DiskObjectStorage>(
getName(), getName(),
object_storage_root_path, object_key_prefix,
getName(), getName(),
metadata_storage, metadata_storage,
object_storage, object_storage,
@ -586,7 +586,7 @@ void DiskObjectStorage::restoreMetadataIfNeeded(
{ {
metadata_helper->restore(config, config_prefix, context); metadata_helper->restore(config, config_prefix, context);
auto current_schema_version = metadata_helper->readSchemaVersion(object_storage.get(), object_storage_root_path); auto current_schema_version = metadata_helper->readSchemaVersion(object_storage.get(), object_key_prefix);
if (current_schema_version < DiskObjectStorageRemoteMetadataRestoreHelper::RESTORABLE_SCHEMA_VERSION) if (current_schema_version < DiskObjectStorageRemoteMetadataRestoreHelper::RESTORABLE_SCHEMA_VERSION)
metadata_helper->migrateToRestorableSchema(); metadata_helper->migrateToRestorableSchema();

View File

@ -37,7 +37,7 @@ friend class DiskObjectStorageRemoteMetadataRestoreHelper;
public: public:
DiskObjectStorage( DiskObjectStorage(
const String & name_, const String & name_,
const String & object_storage_root_path_, const String & object_key_prefix_,
const String & log_name, const String & log_name,
MetadataStoragePtr metadata_storage_, MetadataStoragePtr metadata_storage_,
ObjectStoragePtr object_storage_, ObjectStoragePtr object_storage_,
@ -224,7 +224,7 @@ private:
String getReadResourceName() const; String getReadResourceName() const;
String getWriteResourceName() const; String getWriteResourceName() const;
const String object_storage_root_path; const String object_key_prefix;
Poco::Logger * log; Poco::Logger * log;
MetadataStoragePtr metadata_storage; MetadataStoragePtr metadata_storage;

View File

@ -7,6 +7,8 @@
#include <IO/WriteBufferFromFileBase.h> #include <IO/WriteBufferFromFileBase.h>
#include <Common/logger_useful.h> #include <Common/logger_useful.h>
#include <Interpreters/Context.h>
namespace DB namespace DB
{ {
@ -17,44 +19,57 @@ namespace ErrorCodes
void DiskObjectStorageMetadata::deserialize(ReadBuffer & buf) void DiskObjectStorageMetadata::deserialize(ReadBuffer & buf)
{ {
UInt32 version;
readIntText(version, buf); readIntText(version, buf);
if (version < VERSION_ABSOLUTE_PATHS || version > VERSION_INLINE_DATA) if (version < VERSION_ABSOLUTE_PATHS || version > VERSION_FULL_OBJECT_KEY)
throw Exception( throw Exception(
ErrorCodes::UNKNOWN_FORMAT, ErrorCodes::UNKNOWN_FORMAT,
"Unknown metadata file version. Path: {}. Version: {}. Maximum expected version: {}", "Unknown metadata file version. Path: {}. Version: {}. Maximum expected version: {}",
common_metadata_path + metadata_file_path, toString(version), toString(VERSION_READ_ONLY_FLAG)); metadata_file_path, toString(version), toString(VERSION_FULL_OBJECT_KEY));
assertChar('\n', buf); assertChar('\n', buf);
UInt32 storage_objects_count; UInt32 keys_count;
readIntText(storage_objects_count, buf); readIntText(keys_count, buf);
assertChar('\t', buf); assertChar('\t', buf);
keys_with_meta.resize(keys_count);
readIntText(total_size, buf); readIntText(total_size, buf);
assertChar('\n', buf); assertChar('\n', buf);
storage_objects.resize(storage_objects_count);
for (size_t i = 0; i < storage_objects_count; ++i) for (UInt32 i = 0; i < keys_count; ++i)
{ {
String object_relative_path; UInt64 object_size;
size_t object_size;
readIntText(object_size, buf); readIntText(object_size, buf);
assertChar('\t', buf); assertChar('\t', buf);
readEscapedString(object_relative_path, buf);
if (version == VERSION_ABSOLUTE_PATHS)
{
if (!object_relative_path.starts_with(object_storage_root_path))
throw Exception(ErrorCodes::UNKNOWN_FORMAT,
"Path in metadata does not correspond to root path. Path: {}, root path: {}, disk path: {}",
object_relative_path, object_storage_root_path, common_metadata_path);
object_relative_path = object_relative_path.substr(object_storage_root_path.size()); keys_with_meta[i].metadata.size_bytes = object_size;
}
String key_value;
readEscapedString(key_value, buf);
assertChar('\n', buf); assertChar('\n', buf);
storage_objects[i].relative_path = object_relative_path; if (version == VERSION_ABSOLUTE_PATHS)
storage_objects[i].metadata.size_bytes = object_size; {
if (!key_value.starts_with(compatible_key_prefix))
throw Exception(
ErrorCodes::UNKNOWN_FORMAT,
"Path in metadata does not correspond to root path. Path: {}, root path: {}, disk path: {}",
key_value,
compatible_key_prefix,
metadata_file_path);
keys_with_meta[i].key = ObjectStorageKey::createAsRelative(
compatible_key_prefix, key_value.substr(compatible_key_prefix.size()));
}
else if (version < VERSION_FULL_OBJECT_KEY)
{
keys_with_meta[i].key = ObjectStorageKey::createAsRelative(compatible_key_prefix, key_value);
}
else if (version >= VERSION_FULL_OBJECT_KEY)
{
keys_with_meta[i].key = ObjectStorageKey::createAsAbsolute(key_value);
}
} }
readIntText(ref_count, buf); readIntText(ref_count, buf);
@ -73,7 +88,7 @@ void DiskObjectStorageMetadata::deserialize(ReadBuffer & buf)
} }
} }
void DiskObjectStorageMetadata::deserializeFromString(const std::string & data) void DiskObjectStorageMetadata::deserializeFromString(const String & data)
{ {
ReadBufferFromString buf(data); ReadBufferFromString buf(data);
deserialize(buf); deserialize(buf);
@ -81,21 +96,55 @@ void DiskObjectStorageMetadata::deserializeFromString(const std::string & data)
void DiskObjectStorageMetadata::serialize(WriteBuffer & buf, bool sync) const void DiskObjectStorageMetadata::serialize(WriteBuffer & buf, bool sync) const
{ {
writeIntText(VERSION_INLINE_DATA, buf); /// These are the changes for backward compatibility
/// No new file should be written as VERSION_FULL_OBJECT_KEY until storage_metadata_write_full_object_key feature is enabled
/// However, in case of rollback, once file had been written as VERSION_FULL_OBJECT_KEY
/// it has to be always rewritten as VERSION_FULL_OBJECT_KEY
bool storage_metadata_write_full_object_key = getWriteFullObjectKeySetting();
if (version == VERSION_FULL_OBJECT_KEY && !storage_metadata_write_full_object_key)
{
Poco::Logger * logger = &Poco::Logger::get("DiskObjectStorageMetadata");
LOG_WARNING(
logger,
"Metadata file {} is written with VERSION_FULL_OBJECT_KEY version"
"However storage_metadata_write_full_object_key is off.",
metadata_file_path);
}
UInt32 write_version = version;
if (storage_metadata_write_full_object_key)
write_version = VERSION_FULL_OBJECT_KEY;
chassert(write_version >= VERSION_ABSOLUTE_PATHS && write_version <= VERSION_FULL_OBJECT_KEY);
writeIntText(write_version, buf);
writeChar('\n', buf); writeChar('\n', buf);
writeIntText(storage_objects.size(), buf); writeIntText(keys_with_meta.size(), buf);
writeChar('\t', buf); writeChar('\t', buf);
writeIntText(total_size, buf); writeIntText(total_size, buf);
writeChar('\n', buf); writeChar('\n', buf);
for (const auto & [object_relative_path, object_metadata] : storage_objects) for (const auto & [object_key, object_meta] : keys_with_meta)
{ {
writeIntText(object_metadata.size_bytes, buf); writeIntText(object_meta.size_bytes, buf);
writeChar('\t', buf); writeChar('\t', buf);
writeEscapedString(object_relative_path, buf);
writeChar('\n', buf); if (write_version == VERSION_FULL_OBJECT_KEY)
{
/// if the metadata file has VERSION_FULL_OBJECT_KEY version
/// all keys inside are written as absolute paths
writeEscapedString(object_key.serialize(), buf);
writeChar('\n', buf);
}
else
{
/// otherwise keys are written as relative paths
/// therefore keys have to have suffix and prefix
writeEscapedString(object_key.getSuffix(), buf);
writeChar('\n', buf);
}
} }
writeIntText(ref_count, buf); writeIntText(ref_count, buf);
@ -104,11 +153,6 @@ void DiskObjectStorageMetadata::serialize(WriteBuffer & buf, bool sync) const
writeBoolText(read_only, buf); writeBoolText(read_only, buf);
writeChar('\n', buf); writeChar('\n', buf);
/// Metadata version describes the format of the file
/// It determines the possibility of writing and reading a particular set of fields from the file, no matter the fields' values.
/// It should not be dependent on field values.
/// We always write inline_data in the file when we declare VERSION_INLINE_DATA as a file version,
/// unless it is impossible to introduce the next version of the format.
writeEscapedString(inline_data, buf); writeEscapedString(inline_data, buf);
writeChar('\n', buf); writeChar('\n', buf);
@ -117,7 +161,7 @@ void DiskObjectStorageMetadata::serialize(WriteBuffer & buf, bool sync) const
buf.sync(); buf.sync();
} }
std::string DiskObjectStorageMetadata::serializeToString() const String DiskObjectStorageMetadata::serializeToString() const
{ {
WriteBufferFromOwnString result; WriteBufferFromOwnString result;
serialize(result, false); serialize(result, false);
@ -126,20 +170,44 @@ std::string DiskObjectStorageMetadata::serializeToString() const
/// Load metadata by path or create empty if `create` flag is set. /// Load metadata by path or create empty if `create` flag is set.
DiskObjectStorageMetadata::DiskObjectStorageMetadata( DiskObjectStorageMetadata::DiskObjectStorageMetadata(
const std::string & common_metadata_path_, String compatible_key_prefix_,
const String & object_storage_root_path_, String metadata_file_path_)
const String & metadata_file_path_) : compatible_key_prefix(std::move(compatible_key_prefix_))
: common_metadata_path(common_metadata_path_) , metadata_file_path(std::move(metadata_file_path_))
, object_storage_root_path(object_storage_root_path_)
, metadata_file_path(metadata_file_path_)
{ {
} }
void DiskObjectStorageMetadata::addObject(const String & path, size_t size) void DiskObjectStorageMetadata::addObject(ObjectStorageKey key, size_t size)
{ {
if (!key.hasPrefix())
{
version = VERSION_FULL_OBJECT_KEY;
bool storage_metadata_write_full_object_key = getWriteFullObjectKeySetting();
if (!storage_metadata_write_full_object_key)
{
Poco::Logger * logger = &Poco::Logger::get("DiskObjectStorageMetadata");
LOG_WARNING(
logger,
"Metadata file {} has at least one key {} without fixed common key prefix."
"That forces using VERSION_FULL_OBJECT_KEY version for that metadata file."
"However storage_metadata_write_full_object_key is off.",
metadata_file_path,
key.serialize());
}
}
total_size += size; total_size += size;
storage_objects.emplace_back(path, ObjectMetadata{size, {}, {}}); keys_with_meta.emplace_back(std::move(key), ObjectMetadata{size, {}, {}});
} }
bool DiskObjectStorageMetadata::getWriteFullObjectKeySetting()
{
#ifndef CLICKHOUSE_KEEPER_STANDALONE_BUILD
return Context::getGlobalContextInstance()->getSettings().storage_metadata_write_full_object_key;
#else
return false;
#endif
}
} }

View File

@ -13,29 +13,30 @@ struct DiskObjectStorageMetadata
{ {
private: private:
/// Metadata file version. /// Metadata file version.
static constexpr uint32_t VERSION_ABSOLUTE_PATHS = 1; static constexpr UInt32 VERSION_ABSOLUTE_PATHS = 1;
static constexpr uint32_t VERSION_RELATIVE_PATHS = 2; static constexpr UInt32 VERSION_RELATIVE_PATHS = 2;
static constexpr uint32_t VERSION_READ_ONLY_FLAG = 3; static constexpr UInt32 VERSION_READ_ONLY_FLAG = 3;
static constexpr uint32_t VERSION_INLINE_DATA = 4; static constexpr UInt32 VERSION_INLINE_DATA = 4;
static constexpr UInt32 VERSION_FULL_OBJECT_KEY = 5; /// only for reading data
const std::string & common_metadata_path; UInt32 version = VERSION_INLINE_DATA;
/// Relative paths of blobs. /// Absolute paths of blobs
RelativePathsWithMetadata storage_objects; ObjectKeysWithMetadata keys_with_meta;
const std::string object_storage_root_path; const std::string compatible_key_prefix;
/// Relative path to metadata file on local FS. /// Relative path to metadata file on local FS.
const std::string metadata_file_path; const std::string metadata_file_path;
/// Total size of all remote FS (S3, HDFS) objects. /// Total size of all remote FS (S3, HDFS) objects.
size_t total_size = 0; UInt64 total_size = 0;
/// Number of references (hardlinks) to this metadata file. /// Number of references (hardlinks) to this metadata file.
/// ///
/// FIXME: Why we are tracking it explicitly, without /// FIXME: Why we are tracking it explicitly, without
/// info from filesystem???? /// info from filesystem????
uint32_t ref_count = 0; UInt32 ref_count = 0;
/// Flag indicates that file is read only. /// Flag indicates that file is read only.
bool read_only = false; bool read_only = false;
@ -46,11 +47,11 @@ private:
public: public:
DiskObjectStorageMetadata( DiskObjectStorageMetadata(
const std::string & common_metadata_path_, String compatible_key_prefix_,
const std::string & object_storage_root_path_, String metadata_file_path_);
const std::string & metadata_file_path_);
void addObject(ObjectStorageKey key, size_t size);
void addObject(const std::string & path, size_t size);
void deserialize(ReadBuffer & buf); void deserialize(ReadBuffer & buf);
void deserializeFromString(const std::string & data); void deserializeFromString(const std::string & data);
@ -58,14 +59,9 @@ public:
void serialize(WriteBuffer & buf, bool sync) const; void serialize(WriteBuffer & buf, bool sync) const;
std::string serializeToString() const; std::string serializeToString() const;
std::string getBlobsCommonPrefix() const const ObjectKeysWithMetadata & getKeysWithMeta() const
{ {
return object_storage_root_path; return keys_with_meta;
}
RelativePathsWithMetadata getBlobsRelativePaths() const
{
return storage_objects;
} }
bool isReadOnly() const bool isReadOnly() const
@ -73,12 +69,12 @@ public:
return read_only; return read_only;
} }
uint32_t getRefCount() const UInt32 getRefCount() const
{ {
return ref_count; return ref_count;
} }
uint64_t getTotalSizeBytes() const UInt64 getTotalSizeBytes() const
{ {
return total_size; return total_size;
} }
@ -112,6 +108,8 @@ public:
{ {
return inline_data; return inline_data;
} }
static bool getWriteFullObjectKeySetting();
}; };
using DiskObjectStorageMetadataPtr = std::unique_ptr<DiskObjectStorageMetadata>; using DiskObjectStorageMetadataPtr = std::unique_ptr<DiskObjectStorageMetadata>;

View File

@ -34,7 +34,7 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::createFileOperationObject(
const String & operation_name, UInt64 revision, const ObjectAttributes & metadata) const const String & operation_name, UInt64 revision, const ObjectAttributes & metadata) const
{ {
const String relative_path = "operations/r" + revisionToString(revision) + operation_log_suffix + "-" + operation_name; const String relative_path = "operations/r" + revisionToString(revision) + operation_log_suffix + "-" + operation_name;
StoredObject object(fs::path(disk->object_storage_root_path) / relative_path); StoredObject object(fs::path(disk->object_key_prefix) / relative_path);
auto buf = disk->object_storage->writeObject(object, WriteMode::Rewrite, metadata); auto buf = disk->object_storage->writeObject(object, WriteMode::Rewrite, metadata);
buf->write('0'); buf->write('0');
buf->finalize(); buf->finalize();
@ -52,8 +52,8 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::findLastRevision()
LOG_TRACE(disk->log, "Check object exists with revision prefix {}", revision_prefix); LOG_TRACE(disk->log, "Check object exists with revision prefix {}", revision_prefix);
const auto & object_storage = disk->object_storage; const auto & object_storage = disk->object_storage;
StoredObject revision_object{disk->object_storage_root_path + "r" + revision_prefix}; StoredObject revision_object{disk->object_key_prefix + "r" + revision_prefix};
StoredObject revision_operation_object{disk->object_storage_root_path + "operations/r" + revision_prefix}; StoredObject revision_operation_object{disk->object_key_prefix + "operations/r" + revision_prefix};
/// Check file or operation with such revision prefix exists. /// Check file or operation with such revision prefix exists.
if (object_storage->exists(revision_object) || object_storage->exists(revision_operation_object)) if (object_storage->exists(revision_object) || object_storage->exists(revision_operation_object))
@ -80,7 +80,7 @@ int DiskObjectStorageRemoteMetadataRestoreHelper::readSchemaVersion(IObjectStora
void DiskObjectStorageRemoteMetadataRestoreHelper::saveSchemaVersion(const int & version) const void DiskObjectStorageRemoteMetadataRestoreHelper::saveSchemaVersion(const int & version) const
{ {
StoredObject object{fs::path(disk->object_storage_root_path) / SCHEMA_VERSION_OBJECT}; StoredObject object{fs::path(disk->object_key_prefix) / SCHEMA_VERSION_OBJECT};
auto buf = disk->object_storage->writeObject(object, WriteMode::Rewrite, /* attributes= */ {}, /* buf_size= */ DBMS_DEFAULT_BUFFER_SIZE, write_settings); auto buf = disk->object_storage->writeObject(object, WriteMode::Rewrite, /* attributes= */ {}, /* buf_size= */ DBMS_DEFAULT_BUFFER_SIZE, write_settings);
writeIntText(version, *buf); writeIntText(version, *buf);
@ -187,7 +187,7 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::restore(const Poco::Util::Abs
try try
{ {
RestoreInformation information; RestoreInformation information;
information.source_path = disk->object_storage_root_path; information.source_path = disk->object_key_prefix;
information.source_namespace = disk->object_storage->getObjectsNamespace(); information.source_namespace = disk->object_storage->getObjectsNamespace();
readRestoreInformation(information); readRestoreInformation(information);
@ -201,11 +201,11 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::restore(const Poco::Util::Abs
{ {
/// In this case we need to additionally cleanup S3 from objects with later revision. /// In this case we need to additionally cleanup S3 from objects with later revision.
/// Will be simply just restore to different path. /// Will be simply just restore to different path.
if (information.source_path == disk->object_storage_root_path && information.revision != LATEST_REVISION) if (information.source_path == disk->object_key_prefix && information.revision != LATEST_REVISION)
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Restoring to the same bucket and path is allowed if revision is latest (0)"); throw Exception(ErrorCodes::BAD_ARGUMENTS, "Restoring to the same bucket and path is allowed if revision is latest (0)");
/// This case complicates S3 cleanup in case of unsuccessful restore. /// This case complicates S3 cleanup in case of unsuccessful restore.
if (information.source_path != disk->object_storage_root_path && disk->object_storage_root_path.starts_with(information.source_path)) if (information.source_path != disk->object_key_prefix && disk->object_key_prefix.starts_with(information.source_path))
throw Exception( throw Exception(
ErrorCodes::BAD_ARGUMENTS, ErrorCodes::BAD_ARGUMENTS,
"Restoring to the same bucket is allowed only if source path is not a sub-path of configured path in S3 disk"); "Restoring to the same bucket is allowed only if source path is not a sub-path of configured path in S3 disk");
@ -224,7 +224,7 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::restore(const Poco::Util::Abs
LOG_INFO(disk->log, "Removing old metadata..."); LOG_INFO(disk->log, "Removing old metadata...");
bool cleanup_s3 = information.source_path != disk->object_storage_root_path; bool cleanup_s3 = information.source_path != disk->object_key_prefix;
for (const auto & root : data_roots) for (const auto & root : data_roots)
if (disk->exists(root)) if (disk->exists(root))
disk->removeSharedRecursive(root + '/', !cleanup_s3, {}); disk->removeSharedRecursive(root + '/', !cleanup_s3, {});
@ -424,18 +424,17 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::processRestoreFiles(
continue; continue;
disk->createDirectories(directoryPath(path)); disk->createDirectories(directoryPath(path));
auto relative_key = shrinkKey(source_path, key); auto object_key = ObjectStorageKey::createAsRelative(disk->object_key_prefix, shrinkKey(source_path, key));
auto full_path = fs::path(disk->object_storage_root_path) / relative_key;
StoredObject object_from{key}; StoredObject object_from{key};
StoredObject object_to{fs::path(disk->object_storage_root_path) / relative_key}; StoredObject object_to{object_key.serialize()};
/// Copy object if we restore to different bucket / path. /// Copy object if we restore to different bucket / path.
if (source_object_storage->getObjectsNamespace() != disk->object_storage->getObjectsNamespace() || disk->object_storage_root_path != source_path) if (source_object_storage->getObjectsNamespace() != disk->object_storage->getObjectsNamespace() || disk->object_key_prefix != source_path)
source_object_storage->copyObjectToAnotherObjectStorage(object_from, object_to, read_settings, write_settings, *disk->object_storage); source_object_storage->copyObjectToAnotherObjectStorage(object_from, object_to, read_settings, write_settings, *disk->object_storage);
auto tx = disk->metadata_storage->createTransaction(); auto tx = disk->metadata_storage->createTransaction();
tx->addBlobToMetadata(path, relative_key, meta.size_bytes); tx->addBlobToMetadata(path, object_key, meta.size_bytes);
tx->commit(); tx->commit();
LOG_TRACE(disk->log, "Restored file {}", path); LOG_TRACE(disk->log, "Restored file {}", path);
@ -464,7 +463,7 @@ void DiskObjectStorageRemoteMetadataRestoreHelper::restoreFileOperations(IObject
{ {
/// Enable recording file operations if we restore to different bucket / path. /// Enable recording file operations if we restore to different bucket / path.
bool send_metadata = source_object_storage->getObjectsNamespace() != disk->object_storage->getObjectsNamespace() bool send_metadata = source_object_storage->getObjectsNamespace() != disk->object_storage->getObjectsNamespace()
|| disk->object_storage_root_path != restore_information.source_path; || disk->object_key_prefix != restore_information.source_path;
std::set<String> renames; std::set<String> renames;
auto restore_file_operations = [this, &source_object_storage, &restore_information, &renames, &send_metadata](const RelativePathsWithMetadata & objects) auto restore_file_operations = [this, &source_object_storage, &restore_information, &renames, &send_metadata](const RelativePathsWithMetadata & objects)

View File

@ -25,6 +25,7 @@ namespace ErrorCodes
extern const int BAD_FILE_TYPE; extern const int BAD_FILE_TYPE;
extern const int FILE_ALREADY_EXISTS; extern const int FILE_ALREADY_EXISTS;
extern const int CANNOT_PARSE_INPUT_ASSERTION_FAILED; extern const int CANNOT_PARSE_INPUT_ASSERTION_FAILED;
extern const int LOGICAL_ERROR;
} }
DiskObjectStorageTransaction::DiskObjectStorageTransaction( DiskObjectStorageTransaction::DiskObjectStorageTransaction(
@ -511,12 +512,12 @@ struct CopyFileObjectStorageOperation final : public IDiskObjectStorageOperation
for (const auto & object_from : source_blobs) for (const auto & object_from : source_blobs)
{ {
std::string blob_name = object_storage.generateBlobNameForPath(to_path); auto object_key = object_storage.generateObjectKeyForPath(to_path);
auto object_to = StoredObject(fs::path(metadata_storage.getObjectStorageRootPath()) / blob_name); auto object_to = StoredObject(object_key.serialize());
object_storage.copyObject(object_from, object_to, read_settings, write_settings); object_storage.copyObject(object_from, object_to, read_settings, write_settings);
tx->addBlobToMetadata(to_path, blob_name, object_from.bytes_size); tx->addBlobToMetadata(to_path, object_key, object_from.bytes_size);
created_objects.push_back(object_to); created_objects.push_back(object_to);
} }
@ -663,46 +664,53 @@ std::unique_ptr<WriteBufferFromFileBase> DiskObjectStorageTransaction::writeFile
const WriteSettings & settings, const WriteSettings & settings,
bool autocommit) bool autocommit)
{ {
String blob_name; auto object_key = object_storage.generateObjectKeyForPath(path);
std::optional<ObjectAttributes> object_attributes; std::optional<ObjectAttributes> object_attributes;
blob_name = object_storage.generateBlobNameForPath(path);
if (metadata_helper) if (metadata_helper)
{ {
if (!object_key.hasPrefix())
throw Exception(ErrorCodes::LOGICAL_ERROR, "metadata helper is not supported with absolute paths");
auto revision = metadata_helper->revision_counter + 1; auto revision = metadata_helper->revision_counter + 1;
metadata_helper->revision_counter++; metadata_helper->revision_counter++;
object_attributes = { object_attributes = {
{"path", path} {"path", path}
}; };
blob_name = "r" + revisionToString(revision) + "-file-" + blob_name;
object_key = ObjectStorageKey::createAsRelative(
object_key.getPrefix(),
"r" + revisionToString(revision) + "-file-" + object_key.getSuffix());
} }
auto object = StoredObject(fs::path(metadata_storage.getObjectStorageRootPath()) / blob_name); /// seems ok
auto write_operation = std::make_unique<WriteFileObjectStorageOperation>(object_storage, metadata_storage, object); auto object = StoredObject(object_key.serialize());
std::function<void(size_t count)> create_metadata_callback; std::function<void(size_t count)> create_metadata_callback;
if (autocommit) if (autocommit)
{ {
create_metadata_callback = [tx = shared_from_this(), mode, path, blob_name](size_t count) create_metadata_callback = [tx = shared_from_this(), mode, path, key_ = std::move(object_key)](size_t count)
{ {
if (mode == WriteMode::Rewrite) if (mode == WriteMode::Rewrite)
{ {
// Otherwise we will produce lost blobs which nobody points to /// Otherwise we will produce lost blobs which nobody points to
/// WriteOnce storages are not affected by the issue /// WriteOnce storages are not affected by the issue
if (!tx->object_storage.isWriteOnce() && tx->metadata_storage.exists(path)) if (!tx->object_storage.isWriteOnce() && tx->metadata_storage.exists(path))
tx->object_storage.removeObjectsIfExist(tx->metadata_storage.getStorageObjects(path)); tx->object_storage.removeObjectsIfExist(tx->metadata_storage.getStorageObjects(path));
tx->metadata_transaction->createMetadataFile(path, blob_name, count); tx->metadata_transaction->createMetadataFile(path, key_, count);
} }
else else
tx->metadata_transaction->addBlobToMetadata(path, blob_name, count); tx->metadata_transaction->addBlobToMetadata(path, key_, count);
tx->metadata_transaction->commit(); tx->metadata_transaction->commit();
}; };
} }
else else
{ {
create_metadata_callback = [object_storage_tx = shared_from_this(), write_op = write_operation.get(), mode, path, blob_name](size_t count) auto write_operation = std::make_unique<WriteFileObjectStorageOperation>(object_storage, metadata_storage, object);
create_metadata_callback = [object_storage_tx = shared_from_this(), write_op = write_operation.get(), mode, path, key_ = std::move(object_key)](size_t count)
{ {
/// This callback called in WriteBuffer finalize method -- only there we actually know /// This callback called in WriteBuffer finalize method -- only there we actually know
/// how many bytes were written. We don't control when this finalize method will be called /// how many bytes were written. We don't control when this finalize method will be called
@ -714,7 +722,7 @@ std::unique_ptr<WriteBufferFromFileBase> DiskObjectStorageTransaction::writeFile
/// ... /// ...
/// buf1->finalize() // shouldn't do anything with metadata operations, just memoize what to do /// buf1->finalize() // shouldn't do anything with metadata operations, just memoize what to do
/// tx->commit() /// tx->commit()
write_op->setOnExecute([object_storage_tx, mode, path, blob_name, count](MetadataTransactionPtr tx) write_op->setOnExecute([object_storage_tx, mode, path, key_, count](MetadataTransactionPtr tx)
{ {
if (mode == WriteMode::Rewrite) if (mode == WriteMode::Rewrite)
{ {
@ -726,15 +734,16 @@ std::unique_ptr<WriteBufferFromFileBase> DiskObjectStorageTransaction::writeFile
object_storage_tx->metadata_storage.getStorageObjects(path)); object_storage_tx->metadata_storage.getStorageObjects(path));
} }
tx->createMetadataFile(path, blob_name, count); tx->createMetadataFile(path, key_, count);
} }
else else
tx->addBlobToMetadata(path, blob_name, count); tx->addBlobToMetadata(path, key_, count);
}); });
}; };
operations_to_execute.emplace_back(std::move(write_operation));
} }
operations_to_execute.emplace_back(std::move(write_operation));
auto impl = object_storage.writeObject( auto impl = object_storage.writeObject(
object, object,
@ -753,20 +762,27 @@ void DiskObjectStorageTransaction::writeFileUsingBlobWritingFunction(
const String & path, WriteMode mode, WriteBlobFunction && write_blob_function) const String & path, WriteMode mode, WriteBlobFunction && write_blob_function)
{ {
/// This function is a simplified and adapted version of DiskObjectStorageTransaction::writeFile(). /// This function is a simplified and adapted version of DiskObjectStorageTransaction::writeFile().
auto blob_name = object_storage.generateBlobNameForPath(path); auto object_key = object_storage.generateObjectKeyForPath(path);
std::optional<ObjectAttributes> object_attributes; std::optional<ObjectAttributes> object_attributes;
if (metadata_helper) if (metadata_helper)
{ {
if (!object_key.hasPrefix())
throw Exception(ErrorCodes::LOGICAL_ERROR, "metadata helper is not supported with abs paths");
auto revision = metadata_helper->revision_counter + 1; auto revision = metadata_helper->revision_counter + 1;
metadata_helper->revision_counter++; metadata_helper->revision_counter++;
object_attributes = { object_attributes = {
{"path", path} {"path", path}
}; };
blob_name = "r" + revisionToString(revision) + "-file-" + blob_name;
object_key = ObjectStorageKey::createAsRelative(
object_key.getPrefix(),
"r" + revisionToString(revision) + "-file-" + object_key.getSuffix());
} }
auto object = StoredObject(fs::path(metadata_storage.getObjectStorageRootPath()) / blob_name); /// seems ok
auto object = StoredObject(object_key.serialize());
auto write_operation = std::make_unique<WriteFileObjectStorageOperation>(object_storage, metadata_storage, object); auto write_operation = std::make_unique<WriteFileObjectStorageOperation>(object_storage, metadata_storage, object);
operations_to_execute.emplace_back(std::move(write_operation)); operations_to_execute.emplace_back(std::move(write_operation));
@ -788,10 +804,10 @@ void DiskObjectStorageTransaction::writeFileUsingBlobWritingFunction(
if (!object_storage.isWriteOnce() && metadata_storage.exists(path)) if (!object_storage.isWriteOnce() && metadata_storage.exists(path))
object_storage.removeObjectsIfExist(metadata_storage.getStorageObjects(path)); object_storage.removeObjectsIfExist(metadata_storage.getStorageObjects(path));
metadata_transaction->createMetadataFile(path, blob_name, object_size); metadata_transaction->createMetadataFile(path, std::move(object_key), object_size);
} }
else else
metadata_transaction->addBlobToMetadata(path, blob_name, object_size); metadata_transaction->addBlobToMetadata(path, std::move(object_key), object_size);
} }

View File

@ -28,9 +28,10 @@ void HDFSObjectStorage::startup()
{ {
} }
std::string HDFSObjectStorage::generateBlobNameForPath(const std::string & /* path */) ObjectStorageKey HDFSObjectStorage::generateObjectKeyForPath(const std::string & /* path */) const
{ {
return getRandomASCIIString(32); /// what ever data_source_description.description value is, consider that key as relative key
return ObjectStorageKey::createAsRelative(data_source_description.description, getRandomASCIIString(32));
} }
bool HDFSObjectStorage::exists(const StoredObject & object) const bool HDFSObjectStorage::exists(const StoredObject & object) const

View File

@ -114,7 +114,7 @@ public:
const std::string & config_prefix, const std::string & config_prefix,
ContextPtr context) override; ContextPtr context) override;
std::string generateBlobNameForPath(const std::string & path) override; ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return true; } bool isRemote() const override { return true; }

View File

@ -126,10 +126,10 @@ public:
virtual void createEmptyMetadataFile(const std::string & path) = 0; virtual void createEmptyMetadataFile(const std::string & path) = 0;
/// Create metadata file on paths with content (blob_name, size_in_bytes) /// Create metadata file on paths with content (blob_name, size_in_bytes)
virtual void createMetadataFile(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) = 0; virtual void createMetadataFile(const std::string & path, ObjectStorageKey key, uint64_t size_in_bytes) = 0;
/// Add to new blob to metadata file (way to implement appends) /// Add to new blob to metadata file (way to implement appends)
virtual void addBlobToMetadata(const std::string & /* path */, const std::string & /* blob_name */, uint64_t /* size_in_bytes */) virtual void addBlobToMetadata(const std::string & /* path */, ObjectStorageKey /* key */, uint64_t /* size_in_bytes */)
{ {
throwNotImplemented(); throwNotImplemented();
} }
@ -221,8 +221,6 @@ public:
/// object_storage_path is absolute. /// object_storage_path is absolute.
virtual StoredObjects getStorageObjects(const std::string & path) const = 0; virtual StoredObjects getStorageObjects(const std::string & path) const = 0;
virtual std::string getObjectStorageRootPath() const = 0;
private: private:
[[noreturn]] static void throwNotImplemented() [[noreturn]] static void throwNotImplemented()
{ {

View File

@ -1,6 +1,6 @@
#include <Disks/ObjectStorages/IObjectStorage.h> #include <Disks/ObjectStorages/IObjectStorage.h>
#include <Disks/IO/ThreadPoolRemoteFSReader.h> #include <Disks/IO/ThreadPoolRemoteFSReader.h>
#include <Common/getRandomASCIIString.h> #include <Common/Exception.h>
#include <IO/WriteBufferFromFileBase.h> #include <IO/WriteBufferFromFileBase.h>
#include <IO/copyData.h> #include <IO/copyData.h>
#include <IO/ReadBufferFromFileBase.h> #include <IO/ReadBufferFromFileBase.h>
@ -95,21 +95,4 @@ WriteSettings IObjectStorage::patchSettings(const WriteSettings & write_settings
return settings; return settings;
} }
std::string IObjectStorage::generateBlobNameForPath(const std::string & /* path */)
{
/// Path to store the new S3 object.
/// Total length is 32 a-z characters for enough randomness.
/// First 3 characters are used as a prefix for
/// https://aws.amazon.com/premiumsupport/knowledge-center/s3-object-key-naming-pattern/
constexpr size_t key_name_total_size = 32;
constexpr size_t key_name_prefix_size = 3;
/// Path to store new S3 object.
return fmt::format("{}/{}",
getRandomASCIIString(key_name_prefix_size),
getRandomASCIIString(key_name_total_size - key_name_prefix_size));
}
} }

View File

@ -9,7 +9,6 @@
#include <Poco/Timestamp.h> #include <Poco/Timestamp.h>
#include <Poco/Util/AbstractConfiguration.h> #include <Poco/Util/AbstractConfiguration.h>
#include <Core/Defines.h> #include <Core/Defines.h>
#include <Common/Exception.h>
#include <IO/ReadSettings.h> #include <IO/ReadSettings.h>
#include <IO/WriteSettings.h> #include <IO/WriteSettings.h>
#include <IO/copyData.h> #include <IO/copyData.h>
@ -17,6 +16,7 @@
#include <Disks/ObjectStorages/StoredObject.h> #include <Disks/ObjectStorages/StoredObject.h>
#include <Disks/DiskType.h> #include <Disks/DiskType.h>
#include <Common/ThreadPool_fwd.h> #include <Common/ThreadPool_fwd.h>
#include <Common/ObjectStorageKey.h>
#include <Disks/WriteMode.h> #include <Disks/WriteMode.h>
#include <Interpreters/Context_fwd.h> #include <Interpreters/Context_fwd.h>
#include <Core/Types.h> #include <Core/Types.h>
@ -35,7 +35,7 @@ using ObjectAttributes = std::map<std::string, std::string>;
struct ObjectMetadata struct ObjectMetadata
{ {
uint64_t size_bytes; uint64_t size_bytes = 0;
std::optional<Poco::Timestamp> last_modified; std::optional<Poco::Timestamp> last_modified;
std::optional<ObjectAttributes> attributes; std::optional<ObjectAttributes> attributes;
}; };
@ -43,16 +43,31 @@ struct ObjectMetadata
struct RelativePathWithMetadata struct RelativePathWithMetadata
{ {
String relative_path; String relative_path;
ObjectMetadata metadata{}; ObjectMetadata metadata;
RelativePathWithMetadata() = default; RelativePathWithMetadata() = default;
RelativePathWithMetadata(const String & relative_path_, const ObjectMetadata & metadata_) RelativePathWithMetadata(String relative_path_, ObjectMetadata metadata_)
: relative_path(relative_path_), metadata(metadata_) : relative_path(std::move(relative_path_))
, metadata(std::move(metadata_))
{}
};
struct ObjectKeyWithMetadata
{
ObjectStorageKey key;
ObjectMetadata metadata;
ObjectKeyWithMetadata() = default;
ObjectKeyWithMetadata(ObjectStorageKey key_, ObjectMetadata metadata_)
: key(std::move(key_))
, metadata(std::move(metadata_))
{} {}
}; };
using RelativePathsWithMetadata = std::vector<RelativePathWithMetadata>; using RelativePathsWithMetadata = std::vector<RelativePathWithMetadata>;
using ObjectKeysWithMetadata = std::vector<ObjectKeyWithMetadata>;
class IObjectStorageIterator; class IObjectStorageIterator;
using ObjectStorageIteratorPtr = std::shared_ptr<IObjectStorageIterator>; using ObjectStorageIteratorPtr = std::shared_ptr<IObjectStorageIterator>;
@ -176,7 +191,7 @@ public:
/// Generate blob name for passed absolute local path. /// Generate blob name for passed absolute local path.
/// Path can be generated either independently or based on `path`. /// Path can be generated either independently or based on `path`.
virtual std::string generateBlobNameForPath(const std::string & path); virtual ObjectStorageKey generateObjectKeyForPath(const std::string & path) const = 0;
/// Get unique id for passed absolute path in object storage. /// Get unique id for passed absolute path in object storage.
virtual std::string getUniqueId(const std::string & path) const { return path; } virtual std::string getUniqueId(const std::string & path) const { return path; }

View File

@ -24,8 +24,9 @@ namespace ErrorCodes
extern const int CANNOT_UNLINK; extern const int CANNOT_UNLINK;
} }
LocalObjectStorage::LocalObjectStorage() LocalObjectStorage::LocalObjectStorage(String key_prefix_)
: log(&Poco::Logger::get("LocalObjectStorage")) : key_prefix(std::move(key_prefix_))
, log(&Poco::Logger::get("LocalObjectStorage"))
{ {
data_source_description.type = DataSourceType::Local; data_source_description.type = DataSourceType::Local;
if (auto block_device_id = tryGetBlockDeviceId("/"); block_device_id.has_value()) if (auto block_device_id = tryGetBlockDeviceId("/"); block_device_id.has_value())
@ -200,10 +201,10 @@ void LocalObjectStorage::applyNewSettings(
{ {
} }
std::string LocalObjectStorage::generateBlobNameForPath(const std::string & /* path */) ObjectStorageKey LocalObjectStorage::generateObjectKeyForPath(const std::string & /* path */) const
{ {
constexpr size_t key_name_total_size = 32; constexpr size_t key_name_total_size = 32;
return getRandomASCIIString(key_name_total_size); return ObjectStorageKey::createAsRelative(key_prefix, getRandomASCIIString(key_name_total_size));
} }
} }

View File

@ -16,7 +16,7 @@ namespace DB
class LocalObjectStorage : public IObjectStorage class LocalObjectStorage : public IObjectStorage
{ {
public: public:
LocalObjectStorage(); LocalObjectStorage(String key_prefix_);
DataSourceDescription getDataSourceDescription() const override { return data_source_description; } DataSourceDescription getDataSourceDescription() const override { return data_source_description; }
@ -78,13 +78,14 @@ public:
const std::string & config_prefix, const std::string & config_prefix,
ContextPtr context) override; ContextPtr context) override;
std::string generateBlobNameForPath(const std::string & path) override; ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
bool isRemote() const override { return false; } bool isRemote() const override { return false; }
ReadSettings patchSettings(const ReadSettings & read_settings) const override; ReadSettings patchSettings(const ReadSettings & read_settings) const override;
private: private:
String key_prefix;
Poco::Logger * log; Poco::Logger * log;
DataSourceDescription data_source_description; DataSourceDescription data_source_description;
}; };

View File

@ -20,23 +20,25 @@ void registerDiskLocalObjectStorage(DiskFactory & factory, bool global_skip_acce
ContextPtr context, ContextPtr context,
const DisksMap & /*map*/) -> DiskPtr const DisksMap & /*map*/) -> DiskPtr
{ {
String path; String object_key_prefix;
UInt64 keep_free_space_bytes; UInt64 keep_free_space_bytes;
loadDiskLocalConfig(name, config, config_prefix, context, path, keep_free_space_bytes); loadDiskLocalConfig(name, config, config_prefix, context, object_key_prefix, keep_free_space_bytes);
fs::create_directories(path); /// keys are mapped to the fs, object_key_prefix is a directory also
fs::create_directories(object_key_prefix);
String type = config.getString(config_prefix + ".type"); String type = config.getString(config_prefix + ".type");
chassert(type == "local_blob_storage"); chassert(type == "local_blob_storage");
std::shared_ptr<LocalObjectStorage> local_storage = std::make_shared<LocalObjectStorage>(); std::shared_ptr<LocalObjectStorage> local_storage = std::make_shared<LocalObjectStorage>(object_key_prefix);
MetadataStoragePtr metadata_storage; MetadataStoragePtr metadata_storage;
auto [metadata_path, metadata_disk] = prepareForLocalMetadata(name, config, config_prefix, context); auto [metadata_path, metadata_disk] = prepareForLocalMetadata(name, config, config_prefix, context);
metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, path); metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, object_key_prefix);
auto disk = std::make_shared<DiskObjectStorage>( auto disk = std::make_shared<DiskObjectStorage>(
name, path, "Local", metadata_storage, local_storage, config, config_prefix); name, object_key_prefix, "Local", metadata_storage, local_storage, config, config_prefix);
disk->startup(context, global_skip_access_check); disk->startup(context, global_skip_access_check);
return disk; return disk;
}; };
factory.registerDiskType("local_blob_storage", creator); factory.registerDiskType("local_blob_storage", creator);
} }

View File

@ -15,9 +15,9 @@ namespace ErrorCodes
extern const int FS_METADATA_ERROR; extern const int FS_METADATA_ERROR;
} }
MetadataStorageFromDisk::MetadataStorageFromDisk(DiskPtr disk_, const std::string & object_storage_root_path_) MetadataStorageFromDisk::MetadataStorageFromDisk(DiskPtr disk_, String compatible_key_prefix_)
: disk(disk_) : disk(disk_)
, object_storage_root_path(object_storage_root_path_) , compatible_key_prefix(compatible_key_prefix_)
{ {
} }
@ -85,7 +85,7 @@ std::string MetadataStorageFromDisk::readInlineDataToString(const std::string &
DiskObjectStorageMetadataPtr MetadataStorageFromDisk::readMetadataUnlocked(const std::string & path, std::shared_lock<SharedMutex> &) const DiskObjectStorageMetadataPtr MetadataStorageFromDisk::readMetadataUnlocked(const std::string & path, std::shared_lock<SharedMutex> &) const
{ {
auto metadata = std::make_unique<DiskObjectStorageMetadata>(disk->getPath(), object_storage_root_path, path); auto metadata = std::make_unique<DiskObjectStorageMetadata>(compatible_key_prefix, path);
auto str = readFileToString(path); auto str = readFileToString(path);
metadata->deserializeFromString(str); metadata->deserializeFromString(str);
return metadata; return metadata;
@ -93,7 +93,7 @@ DiskObjectStorageMetadataPtr MetadataStorageFromDisk::readMetadataUnlocked(const
DiskObjectStorageMetadataPtr MetadataStorageFromDisk::readMetadataUnlocked(const std::string & path, std::unique_lock<SharedMutex> &) const DiskObjectStorageMetadataPtr MetadataStorageFromDisk::readMetadataUnlocked(const std::string & path, std::unique_lock<SharedMutex> &) const
{ {
auto metadata = std::make_unique<DiskObjectStorageMetadata>(disk->getPath(), object_storage_root_path, path); auto metadata = std::make_unique<DiskObjectStorageMetadata>(compatible_key_prefix, path);
auto str = readFileToString(path); auto str = readFileToString(path);
metadata->deserializeFromString(str); metadata->deserializeFromString(str);
return metadata; return metadata;
@ -135,21 +135,16 @@ MetadataTransactionPtr MetadataStorageFromDisk::createTransaction()
StoredObjects MetadataStorageFromDisk::getStorageObjects(const std::string & path) const StoredObjects MetadataStorageFromDisk::getStorageObjects(const std::string & path) const
{ {
auto metadata = readMetadata(path); auto metadata = readMetadata(path);
const auto & keys_with_meta = metadata->getKeysWithMeta();
auto object_storage_relative_paths = metadata->getBlobsRelativePaths(); /// Relative paths. StoredObjects objects;
objects.reserve(keys_with_meta.size());
StoredObjects object_storage_paths; for (const auto & [object_key, object_meta] : keys_with_meta)
object_storage_paths.reserve(object_storage_relative_paths.size());
/// Relative paths -> absolute.
for (auto & [object_relative_path, object_meta] : object_storage_relative_paths)
{ {
auto object_path = fs::path(metadata->getBlobsCommonPrefix()) / object_relative_path; objects.emplace_back(object_key.serialize(), object_meta.size_bytes, path);
StoredObject object{ object_path, object_meta.size_bytes, path };
object_storage_paths.push_back(object);
} }
return object_storage_paths; return objects;
} }
uint32_t MetadataStorageFromDisk::getHardlinkCount(const std::string & path) const uint32_t MetadataStorageFromDisk::getHardlinkCount(const std::string & path) const
@ -253,8 +248,7 @@ void MetadataStorageFromDiskTransaction::writeInlineDataToFile(
const std::string & path, const std::string & path,
const std::string & data) const std::string & data)
{ {
auto metadata = std::make_unique<DiskObjectStorageMetadata>( auto metadata = std::make_unique<DiskObjectStorageMetadata>(metadata_storage.compatible_key_prefix, path);
metadata_storage.getDisk()->getPath(), metadata_storage.getObjectStorageRootPath(), path);
metadata->setInlineData(data); metadata->setInlineData(data);
writeStringToFile(path, metadata->serializeToString()); writeStringToFile(path, metadata->serializeToString());
} }
@ -318,26 +312,23 @@ void MetadataStorageFromDiskTransaction::setReadOnly(const std::string & path)
void MetadataStorageFromDiskTransaction::createEmptyMetadataFile(const std::string & path) void MetadataStorageFromDiskTransaction::createEmptyMetadataFile(const std::string & path)
{ {
auto metadata = std::make_unique<DiskObjectStorageMetadata>( auto metadata = std::make_unique<DiskObjectStorageMetadata>(metadata_storage.compatible_key_prefix, path);
metadata_storage.getDisk()->getPath(), metadata_storage.getObjectStorageRootPath(), path);
writeStringToFile(path, metadata->serializeToString()); writeStringToFile(path, metadata->serializeToString());
} }
void MetadataStorageFromDiskTransaction::createMetadataFile(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) void MetadataStorageFromDiskTransaction::createMetadataFile(const std::string & path, ObjectStorageKey object_key, uint64_t size_in_bytes)
{ {
DiskObjectStorageMetadataPtr metadata = std::make_unique<DiskObjectStorageMetadata>( auto metadata = std::make_unique<DiskObjectStorageMetadata>(metadata_storage.compatible_key_prefix, path);
metadata_storage.getDisk()->getPath(), metadata_storage.getObjectStorageRootPath(), path); metadata->addObject(std::move(object_key), size_in_bytes);
metadata->addObject(blob_name, size_in_bytes);
auto data = metadata->serializeToString(); auto data = metadata->serializeToString();
if (!data.empty()) if (!data.empty())
addOperation(std::make_unique<WriteFileOperation>(path, *metadata_storage.getDisk(), data)); addOperation(std::make_unique<WriteFileOperation>(path, *metadata_storage.getDisk(), data));
} }
void MetadataStorageFromDiskTransaction::addBlobToMetadata(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) void MetadataStorageFromDiskTransaction::addBlobToMetadata(const std::string & path, ObjectStorageKey object_key, uint64_t size_in_bytes)
{ {
addOperation(std::make_unique<AddBlobOperation>(path, blob_name, metadata_storage.object_storage_root_path, size_in_bytes, *metadata_storage.disk, metadata_storage)); addOperation(std::make_unique<AddBlobOperation>(path, std::move(object_key), size_in_bytes, *metadata_storage.disk, metadata_storage));
} }
UnlinkMetadataFileOperationOutcomePtr MetadataStorageFromDiskTransaction::unlinkMetadata(const std::string & path) UnlinkMetadataFileOperationOutcomePtr MetadataStorageFromDiskTransaction::unlinkMetadata(const std::string & path)

View File

@ -22,12 +22,11 @@ private:
friend class MetadataStorageFromDiskTransaction; friend class MetadataStorageFromDiskTransaction;
mutable SharedMutex metadata_mutex; mutable SharedMutex metadata_mutex;
DiskPtr disk; DiskPtr disk;
std::string object_storage_root_path; String compatible_key_prefix;
public: public:
MetadataStorageFromDisk(DiskPtr disk_, const std::string & object_storage_root_path_); MetadataStorageFromDisk(DiskPtr disk_, String compatible_key_prefix);
MetadataTransactionPtr createTransaction() override; MetadataTransactionPtr createTransaction() override;
@ -67,8 +66,6 @@ public:
StoredObjects getStorageObjects(const std::string & path) const override; StoredObjects getStorageObjects(const std::string & path) const override;
std::string getObjectStorageRootPath() const override { return object_storage_root_path; }
DiskObjectStorageMetadataPtr readMetadata(const std::string & path) const; DiskObjectStorageMetadataPtr readMetadata(const std::string & path) const;
DiskObjectStorageMetadataPtr readMetadataUnlocked(const std::string & path, std::unique_lock<SharedMutex> & lock) const; DiskObjectStorageMetadataPtr readMetadataUnlocked(const std::string & path, std::unique_lock<SharedMutex> & lock) const;
@ -104,9 +101,9 @@ public:
void createEmptyMetadataFile(const std::string & path) override; void createEmptyMetadataFile(const std::string & path) override;
void createMetadataFile(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) override; void createMetadataFile(const std::string & path, ObjectStorageKey object_key, uint64_t size_in_bytes) override;
void addBlobToMetadata(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) override; void addBlobToMetadata(const std::string & path, ObjectStorageKey object_key, uint64_t size_in_bytes) override;
void setLastModified(const std::string & path, const Poco::Timestamp & timestamp) override; void setLastModified(const std::string & path, const Poco::Timestamp & timestamp) override;

View File

@ -294,9 +294,9 @@ void AddBlobOperation::execute(std::unique_lock<SharedMutex> & metadata_lock)
if (metadata_storage.exists(path)) if (metadata_storage.exists(path))
metadata = metadata_storage.readMetadataUnlocked(path, metadata_lock); metadata = metadata_storage.readMetadataUnlocked(path, metadata_lock);
else else
metadata = std::make_unique<DiskObjectStorageMetadata>(disk.getPath(), root_path, path); metadata = std::make_unique<DiskObjectStorageMetadata>(disk.getPath(), path);
metadata->addObject(blob_name, size_in_bytes); metadata->addObject(object_key, size_in_bytes);
write_operation = std::make_unique<WriteFileOperation>(path, disk, metadata->serializeToString()); write_operation = std::make_unique<WriteFileOperation>(path, disk, metadata->serializeToString());

View File

@ -216,14 +216,12 @@ struct AddBlobOperation final : public IMetadataOperation
{ {
AddBlobOperation( AddBlobOperation(
const std::string & path_, const std::string & path_,
const std::string & blob_name_, ObjectStorageKey object_key_,
const std::string & root_path_,
uint64_t size_in_bytes_, uint64_t size_in_bytes_,
IDisk & disk_, IDisk & disk_,
const MetadataStorageFromDisk & metadata_storage_) const MetadataStorageFromDisk & metadata_storage_)
: path(path_) : path(path_)
, blob_name(blob_name_) , object_key(std::move(object_key_))
, root_path(root_path_)
, size_in_bytes(size_in_bytes_) , size_in_bytes(size_in_bytes_)
, disk(disk_) , disk(disk_)
, metadata_storage(metadata_storage_) , metadata_storage(metadata_storage_)
@ -235,8 +233,7 @@ struct AddBlobOperation final : public IMetadataOperation
private: private:
std::string path; std::string path;
std::string blob_name; ObjectStorageKey object_key;
std::string root_path;
uint64_t size_in_bytes; uint64_t size_in_bytes;
IDisk & disk; IDisk & disk;
const MetadataStorageFromDisk & metadata_storage; const MetadataStorageFromDisk & metadata_storage;

View File

@ -12,9 +12,9 @@ namespace DB
MetadataStorageFromPlainObjectStorage::MetadataStorageFromPlainObjectStorage( MetadataStorageFromPlainObjectStorage::MetadataStorageFromPlainObjectStorage(
ObjectStoragePtr object_storage_, ObjectStoragePtr object_storage_,
const std::string & object_storage_root_path_) String storage_path_prefix_)
: object_storage(object_storage_) : object_storage(object_storage_)
, object_storage_root_path(object_storage_root_path_) , storage_path_prefix(std::move(storage_path_prefix_))
{ {
} }
@ -25,19 +25,15 @@ MetadataTransactionPtr MetadataStorageFromPlainObjectStorage::createTransaction(
const std::string & MetadataStorageFromPlainObjectStorage::getPath() const const std::string & MetadataStorageFromPlainObjectStorage::getPath() const
{ {
return object_storage_root_path; return storage_path_prefix;
}
std::filesystem::path MetadataStorageFromPlainObjectStorage::getAbsolutePath(const std::string & path) const
{
return fs::path(object_storage_root_path) / path;
} }
bool MetadataStorageFromPlainObjectStorage::exists(const std::string & path) const bool MetadataStorageFromPlainObjectStorage::exists(const std::string & path) const
{ {
/// NOTE: exists() cannot be used here since it works only for existing /// NOTE: exists() cannot be used here since it works only for existing
/// key, and does not work for some intermediate path. /// key, and does not work for some intermediate path.
std::string abs_path = getAbsolutePath(path); auto object_key = object_storage->generateObjectKeyForPath(path);
return object_storage->existsOrHasAnyChild(abs_path); return object_storage->existsOrHasAnyChild(object_key.serialize());
} }
bool MetadataStorageFromPlainObjectStorage::isFile(const std::string & path) const bool MetadataStorageFromPlainObjectStorage::isFile(const std::string & path) const
@ -48,7 +44,8 @@ bool MetadataStorageFromPlainObjectStorage::isFile(const std::string & path) con
bool MetadataStorageFromPlainObjectStorage::isDirectory(const std::string & path) const bool MetadataStorageFromPlainObjectStorage::isDirectory(const std::string & path) const
{ {
std::string directory = getAbsolutePath(path); auto object_key = object_storage->generateObjectKeyForPath(path);
std::string directory = object_key.serialize();
if (!directory.ends_with('/')) if (!directory.ends_with('/'))
directory += '/'; directory += '/';
@ -59,8 +56,8 @@ bool MetadataStorageFromPlainObjectStorage::isDirectory(const std::string & path
uint64_t MetadataStorageFromPlainObjectStorage::getFileSize(const String & path) const uint64_t MetadataStorageFromPlainObjectStorage::getFileSize(const String & path) const
{ {
RelativePathsWithMetadata children; auto object_key = object_storage->generateObjectKeyForPath(path);
auto metadata = object_storage->tryGetObjectMetadata(getAbsolutePath(path)); auto metadata = object_storage->tryGetObjectMetadata(object_key.serialize());
if (metadata) if (metadata)
return metadata->size_bytes; return metadata->size_bytes;
return 0; return 0;
@ -68,12 +65,14 @@ uint64_t MetadataStorageFromPlainObjectStorage::getFileSize(const String & path)
std::vector<std::string> MetadataStorageFromPlainObjectStorage::listDirectory(const std::string & path) const std::vector<std::string> MetadataStorageFromPlainObjectStorage::listDirectory(const std::string & path) const
{ {
RelativePathsWithMetadata files; auto object_key = object_storage->generateObjectKeyForPath(path);
std::string abs_path = getAbsolutePath(path);
if (!abs_path.ends_with('/'))
abs_path += '/';
object_storage->listObjects(abs_path, files, 0); RelativePathsWithMetadata files;
std::string abs_key = object_key.serialize();
if (!abs_key.ends_with('/'))
abs_key += '/';
object_storage->listObjects(abs_key, files, 0);
std::vector<std::string> result; std::vector<std::string> result;
for (const auto & path_size : files) for (const auto & path_size : files)
@ -84,8 +83,8 @@ std::vector<std::string> MetadataStorageFromPlainObjectStorage::listDirectory(co
std::unordered_set<std::string> duplicates_filter; std::unordered_set<std::string> duplicates_filter;
for (auto & row : result) for (auto & row : result)
{ {
chassert(row.starts_with(abs_path)); chassert(row.starts_with(abs_key));
row.erase(0, abs_path.size()); row.erase(0, abs_key.size());
auto slash_pos = row.find_first_of('/'); auto slash_pos = row.find_first_of('/');
if (slash_pos != std::string::npos) if (slash_pos != std::string::npos)
row.erase(slash_pos, row.size() - slash_pos); row.erase(slash_pos, row.size() - slash_pos);
@ -105,10 +104,9 @@ DirectoryIteratorPtr MetadataStorageFromPlainObjectStorage::iterateDirectory(con
StoredObjects MetadataStorageFromPlainObjectStorage::getStorageObjects(const std::string & path) const StoredObjects MetadataStorageFromPlainObjectStorage::getStorageObjects(const std::string & path) const
{ {
std::string blob_name = object_storage->generateBlobNameForPath(path); size_t object_size = getFileSize(path);
size_t object_size = getFileSize(blob_name); auto object_key = object_storage->generateObjectKeyForPath(path);
auto object = StoredObject(getAbsolutePath(blob_name), object_size, path); return {StoredObject(object_key.serialize(), object_size, path)};
return {std::move(object)};
} }
const IMetadataStorage & MetadataStorageFromPlainObjectStorageTransaction::getStorageForNonTransactionalReads() const const IMetadataStorage & MetadataStorageFromPlainObjectStorageTransaction::getStorageForNonTransactionalReads() const
@ -118,7 +116,8 @@ const IMetadataStorage & MetadataStorageFromPlainObjectStorageTransaction::getSt
void MetadataStorageFromPlainObjectStorageTransaction::unlinkFile(const std::string & path) void MetadataStorageFromPlainObjectStorageTransaction::unlinkFile(const std::string & path)
{ {
auto object = StoredObject(metadata_storage.getAbsolutePath(path)); auto object_key = metadata_storage.object_storage->generateObjectKeyForPath(path);
auto object = StoredObject(object_key.serialize());
metadata_storage.object_storage->removeObject(object); metadata_storage.object_storage->removeObject(object);
} }
@ -131,7 +130,7 @@ void MetadataStorageFromPlainObjectStorageTransaction::createDirectoryRecursive(
/// Noop. It is an Object Storage not a filesystem. /// Noop. It is an Object Storage not a filesystem.
} }
void MetadataStorageFromPlainObjectStorageTransaction::addBlobToMetadata( void MetadataStorageFromPlainObjectStorageTransaction::addBlobToMetadata(
const std::string &, const std::string & /* blob_name */, uint64_t /* size_in_bytes */) const std::string &, ObjectStorageKey /* object_key */, uint64_t /* size_in_bytes */)
{ {
/// Noop, local metadata files is only one file, it is the metadata file itself. /// Noop, local metadata files is only one file, it is the metadata file itself.
} }

View File

@ -29,12 +29,10 @@ private:
friend class MetadataStorageFromPlainObjectStorageTransaction; friend class MetadataStorageFromPlainObjectStorageTransaction;
ObjectStoragePtr object_storage; ObjectStoragePtr object_storage;
std::string object_storage_root_path; String storage_path_prefix;
public: public:
MetadataStorageFromPlainObjectStorage( MetadataStorageFromPlainObjectStorage(ObjectStoragePtr object_storage_, String storage_path_prefix_);
ObjectStoragePtr object_storage_,
const std::string & object_storage_root_path_);
MetadataTransactionPtr createTransaction() override; MetadataTransactionPtr createTransaction() override;
@ -56,8 +54,6 @@ public:
StoredObjects getStorageObjects(const std::string & path) const override; StoredObjects getStorageObjects(const std::string & path) const override;
std::string getObjectStorageRootPath() const override { return object_storage_root_path; }
Poco::Timestamp getLastModified(const std::string & /* path */) const override Poco::Timestamp getLastModified(const std::string & /* path */) const override
{ {
/// Required by MergeTree /// Required by MergeTree
@ -71,9 +67,6 @@ public:
bool supportsChmod() const override { return false; } bool supportsChmod() const override { return false; }
bool supportsStat() const override { return false; } bool supportsStat() const override { return false; }
private:
std::filesystem::path getAbsolutePath(const std::string & path) const;
}; };
class MetadataStorageFromPlainObjectStorageTransaction final : public IMetadataTransaction class MetadataStorageFromPlainObjectStorageTransaction final : public IMetadataTransaction
@ -89,14 +82,14 @@ public:
const IMetadataStorage & getStorageForNonTransactionalReads() const override; const IMetadataStorage & getStorageForNonTransactionalReads() const override;
void addBlobToMetadata(const std::string & path, const std::string & blob_name, uint64_t size_in_bytes) override; void addBlobToMetadata(const std::string & path, ObjectStorageKey object_key, uint64_t size_in_bytes) override;
void createEmptyMetadataFile(const std::string & /* path */) override void createEmptyMetadataFile(const std::string & /* path */) override
{ {
/// No metadata, no need to create anything. /// No metadata, no need to create anything.
} }
void createMetadataFile(const std::string & /* path */, const std::string & /* blob_name */, uint64_t /* size_in_bytes */) override void createMetadataFile(const std::string & /* path */, ObjectStorageKey /* object_key */, uint64_t /* size_in_bytes */) override
{ {
/// Noop /// Noop
} }

View File

@ -17,6 +17,7 @@
#include <Interpreters/threadPoolCallbackRunner.h> #include <Interpreters/threadPoolCallbackRunner.h>
#include <Disks/ObjectStorages/S3/diskSettings.h> #include <Disks/ObjectStorages/S3/diskSettings.h>
#include <Common/getRandomASCIIString.h>
#include <Common/ProfileEvents.h> #include <Common/ProfileEvents.h>
#include <Common/StringUtils/StringUtils.h> #include <Common/StringUtils/StringUtils.h>
#include <Common/logger_useful.h> #include <Common/logger_useful.h>
@ -127,7 +128,10 @@ private:
result = !objects.empty(); result = !objects.empty();
for (const auto & object : objects) for (const auto & object : objects)
batch.emplace_back(object.GetKey(), ObjectMetadata{static_cast<uint64_t>(object.GetSize()), Poco::Timestamp::fromEpochTime(object.GetLastModified().Seconds()), {}}); batch.emplace_back(
object.GetKey(),
ObjectMetadata{static_cast<uint64_t>(object.GetSize()), Poco::Timestamp::fromEpochTime(object.GetLastModified().Seconds()), {}}
);
if (result) if (result)
request.SetContinuationToken(outcome.GetResult().GetNextContinuationToken()); request.SetContinuationToken(outcome.GetResult().GetNextContinuationToken());
@ -293,7 +297,12 @@ void S3ObjectStorage::listObjects(const std::string & path, RelativePathsWithMet
break; break;
for (const auto & object : objects) for (const auto & object : objects)
children.emplace_back(object.GetKey(), ObjectMetadata{static_cast<uint64_t>(object.GetSize()), Poco::Timestamp::fromEpochTime(object.GetLastModified().Seconds()), {}}); children.emplace_back(
object.GetKey(),
ObjectMetadata{
static_cast<uint64_t>(object.GetSize()),
Poco::Timestamp::fromEpochTime(object.GetLastModified().Seconds()),
{}});
if (max_keys) if (max_keys)
{ {
@ -524,12 +533,33 @@ std::unique_ptr<IObjectStorage> S3ObjectStorage::cloneObjectStorage(
return std::make_unique<S3ObjectStorage>( return std::make_unique<S3ObjectStorage>(
std::move(new_client), std::move(new_s3_settings), std::move(new_client), std::move(new_s3_settings),
version_id, s3_capabilities, new_namespace, version_id, s3_capabilities, new_namespace,
endpoint); endpoint, object_key_prefix);
} }
S3ObjectStorage::Clients::Clients(std::shared_ptr<S3::Client> client_, const S3ObjectStorageSettings & settings) S3ObjectStorage::Clients::Clients(std::shared_ptr<S3::Client> client_, const S3ObjectStorageSettings & settings)
: client(std::move(client_)), client_with_long_timeout(client->clone(std::nullopt, settings.request_settings.long_request_timeout_ms)) {} : client(std::move(client_)), client_with_long_timeout(client->clone(std::nullopt, settings.request_settings.long_request_timeout_ms)) {}
ObjectStorageKey S3ObjectStorage::generateObjectKeyForPath(const std::string &) const
{
/// Path to store the new S3 object.
/// Total length is 32 a-z characters for enough randomness.
/// First 3 characters are used as a prefix for
/// https://aws.amazon.com/premiumsupport/knowledge-center/s3-object-key-naming-pattern/
constexpr size_t key_name_total_size = 32;
constexpr size_t key_name_prefix_size = 3;
/// Path to store new S3 object.
String key = fmt::format("{}/{}",
getRandomASCIIString(key_name_prefix_size),
getRandomASCIIString(key_name_total_size - key_name_prefix_size));
/// what ever key_prefix value is, consider that key as relative
return ObjectStorageKey::createAsRelative(object_key_prefix, key);
}
} }
#endif #endif

View File

@ -59,8 +59,10 @@ private:
String version_id_, String version_id_,
const S3Capabilities & s3_capabilities_, const S3Capabilities & s3_capabilities_,
String bucket_, String bucket_,
String connection_string) String connection_string,
: bucket(bucket_) String object_key_prefix_)
: bucket(std::move(bucket_))
, object_key_prefix(std::move(object_key_prefix_))
, clients(std::make_unique<Clients>(std::move(client_), *s3_settings_)) , clients(std::make_unique<Clients>(std::move(client_), *s3_settings_))
, s3_settings(std::move(s3_settings_)) , s3_settings(std::move(s3_settings_))
, s3_capabilities(s3_capabilities_) , s3_capabilities(s3_capabilities_)
@ -170,13 +172,17 @@ public:
bool supportParallelWrite() const override { return true; } bool supportParallelWrite() const override { return true; }
ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override;
private: private:
void setNewSettings(std::unique_ptr<S3ObjectStorageSettings> && s3_settings_); void setNewSettings(std::unique_ptr<S3ObjectStorageSettings> && s3_settings_);
void removeObjectImpl(const StoredObject & object, bool if_exists); void removeObjectImpl(const StoredObject & object, bool if_exists);
void removeObjectsImpl(const StoredObjects & objects, bool if_exists); void removeObjectsImpl(const StoredObjects & objects, bool if_exists);
private:
std::string bucket; std::string bucket;
String object_key_prefix;
MultiVersion<Clients> clients; MultiVersion<Clients> clients;
MultiVersion<S3ObjectStorageSettings> s3_settings; MultiVersion<S3ObjectStorageSettings> s3_settings;
@ -195,7 +201,11 @@ private:
class S3PlainObjectStorage : public S3ObjectStorage class S3PlainObjectStorage : public S3ObjectStorage
{ {
public: public:
std::string generateBlobNameForPath(const std::string & path) override { return path; } ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override
{
return ObjectStorageKey::createAsRelative(object_key_prefix, path);
}
std::string getName() const override { return "S3PlainObjectStorage"; } std::string getName() const override { return "S3PlainObjectStorage"; }
template <class ...Args> template <class ...Args>

View File

@ -126,12 +126,15 @@ void registerDiskS3(DiskFactory & factory, bool global_skip_access_check)
if (config.getBool(config_prefix + ".send_metadata", false)) if (config.getBool(config_prefix + ".send_metadata", false))
throw Exception(ErrorCodes::BAD_ARGUMENTS, "s3_plain does not supports send_metadata"); throw Exception(ErrorCodes::BAD_ARGUMENTS, "s3_plain does not supports send_metadata");
s3_storage = std::make_shared<S3PlainObjectStorage>(std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint); s3_storage = std::make_shared<S3PlainObjectStorage>(
std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint, uri.key);
metadata_storage = std::make_shared<MetadataStorageFromPlainObjectStorage>(s3_storage, uri.key); metadata_storage = std::make_shared<MetadataStorageFromPlainObjectStorage>(s3_storage, uri.key);
} }
else else
{ {
s3_storage = std::make_shared<S3ObjectStorage>(std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint); s3_storage = std::make_shared<S3ObjectStorage>(
std::move(client), std::move(settings), uri.version_id, s3_capabilities, uri.bucket, uri.endpoint, uri.key);
auto [metadata_path, metadata_disk] = prepareForLocalMetadata(name, config, config_prefix, context); auto [metadata_path, metadata_disk] = prepareForLocalMetadata(name, config, config_prefix, context);
metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, uri.key); metadata_storage = std::make_shared<MetadataStorageFromDisk>(metadata_disk, uri.key);
} }

View File

@ -1,8 +1,11 @@
#pragma once #pragma once
#include <base/types.h>
#include <Disks/ObjectStorages/IObjectStorage_fwd.h>
#include <functional> #include <functional>
#include <string> #include <string>
#include <Disks/ObjectStorages/IObjectStorage_fwd.h>
namespace DB namespace DB
@ -11,20 +14,32 @@ namespace DB
/// Object metadata: path, size, path_key_for_cache. /// Object metadata: path, size, path_key_for_cache.
struct StoredObject struct StoredObject
{ {
std::string remote_path; String remote_path; /// abs path
std::string local_path; /// or equivalent "metadata_path" String local_path; /// or equivalent "metadata_path"
uint64_t bytes_size = 0; uint64_t bytes_size = 0;
StoredObject() = default; StoredObject() = default;
explicit StoredObject( explicit StoredObject(String remote_path_)
const std::string & remote_path_, : remote_path(std::move(remote_path_))
uint64_t bytes_size_ = 0, {}
const std::string & local_path_ = "")
: remote_path(remote_path_) StoredObject(
, local_path(local_path_) String remote_path_,
, bytes_size(bytes_size_) {} uint64_t bytes_size_)
: remote_path(std::move(remote_path_))
, bytes_size(bytes_size_)
{}
StoredObject(
String remote_path_,
uint64_t bytes_size_,
String local_path_)
: remote_path(std::move(remote_path_))
, local_path(std::move(local_path_))
, bytes_size(bytes_size_)
{}
}; };
using StoredObjects = std::vector<StoredObject>; using StoredObjects = std::vector<StoredObject>;

View File

@ -28,7 +28,8 @@ MetadataTransactionPtr MetadataStorageFromStaticFilesWebServer::createTransactio
const std::string & MetadataStorageFromStaticFilesWebServer::getPath() const const std::string & MetadataStorageFromStaticFilesWebServer::getPath() const
{ {
return root_path; static const String no_root;
return no_root;
} }
bool MetadataStorageFromStaticFilesWebServer::exists(const std::string & path) const bool MetadataStorageFromStaticFilesWebServer::exists(const std::string & path) const
@ -96,7 +97,7 @@ std::vector<std::string> MetadataStorageFromStaticFilesWebServer::listDirectory(
for (const auto & [file_path, _] : object_storage.files) for (const auto & [file_path, _] : object_storage.files)
{ {
if (file_path.starts_with(path)) if (file_path.starts_with(path))
result.push_back(file_path); result.push_back(file_path); /// It looks more like recursive listing, not sure it is right
} }
return result; return result;
} }

View File

@ -16,12 +16,9 @@ private:
using FileType = WebObjectStorage::FileType; using FileType = WebObjectStorage::FileType;
const WebObjectStorage & object_storage; const WebObjectStorage & object_storage;
std::string root_path;
void assertExists(const std::string & path) const; void assertExists(const std::string & path) const;
void initializeImpl(const String & uri_path, const std::unique_lock<std::shared_mutex> &) const;
public: public:
explicit MetadataStorageFromStaticFilesWebServer(const WebObjectStorage & object_storage_); explicit MetadataStorageFromStaticFilesWebServer(const WebObjectStorage & object_storage_);
@ -43,8 +40,6 @@ public:
StoredObjects getStorageObjects(const std::string & path) const override; StoredObjects getStorageObjects(const std::string & path) const override;
std::string getObjectStorageRootPath() const override { return ""; }
struct stat stat(const String & /* path */) const override { return {}; } struct stat stat(const String & /* path */) const override { return {}; }
Poco::Timestamp getLastModified(const std::string & /* path */) const override Poco::Timestamp getLastModified(const std::string & /* path */) const override
@ -80,7 +75,7 @@ public:
/// No metadata, no need to create anything. /// No metadata, no need to create anything.
} }
void createMetadataFile(const std::string & /* path */, const std::string & /* blob_name */, uint64_t /* size_in_bytes */) override void createMetadataFile(const std::string & /* path */, ObjectStorageKey /* object_key */, uint64_t /* size_in_bytes */) override
{ {
/// Noop /// Noop
} }

View File

@ -89,7 +89,10 @@ public:
const std::string & config_prefix, const std::string & config_prefix,
ContextPtr context) override; ContextPtr context) override;
std::string generateBlobNameForPath(const std::string & path) override { return path; } ObjectStorageKey generateObjectKeyForPath(const std::string & path) const override
{
return ObjectStorageKey::createAsRelative(path);
}
bool isRemote() const override { return true; } bool isRemote() const override { return true; }

View File

@ -59,6 +59,7 @@ FormatSettings getFormatSettings(ContextPtr context, const Settings & settings)
format_settings.csv.allow_double_quotes = settings.format_csv_allow_double_quotes; format_settings.csv.allow_double_quotes = settings.format_csv_allow_double_quotes;
format_settings.csv.allow_single_quotes = settings.format_csv_allow_single_quotes; format_settings.csv.allow_single_quotes = settings.format_csv_allow_single_quotes;
format_settings.csv.crlf_end_of_line = settings.output_format_csv_crlf_end_of_line; format_settings.csv.crlf_end_of_line = settings.output_format_csv_crlf_end_of_line;
format_settings.csv.allow_cr_end_of_line = settings.input_format_csv_allow_cr_end_of_line;
format_settings.csv.delimiter = settings.format_csv_delimiter; format_settings.csv.delimiter = settings.format_csv_delimiter;
format_settings.csv.tuple_delimiter = settings.format_csv_delimiter; format_settings.csv.tuple_delimiter = settings.format_csv_delimiter;
format_settings.csv.empty_as_default = settings.input_format_csv_empty_as_default; format_settings.csv.empty_as_default = settings.input_format_csv_empty_as_default;

View File

@ -150,6 +150,7 @@ struct FormatSettings
bool allow_double_quotes = true; bool allow_double_quotes = true;
bool empty_as_default = false; bool empty_as_default = false;
bool crlf_end_of_line = false; bool crlf_end_of_line = false;
bool allow_cr_end_of_line = false;
bool enum_as_number = false; bool enum_as_number = false;
bool arrays_as_nested_csv = false; bool arrays_as_nested_csv = false;
String null_representation = "\\N"; String null_representation = "\\N";

View File

@ -49,7 +49,7 @@ public:
{ {
const auto & pos_arg = arguments[i]; const auto & pos_arg = arguments[i];
if (!isUnsignedInteger(pos_arg)) if (!isUInt(pos_arg))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of {} argument of function {}", pos_arg->getName(), i, getName()); throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of {} argument of function {}", pos_arg->getName(), i, getName());
} }

View File

@ -365,7 +365,7 @@ DataTypePtr FunctionGenerateRandomStructure::getReturnTypeImpl(const DataTypes &
for (size_t i = 0; i != arguments.size(); ++i) for (size_t i = 0; i != arguments.size(); ++i)
{ {
if (!isUnsignedInteger(arguments[i]) && !arguments[i]->onlyNull()) if (!isUInt(arguments[i]) && !arguments[i]->onlyNull())
{ {
throw Exception( throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,

View File

@ -2033,7 +2033,7 @@ static inline bool isDateTime64(const ColumnsWithTypeAndName & arguments)
else if constexpr (std::is_same_v<Name, NameToDateTime> || std::is_same_v<Name, NameParseDateTimeBestEffort> else if constexpr (std::is_same_v<Name, NameToDateTime> || std::is_same_v<Name, NameParseDateTimeBestEffort>
|| std::is_same_v<Name, NameParseDateTimeBestEffortOrZero> || std::is_same_v<Name, NameParseDateTimeBestEffortOrNull>) || std::is_same_v<Name, NameParseDateTimeBestEffortOrZero> || std::is_same_v<Name, NameParseDateTimeBestEffortOrNull>)
{ {
return (arguments.size() == 2 && isUnsignedInteger(arguments[1].type)) || arguments.size() == 3; return (arguments.size() == 2 && isUInt(arguments[1].type)) || arguments.size() == 3;
} }
return false; return false;

View File

@ -60,7 +60,7 @@ public:
if (!isString(arguments[0])) if (!isString(arguments[0]))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[0]->getName(), getName()); throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[0]->getName(), getName());
if (!isUnsignedInteger(arguments[1])) if (!isUInt(arguments[1]))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[1]->getName(), getName()); throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[1]->getName(), getName());
const DataTypeArray * array_type = checkAndGetDataType<DataTypeArray>(arguments[2].get()); const DataTypeArray * array_type = checkAndGetDataType<DataTypeArray>(arguments[2].get());

View File

@ -5,6 +5,7 @@
#include <Functions/FunctionFactory.h> #include <Functions/FunctionFactory.h>
#include <Functions/FunctionsStringSimilarity.h> #include <Functions/FunctionsStringSimilarity.h>
#include <Common/PODArray.h> #include <Common/PODArray.h>
#include <Common/UTF8Helpers.h>
#ifdef __SSE4_2__ #ifdef __SSE4_2__
# include <nmmintrin.h> # include <nmmintrin.h>
@ -14,6 +15,7 @@ namespace DB
{ {
namespace ErrorCodes namespace ErrorCodes
{ {
extern const int BAD_ARGUMENTS;
extern const int TOO_LARGE_STRING_SIZE; extern const int TOO_LARGE_STRING_SIZE;
} }
@ -59,8 +61,8 @@ struct FunctionStringDistanceImpl
size_t size = res.size(); size_t size = res.size();
for (size_t i = 0; i < size; ++i) for (size_t i = 0; i < size; ++i)
{ {
res[i] res[i] = Op::process(haystack_data, haystack_size,
= Op::process(haystack_data, haystack_size, needle + needle_offsets[i - 1], needle_offsets[i] - needle_offsets[i - 1] - 1); needle + needle_offsets[i - 1], needle_offsets[i] - needle_offsets[i - 1] - 1);
} }
} }
@ -108,6 +110,117 @@ struct ByteHammingDistanceImpl
} }
}; };
template <bool is_utf8>
struct ByteJaccardIndexImpl
{
using ResultType = Float64;
static ResultType inline process(
const char * __restrict haystack, size_t haystack_size, const char * __restrict needle, size_t needle_size)
{
if (haystack_size == 0 || needle_size == 0)
return 0;
const char * haystack_end = haystack + haystack_size;
const char * needle_end = needle + needle_size;
/// For byte strings use plain array as a set
constexpr size_t max_size = std::numeric_limits<unsigned char>::max() + 1;
std::array<UInt8, max_size> haystack_set;
std::array<UInt8, max_size> needle_set;
/// For UTF-8 strings we also use sets of code points greater than max_size
std::set<UInt32> haystack_utf8_set;
std::set<UInt32> needle_utf8_set;
haystack_set.fill(0);
needle_set.fill(0);
while (haystack < haystack_end)
{
size_t len = 1;
if constexpr (is_utf8)
len = UTF8::seqLength(*haystack);
if (len == 1)
{
haystack_set[static_cast<unsigned char>(*haystack)] = 1;
++haystack;
}
else
{
auto code_point = UTF8::convertUTF8ToCodePoint(haystack, haystack_end - haystack);
if (code_point.has_value())
{
haystack_utf8_set.insert(code_point.value());
haystack += len;
}
else
{
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Illegal UTF-8 sequence, while processing '{}'", StringRef(haystack, haystack_end - haystack));
}
}
}
while (needle < needle_end)
{
size_t len = 1;
if constexpr (is_utf8)
len = UTF8::seqLength(*needle);
if (len == 1)
{
needle_set[static_cast<unsigned char>(*needle)] = 1;
++needle;
}
else
{
auto code_point = UTF8::convertUTF8ToCodePoint(needle, needle_end - needle);
if (code_point.has_value())
{
needle_utf8_set.insert(code_point.value());
needle += len;
}
else
{
throw Exception(ErrorCodes::BAD_ARGUMENTS, "Illegal UTF-8 sequence, while processing '{}'", StringRef(needle, needle_end - needle));
}
}
}
UInt8 intersection = 0;
UInt8 union_size = 0;
if constexpr (is_utf8)
{
auto lit = haystack_utf8_set.begin();
auto rit = needle_utf8_set.begin();
while (lit != haystack_utf8_set.end() && rit != needle_utf8_set.end())
{
if (*lit == *rit)
{
++intersection;
++lit;
++rit;
}
else if (*lit < *rit)
++lit;
else
++rit;
}
union_size = haystack_utf8_set.size() + needle_utf8_set.size() - intersection;
}
for (size_t i = 0; i < max_size; ++i)
{
intersection += haystack_set[i] & needle_set[i];
union_size += haystack_set[i] | needle_set[i];
}
return static_cast<ResultType>(intersection) / static_cast<ResultType>(union_size);
}
};
struct ByteEditDistanceImpl struct ByteEditDistanceImpl
{ {
using ResultType = UInt64; using ResultType = UInt64;
@ -123,9 +236,8 @@ struct ByteEditDistanceImpl
if (haystack_size > max_string_size || needle_size > max_string_size) if (haystack_size > max_string_size || needle_size > max_string_size)
throw Exception( throw Exception(
ErrorCodes::TOO_LARGE_STRING_SIZE, ErrorCodes::TOO_LARGE_STRING_SIZE,
"The string size is too big for function byteEditDistance. " "The string size is too big for function editDistance, "
"Should be at most {}", "should be at most {}", max_string_size);
max_string_size);
PaddedPODArray<ResultType> distances0(haystack_size + 1, 0); PaddedPODArray<ResultType> distances0(haystack_size + 1, 0);
PaddedPODArray<ResultType> distances1(haystack_size + 1, 0); PaddedPODArray<ResultType> distances1(haystack_size + 1, 0);
@ -163,15 +275,25 @@ struct NameByteHammingDistance
{ {
static constexpr auto name = "byteHammingDistance"; static constexpr auto name = "byteHammingDistance";
}; };
using FunctionByteHammingDistance = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteHammingDistanceImpl>, NameByteHammingDistance>;
struct NameEditDistance struct NameEditDistance
{ {
static constexpr auto name = "editDistance"; static constexpr auto name = "editDistance";
}; };
using FunctionEditDistance = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteEditDistanceImpl>, NameEditDistance>;
using FunctionByteHammingDistance = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteHammingDistanceImpl>, NameByteHammingDistance>; struct NameJaccardIndex
{
static constexpr auto name = "stringJaccardIndex";
};
using FunctionStringJaccardIndex = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteJaccardIndexImpl<false>>, NameJaccardIndex>;
using FunctionByteEditDistance = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteEditDistanceImpl>, NameEditDistance>; struct NameJaccardIndexUTF8
{
static constexpr auto name = "stringJaccardIndexUTF8";
};
using FunctionStringJaccardIndexUTF8 = FunctionsStringSimilarity<FunctionStringDistanceImpl<ByteJaccardIndexImpl<true>>, NameJaccardIndexUTF8>;
REGISTER_FUNCTION(StringDistance) REGISTER_FUNCTION(StringDistance)
{ {
@ -179,9 +301,13 @@ REGISTER_FUNCTION(StringDistance)
FunctionDocumentation{.description = R"(Calculates Hamming distance between two byte-strings.)"}); FunctionDocumentation{.description = R"(Calculates Hamming distance between two byte-strings.)"});
factory.registerAlias("mismatches", NameByteHammingDistance::name); factory.registerAlias("mismatches", NameByteHammingDistance::name);
factory.registerFunction<FunctionByteEditDistance>( factory.registerFunction<FunctionEditDistance>(
FunctionDocumentation{.description = R"(Calculates the edit distance between two byte-strings.)"}); FunctionDocumentation{.description = R"(Calculates the edit distance between two byte-strings.)"});
factory.registerAlias("levenshteinDistance", NameEditDistance::name); factory.registerAlias("levenshteinDistance", NameEditDistance::name);
factory.registerFunction<FunctionStringJaccardIndex>(
FunctionDocumentation{.description = R"(Calculates the [Jaccard similarity index](https://en.wikipedia.org/wiki/Jaccard_index) between two byte strings.)"});
factory.registerFunction<FunctionStringJaccardIndexUTF8>(
FunctionDocumentation{.description = R"(Calculates the [Jaccard similarity index](https://en.wikipedia.org/wiki/Jaccard_index) between two UTF8 strings.)"});
} }
} }

View File

@ -64,7 +64,7 @@ public:
if (arguments.size() > 1) if (arguments.size() > 1)
{ {
if (!isUnsignedInteger(arguments[1].type)) if (!isUInt(arguments[1].type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Second argument (shingle size) of function {} must be unsigned integer, got {}", "Second argument (shingle size) of function {} must be unsigned integer, got {}",
getName(), arguments[1].type->getName()); getName(), arguments[1].type->getName());
@ -85,7 +85,7 @@ public:
"Function {} expect no more than two arguments (text, shingle size), got {}", "Function {} expect no more than two arguments (text, shingle size), got {}",
getName(), arguments.size()); getName(), arguments.size());
if (!isUnsignedInteger(arguments[2].type)) if (!isUInt(arguments[2].type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Third argument (num hashes) of function {} must be unsigned integer, got {}", "Third argument (num hashes) of function {} must be unsigned integer, got {}",
getName(), arguments[2].type->getName()); getName(), arguments[2].type->getName());

View File

@ -119,7 +119,7 @@ public:
if (arguments.size() >= 3) if (arguments.size() >= 3)
{ {
if (!isUnsignedInteger(arguments[2])) if (!isUInt(arguments[2]))
throw Exception( throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Illegal type {} of argument of function {}", "Illegal type {} of argument of function {}",

View File

@ -1,118 +0,0 @@
#include <random>
#include <Columns/ColumnArray.h>
#include <DataTypes/DataTypeArray.h>
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h>
#include <Functions/IFunction.h>
#include <Poco/Logger.h>
#include "Columns/ColumnsNumber.h"
namespace DB
{
namespace ErrorCodes
{
extern const int ILLEGAL_COLUMN;
extern const int ILLEGAL_TYPE_OF_ARGUMENT;
}
/// arrayRandomSample(arr, k) - Returns k random elements from the input array
class FunctionArrayRandomSample : public IFunction
{
public:
static constexpr auto name = "arrayRandomSample";
static FunctionPtr create(ContextPtr) { return std::make_shared<FunctionArrayRandomSample>(); }
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
FunctionArgumentDescriptors args{
{"array", &isArray<IDataType>, nullptr, "Array"},
{"samples", &isUnsignedInteger<IDataType>, isColumnConst, "const UInt*"},
};
validateFunctionArgumentTypes(*this, arguments, args);
// Return an array with the same nested type as the input array
const DataTypePtr & array_type = arguments[0].type;
const DataTypeArray * array_data_type = checkAndGetDataType<DataTypeArray>(array_type.get());
// Get the nested data type of the array
const DataTypePtr & nested_type = array_data_type->getNestedType();
return std::make_shared<DataTypeArray>(nested_type);
}
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override
{
const ColumnArray * column_array = checkAndGetColumn<ColumnArray>(arguments[0].column.get());
if (!column_array)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "First argument must be an array");
const IColumn * col_samples = arguments[1].column.get();
if (!col_samples)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "The second argument is empty or null, type = {}", arguments[1].type->getName());
UInt64 samples;
try
{
samples = col_samples->getUInt(0);
}
catch (...)
{
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,
"Failed to fetch UInt64 from the second argument column, type = {}",
arguments[1].type->getName());
}
std::random_device rd;
std::mt19937 gen(rd());
auto nested_column = column_array->getDataPtr()->cloneEmpty();
auto offsets_column = ColumnUInt64::create();
auto res_data = ColumnArray::create(std::move(nested_column), std::move(offsets_column));
const auto & input_offsets = column_array->getOffsets();
auto & res_offsets = res_data->getOffsets();
res_offsets.resize(input_rows_count);
UInt64 cur_samples;
size_t current_offset = 0;
for (size_t row = 0; row < input_rows_count; row++)
{
size_t row_size = input_offsets[row] - current_offset;
std::vector<size_t> indices(row_size);
std::iota(indices.begin(), indices.end(), 0);
std::shuffle(indices.begin(), indices.end(), gen);
cur_samples = std::min(samples, static_cast<UInt64>(row_size));
for (UInt64 j = 0; j < cur_samples; j++)
{
size_t source_index = indices[j];
res_data->getData().insertFrom(column_array->getData(), source_index);
}
res_offsets[row] = current_offset + cur_samples;
current_offset += cur_samples;
}
return res_data;
}
};
REGISTER_FUNCTION(ArrayRandomSample)
{
factory.registerFunction<FunctionArrayRandomSample>();
}
}

View File

@ -1,7 +1,9 @@
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnString.h> #include <Columns/ColumnString.h>
#include <DataTypes/DataTypeString.h>
#include <Functions/FunctionFactory.h> #include <Functions/FunctionFactory.h>
#include <Functions/FunctionHelpers.h> #include <Functions/FunctionHelpers.h>
#include <IO/WriteBufferFromVector.h> #include <IO/WriteBufferFromString.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Parsers/ParserQuery.h> #include <Parsers/ParserQuery.h>
#include <Parsers/formatAST.h> #include <Parsers/formatAST.h>
@ -15,7 +17,19 @@ namespace ErrorCodes
extern const int ILLEGAL_COLUMN; extern const int ILLEGAL_COLUMN;
} }
template <bool one_line, typename Name> enum class OutputFormatting
{
SingleLine,
MultiLine
};
enum class ErrorHandling
{
Exception,
Null
};
template <OutputFormatting output_formatting, ErrorHandling error_handling, typename Name>
class FunctionFormatQuery : public IFunction class FunctionFormatQuery : public IFunction
{ {
public: public:
@ -27,70 +41,127 @@ public:
} }
FunctionFormatQuery(size_t max_query_size_, size_t max_parser_depth_) FunctionFormatQuery(size_t max_query_size_, size_t max_parser_depth_)
: max_query_size(max_query_size_), max_parser_depth(max_parser_depth_) : max_query_size(max_query_size_)
, max_parser_depth(max_parser_depth_)
{ {
} }
String getName() const override { return name; } String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 1; } size_t getNumberOfArguments() const override { return 1; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; } bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool useDefaultImplementationForConstants() const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{ {
FunctionArgumentDescriptors mandatory_args{{"query", &isString<IDataType>, nullptr, "String"}}; FunctionArgumentDescriptors args{
validateFunctionArgumentTypes(*this, arguments, mandatory_args); {"query", &isString<IDataType>, nullptr, "String"}
return arguments[0].type; };
validateFunctionArgumentTypes(*this, arguments, args);
DataTypePtr string_type = std::make_shared<DataTypeString>();
if constexpr (error_handling == ErrorHandling::Null)
return std::make_shared<DataTypeNullable>(string_type);
else
return string_type;
} }
bool useDefaultImplementationForConstants() const override { return true; } ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t input_rows_count) const override
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
{ {
const ColumnPtr column = arguments[0].column; const ColumnPtr col_query = arguments[0].column;
if (const ColumnString * col = checkAndGetColumn<ColumnString>(column.get()))
ColumnUInt8::MutablePtr col_null_map;
if constexpr (error_handling == ErrorHandling::Null)
col_null_map = ColumnUInt8::create(input_rows_count, 0);
if (const ColumnString * col_query_string = checkAndGetColumn<ColumnString>(col_query.get()))
{ {
auto col_res = ColumnString::create(); auto col_res = ColumnString::create();
formatVector(col->getChars(), col->getOffsets(), col_res->getChars(), col_res->getOffsets()); formatVector(col_query_string->getChars(), col_query_string->getOffsets(), col_res->getChars(), col_res->getOffsets(), col_null_map);
return col_res;
if constexpr (error_handling == ErrorHandling::Null)
return ColumnNullable::create(std::move(col_res), std::move(col_null_map));
else
return col_res;
} }
else else
throw Exception( throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of argument of function {}", col_query->getName(), getName());
ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of argument of function {}", arguments[0].column->getName(), getName());
} }
private: private:
void formatQueryImpl(const char * begin, const char * end, ColumnString::Chars & output) const
{
ParserQuery parser{end};
auto ast = parseQuery(parser, begin, end, {}, max_query_size, max_parser_depth);
WriteBufferFromVector buf(output, AppendModeTag{});
formatAST(*ast, buf, /* hilite */ false, /* one_line */ one_line);
buf.finalize();
}
void formatVector( void formatVector(
const ColumnString::Chars & data, const ColumnString::Chars & data,
const ColumnString::Offsets & offsets, const ColumnString::Offsets & offsets,
ColumnString::Chars & res_data, ColumnString::Chars & res_data,
ColumnString::Offsets & res_offsets) const ColumnString::Offsets & res_offsets,
ColumnUInt8::MutablePtr & res_null_map) const
{ {
const size_t size = offsets.size(); const size_t size = offsets.size();
res_offsets.resize(size); res_offsets.resize(size);
res_data.reserve(data.size()); res_data.resize(data.size());
size_t prev_offset = 0;
size_t res_data_size = 0;
size_t prev_in_offset = 0;
for (size_t i = 0; i < size; ++i) for (size_t i = 0; i < size; ++i)
{ {
const auto * begin = reinterpret_cast<const char *>(&data[prev_in_offset]); const char * begin = reinterpret_cast<const char *>(&data[prev_offset]);
const char * end = begin + offsets[i] - 1; const char * end = begin + offsets[i] - prev_offset - 1;
formatQueryImpl(begin, end, res_data);
res_offsets[i] = res_data.size() + 1; ParserQuery parser(end);
prev_in_offset = offsets[i]; ASTPtr ast;
WriteBufferFromOwnString buf;
try
{
ast = parseQuery(parser, begin, end, /*query_description*/ {}, max_query_size, max_parser_depth);
}
catch (...)
{
if constexpr (error_handling == ErrorHandling::Null)
{
const size_t res_data_new_size = res_data_size + 1;
if (res_data_new_size > res_data.size())
res_data.resize(2 * res_data_new_size);
res_data[res_data_size] = '\0';
res_data_size += 1;
res_offsets[i] = res_data_size;
prev_offset = offsets[i];
res_null_map->getData()[i] = 1;
continue;
}
else
{
static_assert(error_handling == ErrorHandling::Exception);
throw;
}
}
formatAST(*ast, buf, /*hilite*/ false, /*single_line*/ output_formatting == OutputFormatting::SingleLine);
auto formatted = buf.stringView();
const size_t res_data_new_size = res_data_size + formatted.size() + 1;
if (res_data_new_size > res_data.size())
res_data.resize(2 * res_data_new_size);
memcpy(&res_data[res_data_size], formatted.begin(), formatted.size());
res_data_size += formatted.size();
res_data[res_data_size] = '\0';
res_data_size += 1;
res_offsets[i] = res_data_size;
prev_offset = offsets[i];
} }
res_data.resize(res_data_size);
} }
size_t max_query_size;
size_t max_parser_depth; const size_t max_query_size;
const size_t max_parser_depth;
}; };
struct NameFormatQuery struct NameFormatQuery
@ -98,15 +169,25 @@ struct NameFormatQuery
static constexpr auto name = "formatQuery"; static constexpr auto name = "formatQuery";
}; };
struct NameFormatQueryOrNull
{
static constexpr auto name = "formatQueryOrNull";
};
struct NameFormatQuerySingleLine struct NameFormatQuerySingleLine
{ {
static constexpr auto name = "formatQuerySingleLine"; static constexpr auto name = "formatQuerySingleLine";
}; };
struct NameFormatQuerySingleLineOrNull
{
static constexpr auto name = "formatQuerySingleLineOrNull";
};
REGISTER_FUNCTION(formatQuery) REGISTER_FUNCTION(formatQuery)
{ {
factory.registerFunction<FunctionFormatQuery<false, NameFormatQuery>>(FunctionDocumentation{ factory.registerFunction<FunctionFormatQuery<OutputFormatting::MultiLine, ErrorHandling::Exception, NameFormatQuery>>(FunctionDocumentation{
.description = "Returns a formatted, possibly multi-line, version of the given SQL query.\n[example:multiline]", .description = "Returns a formatted, possibly multi-line, version of the given SQL query. Throws in case of a parsing error.\n[example:multiline]",
.syntax = "formatQuery(query)", .syntax = "formatQuery(query)",
.arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}}, .arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}},
.returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).", .returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).",
@ -121,10 +202,28 @@ REGISTER_FUNCTION(formatQuery)
.categories{"Other"}}); .categories{"Other"}});
} }
REGISTER_FUNCTION(formatQueryOrNull)
{
factory.registerFunction<FunctionFormatQuery<OutputFormatting::MultiLine, ErrorHandling::Null, NameFormatQueryOrNull>>(FunctionDocumentation{
.description = "Returns a formatted, possibly multi-line, version of the given SQL query. Returns NULL in case of a parsing error.\n[example:multiline]",
.syntax = "formatQueryOrNull(query)",
.arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}},
.returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).",
.examples{
{"multiline",
"SELECT formatQuery('select a, b FRom tab WHERE a > 3 and b < 3');",
"SELECT\n"
" a,\n"
" b\n"
"FROM tab\n"
"WHERE (a > 3) AND (b < 3)"}},
.categories{"Other"}});
}
REGISTER_FUNCTION(formatQuerySingleLine) REGISTER_FUNCTION(formatQuerySingleLine)
{ {
factory.registerFunction<FunctionFormatQuery<true, NameFormatQuerySingleLine>>(FunctionDocumentation{ factory.registerFunction<FunctionFormatQuery<OutputFormatting::SingleLine, ErrorHandling::Exception, NameFormatQuerySingleLine>>(FunctionDocumentation{
.description = "Like formatQuery() but the returned formatted string contains no line breaks.\n[example:multiline]", .description = "Like formatQuery() but the returned formatted string contains no line breaks. Throws in case of a parsing error.\n[example:multiline]",
.syntax = "formatQuerySingleLine(query)", .syntax = "formatQuerySingleLine(query)",
.arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}}, .arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}},
.returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).", .returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).",
@ -134,4 +233,19 @@ REGISTER_FUNCTION(formatQuerySingleLine)
"SELECT a, b FROM tab WHERE (a > 3) AND (b < 3)"}}, "SELECT a, b FROM tab WHERE (a > 3) AND (b < 3)"}},
.categories{"Other"}}); .categories{"Other"}});
} }
REGISTER_FUNCTION(formatQuerySingleLineOrNull)
{
factory.registerFunction<FunctionFormatQuery<OutputFormatting::SingleLine, ErrorHandling::Null, NameFormatQuerySingleLineOrNull>>(FunctionDocumentation{
.description = "Like formatQuery() but the returned formatted string contains no line breaks. Returns NULL in case of a parsing error.\n[example:multiline]",
.syntax = "formatQuerySingleLineOrNull(query)",
.arguments = {{"query", "The SQL query to be formatted. [String](../../sql-reference/data-types/string.md)"}},
.returned_value = "The formatted query. [String](../../sql-reference/data-types/string.md).",
.examples{
{"multiline",
"SELECT formatQuerySingleLine('select a, b FRom tab WHERE a > 3 and b < 3');",
"SELECT a, b FROM tab WHERE (a > 3) AND (b < 3)"}},
.categories{"Other"}});
}
} }

View File

@ -59,7 +59,7 @@ public:
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{ {
FunctionArgumentDescriptors args{ FunctionArgumentDescriptors args{
{"days", &isNativeUnsignedInteger<IDataType>, nullptr, "UInt*"} {"days", &isNativeUInt<IDataType>, nullptr, "UInt*"}
}; };
validateFunctionArgumentTypes(*this, arguments, args); validateFunctionArgumentTypes(*this, arguments, args);

View File

@ -41,7 +41,7 @@ public:
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{ {
if (!isUnsignedInteger(arguments[0].type)) if (!isUInt(arguments[0].type))
throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be unsigned integer", getName()); throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "First argument for function {} must be unsigned integer", getName());
if (!arguments[0].column || !isColumnConst(*arguments[0].column)) if (!arguments[0].column || !isColumnConst(*arguments[0].column))

View File

@ -47,7 +47,7 @@ public:
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{ {
if (!isUnsignedInteger(arguments[1].type)) if (!isUInt(arguments[1].type))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be unsigned integer", getName()); throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be unsigned integer", getName());
if (!arguments[1].column) if (!arguments[1].column)
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be constant", getName()); throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be constant", getName());

View File

@ -1147,7 +1147,7 @@ public:
double p; double p;
if (isFloat(p_column.column->getDataType())) if (isFloat(p_column.column->getDataType()))
p = p_column.column->getFloat64(0); p = p_column.column->getFloat64(0);
else if (isUnsignedInteger(p_column.column->getDataType())) else if (isUInt(p_column.column->getDataType()))
p = p_column.column->getUInt(0); p = p_column.column->getUInt(0);
else else
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be either constant Float64 or constant UInt", getName()); throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be either constant Float64 or constant UInt", getName());

View File

@ -57,7 +57,7 @@ public:
{ {
for (size_t i = 0; i < 4; ++i) for (size_t i = 0; i < 4; ++i)
{ {
if (!isUnsignedInteger(arguments[i].type)) if (!isUInt(arguments[i].type))
{ {
throw Exception( throw Exception(
ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT,

View File

@ -13,7 +13,7 @@ namespace ErrorCodes
{ {
extern const int ARGUMENT_OUT_OF_BOUND; extern const int ARGUMENT_OUT_OF_BOUND;
extern const int UNEXPECTED_END_OF_FILE; extern const int UNEXPECTED_END_OF_FILE;
extern const int CORRUPTED_DATA; extern const int BAD_REQUEST_PARAMETER;
} }
size_t HTTPChunkedReadBuffer::readChunkHeader() size_t HTTPChunkedReadBuffer::readChunkHeader()
@ -22,7 +22,7 @@ size_t HTTPChunkedReadBuffer::readChunkHeader()
throw Exception(ErrorCodes::UNEXPECTED_END_OF_FILE, "Unexpected end of file while reading chunk header of HTTP chunked data"); throw Exception(ErrorCodes::UNEXPECTED_END_OF_FILE, "Unexpected end of file while reading chunk header of HTTP chunked data");
if (!isHexDigit(*in->position())) if (!isHexDigit(*in->position()))
throw Exception(ErrorCodes::CORRUPTED_DATA, "Unexpected data instead of HTTP chunk header"); throw Exception(ErrorCodes::BAD_REQUEST_PARAMETER, "Unexpected data instead of HTTP chunk header");
size_t res = 0; size_t res = 0;
do do

View File

@ -835,7 +835,7 @@ void readCSVStringInto(Vector & s, ReadBuffer & buf, const FormatSettings::CSV &
/// Check for single '\r' not followed by '\n' /// Check for single '\r' not followed by '\n'
/// We should not stop in this case. /// We should not stop in this case.
if (*buf.position() == '\r') if (*buf.position() == '\r' && !settings.allow_cr_end_of_line)
{ {
++buf.position(); ++buf.position();
if (!buf.eof() && *buf.position() != '\n') if (!buf.eof() && *buf.position() != '\n')

View File

@ -984,20 +984,31 @@ inline ReturnType readDateTimeTextImpl(time_t & datetime, ReadBuffer & buf, cons
template <typename ReturnType> template <typename ReturnType>
inline ReturnType readDateTimeTextImpl(DateTime64 & datetime64, UInt32 scale, ReadBuffer & buf, const DateLUTImpl & date_lut) inline ReturnType readDateTimeTextImpl(DateTime64 & datetime64, UInt32 scale, ReadBuffer & buf, const DateLUTImpl & date_lut)
{ {
static constexpr bool throw_exception = std::is_same_v<ReturnType, void>;
time_t whole = 0; time_t whole = 0;
bool is_negative_timestamp = (!buf.eof() && *buf.position() == '-'); bool is_negative_timestamp = (!buf.eof() && *buf.position() == '-');
bool is_empty = buf.eof(); bool is_empty = buf.eof();
if (!is_empty) if (!is_empty)
{ {
try if constexpr (throw_exception)
{ {
readDateTimeTextImpl<ReturnType, true>(whole, buf, date_lut); try
{
readDateTimeTextImpl<ReturnType, true>(whole, buf, date_lut);
}
catch (const DB::ParsingException &)
{
if (buf.eof() || *buf.position() != '.')
throw;
}
} }
catch (const DB::ParsingException & exception) else
{ {
if (buf.eof() || *buf.position() != '.') auto ok = readDateTimeTextImpl<ReturnType, true>(whole, buf, date_lut);
throw exception; if (!ok && (buf.eof() || *buf.position() != '.'))
return ReturnType(false);
} }
} }

View File

@ -513,6 +513,7 @@ bool DDLWorker::tryExecuteQuery(DDLTaskBase & task, const ZooKeeperPtr & zookeep
/// get the same exception again. So we return false only for several special exception codes, /// get the same exception again. So we return false only for several special exception codes,
/// and consider query as executed with status "failed" and return true in other cases. /// and consider query as executed with status "failed" and return true in other cases.
bool no_sense_to_retry = e.code() != ErrorCodes::KEEPER_EXCEPTION && bool no_sense_to_retry = e.code() != ErrorCodes::KEEPER_EXCEPTION &&
e.code() != ErrorCodes::UNFINISHED &&
e.code() != ErrorCodes::NOT_A_LEADER && e.code() != ErrorCodes::NOT_A_LEADER &&
e.code() != ErrorCodes::TABLE_IS_READ_ONLY && e.code() != ErrorCodes::TABLE_IS_READ_ONLY &&
e.code() != ErrorCodes::CANNOT_ASSIGN_ALTER && e.code() != ErrorCodes::CANNOT_ASSIGN_ALTER &&
@ -793,11 +794,15 @@ bool DDLWorker::tryExecuteQueryOnLeaderReplica(
// Has to get with zk fields to get active replicas field // Has to get with zk fields to get active replicas field
replicated_storage->getStatus(status, true); replicated_storage->getStatus(status, true);
// Should return as soon as possible if the table is dropped. // Should return as soon as possible if the table is dropped or detached, so we will release StoragePtr
bool replica_dropped = storage->is_dropped; bool replica_dropped = storage->is_dropped;
bool all_replicas_likely_detached = status.active_replicas == 0 && !DatabaseCatalog::instance().isTableExist(storage->getStorageID(), context); bool all_replicas_likely_detached = status.active_replicas == 0 && !DatabaseCatalog::instance().isTableExist(storage->getStorageID(), context);
if (replica_dropped || all_replicas_likely_detached) if (replica_dropped || all_replicas_likely_detached)
{ {
/// We have to exit (and release StoragePtr) if the replica is being restarted,
/// but we can retry in this case, so don't write execution status
if (storage->is_being_restarted)
throw Exception(ErrorCodes::UNFINISHED, "Cannot execute replicated DDL query, table is dropped or detached permanently");
LOG_WARNING(log, ", task {} will not be executed.", task.entry_name); LOG_WARNING(log, ", task {} will not be executed.", task.entry_name);
task.execution_status = ExecutionStatus(ErrorCodes::UNFINISHED, "Cannot execute replicated DDL query, table is dropped or detached permanently"); task.execution_status = ExecutionStatus(ErrorCodes::UNFINISHED, "Cannot execute replicated DDL query, table is dropped or detached permanently");
return false; return false;

View File

@ -6,6 +6,7 @@
#include <IO/WriteBufferFromString.h> #include <IO/WriteBufferFromString.h>
#include <Parsers/ASTShowColumnsQuery.h> #include <Parsers/ASTShowColumnsQuery.h>
#include <Parsers/formatAST.h> #include <Parsers/formatAST.h>
#include <Interpreters/ClientInfo.h>
#include <Interpreters/Context.h> #include <Interpreters/Context.h>
#include <Interpreters/executeQuery.h> #include <Interpreters/executeQuery.h>
@ -25,8 +26,10 @@ String InterpreterShowColumnsQuery::getRewrittenQuery()
{ {
const auto & query = query_ptr->as<ASTShowColumnsQuery &>(); const auto & query = query_ptr->as<ASTShowColumnsQuery &>();
ClientInfo::Interface client_interface = getContext()->getClientInfo().interface;
const bool use_mysql_types = (client_interface == ClientInfo::Interface::MYSQL); // connection made through MySQL wire protocol
const auto & settings = getContext()->getSettingsRef(); const auto & settings = getContext()->getSettingsRef();
const bool use_mysql_types = settings.use_mysql_types_in_show_columns;
const bool remap_string_as_text = settings.mysql_map_string_to_text_in_show_columns; const bool remap_string_as_text = settings.mysql_map_string_to_text_in_show_columns;
const bool remap_fixed_string_as_text = settings.mysql_map_fixed_string_to_text_in_show_columns; const bool remap_fixed_string_as_text = settings.mysql_map_fixed_string_to_text_in_show_columns;
@ -39,7 +42,6 @@ String InterpreterShowColumnsQuery::getRewrittenQuery()
if (use_mysql_types) if (use_mysql_types)
{ {
/// Cheapskate SQL-based mapping from native types to MySQL types, see https://dev.mysql.com/doc/refman/8.0/en/data-types.html /// Cheapskate SQL-based mapping from native types to MySQL types, see https://dev.mysql.com/doc/refman/8.0/en/data-types.html
/// Only used with setting 'use_mysql_types_in_show_columns = 1'
/// Known issues: /// Known issues:
/// - Enums are translated to TEXT /// - Enums are translated to TEXT
rewritten_query += fmt::format( rewritten_query += fmt::format(

View File

@ -728,6 +728,11 @@ StoragePtr InterpreterSystemQuery::tryRestartReplica(const StorageID & replica,
if (!table || !dynamic_cast<const StorageReplicatedMergeTree *>(table.get())) if (!table || !dynamic_cast<const StorageReplicatedMergeTree *>(table.get()))
return nullptr; return nullptr;
SCOPE_EXIT({
if (table)
table->is_being_restarted = false;
});
table->is_being_restarted = true;
table->flushAndShutdown(); table->flushAndShutdown();
{ {
/// If table was already dropped by anyone, an exception will be thrown /// If table was already dropped by anyone, an exception will be thrown

View File

@ -1146,12 +1146,13 @@ JoinTreeQueryPlan buildQueryPlanForJoinNode(const QueryTreeNodePtr & join_table_
const auto & join_clause_right_key_nodes = join_clause.getRightKeyNodes(); const auto & join_clause_right_key_nodes = join_clause.getRightKeyNodes();
size_t join_clause_key_nodes_size = join_clause_left_key_nodes.size(); size_t join_clause_key_nodes_size = join_clause_left_key_nodes.size();
assert(join_clause_key_nodes_size == join_clause_right_key_nodes.size()); chassert(join_clause_key_nodes_size == join_clause_right_key_nodes.size());
for (size_t i = 0; i < join_clause_key_nodes_size; ++i) for (size_t i = 0; i < join_clause_key_nodes_size; ++i)
{ {
table_join_clause.key_names_left.push_back(join_clause_left_key_nodes[i]->result_name); table_join_clause.addKey(join_clause_left_key_nodes[i]->result_name,
table_join_clause.key_names_right.push_back(join_clause_right_key_nodes[i]->result_name); join_clause_right_key_nodes[i]->result_name,
join_clause.isNullsafeCompareKey(i));
} }
const auto & join_clause_get_left_filter_condition_nodes = join_clause.getLeftFilterConditionNodes(); const auto & join_clause_get_left_filter_condition_nodes = join_clause.getLeftFilterConditionNodes();

View File

@ -191,7 +191,7 @@ void buildJoinClause(ActionsDAGPtr join_expression_dag,
auto asof_inequality = getASOFJoinInequality(function_name); auto asof_inequality = getASOFJoinInequality(function_name);
bool is_asof_join_inequality = join_node.getStrictness() == JoinStrictness::Asof && asof_inequality != ASOFJoinInequality::None; bool is_asof_join_inequality = join_node.getStrictness() == JoinStrictness::Asof && asof_inequality != ASOFJoinInequality::None;
if (function_name == "equals" || is_asof_join_inequality) if (function_name == "equals" || function_name == "isNotDistinctFrom" || is_asof_join_inequality)
{ {
const auto * left_child = join_expressions_actions_node->children.at(0); const auto * left_child = join_expressions_actions_node->children.at(0);
const auto * right_child = join_expressions_actions_node->children.at(1); const auto * right_child = join_expressions_actions_node->children.at(1);
@ -253,7 +253,8 @@ void buildJoinClause(ActionsDAGPtr join_expression_dag,
} }
else else
{ {
join_clause.addKey(left_key, right_key); bool null_safe_comparison = function_name == "isNotDistinctFrom";
join_clause.addKey(left_key, right_key, null_safe_comparison);
} }
} }
else else
@ -474,6 +475,24 @@ JoinClausesAndActions buildJoinClausesAndActions(const ColumnsWithTypeAndName &
right_key_node = &join_expression_actions->addCast(*right_key_node, common_type, {}); right_key_node = &join_expression_actions->addCast(*right_key_node, common_type, {});
} }
if (join_clause.isNullsafeCompareKey(i) && left_key_node->result_type->isNullable() && right_key_node->result_type->isNullable())
{
/**
* In case of null-safe comparison (a IS NOT DISTICT FROM b),
* we need to wrap keys with a non-nullable type.
* The type `tuple` can be used for this purpose,
* because value tuple(NULL) is not NULL itself (moreover it has type Tuple(Nullable(T) which is not Nullable).
* Thus, join algorithm will match keys with values tuple(NULL).
* Example:
* SELECT * FROM t1 JOIN t2 ON t1.a <=> t2.b
* This will be semantically transformed to:
* SELECT * FROM t1 JOIN t2 ON tuple(t1.a) == tuple(t2.b)
*/
auto wrap_nullsafe_function = FunctionFactory::instance().get("tuple", planner_context->getQueryContext());
left_key_node = &join_expression_actions->addFunction(wrap_nullsafe_function, {left_key_node}, {});
right_key_node = &join_expression_actions->addFunction(wrap_nullsafe_function, {right_key_node}, {});
}
join_expression_actions->addOrReplaceInOutputs(*left_key_node); join_expression_actions->addOrReplaceInOutputs(*left_key_node);
join_expression_actions->addOrReplaceInOutputs(*right_key_node); join_expression_actions->addOrReplaceInOutputs(*right_key_node);

View File

@ -53,10 +53,12 @@ class JoinClause
{ {
public: public:
/// Add keys /// Add keys
void addKey(const ActionsDAG::Node * left_key_node, const ActionsDAG::Node * right_key_node) void addKey(const ActionsDAG::Node * left_key_node, const ActionsDAG::Node * right_key_node, bool null_safe_comparison = false)
{ {
left_key_nodes.emplace_back(left_key_node); left_key_nodes.emplace_back(left_key_node);
right_key_nodes.emplace_back(right_key_node); right_key_nodes.emplace_back(right_key_node);
if (null_safe_comparison)
nullsafe_compare_key_indexes.emplace(left_key_nodes.size() - 1);
} }
void addASOFKey(const ActionsDAG::Node * left_key_node, const ActionsDAG::Node * right_key_node, ASOFJoinInequality asof_inequality) void addASOFKey(const ActionsDAG::Node * left_key_node, const ActionsDAG::Node * right_key_node, ASOFJoinInequality asof_inequality)
@ -97,6 +99,11 @@ public:
return right_key_nodes; return right_key_nodes;
} }
bool isNullsafeCompareKey(size_t idx) const
{
return nullsafe_compare_key_indexes.contains(idx);
}
/// Returns true if JOIN clause has ASOF conditions, false otherwise /// Returns true if JOIN clause has ASOF conditions, false otherwise
bool hasASOF() const bool hasASOF() const
{ {
@ -147,6 +154,8 @@ private:
ActionsDAG::NodeRawConstPtrs left_filter_condition_nodes; ActionsDAG::NodeRawConstPtrs left_filter_condition_nodes;
ActionsDAG::NodeRawConstPtrs right_filter_condition_nodes; ActionsDAG::NodeRawConstPtrs right_filter_condition_nodes;
std::unordered_set<size_t> nullsafe_compare_key_indexes;
}; };
using JoinClauses = std::vector<JoinClause>; using JoinClauses = std::vector<JoinClause>;

View File

@ -73,7 +73,7 @@ void Chunk::checkNumRowsIsConsistent()
auto & column = columns[i]; auto & column = columns[i];
if (column->size() != num_rows) if (column->size() != num_rows)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Invalid number of rows in Chunk column {}: expected {}, got {}", throw Exception(ErrorCodes::LOGICAL_ERROR, "Invalid number of rows in Chunk column {}: expected {}, got {}",
column->getName()+ " position " + toString(i), toString(num_rows), toString(column->size())); column->getName() + " position " + toString(i), toString(num_rows), toString(column->size()));
} }
} }

View File

@ -177,7 +177,7 @@ void CSVFormatReader::skipRow()
} }
} }
static void skipEndOfLine(ReadBuffer & in) static void skipEndOfLine(ReadBuffer & in, bool allow_cr_end_of_line)
{ {
/// \n (Unix) or \r\n (DOS/Windows) or \n\r (Mac OS Classic) /// \n (Unix) or \r\n (DOS/Windows) or \n\r (Mac OS Classic)
@ -192,7 +192,7 @@ static void skipEndOfLine(ReadBuffer & in)
++in.position(); ++in.position();
if (!in.eof() && *in.position() == '\n') if (!in.eof() && *in.position() == '\n')
++in.position(); ++in.position();
else else if (!allow_cr_end_of_line)
throw Exception(ErrorCodes::INCORRECT_DATA, throw Exception(ErrorCodes::INCORRECT_DATA,
"Cannot parse CSV format: found \\r (CR) not followed by \\n (LF)." "Cannot parse CSV format: found \\r (CR) not followed by \\n (LF)."
" Line must end by \\n (LF) or \\r\\n (CR LF) or \\n\\r."); " Line must end by \\n (LF) or \\r\\n (CR LF) or \\n\\r.");
@ -258,7 +258,7 @@ void CSVFormatReader::skipRowEndDelimiter()
if (buf->eof()) if (buf->eof())
return; return;
skipEndOfLine(*buf); skipEndOfLine(*buf, format_settings.csv.allow_cr_end_of_line);
} }
void CSVFormatReader::skipHeaderRow() void CSVFormatReader::skipHeaderRow()
@ -343,7 +343,7 @@ bool CSVFormatReader::parseRowEndWithDiagnosticInfo(WriteBuffer & out)
return false; return false;
} }
skipEndOfLine(*buf); skipEndOfLine(*buf, format_settings.csv.allow_cr_end_of_line);
return true; return true;
} }

View File

@ -226,7 +226,7 @@ FillingTransform::FillingTransform(
throw Exception(ErrorCodes::INVALID_WITH_FILL_EXPRESSION, throw Exception(ErrorCodes::INVALID_WITH_FILL_EXPRESSION,
"Incompatible types of WITH FILL expression values with column type {}", type->getName()); "Incompatible types of WITH FILL expression values with column type {}", type->getName());
if (isUnsignedInteger(type) && if (isUInt(type) &&
((!descr.fill_from.isNull() && less(descr.fill_from, Field{0}, 1)) || ((!descr.fill_from.isNull() && less(descr.fill_from, Field{0}, 1)) ||
(!descr.fill_to.isNull() && less(descr.fill_to, Field{0}, 1)))) (!descr.fill_to.isNull() && less(descr.fill_to, Field{0}, 1))))
{ {

View File

@ -44,6 +44,7 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR; extern const int LOGICAL_ERROR;
extern const int DUPLICATE_COLUMN; extern const int DUPLICATE_COLUMN;
extern const int NOT_IMPLEMENTED; extern const int NOT_IMPLEMENTED;
extern const int SUPPORT_IS_DISABLED;
} }
namespace namespace
@ -1083,6 +1084,13 @@ void AlterCommands::validate(const StoragePtr & table, ContextPtr context) const
throw Exception(ErrorCodes::BAD_ARGUMENTS, throw Exception(ErrorCodes::BAD_ARGUMENTS,
"Data type have to be specified for column {} to add", backQuote(column_name)); "Data type have to be specified for column {} to add", backQuote(column_name));
/// FIXME: Adding a new column of type Object(JSON) is broken.
/// Looks like there is something around default expression for this column (method `getDefault` is not implemented for the data type Object).
/// But after ALTER TABLE ADD COLUMN we need to fill existing rows with something (exactly the default value).
/// So we don't allow to do it for now.
if (command.data_type->hasDynamicSubcolumns())
throw Exception(ErrorCodes::SUPPORT_IS_DISABLED, "Adding a new column of a type which has dynamic subcolumns to an existing table is not allowed. It has known bugs");
if (column_name == LightweightDeleteDescription::FILTER_COLUMN.name && std::dynamic_pointer_cast<MergeTreeData>(table)) if (column_name == LightweightDeleteDescription::FILTER_COLUMN.name && std::dynamic_pointer_cast<MergeTreeData>(table))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot add column {}: " throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot add column {}: "
"this column name is reserved for lightweight delete feature", backQuote(column_name)); "this column name is reserved for lightweight delete feature", backQuote(column_name));
@ -1145,17 +1153,22 @@ void AlterCommands::validate(const StoragePtr & table, ContextPtr context) const
} }
} }
/// The change of data type to/from Object is broken, so disable it for now /// FIXME: Modifying the column to/from Object(JSON) is broken.
/// Looks like there is something around default expression for this column (method `getDefault` is not implemented for the data type Object).
/// But after ALTER TABLE MODIFY COLUMN we need to fill existing rows with something (exactly the default value) or calculate the common type for it.
/// So we don't allow to do it for now.
if (command.data_type) if (command.data_type)
{ {
const GetColumnsOptions options(GetColumnsOptions::AllPhysical); const GetColumnsOptions options(GetColumnsOptions::AllPhysical);
const auto old_data_type = all_columns.getColumn(options, column_name).type; const auto old_data_type = all_columns.getColumn(options, column_name).type;
if (command.data_type->getName().contains("Object") bool new_type_has_object = command.data_type->hasDynamicSubcolumns();
|| old_data_type->getName().contains("Object")) bool old_type_has_object = old_data_type->hasDynamicSubcolumns();
if (new_type_has_object || old_type_has_object)
throw Exception( throw Exception(
ErrorCodes::BAD_ARGUMENTS, ErrorCodes::BAD_ARGUMENTS,
"The change of data type {} of column {} to {} is not allowed", "The change of data type {} of column {} to {} is not allowed. It has known bugs",
old_data_type->getName(), backQuote(column_name), command.data_type->getName()); old_data_type->getName(), backQuote(column_name), command.data_type->getName());
} }

Some files were not shown because too many files have changed in this diff Show More