Merge branch 'master' into better_data_part_storage_builder

This commit is contained in:
alesapin 2022-06-30 22:54:42 +02:00
commit 979565edf0
34 changed files with 227 additions and 76 deletions

View File

@ -15,5 +15,8 @@ ClickHouse® is an open-source column-oriented database management system that a
* [Contacts](https://clickhouse.com/company/#contact) can help to get your questions answered if there are any.
## Upcoming events
* [Paris Meetup](https://www.meetup.com/clickhouse-france-user-group/events/286304312/) Please join us for an evening of talks (in English), food and discussion. Featuring talks of ClickHouse in production and at least one on the deep internals of ClickHouse itself.
* [v22.7 Release Webinar](https://clickhouse.com/company/events/v22-7-release-webinar/) Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release, provide live demos, and share vision into what is coming in the roadmap.
* [ClickHouse Meetup at the Cloudflare office in London](https://www.meetup.com/clickhouse-london-user-group/events/286891586/) ClickHouse meetup at the Cloudflare office space in central London
* [ClickHouse Meetup at the Metoda office in Munich](https://www.meetup.com/clickhouse-meetup-munich/events/286891667/) ClickHouse meetup at the Metoda office in Munich

View File

@ -42,6 +42,7 @@ function install_packages()
function configure()
{
# install test configs
export USE_DATABASE_ORDINARY=1
/usr/share/clickhouse-test/config/install.sh
# we mount tests folder from repo to /usr/share

View File

@ -136,4 +136,3 @@ DESCRIBE TABLE test_database.test_table;
└────────┴───────────────────┘
```
[Original article](https://clickhouse.com/docs/en/database-engines/postgresql/) <!--hide-->

View File

@ -43,4 +43,3 @@ The `TinyLog` engine is the simplest in the family and provides the poorest func
The `Log` and `StripeLog` engines support parallel data reading. When reading data, ClickHouse uses multiple threads. Each thread processes a separate data block. The `Log` engine uses a separate file for each column of the table. `StripeLog` stores all the data in one file. As a result, the `StripeLog` engine uses fewer file descriptors, but the `Log` engine provides higher efficiency when reading data.
[Original article](https://clickhouse.com/docs/en/operations/table_engines/log_family/) <!--hide-->

View File

@ -68,40 +68,42 @@ For a description of parameters, see the [CREATE query description](../../../sql
`ORDER BY` — The sorting key.
A tuple of column names or arbitrary expressions. Example: `ORDER BY (CounterID, EventDate)`.
A tuple of column names or arbitrary expressions. Example: `ORDER BY (CounterID, EventDate)`.
ClickHouse uses the sorting key as a primary key if the primary key is not defined explicitly by the `PRIMARY KEY` clause.
ClickHouse uses the sorting key as a primary key if the primary key is not defined explicitly by the `PRIMARY KEY` clause.
Use the `ORDER BY tuple()` syntax, if you do not need sorting. See [Selecting the Primary Key](#selecting-the-primary-key).
Use the `ORDER BY tuple()` syntax, if you do not need sorting. See [Selecting the Primary Key](#selecting-the-primary-key).
#### PARTITION BY
`PARTITION BY` — The [partitioning key](../../../engines/table-engines/mergetree-family/custom-partitioning-key.md). Optional. In most cases you don't need partition key, and in most other cases you don't need partition key more granular than by months. Partitioning does not speed up queries (in contrast to the ORDER BY expression). You should never use too granular partitioning. Don't partition your data by client identifiers or names (instead make client identifier or name the first column in the ORDER BY expression).
For partitioning by month, use the `toYYYYMM(date_column)` expression, where `date_column` is a column with a date of the type [Date](../../../sql-reference/data-types/date.md). The partition names here have the `"YYYYMM"` format.
For partitioning by month, use the `toYYYYMM(date_column)` expression, where `date_column` is a column with a date of the type [Date](../../../sql-reference/data-types/date.md). The partition names here have the `"YYYYMM"` format.
#### PRIMARY KEY
`PRIMARY KEY` — The primary key if it [differs from the sorting key](#choosing-a-primary-key-that-differs-from-the-sorting-key). Optional.
By default the primary key is the same as the sorting key (which is specified by the `ORDER BY` clause). Thus in most cases it is unnecessary to specify a separate `PRIMARY KEY` clause.
By default the primary key is the same as the sorting key (which is specified by the `ORDER BY` clause). Thus in most cases it is unnecessary to specify a separate `PRIMARY KEY` clause.
#### SAMPLE BY
`SAMPLE BY` — An expression for sampling. Optional.
If a sampling expression is used, the primary key must contain it. The result of a sampling expression must be an unsigned integer. Example: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
If a sampling expression is used, the primary key must contain it. The result of a sampling expression must be an unsigned integer. Example: `SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID))`.
#### TTL
`TTL` — A list of rules specifying storage duration of rows and defining logic of automatic parts movement [between disks and volumes](#table_engine-mergetree-multiple-volumes). Optional.
Expression must have one `Date` or `DateTime` column as a result. Example:
`TTL date + INTERVAL 1 DAY`
Expression must have one `Date` or `DateTime` column as a result. Example:
```
TTL date + INTERVAL 1 DAY
```
Type of the rule `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'|GROUP BY` specifies an action to be done with the part if the expression is satisfied (reaches current time): removal of expired rows, moving a part (if expression is satisfied for all rows in a part) to specified disk (`TO DISK 'xxx'`) or to volume (`TO VOLUME 'xxx'`), or aggregating values in expired rows. Default type of the rule is removal (`DELETE`). List of multiple rules can be specified, but there should be no more than one `DELETE` rule.
Type of the rule `DELETE|TO DISK 'xxx'|TO VOLUME 'xxx'|GROUP BY` specifies an action to be done with the part if the expression is satisfied (reaches current time): removal of expired rows, moving a part (if expression is satisfied for all rows in a part) to specified disk (`TO DISK 'xxx'`) or to volume (`TO VOLUME 'xxx'`), or aggregating values in expired rows. Default type of the rule is removal (`DELETE`). List of multiple rules can be specified, but there should be no more than one `DELETE` rule.
For more details, see [TTL for columns and tables](#table_engine-mergetree-ttl)
For more details, see [TTL for columns and tables](#table_engine-mergetree-ttl)
### SETTINGS
Additional parameters that control the behavior of the `MergeTree` (optional):
@ -129,7 +131,6 @@ Additional parameters that control the behavior of the `MergeTree` (optional):
#### min_merge_bytes_to_use_direct_io
`min_merge_bytes_to_use_direct_io` — The minimum data volume for merge operation that is required for using direct I/O access to the storage disk. When merging data parts, ClickHouse calculates the total storage volume of all the data to be merged. If the volume exceeds `min_merge_bytes_to_use_direct_io` bytes, ClickHouse reads and writes the data to the storage disk using the direct I/O interface (`O_DIRECT` option). If `min_merge_bytes_to_use_direct_io = 0`, then direct I/O is disabled. Default value: `10 * 1024 * 1024 * 1024` bytes.
<a name="mergetree_setting-merge_with_ttl_timeout"></a>
#### merge_with_ttl_timeout
@ -305,15 +306,29 @@ For `SELECT` queries, ClickHouse analyzes whether an index can be used. An index
Thus, it is possible to quickly run queries on one or many ranges of the primary key. In this example, queries will be fast when run for a specific tracking tag, for a specific tag and date range, for a specific tag and date, for multiple tags with a date range, and so on.
Lets look at the engine configured as follows:
ENGINE MergeTree() PARTITION BY toYYYYMM(EventDate) ORDER BY (CounterID, EventDate) SETTINGS index_granularity=8192
```sql
ENGINE MergeTree()
PARTITION BY toYYYYMM(EventDate)
ORDER BY (CounterID, EventDate)
SETTINGS index_granularity=8192
```
In this case, in queries:
``` sql
SELECT count() FROM table WHERE EventDate = toDate(now()) AND CounterID = 34
SELECT count() FROM table WHERE EventDate = toDate(now()) AND (CounterID = 34 OR CounterID = 42)
SELECT count() FROM table WHERE ((EventDate >= toDate('2014-01-01') AND EventDate <= toDate('2014-01-31')) OR EventDate = toDate('2014-05-01')) AND CounterID IN (101500, 731962, 160656) AND (CounterID = 101500 OR EventDate != toDate('2014-05-01'))
SELECT count() FROM table
WHERE EventDate = toDate(now())
AND CounterID = 34
SELECT count() FROM table
WHERE EventDate = toDate(now())
AND (CounterID = 34 OR CounterID = 42)
SELECT count() FROM table
WHERE ((EventDate >= toDate('2014-01-01')
AND EventDate <= toDate('2014-01-31')) OR EventDate = toDate('2014-05-01'))
AND CounterID IN (101500, 731962, 160656)
AND (CounterID = 101500 OR EventDate != toDate('2014-05-01'))
```
ClickHouse will use the primary key index to trim improper data and the monthly partitioning key to trim partitions that are in improper date ranges.
@ -376,36 +391,36 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234
#### `minmax`
Stores extremes of the specified expression (if the expression is `tuple`, then it stores extremes for each element of `tuple`), uses stored info for skipping blocks of data like the primary key.
Stores extremes of the specified expression (if the expression is `tuple`, then it stores extremes for each element of `tuple`), uses stored info for skipping blocks of data like the primary key.
#### `set(max_rows)`
Stores unique values of the specified expression (no more than `max_rows` rows, `max_rows=0` means “no limits”). Uses the values to check if the `WHERE` expression is not satisfiable on a block of data.
Stores unique values of the specified expression (no more than `max_rows` rows, `max_rows=0` means “no limits”). Uses the values to check if the `WHERE` expression is not satisfiable on a block of data.
#### `ngrambf_v1(n, size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)`
Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with datatypes: [String](../../../sql-reference/data-types/string.md), [FixedString](../../../sql-reference/data-types/fixedstring.md) and [Map](../../../sql-reference/data-types/map.md). Can be used for optimization of `EQUALS`, `LIKE` and `IN` expressions.
Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) that contains all ngrams from a block of data. Works only with datatypes: [String](../../../sql-reference/data-types/string.md), [FixedString](../../../sql-reference/data-types/fixedstring.md) and [Map](../../../sql-reference/data-types/map.md). Can be used for optimization of `EQUALS`, `LIKE` and `IN` expressions.
- `n` — ngram size,
- `size_of_bloom_filter_in_bytes` — Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well).
- `number_of_hash_functions` — The number of hash functions used in the Bloom filter.
- `random_seed` — The seed for Bloom filter hash functions.
- `n` — ngram size,
- `size_of_bloom_filter_in_bytes` — Bloom filter size in bytes (you can use large values here, for example, 256 or 512, because it can be compressed well).
- `number_of_hash_functions` — The number of hash functions used in the Bloom filter.
- `random_seed` — The seed for Bloom filter hash functions.
#### `tokenbf_v1(size_of_bloom_filter_in_bytes, number_of_hash_functions, random_seed)`
The same as `ngrambf_v1`, but stores tokens instead of ngrams. Tokens are sequences separated by non-alphanumeric characters.
The same as `ngrambf_v1`, but stores tokens instead of ngrams. Tokens are sequences separated by non-alphanumeric characters.
#### `bloom_filter([false_positive])` — Stores a [Bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) for the specified columns.
The optional `false_positive` parameter is the probability of receiving a false positive response from the filter. Possible values: (0, 1). Default value: 0.025.
The optional `false_positive` parameter is the probability of receiving a false positive response from the filter. Possible values: (0, 1). Default value: 0.025.
Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`, `Array`, `LowCardinality`, `Nullable`, `UUID`, `Map`.
Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`, `Array`, `LowCardinality`, `Nullable`, `UUID`, `Map`.
For `Map` data type client can specify if index should be created for keys or values using [mapKeys](../../../sql-reference/functions/tuple-map-functions.md#mapkeys) or [mapValues](../../../sql-reference/functions/tuple-map-functions.md#mapvalues) function.
For `Map` data type client can specify if index should be created for keys or values using [mapKeys](../../../sql-reference/functions/tuple-map-functions.md#mapkeys) or [mapValues](../../../sql-reference/functions/tuple-map-functions.md#mapvalues) function.
The following functions can use the filter: [equals](../../../sql-reference/functions/comparison-functions.md), [notEquals](../../../sql-reference/functions/comparison-functions.md), [in](../../../sql-reference/functions/in-functions), [notIn](../../../sql-reference/functions/in-functions), [has](../../../sql-reference/functions/array-functions#hasarr-elem), [hasAny](../../../sql-reference/functions/array-functions#hasany), [hasAll](../../../sql-reference/functions/array-functions#hasall).
The following functions can use the filter: [equals](../../../sql-reference/functions/comparison-functions.md), [notEquals](../../../sql-reference/functions/comparison-functions.md), [in](../../../sql-reference/functions/in-functions), [notIn](../../../sql-reference/functions/in-functions), [has](../../../sql-reference/functions/array-functions#hasarr-elem), [hasAny](../../../sql-reference/functions/array-functions#hasany), [hasAll](../../../sql-reference/functions/array-functions#hasall).
Example of index creation for `Map` data type
Example of index creation for `Map` data type
```
INDEX map_key_index mapKeys(map_column) TYPE bloom_filter GRANULARITY 1

View File

@ -86,4 +86,3 @@ $ echo -e "1,2\n3,4" | clickhouse-local -q "CREATE TABLE table (a Int64, b Int64
- Indices
- Replication
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/file/) <!--hide-->

View File

@ -151,4 +151,3 @@ ALTER TABLE id_val_join DELETE WHERE id = 3;
└────┴─────┘
```
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/join/) <!--hide-->

View File

@ -86,4 +86,3 @@ SELECT * FROM WatchLog;
- [Virtual columns](../../../engines/table-engines/special/index.md#table_engines-virtual_columns)
- [merge](../../../sql-reference/table-functions/merge.md) table function
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/merge/) <!--hide-->

View File

@ -10,6 +10,3 @@ When writing to a `Null` table, data is ignored. When reading from a `Null` tabl
:::note
If you are wondering why this is useful, note that you can create a materialized view on a `Null` table. So the data written to the table will end up affecting the view, but original raw data will still be discarded.
:::
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/null/) <!--hide-->

View File

@ -20,4 +20,3 @@ When creating a table, the following settings are applied:
- [persistent](../../../operations/settings/settings.md#persistent)
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/set/) <!--hide-->

View File

@ -89,4 +89,3 @@ SELECT * FROM url_engine_table
- Indexes.
- Replication.
[Original article](https://clickhouse.com/docs/en/operations/table_engines/special/url/) <!--hide-->

View File

@ -13,4 +13,3 @@ The following external authenticators and directories are supported:
- Kerberos [Authenticator](./kerberos.md#external-authenticators-kerberos)
- [SSL X.509 authentication](./ssl-x509.md#ssl-external-authentication)
[Original article](https://clickhouse.com/docs/en/operations/external-authenticators/index/) <!--hide-->

View File

@ -61,4 +61,3 @@ exception_code: ZOK
2 rows in set. Elapsed: 0.025 sec.
```
[Original article](https://clickhouse.com/docs/en/operations/system_tables/distributed_ddl_queuedistributed_ddl_queue.md) <!--hide-->

View File

@ -47,4 +47,3 @@ last_exception:
- [Distributed table engine](../../engines/table-engines/special/distributed.md)
[Original article](https://clickhouse.com/docs/en/operations/system_tables/distribution_queue) <!--hide-->

View File

@ -50,4 +50,3 @@ attribute.values: []
- [OpenTelemetry](../../operations/opentelemetry.md)
[Original article](https://clickhouse.com/docs/en/operations/system_tables/opentelemetry_span_log) <!--hide-->

View File

@ -145,4 +145,3 @@ column_marks_bytes: 48
- [MergeTree family](../../engines/table-engines/mergetree-family/mergetree.md)
[Original article](https://clickhouse.com/docs/en/operations/system_tables/parts_columns) <!--hide-->

View File

@ -88,4 +88,3 @@ last_postpone_time: 1970-01-01 03:00:00
- [Managing ReplicatedMergeTree Tables](../../sql-reference/statements/system.md#query-language-system-replicated)
[Original article](https://clickhouse.com/docs/en/operations/system_tables/replication_queue) <!--hide-->

View File

@ -66,5 +66,3 @@ Result:
└──────────────────────────────────────────────────────────────────────────────────┘
```
[Original article](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference/meanZTest/) <!--hide-->

View File

@ -69,4 +69,3 @@ Result:
- [Welch's t-test](https://en.wikipedia.org/wiki/Welch%27s_t-test)
- [studentTTest function](studentttest.md#studentttest)
[Original article](https://clickhouse.com/docs/en/sql-reference/aggregate-functions/reference/welchTTest/) <!--hide-->

View File

@ -27,4 +27,3 @@ You can use domains anywhere corresponding base type can be used, for example:
- Cant implicitly convert string values into domain values when inserting data from another column or table.
- Domain adds no constrains on stored values.
[Original article](https://clickhouse.com/docs/en/data_types/domains/) <!--hide-->

View File

@ -104,4 +104,3 @@ Result:
└─────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────┘
```
[Original article](https://clickhouse.com/docs/en/data-types/geo/) <!--hide-->

View File

@ -108,4 +108,3 @@ Result:
- [map()](../../sql-reference/functions/tuple-map-functions.md#function-map) function
- [CAST()](../../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) function
[Original article](https://clickhouse.com/docs/en/data-types/map/) <!--hide-->

View File

@ -39,4 +39,3 @@ Values of the `SimpleAggregateFunction(func, Type)` look and stored the same way
CREATE TABLE simple (id UInt64, val SimpleAggregateFunction(sum, Double)) ENGINE=AggregatingMergeTree ORDER BY id;
```
[Original article](https://clickhouse.com/docs/en/data_types/simpleaggregatefunction/) <!--hide-->

View File

@ -355,4 +355,3 @@ Result:
└───────────┘
```
[Original article](https://clickhouse.com/docs/en/sql-reference/functions/encryption_functions/) <!--hide-->

View File

@ -111,4 +111,3 @@ SELECT * FROM mysql('localhost:3306', 'test', 'test', 'bayonet', '123');
- [The MySQL table engine](../../engines/table-engines/integrations/mysql.md)
- [Using MySQL as a source of external dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-mysql)
[Original article](https://clickhouse.com/docs/en/sql-reference/table_functions/mysql/) <!--hide-->

View File

@ -18,6 +18,12 @@ option (ENABLE_CLICKHOUSE_SERVER "Server mode (main mode)" ${ENABLE_CLICKHOUSE_A
option (ENABLE_CLICKHOUSE_CLIENT "Client mode (interactive tui/shell that connects to the server)"
${ENABLE_CLICKHOUSE_ALL})
if (CLICKHOUSE_SPLIT_BINARY OR NOT ENABLE_UTILS)
option (ENABLE_CLICKHOUSE_SELF_EXTRACTING "Self-extracting executable" OFF)
else ()
option (ENABLE_CLICKHOUSE_SELF_EXTRACTING "Self-extracting executable" ON)
endif ()
# https://clickhouse.com/docs/en/operations/utilities/clickhouse-local/
option (ENABLE_CLICKHOUSE_LOCAL "Local files fast processing mode" ${ENABLE_CLICKHOUSE_ALL})
@ -101,6 +107,12 @@ else()
message(STATUS "Local mode: OFF")
endif()
if (ENABLE_CLICKHOUSE_SELF_EXTRACTING)
message(STATUS "Self-extracting executable: ON")
else()
message(STATUS "Self-extracting executable: OFF")
endif()
if (ENABLE_CLICKHOUSE_BENCHMARK)
message(STATUS "Benchmark mode: ON")
else()
@ -266,6 +278,10 @@ if (ENABLE_CLICKHOUSE_LIBRARY_BRIDGE)
add_subdirectory (library-bridge)
endif ()
if (ENABLE_CLICKHOUSE_SELF_EXTRACTING)
add_subdirectory (self-extracting)
endif ()
if (CLICKHOUSE_ONE_SHARED)
add_library(clickhouse-lib SHARED
${CLICKHOUSE_SERVER_SOURCES}

View File

@ -0,0 +1,6 @@
add_custom_target (self-extracting ALL
${CMAKE_COMMAND} -E remove clickhouse
COMMAND ${CMAKE_BINARY_DIR}/utils/self-extracting-executable/compressor clickhouse ../clickhouse
DEPENDS clickhouse compressor
)

View File

@ -489,10 +489,10 @@ ColumnPtr ColumnVector<T>::filter(const IColumn::Filter & filt, ssize_t result_s
const T * data_pos = data.data();
/** A slightly more optimized version.
* Based on the assumption that often pieces of consecutive values
* completely pass or do not pass the filter.
* Therefore, we will optimistically check the parts of `SIMD_BYTES` values.
*/
* Based on the assumption that often pieces of consecutive values
* completely pass or do not pass the filter.
* Therefore, we will optimistically check the parts of `SIMD_BYTES` values.
*/
static constexpr size_t SIMD_BYTES = 64;
const UInt8 * filt_end_aligned = filt_pos + size / SIMD_BYTES * SIMD_BYTES;
@ -577,6 +577,115 @@ ColumnPtr ColumnVector<T>::index(const IColumn & indexes, size_t limit) const
return selectIndexImpl(*this, indexes, limit);
}
#ifdef __SSE2__
namespace
{
/** Optimization for ColumnVector replicate using SIMD instructions.
* For such optimization it is important that data is right padded with 15 bytes.
*
* Replicate span size is offsets[i] - offsets[i - 1].
*
* Split spans into 3 categories.
* 1. Span with 0 size. Continue iteration.
*
* 2. Span with 1 size. Update pointer from which data must be copied into result.
* Then if we see span with size 1 or greater than 1 copy data directly into result data and reset pointer.
* Example:
* Data: 1 2 3 4
* Offsets: 1 2 3 4
* Result data: 1 2 3 4
*
* 3. Span with size greater than 1. Save single data element into register and copy it into result data.
* Example:
* Data: 1 2 3 4
* Offsets: 4 4 4 4
* Result data: 1 1 1 1
*
* Additional handling for tail is needed if pointer from which data must be copied from span with size 1 is not null.
*/
template<typename IntType>
requires (std::is_same_v<IntType, Int32> || std::is_same_v<IntType, UInt32>)
void replicateSSE42Int32(const IntType * __restrict data, IntType * __restrict result_data, const IColumn::Offsets & offsets)
{
const IntType * data_copy_begin_ptr = nullptr;
size_t offsets_size = offsets.size();
for (size_t offset_index = 0; offset_index < offsets_size; ++offset_index)
{
size_t span = offsets[offset_index] - offsets[offset_index - 1];
if (span == 1)
{
if (!data_copy_begin_ptr)
data_copy_begin_ptr = data + offset_index;
continue;
}
/// Copy data
if (data_copy_begin_ptr)
{
size_t copy_size = (data + offset_index) - data_copy_begin_ptr;
bool remainder = copy_size % 4;
size_t sse_copy_counter = (copy_size / 4) + remainder;
auto * result_data_copy = result_data;
while (sse_copy_counter)
{
__m128i copy_batch = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data_copy_begin_ptr));
_mm_storeu_si128(reinterpret_cast<__m128i *>(result_data_copy), copy_batch);
result_data_copy += 4;
data_copy_begin_ptr += 4;
--sse_copy_counter;
}
result_data += copy_size;
data_copy_begin_ptr = nullptr;
}
if (span == 0)
continue;
/// Copy single data element into result data
bool span_remainder = span % 4;
size_t copy_counter = (span / 4) + span_remainder;
auto * result_data_tmp = result_data;
__m128i copy_element_data = _mm_set1_epi32(data[offset_index]);
while (copy_counter)
{
_mm_storeu_si128(reinterpret_cast<__m128i *>(result_data_tmp), copy_element_data);
result_data_tmp += 4;
--copy_counter;
}
result_data += span;
}
/// Copy tail if needed
if (data_copy_begin_ptr)
{
size_t copy_size = (data + offsets_size) - data_copy_begin_ptr;
bool remainder = copy_size % 4;
size_t sse_copy_counter = (copy_size / 4) + remainder;
while (sse_copy_counter)
{
__m128i copy_batch = _mm_loadu_si128(reinterpret_cast<const __m128i *>(data_copy_begin_ptr));
_mm_storeu_si128(reinterpret_cast<__m128i *>(result_data), copy_batch);
result_data += 4;
data_copy_begin_ptr += 4;
--sse_copy_counter;
}
}
}
}
#endif
template <typename T>
ColumnPtr ColumnVector<T>::replicate(const IColumn::Offsets & offsets) const
{
@ -589,6 +698,14 @@ ColumnPtr ColumnVector<T>::replicate(const IColumn::Offsets & offsets) const
auto res = this->create(offsets.back());
#ifdef __SSE2__
if constexpr (std::is_same_v<T, UInt32>)
{
replicateSSE42Int32(getData().data(), res->getData().data(), offsets);
return res;
}
#endif
auto it = res->getData().begin(); // NOLINT
for (size_t i = 0; i < size; ++i)
{

View File

@ -68,9 +68,9 @@ void CacheDictionaryUpdateQueue<dictionary_key_type>::waitForCurrentUpdateFinish
if (update_queue.isFinished())
throw Exception(ErrorCodes::UNSUPPORTED_METHOD, "CacheDictionaryUpdateQueue finished");
std::unique_lock<std::mutex> update_lock(update_mutex);
std::unique_lock<std::mutex> update_lock(update_unit_ptr->update_mutex);
bool result = is_update_finished.wait_for(
bool result = update_unit_ptr->is_update_finished.wait_for(
update_lock,
std::chrono::milliseconds(configuration.query_wait_timeout_milliseconds),
[&]
@ -133,19 +133,23 @@ void CacheDictionaryUpdateQueue<dictionary_key_type>::updateThreadFunction()
/// Update
update_func(unit_to_update);
/// Notify thread about finished updating the bunch of ids
/// where their own ids were included.
std::lock_guard lock(update_mutex);
{
/// Notify thread about finished updating the bunch of ids
/// where their own ids were included.
std::lock_guard lock(unit_to_update->update_mutex);
unit_to_update->is_done = true;
}
unit_to_update->is_done = true;
is_update_finished.notify_all();
unit_to_update->is_update_finished.notify_all();
}
catch (...)
{
std::lock_guard lock(update_mutex);
{
std::lock_guard lock(unit_to_update->update_mutex);
unit_to_update->current_exception = std::current_exception(); // NOLINT(bugprone-throw-keyword-missing)
}
unit_to_update->current_exception = std::current_exception(); // NOLINT(bugprone-throw-keyword-missing)
is_update_finished.notify_all();
unit_to_update->is_update_finished.notify_all();
}
}
}

View File

@ -74,7 +74,10 @@ private:
template <DictionaryKeyType>
friend class CacheDictionaryUpdateQueue;
std::atomic<bool> is_done{false};
mutable std::mutex update_mutex;
mutable std::condition_variable is_update_finished;
bool is_done{false};
std::exception_ptr current_exception{nullptr}; /// NOLINT
/// While UpdateUnit is alive, it is accounted in update_queue size.
@ -159,9 +162,6 @@ private:
UpdateQueue update_queue;
ThreadPool update_pool;
mutable std::mutex update_mutex;
mutable std::condition_variable is_update_finished;
};
extern template class CacheDictionaryUpdateQueue<DictionaryKeyType::Simple>;

View File

@ -328,7 +328,7 @@ StorageLiveView::StorageLiveView(
blocks_metadata_ptr = std::make_shared<BlocksMetadataPtr>();
active_ptr = std::make_shared<bool>(true);
periodic_refresh_task = getContext()->getSchedulePool().createTask("LieViewPeriodicRefreshTask", [this]{ periodicRefreshTaskFunc(); });
periodic_refresh_task = getContext()->getSchedulePool().createTask("LiveViewPeriodicRefreshTask", [this]{ periodicRefreshTaskFunc(); });
periodic_refresh_task->deactivate();
}

View File

@ -1103,7 +1103,9 @@ bool ReplicatedMergeTreeQueue::isCoveredByFuturePartsImpl(const LogEntry & entry
continue;
/// Parts are not disjoint, so new_part_name either contains or covers future_part.
chassert(future_part.contains(result_part) || result_part.contains(future_part));
if (!(future_part.contains(result_part) || result_part.contains(future_part)))
throw Exception(ErrorCodes::LOGICAL_ERROR, "Got unexpected non-disjoint parts: {} and {}", future_part_elem.first, new_part_name);
/// We cannot execute `entry` (or upgrade its actual_part_name to `new_part_name`)
/// while any covered or covering parts are processed.
/// But we also cannot simply return true and postpone entry processing, because it may lead to kind of livelock.

View File

@ -600,6 +600,18 @@ void StorageReplicatedMergeTree::createNewZooKeeperNodes()
std::vector<zkutil::ZooKeeper::FutureCreate> futures;
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/quorum/parallel", String(), zkutil::CreateMode::Persistent));
/// These 4 nodes used to be created in createNewZookeeperNodes() and they were moved to createTable()
/// This means that if the first replica creating the table metadata has an older version of CH (22.3 or previous)
/// there will be a time between its calls to `createTable` and `createNewZookeeperNodes` where the nodes won't exists
/// and that will cause issues in newer replicas
/// See https://github.com/ClickHouse/ClickHouse/issues/38600 for example
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/quorum", String(), zkutil::CreateMode::Persistent));
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/quorum/last_part", String(), zkutil::CreateMode::Persistent));
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/quorum/failed_parts", String(), zkutil::CreateMode::Persistent));
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/mutations", String(), zkutil::CreateMode::Persistent));
futures.push_back(zookeeper->asyncTryCreateNoThrow(zookeeper_path + "/quorum/parallel", String(), zkutil::CreateMode::Persistent));
/// Nodes for remote fs zero-copy replication
const auto settings = getSettings();
if (settings->allow_remote_fs_zero_copy_replication)

View File

@ -1,4 +1,4 @@
-- Tags: no-backward-compatibility-check:22.5.1
-- Tags: no-backward-compatibility-check
DROP TABLE IF EXISTS partslost_0;
DROP TABLE IF EXISTS partslost_1;