mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-12 01:12:12 +00:00
Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by
This commit is contained in:
commit
c7650850fe
@ -17,6 +17,9 @@
|
||||
|
||||
### <a id="2212"></a> ClickHouse release 22.12, 2022-12-15
|
||||
|
||||
#### Backward Incompatible Change
|
||||
* Add `GROUP BY ALL` syntax: [#37631](https://github.com/ClickHouse/ClickHouse/issues/37631). [#42265](https://github.com/ClickHouse/ClickHouse/pull/42265) ([刘陶峰](https://github.com/taofengliu)). If you have a column or an alias named `all` and doing `GROUP BY all` without the intention to group by all the columns, the query will have a different semantic. To keep the old semantic, put `all` into backticks or double quotes `"all"` to make it an identifier instead of a keyword.
|
||||
|
||||
#### Upgrade Notes
|
||||
* Fixed backward incompatibility in (de)serialization of states of `min`, `max`, `any*`, `argMin`, `argMax` aggregate functions with `String` argument. The incompatibility affects 22.9, 22.10 and 22.11 branches (fixed since 22.9.6, 22.10.4 and 22.11.2 correspondingly). Some minor releases of 22.3, 22.7 and 22.8 branches are also affected: 22.3.13...22.3.14 (fixed since 22.3.15), 22.8.6...22.8.9 (fixed since 22.8.10), 22.7.6 and newer (will not be fixed in 22.7, we recommend upgrading from 22.7.* to 22.8.10 or newer). This release note does not concern users that have never used affected versions. Incompatible versions append an extra `'\0'` to strings when reading states of the aggregate functions mentioned above. For example, if an older version saved state of `anyState('foobar')` to `state_column` then the incompatible version will print `'foobar\0'` on `anyMerge(state_column)`. Also incompatible versions write states of the aggregate functions without trailing `'\0'`. Newer versions (that have the fix) can correctly read data written by all versions including incompatible versions, except one corner case. If an incompatible version saved a state with a string that actually ends with null character, then newer version will trim trailing `'\0'` when reading state of affected aggregate function. For example, if an incompatible version saved state of `anyState('abrac\0dabra\0')` to `state_column` then newer versions will print `'abrac\0dabra'` on `anyMerge(state_column)`. The issue also affects distributed queries when an incompatible version works in a cluster together with older or newer versions. [#43038](https://github.com/ClickHouse/ClickHouse/pull/43038) ([Alexander Tokmakov](https://github.com/tavplubix), [Raúl Marín](https://github.com/Algunenano)). Note: all the official ClickHouse builds already include the patches. This is not necessarily true for unofficial third-party builds that should be avoided.
|
||||
|
||||
|
@ -16,6 +16,6 @@ ClickHouse® is an open-source column-oriented database management system that a
|
||||
* [Contacts](https://clickhouse.com/company/contact) can help to get your questions answered if there are any.
|
||||
|
||||
## Upcoming events
|
||||
* [**v22.12 Release Webinar**](https://clickhouse.com/company/events/v22-12-release-webinar) 22.12 is the ClickHouse Christmas release. There are plenty of gifts (a new JOIN algorithm among them) and we adopted something from MongoDB. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
|
||||
* **Recording available**: [**v22.12 Release Webinar**](https://www.youtube.com/watch?v=sREupr6uc2k) 22.12 is the ClickHouse Christmas release. There are plenty of gifts (a new JOIN algorithm among them) and we adopted something from MongoDB. Original creator, co-founder, and CTO of ClickHouse Alexey Milovidov will walk us through the highlights of the release.
|
||||
* [**ClickHouse Meetup at the CHEQ office in Tel Aviv**](https://www.meetup.com/clickhouse-tel-aviv-user-group/events/289599423/) - Jan 16 - We are very excited to be holding our next in-person ClickHouse meetup at the CHEQ office in Tel Aviv! Hear from CHEQ, ServiceNow and Contentsquare, as well as a deep dive presentation from ClickHouse CTO Alexey Milovidov. Join us for a fun evening of talks, food and discussion!
|
||||
* [**ClickHouse Meetup at Microsoft Office in Seattle**](https://www.meetup.com/clickhouse-seattle-user-group/events/290310025/) - Jan 18 - Keep an eye on this space as we will be announcing speakers soon!
|
||||
|
@ -3447,13 +3447,45 @@ Default value: 2.
|
||||
|
||||
## compatibility {#compatibility}
|
||||
|
||||
This setting changes other settings according to provided ClickHouse version.
|
||||
If a behaviour in ClickHouse was changed by using a different default value for some setting, this compatibility setting allows you to use default values from previous versions for all the settings that were not set by the user.
|
||||
The `compatibility` setting causes ClickHouse to use the default settings of a previous version of ClickHouse, where the previous version is provided as the setting.
|
||||
|
||||
This setting takes ClickHouse version number as a string, like `21.3`, `21.8`. Empty value means that this setting is disabled.
|
||||
If settings are set to non-default values, then those settings are honored (only settings that have not been modified are affected by the `compatibility` setting).
|
||||
|
||||
This setting takes a ClickHouse version number as a string, like `22.3`, `22.8`. An empty value means that this setting is disabled.
|
||||
|
||||
Disabled by default.
|
||||
|
||||
:::note
|
||||
In ClickHouse Cloud the compatibility setting must be set by ClickHouse Cloud support. Please [open a case](https://clickhouse.cloud/support) to have it set.
|
||||
:::
|
||||
|
||||
## allow_settings_after_format_in_insert {#allow_settings_after_format_in_insert}
|
||||
|
||||
Control whether `SETTINGS` after `FORMAT` in `INSERT` queries is allowed or not. It is not recommended to use this, since this may interpret part of `SETTINGS` as values.
|
||||
|
||||
Example:
|
||||
|
||||
```sql
|
||||
INSERT INTO FUNCTION null('foo String') SETTINGS max_threads=1 VALUES ('bar');
|
||||
```
|
||||
|
||||
But the following query will work only with `allow_settings_after_format_in_insert`:
|
||||
|
||||
```sql
|
||||
SET allow_settings_after_format_in_insert=1;
|
||||
INSERT INTO FUNCTION null('foo String') VALUES ('bar') SETTINGS max_threads=1;
|
||||
```
|
||||
|
||||
Possible values:
|
||||
|
||||
- 0 — Disallow.
|
||||
- 1 — Allow.
|
||||
|
||||
Default value: `0`.
|
||||
|
||||
!!! note "Warning"
|
||||
Use this setting only for backward compatibility if your use cases depend on old syntax.
|
||||
|
||||
# Format settings {#format-settings}
|
||||
|
||||
## input_format_skip_unknown_fields {#input_format_skip_unknown_fields}
|
||||
|
@ -7,6 +7,8 @@ namespace DB
|
||||
{
|
||||
|
||||
static const uint8_t BSON_DOCUMENT_END = 0x00;
|
||||
static const size_t BSON_OBJECT_ID_SIZE = 12;
|
||||
static const size_t BSON_DB_POINTER_SIZE = 12;
|
||||
using BSONSizeT = uint32_t;
|
||||
static const BSONSizeT MAX_BSON_SIZE = std::numeric_limits<BSONSizeT>::max();
|
||||
|
||||
|
@ -685,37 +685,27 @@ public:
|
||||
}
|
||||
else if constexpr (std::is_same_v<ResultDataType, DataTypeDateTime64>)
|
||||
{
|
||||
if (typeid_cast<const DataTypeDateTime64 *>(arguments[0].type.get()))
|
||||
static constexpr auto target_scale = std::invoke(
|
||||
[]() -> std::optional<UInt32>
|
||||
{
|
||||
if constexpr (std::is_base_of_v<AddNanosecondsImpl, Transform>)
|
||||
return 9;
|
||||
else if constexpr (std::is_base_of_v<AddMicrosecondsImpl, Transform>)
|
||||
return 6;
|
||||
else if constexpr (std::is_base_of_v<AddMillisecondsImpl, Transform>)
|
||||
return 3;
|
||||
|
||||
return {};
|
||||
});
|
||||
|
||||
auto timezone = extractTimeZoneNameFromFunctionArguments(arguments, 2, 0);
|
||||
if (const auto* datetime64_type = typeid_cast<const DataTypeDateTime64 *>(arguments[0].type.get()))
|
||||
{
|
||||
const auto & datetime64_type = assert_cast<const DataTypeDateTime64 &>(*arguments[0].type);
|
||||
|
||||
auto from_scale = datetime64_type.getScale();
|
||||
auto scale = from_scale;
|
||||
|
||||
if (std::is_same_v<Transform, AddNanosecondsImpl>)
|
||||
scale = 9;
|
||||
else if (std::is_same_v<Transform, AddMicrosecondsImpl>)
|
||||
scale = 6;
|
||||
else if (std::is_same_v<Transform, AddMillisecondsImpl>)
|
||||
scale = 3;
|
||||
|
||||
scale = std::max(scale, from_scale);
|
||||
|
||||
return std::make_shared<DataTypeDateTime64>(scale, extractTimeZoneNameFromFunctionArguments(arguments, 2, 0));
|
||||
const auto from_scale = datetime64_type->getScale();
|
||||
return std::make_shared<DataTypeDateTime64>(std::max(from_scale, target_scale.value_or(from_scale)), std::move(timezone));
|
||||
}
|
||||
else
|
||||
{
|
||||
auto scale = DataTypeDateTime64::default_scale;
|
||||
|
||||
if (std::is_same_v<Transform, AddNanosecondsImpl>)
|
||||
scale = 9;
|
||||
else if (std::is_same_v<Transform, AddMicrosecondsImpl>)
|
||||
scale = 6;
|
||||
else if (std::is_same_v<Transform, AddMillisecondsImpl>)
|
||||
scale = 3;
|
||||
|
||||
return std::make_shared<DataTypeDateTime64>(scale, extractTimeZoneNameFromFunctionArguments(arguments, 2, 0));
|
||||
}
|
||||
return std::make_shared<DataTypeDateTime64>(target_scale.value_or(DataTypeDateTime64::default_scale), std::move(timezone));
|
||||
}
|
||||
|
||||
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected result type in datetime add interval function");
|
||||
|
@ -18,6 +18,7 @@
|
||||
#include <Columns/ColumnMap.h>
|
||||
|
||||
#include <DataTypes/DataTypeString.h>
|
||||
#include <DataTypes/DataTypeFixedString.h>
|
||||
#include <DataTypes/DataTypeUUID.h>
|
||||
#include <DataTypes/DataTypeDateTime64.h>
|
||||
#include <DataTypes/DataTypeLowCardinality.h>
|
||||
@ -282,7 +283,7 @@ static void readAndInsertString(ReadBuffer & in, IColumn & column, BSONType bson
|
||||
}
|
||||
else if (bson_type == BSONType::OBJECT_ID)
|
||||
{
|
||||
readAndInsertStringImpl<is_fixed_string>(in, column, 12);
|
||||
readAndInsertStringImpl<is_fixed_string>(in, column, BSON_OBJECT_ID_SIZE);
|
||||
}
|
||||
else
|
||||
{
|
||||
@ -664,7 +665,7 @@ static void skipBSONField(ReadBuffer & in, BSONType type)
|
||||
}
|
||||
case BSONType::OBJECT_ID:
|
||||
{
|
||||
in.ignore(12);
|
||||
in.ignore(BSON_OBJECT_ID_SIZE);
|
||||
break;
|
||||
}
|
||||
case BSONType::REGEXP:
|
||||
@ -677,7 +678,7 @@ static void skipBSONField(ReadBuffer & in, BSONType type)
|
||||
{
|
||||
BSONSizeT size;
|
||||
readBinary(size, in);
|
||||
in.ignore(size + 12);
|
||||
in.ignore(size + BSON_DB_POINTER_SIZE);
|
||||
break;
|
||||
}
|
||||
case BSONType::JAVA_SCRIPT_CODE_W_SCOPE:
|
||||
@ -796,7 +797,6 @@ DataTypePtr BSONEachRowSchemaReader::getDataTypeFromBSONField(BSONType type, boo
|
||||
}
|
||||
case BSONType::SYMBOL: [[fallthrough]];
|
||||
case BSONType::JAVA_SCRIPT_CODE: [[fallthrough]];
|
||||
case BSONType::OBJECT_ID: [[fallthrough]];
|
||||
case BSONType::STRING:
|
||||
{
|
||||
BSONSizeT size;
|
||||
@ -804,6 +804,11 @@ DataTypePtr BSONEachRowSchemaReader::getDataTypeFromBSONField(BSONType type, boo
|
||||
in.ignore(size);
|
||||
return std::make_shared<DataTypeString>();
|
||||
}
|
||||
case BSONType::OBJECT_ID:;
|
||||
{
|
||||
in.ignore(BSON_OBJECT_ID_SIZE);
|
||||
return makeNullable(std::make_shared<DataTypeFixedString>(BSON_OBJECT_ID_SIZE));
|
||||
}
|
||||
case BSONType::DOCUMENT:
|
||||
{
|
||||
auto nested_names_and_types = getDataTypesFromBSONDocument(false);
|
||||
@ -954,6 +959,7 @@ void registerInputFormatBSONEachRow(FormatFactory & factory)
|
||||
"BSONEachRow",
|
||||
[](ReadBuffer & buf, const Block & sample, IRowInputFormat::Params params, const FormatSettings & settings)
|
||||
{ return std::make_shared<BSONEachRowRowInputFormat>(buf, sample, std::move(params), settings); });
|
||||
factory.registerFileExtension("bson", "BSONEachRow");
|
||||
}
|
||||
|
||||
void registerFileSegmentationEngineBSONEachRow(FormatFactory & factory)
|
||||
|
@ -456,6 +456,7 @@ class SettingsRandomizer:
|
||||
"merge_tree_coarse_index_granularity": lambda: random.randint(2, 32),
|
||||
"optimize_distinct_in_order": lambda: random.randint(0, 1),
|
||||
"optimize_sorting_by_input_stream_properties": lambda: random.randint(0, 1),
|
||||
"enable_memory_bound_merging_of_aggregation_results": lambda: random.randint(0, 1),
|
||||
}
|
||||
|
||||
@staticmethod
|
||||
|
@ -33,6 +33,9 @@ instance = cluster.add_instance(
|
||||
],
|
||||
user_configs=["configs/default_passwd.xml"],
|
||||
with_zookeeper=True,
|
||||
# Bug in TSAN reproduces in this test https://github.com/grpc/grpc/issues/29550#issuecomment-1188085387
|
||||
# second_deadlock_stack -- just ordinary option we use everywhere, don't want to overwrite it
|
||||
env_variables={"TSAN_OPTIONS": "report_atomic_races=0 second_deadlock_stack=1"},
|
||||
)
|
||||
|
||||
|
||||
|
@ -1,4 +1,5 @@
|
||||
#!/usr/bin/env bash
|
||||
# Tags: no-random-settings, no-parallel
|
||||
|
||||
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
@ -18,7 +19,7 @@ INSERT INTO mt VALUES ('test1', 'test2');
|
||||
EOF
|
||||
|
||||
while true; do
|
||||
$CLICKHOUSE_CLIENT --query="SELECT count(*) FROM dst" | grep -q "1" && break || sleep .5 ||:
|
||||
$CLICKHOUSE_CLIENT --query="SELECT count(*) FROM dst" | grep -q "1" && break || sleep .1 ||:
|
||||
done
|
||||
|
||||
$CLICKHOUSE_CLIENT --query="SELECT colA, colB FROM dst"
|
||||
|
@ -1,4 +1,4 @@
|
||||
-- Tags: no-replicated-database
|
||||
|
||||
SET max_memory_usage = '100M';
|
||||
SET max_memory_usage = '75M';
|
||||
SELECT cityHash64(rand() % 1000) as n, groupBitmapState(number) FROM numbers_mt(2000000000) GROUP BY n FORMAT Null; -- { serverError 241 }
|
||||
|
@ -60,3 +60,19 @@ test add[...]seconds()
|
||||
2220-12-12 12:12:12.124
|
||||
2220-12-12 12:12:12.121
|
||||
2220-12-12 12:12:12.124456
|
||||
test subtract[...]seconds()
|
||||
- test nanoseconds
|
||||
2022-12-31 23:59:59.999999999
|
||||
2022-12-31 23:59:59.999999900
|
||||
2023-01-01 00:00:00.000000001
|
||||
2023-01-01 00:00:00.000000100
|
||||
- test microseconds
|
||||
2022-12-31 23:59:59.999999
|
||||
2022-12-31 23:59:59.999900
|
||||
2023-01-01 00:00:00.000001
|
||||
2023-01-01 00:00:00.000100
|
||||
- test milliseconds
|
||||
2022-12-31 23:59:59.999
|
||||
2022-12-31 23:59:59.900
|
||||
2023-01-01 00:00:00.001
|
||||
2023-01-01 00:00:00.100
|
||||
|
@ -92,3 +92,22 @@ select addMilliseconds(toDateTime64('1930-12-12 12:12:12.123456', 6), 1); -- Bel
|
||||
select addMilliseconds(toDateTime64('2220-12-12 12:12:12.123', 3), 1); -- Above normal range, source scale matches result
|
||||
select addMilliseconds(toDateTime64('2220-12-12 12:12:12.12', 2), 1); -- Above normal range, source scale less than result
|
||||
select addMilliseconds(toDateTime64('2220-12-12 12:12:12.123456', 6), 1); -- Above normal range, source scale greater than result
|
||||
|
||||
select 'test subtract[...]seconds()';
|
||||
select '- test nanoseconds';
|
||||
select subtractNanoseconds(toDateTime64('2023-01-01 00:00:00.0000000', 7, 'UTC'), 1);
|
||||
select subtractNanoseconds(toDateTime64('2023-01-01 00:00:00.0000000', 7, 'UTC'), 100);
|
||||
select subtractNanoseconds(toDateTime64('2023-01-01 00:00:00.0000000', 7, 'UTC'), -1);
|
||||
select subtractNanoseconds(toDateTime64('2023-01-01 00:00:00.0000000', 7, 'UTC'), -100);
|
||||
|
||||
select '- test microseconds';
|
||||
select subtractMicroseconds(toDateTime64('2023-01-01 00:00:00.0000', 4, 'UTC'), 1);
|
||||
select subtractMicroseconds(toDateTime64('2023-01-01 00:00:00.0000', 4, 'UTC'), 100);
|
||||
select subtractMicroseconds(toDateTime64('2023-01-01 00:00:00.0000', 4, 'UTC'), -1);
|
||||
select subtractMicroseconds(toDateTime64('2023-01-01 00:00:00.0000', 4, 'UTC'), -100);
|
||||
|
||||
select '- test milliseconds';
|
||||
select subtractMilliseconds(toDateTime64('2023-01-01 00:00:00.0', 1, 'UTC'), 1);
|
||||
select subtractMilliseconds(toDateTime64('2023-01-01 00:00:00.0', 1, 'UTC'), 100);
|
||||
select subtractMilliseconds(toDateTime64('2023-01-01 00:00:00.0', 1, 'UTC'), -1);
|
||||
select subtractMilliseconds(toDateTime64('2023-01-01 00:00:00.0', 1, 'UTC'), -100);
|
||||
|
@ -1,3 +1,6 @@
|
||||
-- produces different pipeline if enabled
|
||||
set enable_memory_bound_merging_of_aggregation_results = 0;
|
||||
|
||||
set max_threads = 16;
|
||||
set prefer_localhost_replica = 1;
|
||||
set optimize_aggregation_in_order = 0;
|
||||
|
@ -0,0 +1,6 @@
|
||||
_id Nullable(FixedString(12))
|
||||
name Nullable(String)
|
||||
email Nullable(String)
|
||||
movie_id Nullable(FixedString(12))
|
||||
text Nullable(String)
|
||||
date Nullable(DateTime64(6, \'UTC\'))
|
10
tests/queries/0_stateless/02500_bson_read_object_id.sh
Executable file
10
tests/queries/0_stateless/02500_bson_read_object_id.sh
Executable file
@ -0,0 +1,10 @@
|
||||
#!/usr/bin/env bash
|
||||
# Tags: no-fasttest
|
||||
|
||||
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
|
||||
# shellcheck source=../shell_config.sh
|
||||
. "$CURDIR"/../shell_config.sh
|
||||
|
||||
$CLICKHOUSE_LOCAL -q "desc file('$CURDIR/data_bson/comments.bson')"
|
||||
$CLICKHOUSE_LOCAL -q "select _id from file('$CURDIR/data_bson/comments.bson') format Null"
|
||||
|
BIN
tests/queries/0_stateless/data_bson/comments.bson
Normal file
BIN
tests/queries/0_stateless/data_bson/comments.bson
Normal file
Binary file not shown.
Loading…
Reference in New Issue
Block a user