Merge branch 'master' into vzakaznikov-liveview

This commit is contained in:
Nikolai Kochetov 2019-08-20 10:41:31 +03:00
commit cdf1ce3171
111 changed files with 1684 additions and 573 deletions

2
.github/label-pr.yml vendored Normal file
View File

@ -0,0 +1,2 @@
- regExp: ".*\\.md$"
labels: ["documentation", "pr-documentation"]

9
.github/main.workflow vendored Normal file
View File

@ -0,0 +1,9 @@
workflow "Main workflow" {
resolves = ["Label PR"]
on = "pull_request"
}
action "Label PR" {
uses = "decathlon/pull-request-labeler-action@v1.0.0"
secrets = ["GITHUB_TOKEN"]
}

View File

@ -1,3 +1,54 @@
## ClickHouse release 19.13.2.19, 2019-08-14
### New Feature
* Sampling profiler on query level. [Example](https://gist.github.com/alexey-milovidov/92758583dd41c24c360fdb8d6a4da194). [#4247](https://github.com/yandex/ClickHouse/issues/4247) ([laplab](https://github.com/laplab)) [#6124](https://github.com/yandex/ClickHouse/pull/6124) ([alexey-milovidov](https://github.com/alexey-milovidov)) [#6250](https://github.com/yandex/ClickHouse/pull/6250) [#6283](https://github.com/yandex/ClickHouse/pull/6283) [#6386](https://github.com/yandex/ClickHouse/pull/6386)
* Allow to specify a list of columns with `COLUMNS('regexp')` expression that works like a more sophisticated variant of `*` asterisk. [#5951](https://github.com/yandex/ClickHouse/pull/5951) ([mfridental](https://github.com/mfridental)), ([alexey-milovidov](https://github.com/alexey-milovidov))
* `CREATE TABLE AS table_function()` is now possible [#6057](https://github.com/yandex/ClickHouse/pull/6057) ([dimarub2000](https://github.com/dimarub2000))
* Adam optimizer for stochastic gradient descent is used by default in `stochasticLinearRegression()` and `stochasticLogisticRegression()` aggregate functions, because it shows good quality without almost any tuning. [#6000](https://github.com/yandex/ClickHouse/pull/6000) ([Quid37](https://github.com/Quid37))
* Added functions for working with the сustom week number [#5212](https://github.com/yandex/ClickHouse/pull/5212) ([Andy Yang](https://github.com/andyyzh))
* `RENAME` queries now work with all storages. [#5953](https://github.com/yandex/ClickHouse/pull/5953) ([Ivan](https://github.com/abyss7))
* Now client receive logs from server with any desired level by setting `send_logs_level` regardless to the log level specified in server settings. [#5964](https://github.com/yandex/ClickHouse/pull/5964) ([Nikita Mikhaylov](https://github.com/nikitamikhaylov))
### Experimental features
* New query processing pipeline. Use `experimental_use_processors=1` option to enable it. Use for your own trouble. [#4914](https://github.com/yandex/ClickHouse/pull/4914) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
### Bug Fix
* Kafka integration has been fixed in this version.
* Fixed `DoubleDelta` encoding of `Int64` for large `DoubleDelta` values, improved `DoubleDelta` encoding for random data for `Int32`. [#5998](https://github.com/yandex/ClickHouse/pull/5998) ([Vasily Nemkov](https://github.com/Enmk))
* Fixed overestimation of `max_rows_to_read` if the setting `merge_tree_uniform_read_distribution` is set to 0. [#6019](https://github.com/yandex/ClickHouse/pull/6019) ([alexey-milovidov](https://github.com/alexey-milovidov))
### Improvement
* The setting `input_format_defaults_for_omitted_fields` is enabled by default. It enables calculation of complex default expressions for omitted fields in `JSONEachRow` and `CSV*` formats. It should be the expected behaviour but may lead to negligible performance difference or subtle incompatibilities. [#6043](https://github.com/yandex/ClickHouse/pull/6043) ([Artem Zuikov](https://github.com/4ertus2)), [#5625](https://github.com/yandex/ClickHouse/pull/5625) ([akuzm](https://github.com/akuzm))
* Throws an exception if `config.d` file doesn't have the corresponding root element as the config file [#6123](https://github.com/yandex/ClickHouse/pull/6123) ([dimarub2000](https://github.com/dimarub2000))
### Performance Improvement
* Optimize `count()`. Now it uses the smallest column (if possible). [#6028](https://github.com/yandex/ClickHouse/pull/6028) ([Amos Bird](https://github.com/amosbird))
### Build/Testing/Packaging Improvement
* Report memory usage in performance tests. [#5899](https://github.com/yandex/ClickHouse/pull/5899) ([akuzm](https://github.com/akuzm))
* Fix build with external `libcxx` [#6010](https://github.com/yandex/ClickHouse/pull/6010) ([Ivan](https://github.com/abyss7))
* Fix shared build with `rdkafka` library [#6101](https://github.com/yandex/ClickHouse/pull/6101) ([Ivan](https://github.com/abyss7))
## ClickHouse release 19.11.7.40, 2019-08-14
### Bug fix
* Kafka integration has been fixed in this version.
* Fix segfault when using `arrayReduce` for constant arguments. [#6326](https://github.com/yandex/ClickHouse/pull/6326) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed `toFloat()` monotonicity. [#6374](https://github.com/yandex/ClickHouse/pull/6374) ([dimarub2000](https://github.com/dimarub2000))
* Fix segfault with enabled `optimize_skip_unused_shards` and missing sharding key. [#6384](https://github.com/yandex/ClickHouse/pull/6384) ([CurtizJ](https://github.com/CurtizJ))
* Fixed logic of `arrayEnumerateUniqRanked` function. [#6423](https://github.com/yandex/ClickHouse/pull/6423) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Removed extra verbose logging from MySQL handler. [#6389](https://github.com/yandex/ClickHouse/pull/6389) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix wrong behavior and possible segfaults in `topK` and `topKWeighted` aggregated functions. [#6404](https://github.com/yandex/ClickHouse/pull/6404) ([CurtizJ](https://github.com/CurtizJ))
* Do not expose virtual columns in `system.columns` table. This is required for backward compatibility. [#6406](https://github.com/yandex/ClickHouse/pull/6406) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fix bug with memory allocation for string fields in complex key cache dictionary. [#6447](https://github.com/yandex/ClickHouse/pull/6447) ([alesapin](https://github.com/alesapin))
* Fix bug with enabling adaptive granularity when creating new replica for `Replicated*MergeTree` table. [#6452](https://github.com/yandex/ClickHouse/pull/6452) ([alesapin](https://github.com/alesapin))
* Fix infinite loop when reading Kafka messages. [#6354](https://github.com/yandex/ClickHouse/pull/6354) ([abyss7](https://github.com/abyss7))
* Fixed the possibility of a fabricated query to cause server crash due to stack overflow in SQL parser and possibility of stack overflow in `Merge` and `Distributed` tables [#6433](https://github.com/yandex/ClickHouse/pull/6433) ([alexey-milovidov](https://github.com/alexey-milovidov))
* Fixed Gorilla encoding error on small sequences. [#6444](https://github.com/yandex/ClickHouse/pull/6444) ([Enmk](https://github.com/Enmk))
### Improvement
* Allow user to override `poll_interval` and `idle_connection_timeout` settings on connection. [#6230](https://github.com/yandex/ClickHouse/pull/6230) ([alexey-milovidov](https://github.com/alexey-milovidov))
## ClickHouse release 19.11.5.28, 2019-08-05
### Bug fix
@ -299,7 +350,7 @@ It allows to set commit mode: after every batch of messages is handled, or after
* Renamed functions `leastSqr` to `simpleLinearRegression`, `LinearRegression` to `linearRegression`, `LogisticRegression` to `logisticRegression`. [#5391](https://github.com/yandex/ClickHouse/pull/5391) ([Nikolai Kochetov](https://github.com/KochetovNicolai))
### Performance Improvements
* Paralellize processing of parts in alter modify query. [#4639](https://github.com/yandex/ClickHouse/pull/4639) ([Ivan Kush](https://github.com/IvanKush))
* Paralellize processing of parts of non-replicated MergeTree tables in ALTER MODIFY query. [#4639](https://github.com/yandex/ClickHouse/pull/4639) ([Ivan Kush](https://github.com/IvanKush))
* Optimizations in regular expressions extraction. [#5193](https://github.com/yandex/ClickHouse/pull/5193) [#5191](https://github.com/yandex/ClickHouse/pull/5191) ([Danila Kutenin](https://github.com/danlark1))
* Do not add right join key column to join result if it's used only in join on section. [#5260](https://github.com/yandex/ClickHouse/pull/5260) ([Artem Zuikov](https://github.com/4ertus2))
* Freeze the Kafka buffer after first empty response. It avoids multiple invokations of `ReadBuffer::next()` for empty result in some row-parsing streams. [#5283](https://github.com/yandex/ClickHouse/pull/5283) ([Ivan](https://github.com/abyss7))

View File

@ -1,8 +1,8 @@
option(USE_INTERNAL_CPUINFO_LIBRARY "Set to FALSE to use system cpuinfo library instead of bundled" ${NOT_UNBUNDLED})
# Now we have no contrib/libcpuinfo, use from system.
if (USE_INTERNAL_CPUINFO_LIBRARY AND NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcpuinfo/include")
#message (WARNING "submodule contrib/libcpuid is missing. to fix try run: \n git submodule update --init --recursive")
if (NOT EXISTS "${ClickHouse_SOURCE_DIR}/contrib/libcpuinfo/include")
#message (WARNING "submodule contrib/libcpuinfo is missing. to fix try run: \n git submodule update --init --recursive")
set (USE_INTERNAL_CPUINFO_LIBRARY 0)
set (MISSING_INTERNAL_CPUINFO_LIBRARY 1)
endif ()
@ -12,7 +12,7 @@ if(NOT USE_INTERNAL_CPUINFO_LIBRARY)
find_path(CPUINFO_INCLUDE_DIR NAMES cpuinfo.h PATHS ${CPUINFO_INCLUDE_PATHS})
endif()
if(CPUID_LIBRARY AND CPUID_INCLUDE_DIR)
if(CPUINFO_LIBRARY AND CPUINFO_INCLUDE_DIR)
set(USE_CPUINFO 1)
elseif(NOT MISSING_INTERNAL_CPUINFO_LIBRARY)
set(CPUINFO_INCLUDE_DIR ${ClickHouse_SOURCE_DIR}/contrib/libcpuinfo/include)

@ -1 +1 @@
Subproject commit d85d0e98999cd9e28ceb66645999b4a9ce85370e
Subproject commit c6503d3acc85ca1a7f5e7e38b605d7c9410aac1e

View File

@ -31,6 +31,7 @@ ${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_stmt_codec.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_string.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_time.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/ma_tls.c
${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/openssl_crypt.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/gnutls.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/ma_schannel.c
#${MARIADB_CLIENT_SOURCE_DIR}/libmariadb/secure/schannel.c
@ -42,6 +43,7 @@ ${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/mariadb_cleartext.c
${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/my_auth.c
${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/old_password.c
${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/sha256_pw.c
${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/caching_sha2_pw.c
#${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/sspi_client.c
#${MARIADB_CLIENT_SOURCE_DIR}/plugins/auth/sspi_errmsg.c
${MARIADB_CLIENT_SOURCE_DIR}/plugins/connection/aurora.c

View File

@ -76,17 +76,20 @@ struct st_client_plugin_int *plugin_list[MYSQL_CLIENT_MAX_PLUGINS + MARIADB_CLIE
static pthread_mutex_t LOCK_load_client_plugin;
#endif
extern struct st_mysql_client_plugin mysql_native_password_client_plugin;
extern struct st_mysql_client_plugin mysql_old_password_client_plugin;
extern struct st_mysql_client_plugin pvio_socket_client_plugin;
extern struct st_mysql_client_plugin mysql_native_password_client_plugin;
extern struct st_mysql_client_plugin mysql_old_password_client_plugin;
extern struct st_mysql_client_plugin pvio_socket_client_plugin;
extern struct st_mysql_client_plugin sha256_password_client_plugin;
extern struct st_mysql_client_plugin caching_sha2_password_client_plugin;
struct st_mysql_client_plugin *mysql_client_builtins[]=
{
(struct st_mysql_client_plugin *)&mysql_native_password_client_plugin,
(struct st_mysql_client_plugin *)&mysql_old_password_client_plugin,
(struct st_mysql_client_plugin *)&pvio_socket_client_plugin,
(struct st_mysql_client_plugin *)&mysql_native_password_client_plugin,
(struct st_mysql_client_plugin *)&mysql_old_password_client_plugin,
(struct st_mysql_client_plugin *)&pvio_socket_client_plugin,
(struct st_mysql_client_plugin *)&sha256_password_client_plugin,
(struct st_mysql_client_plugin *)&caching_sha2_password_client_plugin,
0
};

2
contrib/simdjson vendored

@ -1 +1 @@
Subproject commit 9dfab9d9a4c111690a101ea0a7506a2b2f3fa414
Subproject commit e9be643db5cf1c29a69bc80ee72d220124a9c50e

View File

@ -1,11 +1,11 @@
# This strings autochanged from release_lib.sh:
set(VERSION_REVISION 54425)
set(VERSION_MAJOR 19)
set(VERSION_MINOR 13)
set(VERSION_MINOR 14)
set(VERSION_PATCH 1)
set(VERSION_GITHASH adfc36917222bdb03eba069f0cad0f4f5b8f1c94)
set(VERSION_DESCRIBE v19.13.1.1-prestable)
set(VERSION_STRING 19.13.1.1)
set(VERSION_DESCRIBE v19.14.1.1-prestable)
set(VERSION_STRING 19.14.1.1)
# end of autochange
set(VERSION_EXTRA "" CACHE STRING "")

View File

@ -5,7 +5,6 @@ set(CLICKHOUSE_ODBC_BRIDGE_SOURCES
${CMAKE_CURRENT_SOURCE_DIR}/IdentifierQuoteHandler.cpp
${CMAKE_CURRENT_SOURCE_DIR}/MainHandler.cpp
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBlockInputStream.cpp
${CMAKE_CURRENT_SOURCE_DIR}/odbc-bridge.cpp
${CMAKE_CURRENT_SOURCE_DIR}/ODBCBridge.cpp
${CMAKE_CURRENT_SOURCE_DIR}/PingHandler.cpp
${CMAKE_CURRENT_SOURCE_DIR}/validateODBCConnectionString.cpp

View File

@ -2,7 +2,6 @@
#include <limits>
#include <ext/scope_guard.h>
#include <openssl/rsa.h>
#include <Columns/ColumnVector.h>
#include <Common/config_version.h>
#include <Common/NetException.h>
@ -45,6 +44,7 @@ MySQLHandler::MySQLHandler(IServer & server_, const Poco::Net::StreamSocket & so
, connection_id(connection_id_)
, public_key(public_key_)
, private_key(private_key_)
, auth_plugin(new Authentication::Native41())
{
server_capability_flags = CLIENT_PROTOCOL_41 | CLIENT_SECURE_CONNECTION | CLIENT_PLUGIN_AUTH | CLIENT_PLUGIN_AUTH_LENENC_CLIENT_DATA | CLIENT_CONNECT_WITH_DB | CLIENT_DEPRECATE_EOF;
if (ssl_enabled)
@ -62,9 +62,7 @@ void MySQLHandler::run()
try
{
String scramble = generateScramble();
Handshake handshake(server_capability_flags, connection_id, VERSION_STRING + String("-") + VERSION_NAME, Authentication::Native, scramble + '\0');
Handshake handshake(server_capability_flags, connection_id, VERSION_STRING + String("-") + VERSION_NAME, auth_plugin->getName(), auth_plugin->getAuthPluginData());
packet_sender->sendPacket<Handshake>(handshake, true);
LOG_TRACE(log, "Sent handshake");
@ -96,10 +94,21 @@ void MySQLHandler::run()
client_capability_flags = handshake_response.capability_flags;
if (!(client_capability_flags & CLIENT_PROTOCOL_41))
throw Exception("Required capability: CLIENT_PROTOCOL_41.", ErrorCodes::MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES);
if (!(client_capability_flags & CLIENT_PLUGIN_AUTH))
throw Exception("Required capability: CLIENT_PLUGIN_AUTH.", ErrorCodes::MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES);
authenticate(handshake_response, scramble);
authenticate(handshake_response.username, handshake_response.auth_plugin_name, handshake_response.auth_response);
try
{
if (!handshake_response.database.empty())
connection_context.setCurrentDatabase(handshake_response.database);
connection_context.setCurrentQueryId("");
}
catch (const Exception & exc)
{
log->log(exc);
packet_sender->sendPacket(ERR_Packet(exc.code(), "00000", exc.message()), true);
}
OK_Packet ok_packet(0, handshake_response.capability_flags, 0, 0, 0);
packet_sender->sendPacket(ok_packet, true);
@ -216,121 +225,24 @@ void MySQLHandler::finishHandshake(MySQLProtocol::HandshakeResponse & packet)
}
}
String MySQLHandler::generateScramble()
void MySQLHandler::authenticate(const String & user_name, const String & auth_plugin_name, const String & initial_auth_response)
{
String scramble(MySQLProtocol::SCRAMBLE_LENGTH, 0);
Poco::RandomInputStream generator;
for (size_t i = 0; i < scramble.size(); i++)
{
generator >> scramble[i];
}
return scramble;
}
// For compatibility with JavaScript MySQL client, Native41 authentication plugin is used when possible (if password is specified using double SHA1). Otherwise SHA256 plugin is used.
auto user = connection_context.getUser(user_name);
if (user->password_double_sha1_hex.empty())
auth_plugin = std::make_unique<Authentication::Sha256Password>(public_key, private_key, log);
void MySQLHandler::authenticate(const HandshakeResponse & handshake_response, const String & scramble)
{
String auth_response;
AuthSwitchResponse response;
if (handshake_response.auth_plugin_name != Authentication::SHA256)
{
/** Native authentication sent 20 bytes + '\0' character = 21 bytes.
* This plugin must do the same to stay consistent with historical behavior if it is set to operate as a default plugin.
* https://github.com/mysql/mysql-server/blob/8.0/sql/auth/sql_authentication.cc#L3994
*/
packet_sender->sendPacket(AuthSwitchRequest(Authentication::SHA256, scramble + '\0'), true);
if (in->eof())
throw Exception(
"Client doesn't support authentication method " + String(Authentication::SHA256) + " used by ClickHouse",
ErrorCodes::MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES);
packet_sender->receivePacket(response);
auth_response = response.value;
LOG_TRACE(log, "Authentication method mismatch.");
}
else
{
auth_response = handshake_response.auth_response;
LOG_TRACE(log, "Authentication method match.");
}
if (auth_response == "\1")
{
LOG_TRACE(log, "Client requests public key.");
BIO * mem = BIO_new(BIO_s_mem());
SCOPE_EXIT(BIO_free(mem));
if (PEM_write_bio_RSA_PUBKEY(mem, &public_key) != 1)
{
throw Exception("Failed to write public key to memory. Error: " + getOpenSSLErrors(), ErrorCodes::OPENSSL_ERROR);
}
char * pem_buf = nullptr;
long pem_size = BIO_get_mem_data(mem, &pem_buf);
String pem(pem_buf, pem_size);
LOG_TRACE(log, "Key: " << pem);
AuthMoreData data(pem);
packet_sender->sendPacket(data, true);
packet_sender->receivePacket(response);
auth_response = response.value;
}
else
{
LOG_TRACE(log, "Client didn't request public key.");
}
String password;
/** Decrypt password, if it's not empty.
* The original intention was that the password is a string[NUL] but this never got enforced properly so now we have to accept that
* an empty packet is a blank password, thus the check for auth_response.empty() has to be made too.
* https://github.com/mysql/mysql-server/blob/8.0/sql/auth/sql_authentication.cc#L4017
*/
if (!secure_connection && !auth_response.empty() && auth_response != String("\0", 1))
{
LOG_TRACE(log, "Received nonempty password");
auto ciphertext = reinterpret_cast<unsigned char *>(auth_response.data());
unsigned char plaintext[RSA_size(&private_key)];
int plaintext_size = RSA_private_decrypt(auth_response.size(), ciphertext, plaintext, &private_key, RSA_PKCS1_OAEP_PADDING);
if (plaintext_size == -1)
{
throw Exception("Failed to decrypt auth data. Error: " + getOpenSSLErrors(), ErrorCodes::OPENSSL_ERROR);
}
password.resize(plaintext_size);
for (int i = 0; i < plaintext_size; ++i)
{
password[i] = plaintext[i] ^ static_cast<unsigned char>(scramble[i % scramble.size()]);
}
}
else if (secure_connection)
{
password = auth_response;
}
else
{
LOG_TRACE(log, "Received empty password");
}
if (!password.empty() && password.back() == 0)
{
password.pop_back();
}
try
{
connection_context.setUser(handshake_response.username, password, socket().address(), "");
if (!handshake_response.database.empty()) connection_context.setCurrentDatabase(handshake_response.database);
connection_context.setCurrentQueryId("");
LOG_INFO(log, "Authentication for user " << handshake_response.username << " succeeded.");
try {
std::optional<String> auth_response = auth_plugin_name == auth_plugin->getName() ? std::make_optional<String>(initial_auth_response) : std::nullopt;
auth_plugin->authenticate(user_name, auth_response, connection_context, packet_sender, secure_connection, socket().address());
}
catch (const Exception & exc)
{
LOG_ERROR(log, "Authentication for user " << handshake_response.username << " failed.");
LOG_ERROR(log, "Authentication for user " << user_name << " failed.");
packet_sender->sendPacket(ERR_Packet(exc.code(), "00000", exc.message()), true);
throw;
}
LOG_INFO(log, "Authentication for user " << user_name << " succeeded.");
}
void MySQLHandler::comInitDB(ReadBuffer & payload)

View File

@ -30,9 +30,7 @@ private:
void comInitDB(ReadBuffer & payload);
static String generateScramble();
void authenticate(const MySQLProtocol::HandshakeResponse &, const String & scramble);
void authenticate(const String & user_name, const String & auth_plugin_name, const String & auth_response);
IServer & server;
Poco::Logger * log;
@ -48,6 +46,8 @@ private:
RSA & public_key;
RSA & private_key;
std::unique_ptr<MySQLProtocol::Authentication::IPlugin> auth_plugin;
std::shared_ptr<Poco::Net::SecureStreamSocket> ss;
std::shared_ptr<ReadBuffer> in;
std::shared_ptr<WriteBuffer> out;

View File

@ -39,10 +39,18 @@
If you want to specify SHA256, place it in 'password_sha256_hex' element.
Example: <password_sha256_hex>65e84be33532fb784c48129675f9eff3a682b27168c0ea744b2cf58ee02337c5</password_sha256_hex>
Restrictions of SHA256: impossibility to connect to ClickHouse using MySQL JS client (as of July 2019).
If you want to specify double SHA1, place it in 'password_double_sha1_hex' element.
Example: <password_double_sha1_hex>e395796d6546b1b65db9d665cd43f0e858dd4303</password_double_sha1_hex>
How to generate decent password:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
In first line will be password and in second - corresponding SHA256.
How to generate double SHA1:
Execute: PASSWORD=$(base64 < /dev/urandom | head -c8); echo "$PASSWORD"; echo -n "$PASSWORD" | openssl dgst -sha1 -binary | openssl dgst -sha1
In first line will be password and in second - corresponding double SHA1.
-->
<password></password>

View File

@ -24,6 +24,12 @@ template <typename Value, bool FloatReturn> using FuncQuantilesDeterministic = A
template <typename Value, bool _> using FuncQuantileExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantileExact, false, void, false>;
template <typename Value, bool _> using FuncQuantilesExact = AggregateFunctionQuantile<Value, QuantileExact<Value>, NameQuantilesExact, false, void, true>;
template <typename Value, bool _> using FuncQuantileExactExclusive = AggregateFunctionQuantile<Value, QuantileExactExclusive<Value>, NameQuantileExactExclusive, false, Float64, false>;
template <typename Value, bool _> using FuncQuantilesExactExclusive = AggregateFunctionQuantile<Value, QuantileExactExclusive<Value>, NameQuantilesExactExclusive, false, Float64, true>;
template <typename Value, bool _> using FuncQuantileExactInclusive = AggregateFunctionQuantile<Value, QuantileExactInclusive<Value>, NameQuantileExactInclusive, false, Float64, false>;
template <typename Value, bool _> using FuncQuantilesExactInclusive = AggregateFunctionQuantile<Value, QuantileExactInclusive<Value>, NameQuantilesExactInclusive, false, Float64, true>;
template <typename Value, bool _> using FuncQuantileExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantileExactWeighted, true, void, false>;
template <typename Value, bool _> using FuncQuantilesExactWeighted = AggregateFunctionQuantile<Value, QuantileExactWeighted<Value>, NameQuantilesExactWeighted, true, void, true>;
@ -92,6 +98,12 @@ void registerAggregateFunctionsQuantile(AggregateFunctionFactory & factory)
factory.registerFunction(NameQuantileExact::name, createAggregateFunctionQuantile<FuncQuantileExact>);
factory.registerFunction(NameQuantilesExact::name, createAggregateFunctionQuantile<FuncQuantilesExact>);
factory.registerFunction(NameQuantileExactExclusive::name, createAggregateFunctionQuantile<FuncQuantileExactExclusive>);
factory.registerFunction(NameQuantilesExactExclusive::name, createAggregateFunctionQuantile<FuncQuantilesExactExclusive>);
factory.registerFunction(NameQuantileExactInclusive::name, createAggregateFunctionQuantile<FuncQuantileExactInclusive>);
factory.registerFunction(NameQuantilesExactInclusive::name, createAggregateFunctionQuantile<FuncQuantilesExactInclusive>);
factory.registerFunction(NameQuantileExactWeighted::name, createAggregateFunctionQuantile<FuncQuantileExactWeighted>);
factory.registerFunction(NameQuantilesExactWeighted::name, createAggregateFunctionQuantile<FuncQuantilesExactWeighted>);

View File

@ -199,8 +199,15 @@ struct NameQuantileDeterministic { static constexpr auto name = "quantileDetermi
struct NameQuantilesDeterministic { static constexpr auto name = "quantilesDeterministic"; };
struct NameQuantileExact { static constexpr auto name = "quantileExact"; };
struct NameQuantileExactWeighted { static constexpr auto name = "quantileExactWeighted"; };
struct NameQuantilesExact { static constexpr auto name = "quantilesExact"; };
struct NameQuantileExactExclusive { static constexpr auto name = "quantileExactExclusive"; };
struct NameQuantilesExactExclusive { static constexpr auto name = "quantilesExactExclusive"; };
struct NameQuantileExactInclusive { static constexpr auto name = "quantileExactInclusive"; };
struct NameQuantilesExactInclusive { static constexpr auto name = "quantilesExactInclusive"; };
struct NameQuantileExactWeighted { static constexpr auto name = "quantileExactWeighted"; };
struct NameQuantilesExactWeighted { static constexpr auto name = "quantilesExactWeighted"; };
struct NameQuantileTiming { static constexpr auto name = "quantileTiming"; };

View File

@ -17,8 +17,8 @@ namespace
template <template <typename> class Data>
AggregateFunctionPtr createAggregateFunctionWindowFunnel(const std::string & name, const DataTypes & arguments, const Array & params)
{
if (params.size() != 1)
throw Exception{"Aggregate function " + name + " requires exactly one parameter.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH};
if (params.size() < 1)
throw Exception{"Aggregate function " + name + " requires at least one parameter: <window>, [option, [option, ...]]", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH};
if (arguments.size() < 2)
throw Exception("Aggregate function " + name + " requires one timestamp argument and at least one event condition.", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);

View File

@ -139,6 +139,7 @@ class AggregateFunctionWindowFunnel final
private:
UInt64 window;
UInt8 events_size;
UInt8 strict;
// Loop through the entire events_list, update the event timestamp value
@ -165,6 +166,10 @@ private:
if (event_idx == 0)
events_timestamp[0] = timestamp;
else if (strict && events_timestamp[event_idx] >= 0)
{
return event_idx + 1;
}
else if (events_timestamp[event_idx - 1] >= 0 && timestamp <= events_timestamp[event_idx - 1] + window)
{
events_timestamp[event_idx] = events_timestamp[event_idx - 1];
@ -191,8 +196,17 @@ public:
{
events_size = arguments.size() - 1;
window = params.at(0).safeGet<UInt64>();
}
strict = 0;
for (size_t i = 1; i < params.size(); ++i)
{
String option = params.at(i).safeGet<String>();
if (option.compare("strict") == 0)
strict = 1;
else
throw Exception{"Aggregate function " + getName() + " doesn't support a parameter: " + option, ErrorCodes::BAD_ARGUMENTS};
}
}
DataTypePtr getReturnType() const override
{

View File

@ -14,6 +14,7 @@ namespace DB
namespace ErrorCodes
{
extern const int NOT_IMPLEMENTED;
extern const int BAD_ARGUMENTS;
}
/** Calculates quantile by collecting all values into array
@ -106,16 +107,134 @@ struct QuantileExact
result[i] = Value();
}
}
};
/// The same, but in the case of an empty state, NaN is returned.
Float64 getFloat(Float64) const
/// QuantileExactExclusive is equivalent to Excel PERCENTILE.EXC, R-6, SAS-4, SciPy-(0,0)
template <typename Value>
struct QuantileExactExclusive : public QuantileExact<Value>
{
using QuantileExact<Value>::array;
/// Get the value of the `level` quantile. The level must be between 0 and 1 excluding bounds.
Float64 getFloat(Float64 level)
{
throw Exception("Method getFloat is not implemented for QuantileExact", ErrorCodes::NOT_IMPLEMENTED);
if (!array.empty())
{
if (level == 0. || level == 1.)
throw Exception("QuantileExactExclusive cannot interpolate for the percentiles 1 and 0", ErrorCodes::BAD_ARGUMENTS);
Float64 h = level * (array.size() + 1);
auto n = static_cast<size_t>(h);
if (n >= array.size())
return array[array.size() - 1];
else if (n < 1)
return array[0];
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
auto nth_element = std::min_element(array.begin() + n, array.end());
return array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
}
return std::numeric_limits<Float64>::quiet_NaN();
}
void getManyFloat(const Float64 *, const size_t *, size_t, Float64 *) const
void getManyFloat(const Float64 * levels, const size_t * indices, size_t size, Float64 * result)
{
throw Exception("Method getManyFloat is not implemented for QuantileExact", ErrorCodes::NOT_IMPLEMENTED);
if (!array.empty())
{
size_t prev_n = 0;
for (size_t i = 0; i < size; ++i)
{
auto level = levels[indices[i]];
if (level == 0. || level == 1.)
throw Exception("QuantileExactExclusive cannot interpolate for the percentiles 1 and 0", ErrorCodes::BAD_ARGUMENTS);
Float64 h = level * (array.size() + 1);
auto n = static_cast<size_t>(h);
if (n >= array.size())
result[indices[i]] = array[array.size() - 1];
else if (n < 1)
result[indices[i]] = array[0];
else
{
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
auto nth_element = std::min_element(array.begin() + n, array.end());
result[indices[i]] = array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
prev_n = n - 1;
}
}
}
else
{
for (size_t i = 0; i < size; ++i)
result[i] = std::numeric_limits<Float64>::quiet_NaN();
}
}
};
/// QuantileExactInclusive is equivalent to Excel PERCENTILE and PERCENTILE.INC, R-7, SciPy-(1,1)
template <typename Value>
struct QuantileExactInclusive : public QuantileExact<Value>
{
using QuantileExact<Value>::array;
/// Get the value of the `level` quantile. The level must be between 0 and 1 including bounds.
Float64 getFloat(Float64 level)
{
if (!array.empty())
{
Float64 h = level * (array.size() - 1) + 1;
auto n = static_cast<size_t>(h);
if (n >= array.size())
return array[array.size() - 1];
else if (n < 1)
return array[0];
std::nth_element(array.begin(), array.begin() + n - 1, array.end());
auto nth_element = std::min_element(array.begin() + n, array.end());
return array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
}
return std::numeric_limits<Float64>::quiet_NaN();
}
void getManyFloat(const Float64 * levels, const size_t * indices, size_t size, Float64 * result)
{
if (!array.empty())
{
size_t prev_n = 0;
for (size_t i = 0; i < size; ++i)
{
auto level = levels[indices[i]];
Float64 h = level * (array.size() - 1) + 1;
auto n = static_cast<size_t>(h);
if (n >= array.size())
result[indices[i]] = array[array.size() - 1];
else if (n < 1)
result[indices[i]] = array[0];
else
{
std::nth_element(array.begin() + prev_n, array.begin() + n - 1, array.end());
auto nth_element = std::min_element(array.begin() + n, array.end());
result[indices[i]] = array[n - 1] + (h - n) * (*nth_element - array[n - 1]);
prev_n = n - 1;
}
}
}
else
{
for (size_t i = 0; i < size; ++i)
result[i] = std::numeric_limits<Float64>::quiet_NaN();
}
}
};

View File

@ -172,7 +172,7 @@
M(OSWriteChars, "Number of bytes written to filesystem, including page cache.") \
M(CreatedHTTPConnections, "Total amount of created HTTP connections (closed or opened).") \
\
M(QueryProfilerCannotWriteTrace, "Number of stack traces dropped by query profiler because pipe is full or cannot write to pipe.") \
M(CannotWriteToWriteBufferDiscard, "Number of stack traces dropped by query profiler or signal handler because pipe is full or cannot write to pipe.") \
M(QueryProfilerSignalOverruns, "Number of times we drop processing of a signal due to overrun plus the number of signals that OS has not delivered due to overrun.") \
namespace ProfileEvents

View File

@ -11,12 +11,11 @@
#include <Common/Exception.h>
#include <Common/thread_local_rng.h>
#include <IO/WriteHelpers.h>
#include <IO/WriteBufferFromFileDescriptor.h>
#include <IO/WriteBufferFromFileDescriptorDiscardOnFailure.h>
namespace ProfileEvents
{
extern const Event QueryProfilerCannotWriteTrace;
extern const Event QueryProfilerSignalOverruns;
}
@ -27,36 +26,6 @@ extern LazyPipe trace_pipe;
namespace
{
/** Write to file descriptor but drop the data if write would block or fail.
* To use within signal handler. Motivating example: a signal handler invoked during execution of malloc
* should not block because some mutex (or even worse - a spinlock) may be held.
*/
class WriteBufferDiscardOnFailure : public WriteBufferFromFileDescriptor
{
protected:
void nextImpl() override
{
size_t bytes_written = 0;
while (bytes_written != offset())
{
ssize_t res = ::write(fd, working_buffer.begin() + bytes_written, offset() - bytes_written);
if ((-1 == res || 0 == res) && errno != EINTR)
{
ProfileEvents::increment(ProfileEvents::QueryProfilerCannotWriteTrace);
break; /// Discard
}
if (res > 0)
bytes_written += res;
}
}
public:
using WriteBufferFromFileDescriptor::WriteBufferFromFileDescriptor;
~WriteBufferDiscardOnFailure() override {}
};
/// Normally query_id is a UUID (string with a fixed length) but user can provide custom query_id.
/// Thus upper bound on query_id length should be introduced to avoid buffer overflow in signal handler.
constexpr size_t QUERY_ID_MAX_LEN = 1024;
@ -90,7 +59,7 @@ namespace
sizeof(TimerType) + // timer type
sizeof(UInt32); // thread_number
char buffer[buf_size];
WriteBufferDiscardOnFailure out(trace_pipe.fds_rw[1], buf_size, buffer);
WriteBufferFromFileDescriptorDiscardOnFailure out(trace_pipe.fds_rw[1], buf_size, buffer);
StringRef query_id = CurrentThread::getQueryId();
query_id.size = std::min(query_id.size, QUERY_ID_MAX_LEN);
@ -204,11 +173,13 @@ QueryProfilerBase<ProfilerImpl>::~QueryProfilerBase()
template <typename ProfilerImpl>
void QueryProfilerBase<ProfilerImpl>::tryCleanup()
{
#if USE_INTERNAL_UNWIND_LIBRARY
if (timer_id != nullptr && timer_delete(timer_id))
LOG_ERROR(log, "Failed to delete query profiler timer " + errnoToString(ErrorCodes::CANNOT_DELETE_TIMER));
if (previous_handler != nullptr && sigaction(pause_signal, previous_handler, nullptr))
LOG_ERROR(log, "Failed to restore signal handler after query profiler " + errnoToString(ErrorCodes::CANNOT_SET_SIGNAL_HANDLER));
#endif
}
template class QueryProfilerBase<QueryProfilerReal>;

View File

@ -1,7 +1,7 @@
#pragma once
#include <Core/Types.h>
#include <common/config_common.h>
#include <signal.h>
#include <time.h>
@ -43,8 +43,10 @@ private:
Poco::Logger * log;
#if USE_INTERNAL_UNWIND_LIBRARY
/// Timer id from timer_create(2)
timer_t timer_id = nullptr;
#endif
/// Pause signal to interrupt threads to get traces
int pause_signal;

View File

@ -151,6 +151,12 @@ std::string signalToErrorMessage(int sig, const siginfo_t & info, const ucontext
}
break;
}
case SIGPROF:
{
error << "This is a signal used for debugging purposes by the user.";
break;
}
}
return error.str();
@ -239,10 +245,10 @@ const StackTrace::Frames & StackTrace::getFrames() const
}
static std::string toStringImpl(const StackTrace::Frames & frames, size_t offset, size_t size)
static void toStringEveryLineImpl(const StackTrace::Frames & frames, size_t offset, size_t size, std::function<void(const std::string &)> callback)
{
if (size == 0)
return "<Empty trace>";
return callback("<Empty trace>");
const DB::SymbolIndex & symbol_index = DB::SymbolIndex::instance();
std::unordered_map<std::string, DB::Dwarf> dwarfs;
@ -281,12 +287,23 @@ static std::string toStringImpl(const StackTrace::Frames & frames, size_t offset
else
out << "?";
out << "\n";
callback(out.str());
out.str({});
}
}
static std::string toStringImpl(const StackTrace::Frames & frames, size_t offset, size_t size)
{
std::stringstream out;
toStringEveryLineImpl(frames, offset, size, [&](const std::string & str) { out << str << '\n'; });
return out.str();
}
void StackTrace::toStringEveryLine(std::function<void(const std::string &)> callback) const
{
toStringEveryLineImpl(frames, offset, size, std::move(callback));
}
std::string StackTrace::toString() const
{
/// Calculation of stack trace text is extremely slow.

View File

@ -4,6 +4,7 @@
#include <vector>
#include <array>
#include <optional>
#include <functional>
#include <signal.h>
#ifdef __APPLE__
@ -39,6 +40,8 @@ public:
const Frames & getFrames() const;
std::string toString() const;
void toStringEveryLine(std::function<void(const std::string &)> callback) const;
protected:
void tryCapture();

View File

@ -36,7 +36,7 @@ target_include_directories (simple_cache PRIVATE ${DBMS_INCLUDE_DIR})
target_link_libraries (simple_cache PRIVATE common)
add_executable (compact_array compact_array.cpp)
target_link_libraries (compact_array PRIVATE clickhouse_common_io ${Boost_FILESYSTEM_LIBRARY})
target_link_libraries (compact_array PRIVATE clickhouse_common_io stdc++fs)
add_executable (radix_sort radix_sort.cpp)
target_link_libraries (radix_sort PRIVATE clickhouse_common_io)

View File

@ -27,7 +27,7 @@
/// For the expansion of gtest macros.
#if defined(__clang__)
#pragma clang diagnostic ignored "-Wdeprecated"
#elif defined (__GNUC__)
#elif defined (__GNUC__) && __GNUC__ >= 9
#pragma GCC diagnostic ignored "-Wdeprecated-copy"
#endif

View File

@ -96,6 +96,7 @@
#endif
/// Check for presence of address sanitizer
#if !defined(ADDRESS_SANITIZER)
#if defined(__has_feature)
#if __has_feature(address_sanitizer)
#define ADDRESS_SANITIZER 1
@ -103,7 +104,9 @@
#elif defined(__SANITIZE_ADDRESS__)
#define ADDRESS_SANITIZER 1
#endif
#endif
#if !defined(THREAD_SANITIZER)
#if defined(__has_feature)
#if __has_feature(thread_sanitizer)
#define THREAD_SANITIZER 1
@ -111,7 +114,9 @@
#elif defined(__SANITIZE_THREAD__)
#define THREAD_SANITIZER 1
#endif
#endif
#if !defined(MEMORY_SANITIZER)
#if defined(__has_feature)
#if __has_feature(memory_sanitizer)
#define MEMORY_SANITIZER 1
@ -119,6 +124,7 @@
#elif defined(__MEMORY_SANITIZER__)
#define MEMORY_SANITIZER 1
#endif
#endif
/// Explicitly allow undefined behaviour for certain functions. Use it as a function attribute.
/// It is useful in case when compiler cannot see (and exploit) it, but UBSan can.

View File

@ -1,9 +1,17 @@
#pragma once
#include <ext/scope_guard.h>
#include <openssl/pem.h>
#include <openssl/rsa.h>
#include <random>
#include <sstream>
#include <Common/MemoryTracker.h>
#include <Common/OpenSSLHelpers.h>
#include <Common/PODArray.h>
#include <Core/Types.h>
#include <Interpreters/Context.h>
#include <IO/copyData.h>
#include <IO/LimitReadBuffer.h>
#include <IO/ReadBuffer.h>
#include <IO/ReadBufferFromMemory.h>
#include <IO/ReadBufferFromPocoSocket.h>
@ -14,9 +22,7 @@
#include <IO/WriteHelpers.h>
#include <Poco/Net/StreamSocket.h>
#include <Poco/RandomStream.h>
#include <random>
#include <sstream>
#include <IO/LimitReadBuffer.h>
#include <Poco/SHA1Engine.h>
/// Implementation of MySQL wire protocol.
/// Works only on little-endian architecture.
@ -27,6 +33,9 @@ namespace DB
namespace ErrorCodes
{
extern const int UNKNOWN_PACKET_FROM_CLIENT;
extern const int MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES;
extern const int OPENSSL_ERROR;
extern const int UNKNOWN_EXCEPTION;
}
namespace MySQLProtocol
@ -39,11 +48,6 @@ const size_t MYSQL_ERRMSG_SIZE = 512;
const size_t PACKET_HEADER_SIZE = 4;
const size_t SSL_REQUEST_PAYLOAD_SIZE = 32;
namespace Authentication
{
const String Native = "mysql_native_password";
const String SHA256 = "sha256_password"; /// Caching SHA2 plugin is not used because it would be possible to authenticate knowing hash from users.xml.
}
enum CharacterSet
{
@ -149,6 +153,8 @@ private:
uint8_t & sequence_id;
const size_t max_packet_size = MAX_PACKET_LENGTH;
bool has_read_header = false;
// Size of packet which is being read now.
size_t payload_length = 0;
@ -158,8 +164,9 @@ private:
protected:
bool nextImpl() override
{
if (payload_length == 0 || (payload_length == max_packet_size && offset == payload_length))
if (!has_read_header || (payload_length == max_packet_size && offset == payload_length))
{
has_read_header = true;
working_buffer.resize(0);
offset = 0;
payload_length = 0;
@ -171,10 +178,6 @@ protected:
tmp << "Received packet with payload larger than max_packet_size: " << payload_length;
throw ProtocolError(tmp.str(), ErrorCodes::UNKNOWN_PACKET_FROM_CLIENT);
}
else if (payload_length == 0)
{
return false;
}
size_t packet_sequence_id = 0;
in.read(reinterpret_cast<char &>(packet_sequence_id));
@ -185,6 +188,9 @@ protected:
throw ProtocolError(tmp.str(), ErrorCodes::UNKNOWN_PACKET_FROM_CLIENT);
}
sequence_id++;
if (payload_length == 0)
return false;
}
else if (offset == payload_length)
{
@ -208,6 +214,7 @@ class ClientPacket
{
public:
ClientPacket() = default;
ClientPacket(ClientPacket &&) = default;
virtual void read(ReadBuffer & in, uint8_t & sequence_id)
@ -257,6 +264,7 @@ public:
{
return total_left;
}
private:
WriteBuffer & out;
uint8_t & sequence_id;
@ -452,9 +460,6 @@ protected:
buffer.write(static_cast<char>(auth_plugin_data.size()));
writeChar(0x0, 10, buffer);
writeString(auth_plugin_data.substr(AUTH_PLUGIN_DATA_PART_1_LENGTH, auth_plugin_data.size() - AUTH_PLUGIN_DATA_PART_1_LENGTH), buffer);
// A workaround for PHP mysqlnd extension bug which occurs when sha256_password is used as a default authentication plugin.
// Instead of using client response for mysql_native_password plugin, the server will always generate authentication method mismatch
// and switch to sha256_password to simulate that mysql_native_password is used as a default plugin.
writeString(auth_plugin_name, buffer);
writeChar(0x0, 1, buffer);
}
@ -843,5 +848,230 @@ protected:
}
};
namespace Authentication
{
class IPlugin
{
public:
virtual String getName() = 0;
virtual String getAuthPluginData() = 0;
virtual void authenticate(const String & user_name, std::optional<String> auth_response, Context & context, std::shared_ptr<PacketSender> packet_sender, bool is_secure_connection,
const Poco::Net::SocketAddress & address) = 0;
virtual ~IPlugin() = default;
};
/// https://dev.mysql.com/doc/internals/en/secure-password-authentication.html
class Native41 : public IPlugin
{
public:
Native41()
{
scramble.resize(SCRAMBLE_LENGTH + 1, 0);
Poco::RandomInputStream generator;
for (size_t i = 0; i < SCRAMBLE_LENGTH; i++)
generator >> scramble[i];
}
String getName() override
{
return "mysql_native_password";
}
String getAuthPluginData() override
{
return scramble;
}
void authenticate(
const String & user_name,
std::optional<String> auth_response,
Context & context,
std::shared_ptr<PacketSender> packet_sender,
bool /* is_secure_connection */,
const Poco::Net::SocketAddress & address) override
{
if (!auth_response)
{
packet_sender->sendPacket(AuthSwitchRequest(getName(), scramble), true);
AuthSwitchResponse response;
packet_sender->receivePacket(response);
auth_response = response.value;
}
if (auth_response->empty())
{
context.setUser(user_name, "", address, "");
return;
}
if (auth_response->size() != Poco::SHA1Engine::DIGEST_SIZE)
throw Exception("Wrong size of auth response. Expected: " + std::to_string(Poco::SHA1Engine::DIGEST_SIZE) + " bytes, received: " + std::to_string(auth_response->size()) + " bytes.",
ErrorCodes::UNKNOWN_EXCEPTION);
auto user = context.getUser(user_name);
if (user->password_double_sha1_hex.empty())
throw Exception("Cannot use " + getName() + " auth plugin for user " + user_name + " since its password isn't specified using double SHA1.", ErrorCodes::UNKNOWN_EXCEPTION);
Poco::SHA1Engine::Digest double_sha1_value = Poco::DigestEngine::digestFromHex(user->password_double_sha1_hex);
assert(double_sha1_value.size() == Poco::SHA1Engine::DIGEST_SIZE);
Poco::SHA1Engine engine;
engine.update(scramble.data(), SCRAMBLE_LENGTH);
engine.update(double_sha1_value.data(), double_sha1_value.size());
String password_sha1(Poco::SHA1Engine::DIGEST_SIZE, 0x0);
const Poco::SHA1Engine::Digest & digest = engine.digest();
for (size_t i = 0; i < password_sha1.size(); i++)
{
password_sha1[i] = digest[i] ^ static_cast<unsigned char>((*auth_response)[i]);
}
context.setUser(user_name, password_sha1, address, "");
}
private:
String scramble;
};
/// Caching SHA2 plugin is not used because it would be possible to authenticate knowing hash from users.xml.
/// https://dev.mysql.com/doc/internals/en/sha256.html
class Sha256Password : public IPlugin
{
public:
Sha256Password(RSA & public_key_, RSA & private_key_, Logger * log_)
: public_key(public_key_)
, private_key(private_key_)
, log(log_)
{
/** Native authentication sent 20 bytes + '\0' character = 21 bytes.
* This plugin must do the same to stay consistent with historical behavior if it is set to operate as a default plugin. [1]
* https://github.com/mysql/mysql-server/blob/8.0/sql/auth/sql_authentication.cc#L3994
*/
scramble.resize(SCRAMBLE_LENGTH + 1, 0);
Poco::RandomInputStream generator;
for (size_t i = 0; i < SCRAMBLE_LENGTH; i++)
generator >> scramble[i];
}
String getName() override
{
return "sha256_password";
}
String getAuthPluginData() override
{
return scramble;
}
void authenticate(
const String & user_name,
std::optional<String> auth_response,
Context & context,
std::shared_ptr<PacketSender> packet_sender,
bool is_secure_connection,
const Poco::Net::SocketAddress & address) override
{
if (!auth_response)
{
packet_sender->sendPacket(AuthSwitchRequest(getName(), scramble), true);
if (packet_sender->in->eof())
throw Exception("Client doesn't support authentication method " + getName() + " used by ClickHouse. Specifying user password using 'password_double_sha1_hex' may fix the problem.",
ErrorCodes::MYSQL_CLIENT_INSUFFICIENT_CAPABILITIES);
AuthSwitchResponse response;
packet_sender->receivePacket(response);
auth_response = response.value;
LOG_TRACE(log, "Authentication method mismatch.");
}
else
{
LOG_TRACE(log, "Authentication method match.");
}
if (auth_response == "\1")
{
LOG_TRACE(log, "Client requests public key.");
BIO * mem = BIO_new(BIO_s_mem());
SCOPE_EXIT(BIO_free(mem));
if (PEM_write_bio_RSA_PUBKEY(mem, &public_key) != 1)
{
throw Exception("Failed to write public key to memory. Error: " + getOpenSSLErrors(), ErrorCodes::OPENSSL_ERROR);
}
char * pem_buf = nullptr;
long pem_size = BIO_get_mem_data(mem, &pem_buf);
String pem(pem_buf, pem_size);
LOG_TRACE(log, "Key: " << pem);
AuthMoreData data(pem);
packet_sender->sendPacket(data, true);
AuthSwitchResponse response;
packet_sender->receivePacket(response);
auth_response = response.value;
}
else
{
LOG_TRACE(log, "Client didn't request public key.");
}
String password;
/** Decrypt password, if it's not empty.
* The original intention was that the password is a string[NUL] but this never got enforced properly so now we have to accept that
* an empty packet is a blank password, thus the check for auth_response.empty() has to be made too.
* https://github.com/mysql/mysql-server/blob/8.0/sql/auth/sql_authentication.cc#L4017
*/
if (!is_secure_connection && !auth_response->empty() && auth_response != String("\0", 1))
{
LOG_TRACE(log, "Received nonempty password");
auto ciphertext = reinterpret_cast<unsigned char *>(auth_response->data());
unsigned char plaintext[RSA_size(&private_key)];
int plaintext_size = RSA_private_decrypt(auth_response->size(), ciphertext, plaintext, &private_key, RSA_PKCS1_OAEP_PADDING);
if (plaintext_size == -1)
{
throw Exception("Failed to decrypt auth data. Error: " + getOpenSSLErrors(), ErrorCodes::OPENSSL_ERROR);
}
password.resize(plaintext_size);
for (int i = 0; i < plaintext_size; i++)
{
password[i] = plaintext[i] ^ static_cast<unsigned char>(scramble[i % scramble.size()]);
}
}
else if (is_secure_connection)
{
password = *auth_response;
}
else
{
LOG_TRACE(log, "Received empty password");
}
if (!password.empty() && password.back() == 0)
{
password.pop_back();
}
context.setUser(user_name, password, address, "");
}
private:
RSA & public_key;
RSA & private_key;
Logger * log;
String scramble;
};
}
}
}

View File

@ -555,7 +555,7 @@ public:
for (const auto & member : members())
{
if (member.isChanged(castToDerived()))
found_changes.emplace_back(member.name.toString(), member.get_field(castToDerived()));
found_changes.push_back({member.name.toString(), member.get_field(castToDerived())});
}
return found_changes;
}

View File

@ -348,10 +348,9 @@ bool OPTIMIZE(1) CSVRowInputStream::parseRowAndPrintDiagnosticInfo(MutableColumn
const auto & current_column_type = data_types[table_column];
const bool is_last_file_column =
file_column + 1 == column_indexes_for_input_fields.size();
const bool at_delimiter = *istr.position() == delimiter;
const bool at_delimiter = !istr.eof() && *istr.position() == delimiter;
const bool at_last_column_line_end = is_last_file_column
&& (*istr.position() == '\n' || *istr.position() == '\r'
|| istr.eof());
&& (istr.eof() || *istr.position() == '\n' || *istr.position() == '\r');
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
<< "name: " << header.safeGetByPosition(table_column).name << ", " << std::string(max_length_of_column_name - header.safeGetByPosition(table_column).name.size(), ' ')
@ -514,10 +513,9 @@ void CSVRowInputStream::updateDiagnosticInfo()
bool CSVRowInputStream::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
{
const bool at_delimiter = *istr.position() == format_settings.csv.delimiter;
const bool at_delimiter = !istr.eof() || *istr.position() == format_settings.csv.delimiter;
const bool at_last_column_line_end = is_last_file_column
&& (*istr.position() == '\n' || *istr.position() == '\r'
|| istr.eof());
&& (istr.eof() || *istr.position() == '\n' || *istr.position() == '\r');
if (format_settings.csv.empty_as_default
&& (at_delimiter || at_last_column_line_end))

View File

@ -25,6 +25,7 @@ target_link_libraries(clickhouse_functions
PRIVATE
${ZLIB_LIBRARIES}
${Boost_FILESYSTEM_LIBRARY}
${CMAKE_DL_LIBS}
)
if (OPENSSL_CRYPTO_LIBRARY)

View File

@ -4,6 +4,10 @@
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_CAST;
}
/// Working with UInt8: last bit = can be true, previous = can be false (Like dbms/src/Storages/MergeTree/BoolMask.h).
/// This function provides "AND" operation for BoolMasks.
@ -17,6 +21,8 @@ namespace DB
template <typename Result = ResultType>
static inline Result apply(A left, B right)
{
if constexpr (!std::is_same_v<A, ResultType> || !std::is_same_v<B, ResultType>)
throw DB::Exception("It's a bug! Only UInt8 type is supported by __bitBoolMaskAnd.", ErrorCodes::BAD_CAST);
return static_cast<ResultType>(
((static_cast<ResultType>(left) & static_cast<ResultType>(right)) & 1)
| ((((static_cast<ResultType>(left) >> 1) | (static_cast<ResultType>(right) >> 1)) & 1) << 1));

View File

@ -4,6 +4,10 @@
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_CAST;
}
/// Working with UInt8: last bit = can be true, previous = can be false (Like dbms/src/Storages/MergeTree/BoolMask.h).
/// This function provides "OR" operation for BoolMasks.
@ -17,6 +21,8 @@ namespace DB
template <typename Result = ResultType>
static inline Result apply(A left, B right)
{
if constexpr (!std::is_same_v<A, ResultType> || !std::is_same_v<B, ResultType>)
throw DB::Exception("It's a bug! Only UInt8 type is supported by __bitBoolMaskOr.", ErrorCodes::BAD_CAST);
return static_cast<ResultType>(
((static_cast<ResultType>(left) | static_cast<ResultType>(right)) & 1)
| ((((static_cast<ResultType>(left) >> 1) & (static_cast<ResultType>(right) >> 1)) & 1) << 1));

View File

@ -4,6 +4,10 @@
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_CAST;
}
/// Working with UInt8: last bit = can be true, previous = can be false (Like dbms/src/Storages/MergeTree/BoolMask.h).
/// This function provides "NOT" operation for BoolMasks by swapping last two bits ("can be true" <-> "can be false").
@ -14,6 +18,8 @@ namespace DB
static inline ResultType NO_SANITIZE_UNDEFINED apply(A a)
{
if constexpr (!std::is_same_v<A, ResultType>)
throw DB::Exception("It's a bug! Only UInt8 type is supported by __bitSwapLastTwo.", ErrorCodes::BAD_CAST);
return static_cast<ResultType>(
((static_cast<ResultType>(a) & 1) << 1) | ((static_cast<ResultType>(a) >> 1) & 1));
}

View File

@ -4,6 +4,10 @@
namespace DB
{
namespace ErrorCodes
{
extern const int BAD_CAST;
}
/// Working with UInt8: last bit = can be true, previous = can be false (Like dbms/src/Storages/MergeTree/BoolMask.h).
/// This function wraps bool atomic functions
@ -15,7 +19,9 @@ namespace DB
static inline ResultType NO_SANITIZE_UNDEFINED apply(A a)
{
return a == static_cast<UInt8>(0) ? static_cast<ResultType>(0b10) : static_cast<ResultType >(0b1);
if constexpr (!std::is_integral_v<A>)
throw DB::Exception("It's a bug! Only integer types are supported by __bitWrapperFunc.", ErrorCodes::BAD_CAST);
return a == 0 ? static_cast<ResultType>(0b10) : static_cast<ResultType >(0b1);
}
#if USE_EMBEDDED_COMPILER

View File

@ -0,0 +1,29 @@
#include <IO/WriteBufferFromFileDescriptorDiscardOnFailure.h>
namespace ProfileEvents
{
extern const Event CannotWriteToWriteBufferDiscard;
}
namespace DB
{
void WriteBufferFromFileDescriptorDiscardOnFailure::nextImpl()
{
size_t bytes_written = 0;
while (bytes_written != offset())
{
ssize_t res = ::write(fd, working_buffer.begin() + bytes_written, offset() - bytes_written);
if ((-1 == res || 0 == res) && errno != EINTR)
{
ProfileEvents::increment(ProfileEvents::CannotWriteToWriteBufferDiscard);
break; /// Discard
}
if (res > 0)
bytes_written += res;
}
}
}

View File

@ -0,0 +1,23 @@
#pragma once
#include <IO/WriteBufferFromFileDescriptor.h>
namespace DB
{
/** Write to file descriptor but drop the data if write would block or fail.
* To use within signal handler. Motivating example: a signal handler invoked during execution of malloc
* should not block because some mutex (or even worse - a spinlock) may be held.
*/
class WriteBufferFromFileDescriptorDiscardOnFailure : public WriteBufferFromFileDescriptor
{
protected:
void nextImpl() override;
public:
using WriteBufferFromFileDescriptor::WriteBufferFromFileDescriptor;
~WriteBufferFromFileDescriptorDiscardOnFailure() override {}
};
}

View File

@ -59,10 +59,10 @@ target_link_libraries (write_int PRIVATE clickhouse_common_io)
if (OS_LINUX OR OS_FREEBSD)
add_executable(write_buffer_aio write_buffer_aio.cpp)
target_link_libraries (write_buffer_aio PRIVATE clickhouse_common_io ${Boost_FILESYSTEM_LIBRARY})
target_link_libraries (write_buffer_aio PRIVATE clickhouse_common_io stdc++fs)
add_executable(read_buffer_aio read_buffer_aio.cpp)
target_link_libraries (read_buffer_aio PRIVATE clickhouse_common_io ${Boost_FILESYSTEM_LIBRARY})
target_link_libraries (read_buffer_aio PRIVATE clickhouse_common_io stdc++fs)
endif ()
add_executable (zlib_buffers zlib_buffers.cpp)

View File

@ -1,7 +1,5 @@
#include <Interpreters/AnalyzedJoin.h>
#include <Interpreters/DatabaseAndTableWithAlias.h>
#include <Interpreters/SyntaxAnalyzer.h>
#include <Interpreters/ExpressionAnalyzer.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Parsers/ASTExpressionList.h>
@ -50,40 +48,6 @@ size_t AnalyzedJoin::rightKeyInclusion(const String & name) const
return count;
}
ExpressionActionsPtr AnalyzedJoin::createJoinedBlockActions(
const NamesAndTypesList & columns_added_by_join,
const ASTSelectQuery * select_query_with_join,
const Context & context) const
{
if (!select_query_with_join)
return nullptr;
const ASTTablesInSelectQueryElement * join = select_query_with_join->join();
if (!join)
return nullptr;
const auto & join_params = join->table_join->as<ASTTableJoin &>();
/// Create custom expression list with join keys from right table.
auto expression_list = std::make_shared<ASTExpressionList>();
ASTs & children = expression_list->children;
if (join_params.on_expression)
for (const auto & join_right_key : key_asts_right)
children.emplace_back(join_right_key);
NameSet required_columns_set(key_names_right.begin(), key_names_right.end());
for (const auto & joined_column : columns_added_by_join)
required_columns_set.insert(joined_column.name);
Names required_columns(required_columns_set.begin(), required_columns_set.end());
ASTPtr query = expression_list;
auto syntax_result = SyntaxAnalyzer(context).analyze(query, columns_from_joined_table, required_columns);
ExpressionAnalyzer analyzer(query, syntax_result, context);
return analyzer.getActions(true, false);
}
void AnalyzedJoin::deduplicateAndQualifyColumnNames(const NameSet & left_table_columns, const String & right_table_prefix)
{
NameSet joined_columns;

View File

@ -14,9 +14,6 @@ class Context;
class ASTSelectQuery;
struct DatabaseAndTableWithAlias;
class ExpressionActions;
using ExpressionActionsPtr = std::shared_ptr<ExpressionActions>;
struct AnalyzedJoin
{
/** Query of the form `SELECT expr(x) AS k FROM t1 ANY LEFT JOIN (SELECT expr(x) AS k FROM t2) USING k`
@ -56,10 +53,8 @@ public:
void addUsingKey(const ASTPtr & ast);
void addOnKeys(ASTPtr & left_table_ast, ASTPtr & right_table_ast);
ExpressionActionsPtr createJoinedBlockActions(
const NamesAndTypesList & columns_added_by_join, /// Subset of available_joined_columns.
const ASTSelectQuery * select_query_with_join,
const Context & context) const;
bool hasUsing() const { return with_using; }
bool hasOn() const { return !with_using; }
NameSet getQualifiedColumnsSet() const;
NameSet getOriginalColumnsSet() const;

View File

@ -180,19 +180,22 @@ void AsynchronousMetrics::update()
calculateMaxAndSum(max_inserts_in_queue, sum_inserts_in_queue, status.queue.inserts_in_queue);
calculateMaxAndSum(max_merges_in_queue, sum_merges_in_queue, status.queue.merges_in_queue);
try
if (!status.is_readonly)
{
time_t absolute_delay = 0;
time_t relative_delay = 0;
table_replicated_merge_tree->getReplicaDelays(absolute_delay, relative_delay);
try
{
time_t absolute_delay = 0;
time_t relative_delay = 0;
table_replicated_merge_tree->getReplicaDelays(absolute_delay, relative_delay);
calculateMax(max_absolute_delay, absolute_delay);
calculateMax(max_relative_delay, relative_delay);
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__,
"Cannot get replica delay for table: " + backQuoteIfNeed(db.first) + "." + backQuoteIfNeed(iterator->name()));
calculateMax(max_absolute_delay, absolute_delay);
calculateMax(max_relative_delay, relative_delay);
}
catch (...)
{
tryLogCurrentException(__PRETTY_FUNCTION__,
"Cannot get replica delay for table: " + backQuoteIfNeed(db.first) + "." + backQuoteIfNeed(iterator->name()));
}
}
calculateMax(max_part_count_for_partition, table_replicated_merge_tree->getMaxPartsCountForPartition());

View File

@ -655,6 +655,10 @@ void Context::setProfile(const String & profile)
settings_constraints = std::move(new_constraints);
}
std::shared_ptr<const User> Context::getUser(const String & user_name)
{
return shared->users_manager->getUser(user_name);
}
void Context::setUser(const String & name, const String & password, const Poco::Net::SocketAddress & address, const String & quota_key)
{

View File

@ -6,6 +6,7 @@
#include <Core/Types.h>
#include <DataStreams/IBlockStream_fwd.h>
#include <Interpreters/ClientInfo.h>
#include <Interpreters/Users.h>
#include <Parsers/IAST_fwd.h>
#include <Common/LRUCache.h>
#include <Common/MultiVersion.h>
@ -200,6 +201,10 @@ public:
/// Must be called before getClientInfo.
void setUser(const String & name, const String & password, const Poco::Net::SocketAddress & address, const String & quota_key);
/// Used by MySQL Secure Password Authentication plugin.
std::shared_ptr<const User> getUser(const String & user_name);
/// Compute and set actual user settings, client_info.current_user should be set
void calculateUserSettings();

View File

@ -69,7 +69,7 @@ using LogAST = DebugASTLog<false>; /// set to true to enable logs
namespace ErrorCodes
{
extern const int UNKNOWN_IDENTIFIER;
extern const int EXPECTED_ALL_OR_ANY;
extern const int LOGICAL_ERROR;
}
ExpressionAnalyzer::ExpressionAnalyzer(
@ -140,7 +140,7 @@ void ExpressionAnalyzer::analyzeAggregation()
for (const auto & key_ast : analyzedJoin().key_asts_left)
getRootActions(key_ast, true, temp_actions);
addJoinAction(temp_actions, true);
addJoinAction(temp_actions);
}
}
@ -411,17 +411,6 @@ bool SelectQueryExpressionAnalyzer::appendArrayJoin(ExpressionActionsChain & cha
return true;
}
void ExpressionAnalyzer::addJoinAction(ExpressionActionsPtr & actions, bool only_types) const
{
if (only_types)
actions->add(ExpressionAction::ordinaryJoin(nullptr, analyzedJoin().key_names_left, columnsAddedByJoin()));
else
for (auto & subquery_for_set : subqueries_for_sets)
if (subquery_for_set.second.join)
actions->add(ExpressionAction::ordinaryJoin(subquery_for_set.second.join, analyzedJoin().key_names_left,
columnsAddedByJoin()));
}
static void appendRequiredColumns(
NameSet & required_columns, const Block & sample, const Names & key_names_right, const NamesAndTypesList & columns_added_by_join)
{
@ -434,48 +423,37 @@ static void appendRequiredColumns(
required_columns.insert(column.name);
}
/// It's possible to set nullptr as join for only_types mode
void ExpressionAnalyzer::addJoinAction(ExpressionActionsPtr & actions, JoinPtr join) const
{
actions->add(ExpressionAction::ordinaryJoin(join, analyzedJoin().key_names_left, columnsAddedByJoin()));
}
bool SelectQueryExpressionAnalyzer::appendJoin(ExpressionActionsChain & chain, bool only_types)
{
const auto * select_query = getSelectQuery();
if (!select_query->join())
const ASTTablesInSelectQueryElement * ast_join = getSelectQuery()->join();
if (!ast_join)
return false;
SubqueryForSet & subquery_for_set = getSubqueryForJoin(*ast_join);
ASTPtr left_keys_list = std::make_shared<ASTExpressionList>();
left_keys_list->children = analyzedJoin().key_asts_left;
initChain(chain, sourceColumns());
ExpressionActionsChain::Step & step = chain.steps.back();
const auto & join_element = select_query->join()->as<ASTTablesInSelectQueryElement &>();
getRootActions(left_keys_list, only_types, step.actions);
addJoinAction(step.actions, subquery_for_set.join);
return true;
}
static JoinPtr tryGetStorageJoin(const ASTTablesInSelectQueryElement & join_element, const Context & context)
{
const auto & table_to_join = join_element.table_expression->as<ASTTableExpression &>();
auto & join_params = join_element.table_join->as<ASTTableJoin &>();
if (join_params.strictness == ASTTableJoin::Strictness::Unspecified && join_params.kind != ASTTableJoin::Kind::Cross)
{
if (settings.join_default_strictness == "ANY")
join_params.strictness = ASTTableJoin::Strictness::Any;
else if (settings.join_default_strictness == "ALL")
join_params.strictness = ASTTableJoin::Strictness::All;
else
throw Exception("Expected ANY or ALL in JOIN section, because setting (join_default_strictness) is empty", DB::ErrorCodes::EXPECTED_ALL_OR_ANY);
}
if (join_params.using_expression_list)
{
getRootActions(join_params.using_expression_list, only_types, step.actions);
}
else if (join_params.on_expression)
{
auto list = std::make_shared<ASTExpressionList>();
list->children = analyzedJoin().key_asts_left;
getRootActions(list, only_types, step.actions);
}
const auto & table_to_join = join_element.table_expression->as<ASTTableExpression &>();
/// Two JOINs are not supported with the same subquery, but different USINGs.
auto join_hash = join_element.getTreeHash();
SubqueryForSet & subquery_for_set = subqueries_for_sets[toString(join_hash.first) + "_" + toString(join_hash.second)];
/// Special case - if table name is specified on the right of JOIN, then the table has the type Join (the previously prepared mapping).
/// TODO This syntax does not support specifying a database name.
if (table_to_join.database_and_table_name)
{
@ -491,64 +469,100 @@ bool SelectQueryExpressionAnalyzer::appendJoin(ExpressionActionsChain & chain, b
storage_join->assertCompatible(join_params.kind, join_params.strictness);
/// TODO Check the set of keys.
JoinPtr & join = storage_join->getJoin();
subquery_for_set.join = join;
return storage_join->getJoin();
}
}
}
return {};
}
SubqueryForSet & SelectQueryExpressionAnalyzer::getSubqueryForJoin(const ASTTablesInSelectQueryElement & join_element)
{
/// Two JOINs are not supported with the same subquery, but different USINGs.
auto join_hash = join_element.getTreeHash();
String join_subquery_id = toString(join_hash.first) + "_" + toString(join_hash.second);
SubqueryForSet & subquery_for_set = subqueries_for_sets[join_subquery_id];
/// Special case - if table name is specified on the right of JOIN, then the table has the type Join (the previously prepared mapping).
if (!subquery_for_set.join)
subquery_for_set.join = tryGetStorageJoin(join_element, context);
if (!subquery_for_set.join)
makeHashJoin(join_element, subquery_for_set);
return subquery_for_set;
}
void SelectQueryExpressionAnalyzer::makeHashJoin(const ASTTablesInSelectQueryElement & join_element,
SubqueryForSet & subquery_for_set) const
{
/// Actions which need to be calculated on joined block.
ExpressionActionsPtr joined_block_actions = createJoinedBlockActions();
/** For GLOBAL JOINs (in the case, for example, of the push method for executing GLOBAL subqueries), the following occurs
* - in the addExternalStorage function, the JOIN (SELECT ...) subquery is replaced with JOIN _data1,
* in the subquery_for_set object this subquery is exposed as source and the temporary table _data1 as the `table`.
* - this function shows the expression JOIN _data1.
*/
if (!subquery_for_set.source)
{
ASTPtr table;
auto & table_to_join = join_element.table_expression->as<ASTTableExpression &>();
if (table_to_join.subquery)
table = table_to_join.subquery;
else if (table_to_join.table_function)
table = table_to_join.table_function;
else if (table_to_join.database_and_table_name)
table = table_to_join.database_and_table_name;
Names action_columns = joined_block_actions->getRequiredColumns();
NameSet required_columns(action_columns.begin(), action_columns.end());
auto & analyzed_join = analyzedJoin();
/// Actions which need to be calculated on joined block.
ExpressionActionsPtr joined_block_actions =
analyzed_join.createJoinedBlockActions(columnsAddedByJoin(), select_query, context);
appendRequiredColumns(
required_columns, joined_block_actions->getSampleBlock(), analyzed_join.key_names_right, columnsAddedByJoin());
/** For GLOBAL JOINs (in the case, for example, of the push method for executing GLOBAL subqueries), the following occurs
* - in the addExternalStorage function, the JOIN (SELECT ...) subquery is replaced with JOIN _data1,
* in the subquery_for_set object this subquery is exposed as source and the temporary table _data1 as the `table`.
* - this function shows the expression JOIN _data1.
*/
if (!subquery_for_set.source)
{
ASTPtr table;
auto original_map = analyzed_join.getOriginalColumnsMap(required_columns);
Names original_columns;
for (auto & pr : original_map)
original_columns.push_back(pr.second);
if (table_to_join.subquery)
table = table_to_join.subquery;
else if (table_to_join.table_function)
table = table_to_join.table_function;
else if (table_to_join.database_and_table_name)
table = table_to_join.database_and_table_name;
auto interpreter = interpretSubquery(table, context, subquery_depth, original_columns);
Names action_columns = joined_block_actions->getRequiredColumns();
NameSet required_columns(action_columns.begin(), action_columns.end());
appendRequiredColumns(
required_columns, joined_block_actions->getSampleBlock(), analyzed_join.key_names_right, columnsAddedByJoin());
auto original_map = analyzed_join.getOriginalColumnsMap(required_columns);
Names original_columns;
for (auto & pr : original_map)
original_columns.push_back(pr.second);
auto interpreter = interpretSubquery(table, context, subquery_depth, original_columns);
subquery_for_set.makeSource(interpreter, original_map);
}
Block sample_block = subquery_for_set.renamedSampleBlock();
joined_block_actions->execute(sample_block);
/// TODO You do not need to set this up when JOIN is only needed on remote servers.
subquery_for_set.join = std::make_shared<Join>(analyzedJoin().key_names_right, settings.join_use_nulls,
settings.size_limits_for_join, join_params.kind, join_params.strictness);
subquery_for_set.join->setSampleBlock(sample_block);
subquery_for_set.joined_block_actions = joined_block_actions;
subquery_for_set.makeSource(interpreter, original_map);
}
addJoinAction(step.actions, false);
Block sample_block = subquery_for_set.renamedSampleBlock();
joined_block_actions->execute(sample_block);
return true;
/// TODO You do not need to set this up when JOIN is only needed on remote servers.
auto & join_params = join_element.table_join->as<ASTTableJoin &>();
subquery_for_set.join = std::make_shared<Join>(analyzedJoin().key_names_right, settings.join_use_nulls,
settings.size_limits_for_join, join_params.kind, join_params.strictness);
subquery_for_set.join->setSampleBlock(sample_block);
subquery_for_set.joined_block_actions = joined_block_actions;
}
ExpressionActionsPtr SelectQueryExpressionAnalyzer::createJoinedBlockActions() const
{
/// Create custom expression list with join keys from right table.
ASTPtr expression_list = std::make_shared<ASTExpressionList>();
ASTs & children = expression_list->children;
if (analyzedJoin().hasOn())
for (const auto & join_right_key : analyzedJoin().key_asts_right)
children.emplace_back(join_right_key);
NameSet required_columns_set(analyzedJoin().key_names_right.begin(), analyzedJoin().key_names_right.end());
for (const auto & joined_column : columnsAddedByJoin())
required_columns_set.insert(joined_column.name);
Names required_columns(required_columns_set.begin(), required_columns_set.end());
auto syntax_result = SyntaxAnalyzer(context).analyze(expression_list, analyzedJoin().columns_from_joined_table, required_columns);
return ExpressionAnalyzer(expression_list, syntax_result, context).getActions(true, false);
}
bool SelectQueryExpressionAnalyzer::appendPrewhere(

View File

@ -24,6 +24,7 @@ struct ASTTableJoin;
class ASTFunction;
class ASTExpressionList;
class ASTSelectQuery;
struct ASTTablesInSelectQueryElement;
struct SyntaxAnalyzerResult;
using SyntaxAnalyzerResultPtr = std::shared_ptr<const SyntaxAnalyzerResult>;
@ -63,14 +64,12 @@ private:
const bool join_use_nulls;
const SizeLimits size_limits_for_set;
const SizeLimits size_limits_for_join;
const String join_default_strictness;
ExtractedSettings(const Settings & settings_)
: use_index_for_in_with_subqueries(settings_.use_index_for_in_with_subqueries),
join_use_nulls(settings_.join_use_nulls),
size_limits_for_set(settings_.max_rows_in_set, settings_.max_bytes_in_set, settings_.set_overflow_mode),
size_limits_for_join(settings_.max_rows_in_join, settings_.max_bytes_in_join, settings_.join_overflow_mode),
join_default_strictness(settings_.join_default_strictness.toString())
size_limits_for_join(settings_.max_rows_in_join, settings_.max_bytes_in_join, settings_.join_overflow_mode)
{}
};
@ -132,7 +131,7 @@ protected:
void addMultipleArrayJoinAction(ExpressionActionsPtr & actions, bool is_left) const;
void addJoinAction(ExpressionActionsPtr & actions, bool only_types) const;
void addJoinAction(ExpressionActionsPtr & actions, JoinPtr join = {}) const;
void getRootActions(const ASTPtr & ast, bool no_subqueries, ExpressionActionsPtr & actions, bool only_consts = false);
@ -224,6 +223,10 @@ private:
*/
void tryMakeSetForIndexFromSubquery(const ASTPtr & subquery_or_table_name);
SubqueryForSet & getSubqueryForJoin(const ASTTablesInSelectQueryElement & join_element);
ExpressionActionsPtr createJoinedBlockActions() const;
void makeHashJoin(const ASTTablesInSelectQueryElement & join_element, SubqueryForSet & subquery_for_set) const;
const ASTSelectQuery * getAggregatingQuery() const;
};

View File

@ -1,9 +1,9 @@
#include <Interpreters/MetricLog.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/DataTypeString.h>
#include <DataTypes/DataTypeDate.h>
#include <DataTypes/DataTypeDateTime.h>
namespace DB
{
@ -11,11 +11,10 @@ Block MetricLogElement::createBlock()
{
ColumnsWithTypeAndName columns_with_type_and_name;
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeDate>(), "event_date");
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeDateTime>(), "event_time");
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeUInt64>(), "milliseconds");
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeDate>(), "event_date");
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeDateTime>(), "event_time");
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeUInt64>(), "milliseconds");
//ProfileEvents
for (size_t i = 0, end = ProfileEvents::end(); i < end; ++i)
{
std::string name;
@ -24,7 +23,6 @@ Block MetricLogElement::createBlock()
columns_with_type_and_name.emplace_back(std::make_shared<DataTypeUInt64>(), std::move(name));
}
//CurrentMetrics
for (size_t i = 0, end = CurrentMetrics::end(); i < end; ++i)
{
std::string name;
@ -36,31 +34,25 @@ Block MetricLogElement::createBlock()
return Block(columns_with_type_and_name);
}
void MetricLogElement::appendToBlock(Block & block) const
{
MutableColumns columns = block.mutateColumns();
size_t iter = 0;
size_t column_idx = 0;
columns[iter++]->insert(DateLUT::instance().toDayNum(event_time));
columns[iter++]->insert(event_time);
columns[iter++]->insert(milliseconds);
columns[column_idx++]->insert(DateLUT::instance().toDayNum(event_time));
columns[column_idx++]->insert(event_time);
columns[column_idx++]->insert(milliseconds);
//ProfileEvents
for (size_t i = 0, end = ProfileEvents::end(); i < end; ++i)
{
const UInt64 value = ProfileEvents::global_counters[i].load(std::memory_order_relaxed);
columns[iter++]->insert(value);
}
columns[column_idx++]->insert(profile_events[i]);
//CurrentMetrics
for (size_t i = 0, end = CurrentMetrics::end(); i < end; ++i)
{
const UInt64 value = CurrentMetrics::values[i];
columns[iter++]->insert(value);
}
columns[column_idx++]->insert(current_metrics[i]);
}
void MetricLog::startCollectMetric(size_t collect_interval_milliseconds_)
{
collect_interval_milliseconds = collect_interval_milliseconds_;
@ -68,6 +60,7 @@ void MetricLog::startCollectMetric(size_t collect_interval_milliseconds_)
metric_flush_thread = ThreadFromGlobalPool([this] { metricThreadFunction(); });
}
void MetricLog::stopCollectMetric()
{
bool old_val = false;
@ -76,28 +69,51 @@ void MetricLog::stopCollectMetric()
metric_flush_thread.join();
}
inline UInt64 time_in_milliseconds(std::chrono::time_point<std::chrono::system_clock> timepoint)
{
return std::chrono::duration_cast<std::chrono::milliseconds>(timepoint.time_since_epoch()).count();
}
inline UInt64 time_in_seconds(std::chrono::time_point<std::chrono::system_clock> timepoint)
{
return std::chrono::duration_cast<std::chrono::seconds>(timepoint.time_since_epoch()).count();
}
void MetricLog::metricThreadFunction()
{
auto desired_timepoint = std::chrono::system_clock::now();
/// For differentiation of ProfileEvents counters.
std::vector<ProfileEvents::Count> prev_profile_events(ProfileEvents::end());
while (!is_shutdown_metric_thread)
{
try
{
MetricLogElement elem;
const auto current_time = std::chrono::system_clock::now();
MetricLogElement elem;
elem.event_time = std::chrono::system_clock::to_time_t(current_time);
elem.milliseconds = time_in_milliseconds(current_time) - time_in_seconds(current_time) * 1000;
elem.profile_events.resize(ProfileEvents::end());
for (size_t i = 0, end = ProfileEvents::end(); i < end; ++i)
{
const ProfileEvents::Count new_value = ProfileEvents::global_counters[i].load(std::memory_order_relaxed);
UInt64 & old_value = prev_profile_events[i];
elem.profile_events[i] = new_value - old_value;
old_value = new_value;
}
elem.current_metrics.resize(CurrentMetrics::end());
for (size_t i = 0, end = CurrentMetrics::end(); i < end; ++i)
{
elem.current_metrics[i] = CurrentMetrics::values[i];
}
this->add(elem);
/// We will record current time into table but align it to regular time intervals to avoid time drift.

View File

@ -1,22 +1,34 @@
#pragma once
#include <Interpreters/SystemLog.h>
#include <Interpreters/AsynchronousMetrics.h>
#include <Common/ProfileEvents.h>
#include <Common/CurrentMetrics.h>
#include <vector>
#include <atomic>
#include <ctime>
namespace DB
{
using Poco::Message;
/** MetricLog is a log of metric values measured at regular time interval.
*/
struct MetricLogElement
{
time_t event_time{};
UInt64 milliseconds{};
std::vector<ProfileEvents::Count> profile_events;
std::vector<CurrentMetrics::Metric> current_metrics;
static std::string name() { return "MetricLog"; }
static Block createBlock();
void appendToBlock(Block & block) const;
};
class MetricLog : public SystemLog<MetricLogElement>
{
using SystemLog<MetricLogElement>::SystemLog;

View File

@ -320,13 +320,7 @@ ColumnPtr Set::execute(const Block & block, bool negative) const
return res;
}
if (data_types.size() != num_key_columns)
{
std::stringstream message;
message << "Number of columns in section IN doesn't match. "
<< num_key_columns << " at left, " << data_types.size() << " at right.";
throw Exception(message.str(), ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH);
}
checkColumnsNumber(num_key_columns);
/// Remember the columns we will work with. Also check that the data types are correct.
ColumnRawPtrs key_columns;
@ -337,11 +331,7 @@ ColumnPtr Set::execute(const Block & block, bool negative) const
for (size_t i = 0; i < num_key_columns; ++i)
{
if (!removeNullable(data_types[i])->equals(*removeNullable(block.safeGetByPosition(i).type)))
throw Exception("Types of column " + toString(i + 1) + " in section IN don't match: "
+ data_types[i]->getName() + " on the right, " + block.safeGetByPosition(i).type->getName() +
" on the left.", ErrorCodes::TYPE_MISMATCH);
checkTypesEqual(i, block.safeGetByPosition(i).type);
materialized_columns.emplace_back(block.safeGetByPosition(i).column->convertToFullColumnIfConst());
key_columns.emplace_back() = materialized_columns.back().get();
}
@ -421,6 +411,24 @@ void Set::executeOrdinary(
}
}
void Set::checkColumnsNumber(size_t num_key_columns) const
{
if (data_types.size() != num_key_columns)
{
std::stringstream message;
message << "Number of columns in section IN doesn't match. "
<< num_key_columns << " at left, " << data_types.size() << " at right.";
throw Exception(message.str(), ErrorCodes::NUMBER_OF_COLUMNS_DOESNT_MATCH);
}
}
void Set::checkTypesEqual(size_t set_type_idx, const DataTypePtr & other_type) const
{
if (!removeNullable(data_types[set_type_idx])->equals(*removeNullable(other_type)))
throw Exception("Types of column " + toString(set_type_idx + 1) + " in section IN don't match: "
+ data_types[set_type_idx]->getName() + " on the right, " + other_type->getName() +
" on the left.", ErrorCodes::TYPE_MISMATCH);
}
MergeTreeSetIndex::MergeTreeSetIndex(const Columns & set_elements, std::vector<KeyTuplePositionMapping> && index_mapping_)
: indexes_mapping(std::move(index_mapping_))

View File

@ -70,6 +70,9 @@ public:
bool hasExplicitSetElements() const { return fill_set_elements; }
Columns getSetElements() const { return { set_elements.begin(), set_elements.end() }; }
void checkColumnsNumber(size_t num_key_columns) const;
void checkTypesEqual(size_t set_type_idx, const DataTypePtr & other_type) const;
private:
size_t keys_size = 0;
Sizes key_sizes;

View File

@ -1,7 +1,9 @@
#include <Core/Settings.h>
#include <Core/NamesAndTypes.h>
#include <Interpreters/SyntaxAnalyzer.h>
#include <Interpreters/InJoinSubqueriesPreprocessor.h>
#include <Interpreters/LogicalExpressionsOptimizer.h>
#include <Core/Settings.h>
#include <Interpreters/QueryAliasesVisitor.h>
#include <Interpreters/InterpreterSelectWithUnionQuery.h>
#include <Interpreters/ArrayJoinedColumnsVisitor.h>
@ -29,10 +31,8 @@
#include <DataTypes/NestedUtils.h>
#include <Core/NamesAndTypes.h>
#include <IO/WriteHelpers.h>
#include <Storages/IStorage.h>
#include <Common/typeid_cast.h>
#include <functional>
@ -48,6 +48,7 @@ namespace ErrorCodes
extern const int EMPTY_LIST_OF_COLUMNS_QUERIED;
extern const int NOT_IMPLEMENTED;
extern const int UNKNOWN_IDENTIFIER;
extern const int EXPECTED_ALL_OR_ANY;
}
NameSet removeDuplicateColumns(NamesAndTypesList & columns)
@ -487,6 +488,27 @@ void getArrayJoinedColumns(ASTPtr & query, SyntaxAnalyzerResult & result, const
}
}
void setJoinStrictness(ASTSelectQuery & select_query, JoinStrictness join_default_strictness)
{
const ASTTablesInSelectQueryElement * node = select_query.join();
if (!node)
return;
auto & table_join = const_cast<ASTTablesInSelectQueryElement *>(node)->table_join->as<ASTTableJoin &>();
if (table_join.strictness == ASTTableJoin::Strictness::Unspecified &&
table_join.kind != ASTTableJoin::Kind::Cross)
{
if (join_default_strictness == JoinStrictness::ANY)
table_join.strictness = ASTTableJoin::Strictness::Any;
else if (join_default_strictness == JoinStrictness::ALL)
table_join.strictness = ASTTableJoin::Strictness::All;
else
throw Exception("Expected ANY or ALL in JOIN section, because setting (join_default_strictness) is empty",
DB::ErrorCodes::EXPECTED_ALL_OR_ANY);
}
}
/// Find the columns that are obtained by JOIN.
void collectJoinedColumns(AnalyzedJoin & analyzed_join, const ASTSelectQuery & select_query, const NameSet & source_columns,
const Aliases & aliases, bool join_use_nulls)
@ -863,6 +885,7 @@ SyntaxAnalyzerResultPtr SyntaxAnalyzer::analyze(
/// Push the predicate expression down to the subqueries.
result.rewrite_subqueries = PredicateExpressionsOptimizer(select_query, settings, context).optimize();
setJoinStrictness(*select_query, settings.join_default_strictness);
collectJoinedColumns(result.analyzed_join, *select_query, source_columns_set, result.aliases, settings.join_use_nulls);
}

View File

@ -278,12 +278,14 @@ User::User(const String & name_, const String & config_elem, const Poco::Util::A
{
bool has_password = config.has(config_elem + ".password");
bool has_password_sha256_hex = config.has(config_elem + ".password_sha256_hex");
bool has_password_double_sha1_hex = config.has(config_elem + ".password_double_sha1_hex");
if (has_password && has_password_sha256_hex)
throw Exception("Both fields 'password' and 'password_sha256_hex' are specified for user " + name + ". Must be only one of them.", ErrorCodes::BAD_ARGUMENTS);
if (has_password + has_password_sha256_hex + has_password_double_sha1_hex > 1)
throw Exception("More than one field of 'password', 'password_sha256_hex', 'password_double_sha1_hex' is used to specify password for user " + name + ". Must be only one of them.",
ErrorCodes::BAD_ARGUMENTS);
if (!has_password && !has_password_sha256_hex)
throw Exception("Either 'password' or 'password_sha256_hex' must be specified for user " + name + ".", ErrorCodes::BAD_ARGUMENTS);
if (!has_password && !has_password_sha256_hex && !has_password_double_sha1_hex)
throw Exception("Either 'password' or 'password_sha256_hex' or 'password_double_sha1_hex' must be specified for user " + name + ".", ErrorCodes::BAD_ARGUMENTS);
if (has_password)
password = config.getString(config_elem + ".password");
@ -296,6 +298,14 @@ User::User(const String & name_, const String & config_elem, const Poco::Util::A
throw Exception("password_sha256_hex for user " + name + " has length " + toString(password_sha256_hex.size()) + " but must be exactly 64 symbols.", ErrorCodes::BAD_ARGUMENTS);
}
if (has_password_double_sha1_hex)
{
password_double_sha1_hex = Poco::toLower(config.getString(config_elem + ".password_double_sha1_hex"));
if (password_double_sha1_hex.size() != 40)
throw Exception("password_double_sha1_hex for user " + name + " has length " + toString(password_double_sha1_hex.size()) + " but must be exactly 40 symbols.", ErrorCodes::BAD_ARGUMENTS);
}
profile = config.getString(config_elem + ".profile");
quota = config.getString(config_elem + ".quota");

View File

@ -56,6 +56,7 @@ struct User
/// Required password. Could be stored in plaintext or in SHA256.
String password;
String password_sha256_hex;
String password_double_sha1_hex;
String profile;
String quota;

View File

@ -1,14 +1,15 @@
#include <Interpreters/UsersManager.h>
#include <Poco/Net/IPAddress.h>
#include <Poco/Util/AbstractConfiguration.h>
#include <Poco/String.h>
#include "config_core.h"
#include <Common/Exception.h>
#include <common/logger_useful.h>
#include <IO/HexWriteBuffer.h>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteHelpers.h>
#include <common/logger_useful.h>
#include "config_core.h"
#include <Poco/Net/IPAddress.h>
#include <Poco/SHA1Engine.h>
#include <Poco/String.h>
#include <Poco/Util/AbstractConfiguration.h>
#if USE_SSL
# include <openssl/sha.h>
#endif
@ -93,6 +94,21 @@ UserPtr UsersManager::authorizeAndGetUser(
throw DB::Exception("SHA256 passwords support is disabled, because ClickHouse was built without SSL library", DB::ErrorCodes::SUPPORT_IS_DISABLED);
#endif
}
else if (!it->second->password_double_sha1_hex.empty())
{
Poco::SHA1Engine engine;
engine.update(password);
const auto & first_sha1 = engine.digest();
/// If it was MySQL compatibility server, then first_sha1 already contains double SHA1.
if (Poco::SHA1Engine::digestToHex(first_sha1) == it->second->password_double_sha1_hex)
return it->second;
engine.update(first_sha1.data(), first_sha1.size());
if (Poco::SHA1Engine::digestToHex(engine.digest()) != it->second->password_double_sha1_hex)
on_wrong_password();
}
else if (password != it->second->password)
{
on_wrong_password();

View File

@ -56,7 +56,7 @@ target_link_libraries (expression_analyzer PRIVATE dbms clickhouse_storages_syst
add_check(expression_analyzer)
add_executable (users users.cpp)
target_link_libraries (users PRIVATE dbms clickhouse_common_config ${Boost_FILESYSTEM_LIBRARY})
target_link_libraries (users PRIVATE dbms clickhouse_common_config stdc++fs)
if (OS_LINUX)
add_executable (internal_iotop internal_iotop.cpp)

View File

@ -349,10 +349,9 @@ bool OPTIMIZE(1) CSVRowInputFormat::parseRowAndPrintDiagnosticInfo(MutableColumn
const auto & current_column_type = data_types[table_column];
const bool is_last_file_column =
file_column + 1 == column_indexes_for_input_fields.size();
const bool at_delimiter = *in.position() == delimiter;
const bool at_delimiter = !in.eof() && *in.position() == delimiter;
const bool at_last_column_line_end = is_last_file_column
&& (*in.position() == '\n' || *in.position() == '\r'
|| in.eof());
&& (in.eof() || *in.position() == '\n' || *in.position() == '\r');
auto & header = getPort().getHeader();
out << "Column " << file_column << ", " << std::string((file_column < 10 ? 2 : file_column < 100 ? 1 : 0), ' ')
@ -516,10 +515,9 @@ void CSVRowInputFormat::updateDiagnosticInfo()
bool CSVRowInputFormat::readField(IColumn & column, const DataTypePtr & type, bool is_last_file_column, size_t column_idx)
{
const bool at_delimiter = *in.position() == format_settings.csv.delimiter;
const bool at_delimiter = !in.eof() && *in.position() == format_settings.csv.delimiter;
const bool at_last_column_line_end = is_last_file_column
&& (*in.position() == '\n' || *in.position() == '\r'
|| in.eof());
&& (in.eof() || *in.position() == '\n' || *in.position() == '\r');
if (format_settings.csv.empty_as_default
&& (at_delimiter || at_last_column_line_end))

View File

@ -1,14 +1,13 @@
#include "config_formats.h"
#include <Processors/Formats/Impl/CapnProtoRowInputFormat.h> // Y_IGNORE
#include "CapnProtoRowInputFormat.h"
#if USE_CAPNP
#include <IO/ReadBuffer.h>
#include <Interpreters/Context.h>
#include <Formats/FormatFactory.h>
#include <Formats/FormatSchemaInfo.h>
#include <capnp/serialize.h> // Y_IGNORE
#include <capnp/dynamic.h> // Y_IGNORE
#include <capnp/common.h> // Y_IGNORE
#include <capnp/serialize.h>
#include <capnp/dynamic.h>
#include <capnp/common.h>
#include <boost/algorithm/string.hpp>
#include <boost/range/join.hpp>
#include <common/logger_useful.h>

View File

@ -1,10 +1,10 @@
#pragma once
#include <Common/config.h>
#include "config_formats.h"
#if USE_CAPNP
#include <Core/Block.h>
#include <Processors/Formats/IRowInputFormat.h>
#include <capnp/schema-parser.h>
namespace DB

View File

@ -1,7 +1,6 @@
#include "config_formats.h"
#include <Processors/Formats/Impl/ParquetBlockInputFormat.h>
#include "ParquetBlockInputFormat.h"
#if USE_PARQUET
#include <algorithm>
#include <iterator>
#include <vector>

View File

@ -1,5 +1,4 @@
#include "config_formats.h"
#include <Processors/Formats/Impl/ParquetBlockOutputFormat.h>
#include "ParquetBlockOutputFormat.h"
#if USE_PARQUET

View File

@ -1,5 +1,4 @@
#include "config_formats.h"
#include <Processors/Formats/Impl/ProtobufRowInputFormat.h>
#include "ProtobufRowInputFormat.h"
#if USE_PROTOBUF
#include <Core/Block.h>

View File

@ -1,6 +1,6 @@
#pragma once
#include <Common/config.h>
#include "config_formats.h"
#if USE_PROTOBUF
#include <DataTypes/IDataType.h>

View File

@ -1,10 +1,8 @@
#include <Formats/FormatFactory.h>
#include "ProtobufRowOutputFormat.h"
#include "config_formats.h"
#if USE_PROTOBUF
#include <Processors/Formats/Impl/ProtobufRowOutputFormat.h>
#include <Core/Block.h>
#include <Formats/FormatSchemaInfo.h>
#include <Formats/ProtobufSchemas.h>

View File

@ -1,6 +1,6 @@
#pragma once
#include <Common/config.h>
#include "config_formats.h"
#if USE_PROTOBUF
#include <Core/Block.h>

View File

@ -25,25 +25,26 @@ IMergedBlockOutputStream::IMergedBlockOutputStream(
size_t aio_threshold_,
bool blocks_are_granules_size_,
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
const MergeTreeIndexGranularity & index_granularity_)
const MergeTreeIndexGranularity & index_granularity_,
const MergeTreeIndexGranularityInfo * index_granularity_info_)
: storage(storage_)
, part_path(part_path_)
, min_compress_block_size(min_compress_block_size_)
, max_compress_block_size(max_compress_block_size_)
, aio_threshold(aio_threshold_)
, marks_file_extension(storage.canUseAdaptiveGranularity() ? getAdaptiveMrkExtension() : getNonAdaptiveMrkExtension())
, can_use_adaptive_granularity(index_granularity_info_ ? index_granularity_info_->is_adaptive : storage.canUseAdaptiveGranularity())
, marks_file_extension(can_use_adaptive_granularity ? getAdaptiveMrkExtension() : getNonAdaptiveMrkExtension())
, blocks_are_granules_size(blocks_are_granules_size_)
, index_granularity(index_granularity_)
, compute_granularity(index_granularity.empty())
, codec(std::move(codec_))
, skip_indices(indices_to_recalc)
, with_final_mark(storage.settings.write_final_mark && storage.canUseAdaptiveGranularity())
, with_final_mark(storage.settings.write_final_mark && can_use_adaptive_granularity)
{
if (blocks_are_granules_size && !index_granularity.empty())
throw Exception("Can't take information about index granularity from blocks, when non empty index_granularity array specified", ErrorCodes::LOGICAL_ERROR);
}
void IMergedBlockOutputStream::addStreams(
const String & path,
const String & name,
@ -145,7 +146,7 @@ void IMergedBlockOutputStream::fillIndexGranularity(const Block & block)
blocks_are_granules_size,
index_offset,
index_granularity,
storage.canUseAdaptiveGranularity());
can_use_adaptive_granularity);
}
void IMergedBlockOutputStream::writeSingleMark(
@ -176,7 +177,7 @@ void IMergedBlockOutputStream::writeSingleMark(
writeIntBinary(stream.plain_hashing.count(), stream.marks);
writeIntBinary(stream.compressed.offset(), stream.marks);
if (storage.canUseAdaptiveGranularity())
if (can_use_adaptive_granularity)
writeIntBinary(number_of_rows, stream.marks);
}, path);
}
@ -362,7 +363,7 @@ void IMergedBlockOutputStream::calculateAndSerializeSkipIndices(
writeIntBinary(stream.compressed.offset(), stream.marks);
/// Actually this numbers is redundant, but we have to store them
/// to be compatible with normal .mrk2 file format
if (storage.canUseAdaptiveGranularity())
if (can_use_adaptive_granularity)
writeIntBinary(1UL, stream.marks);
++skip_index_current_mark;

View File

@ -1,6 +1,7 @@
#pragma once
#include <Storages/MergeTree/MergeTreeIndexGranularity.h>
#include <Storages/MergeTree/MergeTreeIndexGranularityInfo.h>
#include <IO/WriteBufferFromFile.h>
#include <Compression/CompressedWriteBuffer.h>
#include <IO/HashingWriteBuffer.h>
@ -23,7 +24,8 @@ public:
size_t aio_threshold_,
bool blocks_are_granules_size_,
const std::vector<MergeTreeIndexPtr> & indices_to_recalc,
const MergeTreeIndexGranularity & index_granularity_);
const MergeTreeIndexGranularity & index_granularity_,
const MergeTreeIndexGranularityInfo * index_granularity_info_ = nullptr);
using WrittenOffsetColumns = std::set<std::string>;
@ -141,6 +143,7 @@ protected:
size_t current_mark = 0;
size_t skip_index_mark = 0;
const bool can_use_adaptive_granularity;
const std::string marks_file_extension;
const bool blocks_are_granules_size;

View File

@ -546,11 +546,13 @@ bool KeyCondition::tryPrepareSetIndex(
}
};
size_t left_args_count = 1;
const auto * left_arg_tuple = left_arg->as<ASTFunction>();
if (left_arg_tuple && left_arg_tuple->name == "tuple")
{
const auto & tuple_elements = left_arg_tuple->arguments->children;
for (size_t i = 0; i < tuple_elements.size(); ++i)
left_args_count = tuple_elements.size();
for (size_t i = 0; i < left_args_count; ++i)
get_key_tuple_position_mapping(tuple_elements[i], i);
}
else
@ -577,6 +579,10 @@ bool KeyCondition::tryPrepareSetIndex(
if (!prepared_set->hasExplicitSetElements())
return false;
prepared_set->checkColumnsNumber(left_args_count);
for (size_t i = 0; i < indexes_mapping.size(); ++i)
prepared_set->checkTypesEqual(indexes_mapping[i].tuple_index, removeLowCardinality(data_types[i]));
out.set_index = std::make_shared<MergeTreeSetIndex>(prepared_set->getSetElements(), std::move(indexes_mapping));
return true;

View File

@ -1581,7 +1581,8 @@ void MergeTreeData::alterDataPart(
true /* skip_offsets */,
{},
unused_written_offsets,
part->index_granularity);
part->index_granularity,
&part->index_granularity_info);
in.readPrefix();
out.writePrefix();

View File

@ -934,6 +934,7 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
new_data_part->relative_path = "tmp_mut_" + future_part.name;
new_data_part->is_temp = true;
new_data_part->ttl_infos = source_part->ttl_infos;
new_data_part->index_granularity_info = source_part->index_granularity_info;
String new_part_tmp_path = new_data_part->getFullPath();
@ -1069,7 +1070,8 @@ MergeTreeData::MutableDataPartPtr MergeTreeDataMergerMutator::mutatePartToTempor
/* skip_offsets = */ false,
std::vector<MergeTreeIndexPtr>(indices_to_recalc.begin(), indices_to_recalc.end()),
unused_written_offsets,
source_part->index_granularity
source_part->index_granularity,
&source_part->index_granularity_info
);
in->readPrefix();

View File

@ -409,6 +409,8 @@ void MergeTreeDataPart::remove() const
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-variable"
#endif
std::shared_lock<std::shared_mutex> lock(columns_lock);
for (const auto & [file, _] : checksums.files)
{
String path_to_remove = to + "/" + file;

View File

@ -8,14 +8,16 @@ MergedColumnOnlyOutputStream::MergedColumnOnlyOutputStream(
CompressionCodecPtr default_codec_, bool skip_offsets_,
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
WrittenOffsetColumns & already_written_offset_columns_,
const MergeTreeIndexGranularity & index_granularity_)
const MergeTreeIndexGranularity & index_granularity_,
const MergeTreeIndexGranularityInfo * index_granularity_info_)
: IMergedBlockOutputStream(
storage_, part_path_, storage_.global_context.getSettings().min_compress_block_size,
storage_.global_context.getSettings().max_compress_block_size, default_codec_,
storage_.global_context.getSettings().min_bytes_to_use_direct_io,
false,
indices_to_recalc_,
index_granularity_),
index_granularity_,
index_granularity_info_),
header(header_), sync(sync_), skip_offsets(skip_offsets_),
already_written_offset_columns(already_written_offset_columns_)
{

View File

@ -17,7 +17,8 @@ public:
CompressionCodecPtr default_codec_, bool skip_offsets_,
const std::vector<MergeTreeIndexPtr> & indices_to_recalc_,
WrittenOffsetColumns & already_written_offset_columns_,
const MergeTreeIndexGranularity & index_granularity_);
const MergeTreeIndexGranularity & index_granularity_,
const MergeTreeIndexGranularityInfo * index_granularity_info_ = nullptr);
Block getHeader() const override { return header; }
void write(const Block & block) override;

View File

@ -176,6 +176,22 @@ IColumn::Selector createSelector(const ClusterPtr cluster, const ColumnWithTypeA
throw Exception{"Sharding key expression does not evaluate to an integer type", ErrorCodes::TYPE_MISMATCH};
}
std::string makeFormattedListOfShards(const ClusterPtr & cluster)
{
std::ostringstream os;
bool head = true;
os << "[";
for (const auto & shard_info : cluster->getShardsInfo())
{
(head ? os : os << ", ") << shard_info.shard_num;
head = false;
}
os << "]";
return os.str();
}
}
@ -312,10 +328,23 @@ BlockInputStreams StorageDistributed::read(
if (settings.optimize_skip_unused_shards)
{
auto smaller_cluster = skipUnusedShards(cluster, query_info);
if (has_sharding_key)
{
auto smaller_cluster = skipUnusedShards(cluster, query_info);
if (smaller_cluster)
cluster = smaller_cluster;
if (smaller_cluster)
{
cluster = smaller_cluster;
LOG_DEBUG(log, "Reading from " << database_name << "." << table_name << ": "
"Skipping irrelevant shards - the query will be sent to the following shards of the cluster (shard numbers): "
" " << makeFormattedListOfShards(cluster));
}
else
{
LOG_DEBUG(log, "Reading from " << database_name << "." << table_name << ": "
"Unable to figure out irrelevant shards from WHERE/PREWHERE clauses - the query will be sent to all shards of the cluster");
}
}
}
return ClusterProxy::executeQuery(
@ -488,15 +517,32 @@ void StorageDistributed::ClusterNodeData::shutdownAndDropAllData()
}
/// Returns a new cluster with fewer shards if constant folding for `sharding_key_expr` is possible
/// using constraints from "WHERE" condition, otherwise returns `nullptr`
/// using constraints from "PREWHERE" and "WHERE" conditions, otherwise returns `nullptr`
ClusterPtr StorageDistributed::skipUnusedShards(ClusterPtr cluster, const SelectQueryInfo & query_info)
{
if (!has_sharding_key)
{
throw Exception("Internal error: cannot determine shards of a distributed table if no sharding expression is supplied", ErrorCodes::LOGICAL_ERROR);
}
const auto & select = query_info.query->as<ASTSelectQuery &>();
if (!select.where() || !sharding_key_expr)
if (!select.prewhere() && !select.where())
{
return nullptr;
}
const auto & blocks = evaluateExpressionOverConstantCondition(select.where(), sharding_key_expr);
ASTPtr condition_ast;
if (select.prewhere() && select.where())
{
condition_ast = makeASTFunction("and", select.prewhere()->clone(), select.where()->clone());
}
else
{
condition_ast = select.prewhere() ? select.prewhere()->clone() : select.where()->clone();
}
const auto blocks = evaluateExpressionOverConstantCondition(condition_ast, sharding_key_expr);
// Can't get definite answer if we can skip any shards
if (!blocks)

View File

@ -1956,10 +1956,37 @@ void StorageReplicatedMergeTree::cloneReplica(const String & source_replica, Coo
}
/// Add to the queue jobs to receive all the active parts that the reference/master replica has.
Strings parts = zookeeper->getChildren(source_path + "/parts");
ActiveDataPartSet active_parts_set(format_version, parts);
Strings source_replica_parts = zookeeper->getChildren(source_path + "/parts");
ActiveDataPartSet active_parts_set(format_version, source_replica_parts);
Strings active_parts = active_parts_set.getParts();
/// Remove local parts if source replica does not have them, because such parts will never be fetched by other replicas.
Strings local_parts_in_zk = zookeeper->getChildren(replica_path + "/parts");
Strings parts_to_remove_from_zk;
for (const auto & part : local_parts_in_zk)
{
if (active_parts_set.getContainingPart(part).empty())
{
queue.remove(zookeeper, part);
parts_to_remove_from_zk.emplace_back(part);
LOG_WARNING(log, "Source replica does not have part " << part << ". Removing it from ZooKeeper.");
}
}
tryRemovePartsFromZooKeeperWithRetries(parts_to_remove_from_zk);
auto local_active_parts = getDataParts();
DataPartsVector parts_to_remove_from_working_set;
for (const auto & part : local_active_parts)
{
if (active_parts_set.getContainingPart(part->name).empty())
{
parts_to_remove_from_working_set.emplace_back(part);
LOG_WARNING(log, "Source replica does not have part " << part->name << ". Removing it from working set.");
}
}
removePartsFromWorkingSet(parts_to_remove_from_working_set, true);
for (const String & name : active_parts)
{
LogEntry log_entry;

View File

@ -114,7 +114,11 @@ void StorageSystemParts::processNextStorage(MutableColumns & columns_, const Sto
columns_[i++]->insert(part->stateString());
MinimalisticDataPartChecksums helper;
helper.computeTotalChecksums(part->checksums);
{
/// TODO: MergeTreeDataPart structure is too error-prone.
std::shared_lock<std::shared_mutex> lock(part->columns_lock);
helper.computeTotalChecksums(part->checksums);
}
auto checksum = helper.hash_of_all_files;
columns_[i++]->insert(getHexUIntLowercase(checksum.first) + getHexUIntLowercase(checksum.second));

View File

@ -148,10 +148,9 @@ StoragesInfoStream::StoragesInfoStream(const SelectQueryInfo & query_info, const
StoragesInfo StoragesInfoStream::next()
{
StoragesInfo info;
while (next_row < rows)
{
StoragesInfo info;
info.database = (*database_column)[next_row].get<String>();
info.table = (*table_column)[next_row].get<String>();
@ -198,10 +197,10 @@ StoragesInfo StoragesInfoStream::next()
if (!info.data)
throw Exception("Unknown engine " + info.engine, ErrorCodes::LOGICAL_ERROR);
break;
return info;
}
return info;
return {};
}
BlockInputStreams StorageSystemPartsBase::read(

View File

@ -288,6 +288,17 @@ def test_mixed_granularity_single_node(start_dynamic_cluster, node):
node.exec_in_container(["bash", "-c", "find {p} -name '*.mrk' | grep '.*'".format(p=path_to_old_part)]) # check that we have non adaptive files
node.query("ALTER TABLE table_with_default_granularity UPDATE dummy = dummy + 1 WHERE 1")
# still works
assert node.query("SELECT count() from table_with_default_granularity") == '6\n'
node.query("ALTER TABLE table_with_default_granularity MODIFY COLUMN dummy String")
node.query("ALTER TABLE table_with_default_granularity ADD COLUMN dummy2 Float64")
#still works
assert node.query("SELECT count() from table_with_default_granularity") == '6\n'
def test_version_update_two_nodes(start_dynamic_cluster):
node11.query("INSERT INTO table_with_default_granularity VALUES (toDate('2018-10-01'), 1, 333), (toDate('2018-10-02'), 2, 444)")
node12.query("SYSTEM SYNC REPLICA table_with_default_granularity")

View File

@ -0,0 +1,19 @@
<yandex>
<remote_servers>
<test_cluster>
<shard>
<internal_replication>true</internal_replication>
<replica>
<default_database>shard_0</default_database>
<host>node1</host>
<port>9000</port>
</replica>
<replica>
<default_database>shard_0</default_database>
<host>node2</host>
<port>9000</port>
</replica>
</shard>
</test_cluster>
</remote_servers>
</yandex>

View File

@ -0,0 +1,60 @@
import pytest
from helpers.cluster import ClickHouseCluster
from helpers.network import PartitionManager
from helpers.test_tools import assert_eq_with_retry
def fill_nodes(nodes, shard):
for node in nodes:
node.query(
'''
CREATE DATABASE test;
CREATE TABLE test_table(date Date, id UInt32)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/test{shard}/replicated', '{replica}')
ORDER BY id PARTITION BY toYYYYMM(date)
SETTINGS min_replicated_logs_to_keep=3, max_replicated_logs_to_keep=5, cleanup_delay_period=0, cleanup_delay_period_random_add=0;
'''.format(shard=shard, replica=node.name))
cluster = ClickHouseCluster(__file__)
node1 = cluster.add_instance('node1', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
node2 = cluster.add_instance('node2', main_configs=['configs/remote_servers.xml'], with_zookeeper=True)
@pytest.fixture(scope="module")
def start_cluster():
try:
cluster.start()
fill_nodes([node1, node2], 1)
yield cluster
except Exception as ex:
print ex
finally:
cluster.shutdown()
def test_inconsistent_parts_if_drop_while_replica_not_active(start_cluster):
with PartitionManager() as pm:
# insert into all replicas
for i in range(50):
node1.query("INSERT INTO test_table VALUES ('2019-08-16', {})".format(i))
assert_eq_with_retry(node2, "SELECT count(*) FROM test_table", node1.query("SELECT count(*) FROM test_table"))
# disable network on the first replica
pm.partition_instances(node1, node2)
pm.drop_instance_zk_connections(node1)
# drop all parts on the second replica
node2.query_with_retry("ALTER TABLE test_table DROP PARTITION 201908")
assert_eq_with_retry(node2, "SELECT count(*) FROM test_table", "0")
# insert into the second replica
# DROP_RANGE will be removed from the replication log and the first replica will be lost
for i in range(50):
node2.query("INSERT INTO test_table VALUES ('2019-08-16', {})".format(50 + i))
# the first replica will be cloned from the second
pm.heal_all()
assert_eq_with_retry(node1, "SELECT count(*) FROM test_table", node2.query("SELECT count(*) FROM test_table"))

View File

@ -0,0 +1,5 @@
FROM node:8
RUN npm install mysql
COPY ./test.js test.js

View File

@ -0,0 +1,8 @@
version: '2.2'
services:
mysqljs1:
build:
context: ./
network: host
# to keep container running
command: sleep infinity

View File

@ -0,0 +1,21 @@
var mysql = require('mysql');
var connection = mysql.createConnection({
host : process.argv[2],
port : process.argv[3],
user : process.argv[4],
password : process.argv[5],
database : 'system',
});
connection.connect();
connection.query('SELECT 1 + 1 AS solution', function (error, results, fields) {
if (error) throw error;
if (results[0].solution.toString() !== '2') {
throw Error('Wrong result of a query. Expected: "2", received: ' + results[0].solution + '.')
}
});
connection.end();

View File

@ -14,6 +14,26 @@
<profile>default</profile>
<quota>default</quota>
</default>
<user_with_double_sha1>
<!-- echo -n abacaba | openssl dgst -sha1 -binary | openssl dgst -sha1 !-->
<password_double_sha1_hex>e395796d6546b1b65db9d665cd43f0e858dd4303</password_double_sha1_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</user_with_double_sha1>
<user_with_empty_password>
<password></password>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</user_with_empty_password>
</users>
<quotas>

View File

@ -50,8 +50,22 @@ def php_container():
yield docker.from_env().containers.get(cluster.project_name + '_php1_1')
@pytest.fixture(scope='module')
def nodejs_container():
docker_compose = os.path.join(SCRIPT_DIR, 'clients', 'mysqljs', 'docker_compose.yml')
subprocess.check_call(['docker-compose', '-p', cluster.project_name, '-f', docker_compose, 'up', '--no-recreate', '-d', '--build'])
yield docker.from_env().containers.get(cluster.project_name + '_mysqljs1_1')
def test_mysql_client(mysql_client, server_address):
# type: (Container, str) -> None
code, (stdout, stderr) = mysql_client.exec_run('''
mysql --protocol tcp -h {host} -P {port} default -u user_with_double_sha1 --password=abacaba
-e "SELECT 1;"
'''.format(host=server_address, port=server_port), demux=True)
assert stdout == '\n'.join(['1', '1', ''])
code, (stdout, stderr) = mysql_client.exec_run('''
mysql --protocol tcp -h {host} -P {port} default -u default --password=123
-e "SELECT 1 as a;"
@ -149,10 +163,26 @@ def test_golang_client(server_address, golang_container):
def test_php_client(server_address, php_container):
# type: (str, Container) -> None
code, (stdout, stderr) = php_container.exec_run('php -f test.php {host} {port} default 123 '.format(host=server_address, port=server_port), demux=True)
code, (stdout, stderr) = php_container.exec_run('php -f test.php {host} {port} default 123'.format(host=server_address, port=server_port), demux=True)
assert code == 0
assert stdout == 'tables\n'
code, (stdout, stderr) = php_container.exec_run('php -f test_ssl.php {host} {port} default 123 '.format(host=server_address, port=server_port), demux=True)
code, (stdout, stderr) = php_container.exec_run('php -f test_ssl.php {host} {port} default 123'.format(host=server_address, port=server_port), demux=True)
assert code == 0
assert stdout == 'tables\n'
def test_mysqljs_client(server_address, nodejs_container):
code, (_, stderr) = nodejs_container.exec_run('node test.js {host} {port} default 123'.format(host=server_address, port=server_port), demux=True)
assert code == 1
assert 'MySQL is requesting the sha256_password authentication method, which is not supported.' in stderr
code, (_, stderr) = nodejs_container.exec_run('node test.js {host} {port} user_with_empty_password ""'.format(host=server_address, port=server_port), demux=True)
assert code == 1
assert 'MySQL is requesting the sha256_password authentication method, which is not supported.' in stderr
code, (_, _) = nodejs_container.exec_run('node test.js {host} {port} user_with_double_sha1 abacaba'.format(host=server_address, port=server_port), demux=True)
assert code == 0
code, (_, _) = nodejs_container.exec_run('node test.js {host} {port} user_with_empty_password 123'.format(host=server_address, port=server_port), demux=True)
assert code == 1

View File

@ -1,18 +1,19 @@
<test>
<type>once</type>
<type>loop</type>
<stop_conditions>
<all_of>
<iterations>3</iterations>
<min_time_not_changing_for_ms>10000</min_time_not_changing_for_ms>
</all_of>
<any_of>
<average_speed_not_changing_for_ms>1000</average_speed_not_changing_for_ms>
<total_time_ms>2000</total_time_ms>
<iterations>5</iterations>
<total_time_ms>60000</total_time_ms>
</any_of>
</stop_conditions>
<main_metric>
<max_rows_per_second />
<max_bytes_per_second />
<avg_rows_per_second />
<avg_bytes_per_second />
<min_time/>
</main_metric>
<substitutions>
@ -34,18 +35,18 @@
<allow_simdjson>0</allow_simdjson>
</settings>
<query>SELECT 'rapidjson-1', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam'))</query>
<query>SELECT 'rapidjson-2', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'rapidjson-3', count() FROM system.numbers WHERE NOT ignore(JSONExtractInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'rapidjson-4', count() FROM system.numbers WHERE NOT ignore(JSONExtractUInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'rapidjson-5', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({json}), 'fparam'))</query>
<query>SELECT 'rapidjson-6', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam'))</query>
<query>SELECT 'rapidjson-7', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'rapidjson-8', count() FROM system.numbers WHERE NOT ignore(JSONExtractInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'rapidjson-9', count() FROM system.numbers WHERE NOT ignore(JSONExtractUInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'rapidjson-10', count() FROM system.numbers WHERE NOT ignore(JSONExtractRaw(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'rapidjson-11', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'rapidjson-12', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam', 'nested_2', -2))</query>
<query>SELECT 'rapidjson-13', count() FROM system.numbers WHERE NOT ignore(JSONExtractBool(materialize({long_json}), 'bparam'))</query>
<query>SELECT 'rapidjson-1', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam'))</query>
<query>SELECT 'rapidjson-2', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'rapidjson-3', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'rapidjson-4', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractUInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'rapidjson-5', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({json}), 'fparam'))</query>
<query>SELECT 'rapidjson-6', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam'))</query>
<query>SELECT 'rapidjson-7', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'rapidjson-8', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'rapidjson-9', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractUInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'rapidjson-10', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractRaw(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'rapidjson-11', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'rapidjson-12', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam', 'nested_2', -2))</query>
<query>SELECT 'rapidjson-13', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractBool(materialize({long_json}), 'bparam'))</query>
</test>

View File

@ -1,23 +1,20 @@
<test>
<type>once</type>
<type>loop</type>
<preconditions>
<cpu>AVX2</cpu>
</preconditions>
<stop_conditions>
<all_of>
<iterations>3</iterations>
<min_time_not_changing_for_ms>10000</min_time_not_changing_for_ms>
</all_of>
<any_of>
<iterations>5</iterations>
<total_time_ms>60000</total_time_ms>
</any_of>
</stop_conditions>
<stop_conditions>
<any_of>
<average_speed_not_changing_for_ms>1000</average_speed_not_changing_for_ms>
<total_time_ms>2000</total_time_ms>
</any_of>
</stop_conditions>
<main_metric>
<max_rows_per_second />
<max_bytes_per_second />
<avg_rows_per_second />
<avg_bytes_per_second />
</main_metric>
<main_metric>
<min_time/>
</main_metric>
<substitutions>
<substitution>
@ -38,19 +35,19 @@
<allow_simdjson>1</allow_simdjson>
</settings>
<query>SELECT 'simdjson-1', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam'))</query>
<query>SELECT 'simdjson-2', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'simdjson-3', count() FROM system.numbers WHERE NOT ignore(JSONExtractInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'simdjson-4', count() FROM system.numbers WHERE NOT ignore(JSONExtractUInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'simdjson-5', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({json}), 'fparam'))</query>
<query>SELECT 'simdjson-6', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam'))</query>
<query>SELECT 'simdjson-7', count() FROM system.numbers WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'simdjson-8', count() FROM system.numbers WHERE NOT ignore(JSONExtractInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'simdjson-9', count() FROM system.numbers WHERE NOT ignore(JSONExtractUInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'simdjson-10', count() FROM system.numbers WHERE NOT ignore(JSONExtractRaw(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'simdjson-11', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'simdjson-12', count() FROM system.numbers WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam', 'nested_2', -2))</query>
<query>SELECT 'simdjson-13', count() FROM system.numbers WHERE NOT ignore(JSONExtractBool(materialize({long_json}), 'bparam'))</query>
<query>SELECT 'simdjson-1', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam'))</query>
<query>SELECT 'simdjson-2', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'simdjson-3', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'simdjson-4', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractUInt(materialize({json}), 'nparam'))</query>
<query>SELECT 'simdjson-5', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({json}), 'fparam'))</query>
<query>SELECT 'simdjson-6', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam'))</query>
<query>SELECT 'simdjson-7', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractString(materialize({long_json}), 'sparam', 'nested_1'))</query>
<query>SELECT 'simdjson-8', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'simdjson-9', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractUInt(materialize({long_json}), 'nparam'))</query>
<query>SELECT 'simdjson-10', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractRaw(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'simdjson-11', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam'))</query>
<query>SELECT 'simdjson-12', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractFloat(materialize({long_json}), 'fparam', 'nested_2', -2))</query>
<query>SELECT 'simdjson-13', count() FROM numbers(1000000) WHERE NOT ignore(JSONExtractBool(materialize({long_json}), 'bparam'))</query>
</test>

View File

@ -9,7 +9,6 @@ select 3 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 10
select 4 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 1002, event = 1008) from funnel_test;
select 1 = windowFunnel(1)(timestamp, event = 1000) from funnel_test;
select 3 = windowFunnel(2)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
select 4 = windowFunnel(3)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test;
@ -39,6 +38,16 @@ select 1 = windowFunnel(10000)(timestamp, event = 1008, event = 1001) from funne
select 5 = windowFunnel(4)(timestamp, event = 1003, event = 1004, event = 1005, event = 1006, event = 1007) from funnel_test_u64;
select 4 = windowFunnel(4)(timestamp, event <= 1007, event >= 1002, event <= 1006, event >= 1004) from funnel_test_u64;
drop table if exists funnel_test_strict;
create table funnel_test_strict (timestamp UInt32, event UInt32) engine=Memory;
insert into funnel_test_strict values (00,1000),(10,1001),(20,1002),(30,1003),(40,1004),(50,1005),(51,1005),(60,1006),(70,1007),(80,1008);
select 6 = windowFunnel(10000, 'strict')(timestamp, event = 1000, event = 1001, event = 1002, event = 1003, event = 1004, event = 1005, event = 1006) from funnel_test_strict;
select 7 = windowFunnel(10000)(timestamp, event = 1000, event = 1001, event = 1002, event = 1003, event = 1004, event = 1005, event = 1006) from funnel_test_strict;
drop table funnel_test;
drop table funnel_test2;
drop table funnel_test_u64;
drop table funnel_test_strict;

View File

@ -0,0 +1,15 @@
OK
OK
1
OK
0
1
4
4
2
4
OK
OK
OK
OK
OK

View File

@ -0,0 +1,100 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS distributed_00754;"
${CLICKHOUSE_CLIENT} --query "DROP TABLE IF EXISTS mergetree_00754;"
${CLICKHOUSE_CLIENT} --query "
CREATE TABLE mergetree_00754 (a Int64, b Int64, c String) ENGINE = MergeTree ORDER BY (a, b);
"
${CLICKHOUSE_CLIENT} --query "
CREATE TABLE distributed_00754 AS mergetree_00754
ENGINE = Distributed(test_unavailable_shard, ${CLICKHOUSE_DATABASE}, mergetree_00754, jumpConsistentHash(a+b, 2));
"
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (0, 0, 'Hello');"
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (1, 0, 'World');"
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (0, 1, 'Hello');"
${CLICKHOUSE_CLIENT} --query "INSERT INTO mergetree_00754 VALUES (1, 1, 'World');"
# Should fail because the second shard is unavailable
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM distributed_00754;" 2>&1 \
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Should fail without setting `optimize_skip_unused_shards` = 1
${CLICKHOUSE_CLIENT} --query "SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0;" 2>&1 \
| fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Should pass now
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0;
"
# Should still fail because of matching unavailable shard
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 2 AND b = 2;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
# Try more complex expressions for constant folding - all should pass.
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 1 AND a = 0 WHERE b = 0;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 1 WHERE b = 1 AND length(c) = 5;
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a IN (0, 1) AND b IN (0, 1) WHERE c LIKE '%l%';
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a IN (0, 1) WHERE b IN (0, 1) AND c LIKE '%l%';
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR a = 1 AND b = 1 WHERE c LIKE '%l%';
"
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE (a = 0 OR a = 1) WHERE (b = 0 OR b = 1);
"
# These should fail.
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b <= 1;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 WHERE c LIKE '%l%';
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 OR a = 1 AND b = 0;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR a = 2 AND b = 2;
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'
${CLICKHOUSE_CLIENT} -n --query="
SET optimize_skip_unused_shards = 1;
SELECT count(*) FROM distributed_00754 PREWHERE a = 0 AND b = 0 OR c LIKE '%l%';
" 2>&1 \ | fgrep -q "All connection tries failed" && echo 'OK' || echo 'FAIL'

View File

@ -1 +1 @@
SELECT DISTINCT JSONExtractRaw(concat('{"x":', rand() % 2 ? 'true' : 'false', '}'), 'x') AS res FROM numbers(100000) ORDER BY res;
SELECT DISTINCT JSONExtractRaw(concat('{"x":', rand() % 2 ? 'true' : 'false', '}'), 'x') AS res FROM numbers(1000000) ORDER BY res;

View File

@ -0,0 +1,6 @@
[249.25,499.5,749.75,899.9,949.9499999999999,989.99,998.999]
[249.75,499.5,749.25,899.1,949.05,989.01,998.001]
[250,500,750,900,950,990,999]
599.6
599.4
600

View File

@ -0,0 +1,12 @@
DROP TABLE IF EXISTS num;
CREATE TABLE num AS numbers(1000);
SELECT quantilesExactExclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
SELECT quantilesExactInclusive(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
SELECT quantilesExact(0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999)(x) FROM (SELECT number AS x FROM num);
SELECT quantileExactExclusive(0.6)(x) FROM (SELECT number AS x FROM num);
SELECT quantileExactInclusive(0.6)(x) FROM (SELECT number AS x FROM num);
SELECT quantileExact(0.6)(x) FROM (SELECT number AS x FROM num);
DROP TABLE num;

View File

@ -0,0 +1,7 @@
OK1
OK2
OK3
OK4
OK5
2019-08-11 world
2019-08-12 hello

View File

@ -0,0 +1,21 @@
#!/usr/bin/env bash
CURDIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
. $CURDIR/../shell_config.sh
$CLICKHOUSE_CLIENT --query="DROP TABLE IF EXISTS bug";
$CLICKHOUSE_CLIENT --query="CREATE TABLE bug (d Date, s String) ENGINE = MergeTree(d, s, 8192)";
$CLICKHOUSE_CLIENT --query="INSERT INTO bug VALUES ('2019-08-09', 'hello'), ('2019-08-10', 'world'), ('2019-08-11', 'world'), ('2019-08-12', 'hello')";
#SET force_primary_key = 1;
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT (s, max(d)) FROM bug GROUP BY s) ORDER BY d" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK1";
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d, s) IN (SELECT s, max(d) FROM bug GROUP BY s)" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK2";
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, max(d), s FROM bug GROUP BY s)" 2>&1 | grep "Number of columns in section IN doesn't match" > /dev/null && echo "OK3";
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, toDateTime(d)) IN (SELECT s, max(d) FROM bug GROUP BY s)" 2>&1 | grep "Types of column 2 in section IN don't match" > /dev/null && echo "OK4";
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, toDateTime(max(d)) FROM bug GROUP BY s)" 2>&1 | grep "Types of column 2 in section IN don't match" > /dev/null && echo "OK5";
$CLICKHOUSE_CLIENT --query="SELECT * FROM bug WHERE (s, d) IN (SELECT s, max(d) FROM bug GROUP BY s) ORDER BY d";
$CLICKHOUSE_CLIENT --query="DROP TABLE bug";

View File

@ -7,9 +7,9 @@
"docker/test/performance": "yandex/clickhouse-performance-test",
"docker/test/pvs": "yandex/clickhouse-pvs-test",
"docker/test/stateful": "yandex/clickhouse-stateful-test",
"docker/test/stateful_with_coverage": "yandex/clickhouse-stateful-with-coverage-test",
"docker/test/stateful_with_coverage": "yandex/clickhouse-stateful-test-with-coverage",
"docker/test/stateless": "yandex/clickhouse-stateless-test",
"docker/test/stateless_with_coverage": "yandex/clickhouse-stateless-with-coverage-test",
"docker/test/stateless_with_coverage": "yandex/clickhouse-stateless-test-with-coverage",
"docker/test/unit": "yandex/clickhouse-unit-test",
"docker/test/stress": "yandex/clickhouse-stress-test",
"dbms/tests/integration/image": "yandex/clickhouse-integration-tests-runner"

View File

@ -1,7 +1,7 @@
# docker build -t yandex/clickhouse-stateful-test .
FROM yandex/clickhouse-stateless-test
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic main" >> /etc/apt/sources.list
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-9 main" >> /etc/apt/sources.list
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \

View File

@ -1,7 +1,7 @@
# docker build -t yandex/clickhouse-stateless-with-coverage-test .
FROM yandex/clickhouse-deb-builder
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic main" >> /etc/apt/sources.list
RUN echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-9 main" >> /etc/apt/sources.list
RUN apt-get update -y \
&& env DEBIAN_FRONTEND=noninteractive \

View File

@ -1,9 +1,6 @@
# Formats for input and output data {#formats}
ClickHouse can accept and return data in various formats. A format supported
for input can be used to parse the data provided to `INSERT`s, to perform
`SELECT`s from a file-backed table such as File, URL or HDFS, or to read an
external dictionary. A format supported for output can be used to arrange the
ClickHouse can accept and return data in various formats. A format supported for input can be used to parse the data provided to `INSERT`s, to perform `SELECT`s from a file-backed table such as File, URL or HDFS, or to read an external dictionary. A format supported for output can be used to arrange the
results of a `SELECT`, and to perform `INSERT`s into a file-backed table.
The supported formats are:
@ -388,7 +385,7 @@ Unlike the [JSON](#json) format, there is no substitution of invalid UTF-8 seque
### Usage of Nested Structures {#jsoneachrow-nested}
If you have a table with the [Nested](../data_types/nested_data_structures/nested.md) data type columns, you can insert JSON data having the same structure. Enable this functionality with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
If you have a table with [Nested](../data_types/nested_data_structures/nested.md) data type columns, you can insert JSON data with the same structure. Enable this feature with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
For example, consider the following table:
@ -396,13 +393,13 @@ For example, consider the following table:
CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory
```
As you can find in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column, `n.s` and `n.i` for our table. So you can insert the data the following way:
As you can see in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column (`n.s` and `n.i` for our table). You can insert data in the following way:
```sql
INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}
```
To insert data as hierarchical JSON object set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
To insert data as a hierarchical JSON object, set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
```json
{
@ -413,7 +410,7 @@ To insert data as hierarchical JSON object set [input_format_import_nested_json=
}
```
Without this setting ClickHouse throws the exception.
Without this setting, ClickHouse throws an exception.
```sql
SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'

View File

@ -238,7 +238,7 @@ Default value: 0.
## input_format_import_nested_json {#settings-input_format_import_nested_json}
Enables or disables inserting of JSON data with nested objects.
Enables or disables the insertion of JSON data with nested objects.
Supported formats:
@ -275,7 +275,7 @@ Default value: 1.
## date_time_input_format {#settings-date_time_input_format}
Enables or disables extended parsing of date and time formatted strings.
Allows to choose a parser of text representation of date and time.
The setting doesn't apply to [date and time functions](../../query_language/functions/date_time_functions.md).
@ -283,11 +283,13 @@ Possible values:
- `'best_effort'` — Enables extended parsing.
ClickHouse can parse the basic format `YYYY-MM-DD HH:MM:SS` and all the [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date and time formats. For example, `'2018-06-08T01:02:03.000Z'`.
ClickHouse can parse the basic `YYYY-MM-DD HH:MM:SS` format and all [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date and time formats. For example, `'2018-06-08T01:02:03.000Z'`.
- `'basic'` — Use basic parser.
ClickHouse can parse only the basic format.
ClickHouse can parse only the basic `YYYY-MM-DD HH:MM:SS` format. For example, `'2019-08-20 10:18:56'`.
Default value: `'basic'`.
**See Also**

View File

@ -45,7 +45,10 @@ CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
[SETTINGS name=value, ...]
```
For descriptions of request parameters, see the [request description](../../query_language/create.md).
For a description of parameters, see the [CREATE query description](../../query_language/create.md).
!!! note "Note"
`INDEX` is an experimental feature, see [Data Skipping Indexes](#table_engine-mergetree-data_skipping-indexes).
### Query Clauses
@ -236,7 +239,7 @@ ClickHouse cannot use an index if the values of the primary key in the query par
ClickHouse uses this logic not only for days of the month sequences, but for any primary key that represents a partially-monotonic sequence.
### Data Skipping Indices (Experimental)
### Data Skipping Indexes (Experimental) {#table_engine-mergetree-data_skipping-indexes}
You need to set `allow_experimental_data_skipping_indices` to 1 to use indices. (run `SET allow_experimental_data_skipping_indices = 1`).
@ -295,6 +298,14 @@ SELECT count() FROM table WHERE u64 * i32 == 10 AND u64 * length(s) >= 1234
The same as `ngrambf_v1`, but stores tokens instead of ngrams. Tokens are sequences separated by non-alphanumeric characters.
- `bloom_filter([false_positive])` — Stores [bloom filter](https://en.wikipedia.org/wiki/Bloom_filter) for the specified columns.
The `false_positive` optional parameter is the probability of false positive response from the filter. Possible values: (0, 1). Default value: 0.025.
Supported data types: `Int*`, `UInt*`, `Float*`, `Enum`, `Date`, `DateTime`, `String`, `FixedString`.
Supported for the following functions: [equals](../../query_language/functions/comparison_functions.md), [notEquals](../../query_language/functions/comparison_functions.md), [in](../../query_language/functions/in_functions.md), [notIn](../../query_language/functions/in_functions.md).
```sql
INDEX sample_index (u64 * length(s)) TYPE minmax GRANULARITY 4
INDEX sample_index2 (u64 * length(str), i32 + f64 * 100, date, str) TYPE set(100) GRANULARITY 4

Some files were not shown because too many files have changed in this diff Show More