diff --git a/CHANGELOG.md b/CHANGELOG.md index 23062ae1e73..ed71baf8046 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,36 @@ +## ClickHouse release 18.1.0, 2018-07-23 + +### New features: + +* Support for the `ALTER TABLE t DELETE WHERE` query for non-replicated MergeTree tables ([#2634](https://github.com/yandex/ClickHouse/pull/2634)). +* Support for arbitrary types for the `uniq*` family of aggregate functions ([#2010](https://github.com/yandex/ClickHouse/issues/2010)). +* Support for arbitrary types in comparison operators ([#2026](https://github.com/yandex/ClickHouse/issues/2026)). +* The `users.xml` file allows setting a subnet mask in the format `10.0.0.1/255.255.255.0`. This is necessary for using masks for IPv6 networks with zeros in the middle ([#2637](https://github.com/yandex/ClickHouse/pull/2637)). +* Added the `arrayDistinct` function ([#2670](https://github.com/yandex/ClickHouse/pull/2670)). +* The SummingMergeTree engine can now work with AggregateFunction type columns ([Constantin S. Pan](https://github.com/yandex/ClickHouse/pull/2566)). + +### Improvements: + +* Changed the numbering scheme for release versions. Now the first part contains the year of release (A.D., Moscow timezone, minus 2000), the second part contains the number for major changes (increases for most releases), and the third part is the patch version. Releases are still backwards compatible, unless otherwise stated in the changelog. +* Faster conversions of floating-point numbers to a string ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2664)). +* If some rows were skipped during an insert due to parsing errors (this is possible with the `input_allow_errors_num` and `input_allow_errors_ratio` settings enabled), the number of skipped rows is now written to the server log ([Leonardo Cecchi](https://github.com/yandex/ClickHouse/pull/2669)). + +### Bug fixes: + +* Fixed the TRUNCATE command for temporary tables ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2624)). +* Fixed a rare deadlock in the ZooKeeper client library that occurred when there was a network error while reading the response ([c315200](https://github.com/yandex/ClickHouse/commit/c315200e64b87e44bdf740707fc857d1fdf7e947)). +* Fixed an error during a CAST to Nullable types ([#1322](https://github.com/yandex/ClickHouse/issues/1322)). +* Fixed the incorrect result of the `maxIntersection()` function when the boundaries of intervals coincided ([Michael Furmur](https://github.com/yandex/ClickHouse/pull/2657)). +* Fixed incorrect transformation of the OR expression chain in a function argument ([chenxing-xc](https://github.com/yandex/ClickHouse/pull/2663)). +* Fixed performance degradation for queries containing `IN (subquery)` expressions inside another subquery ([#2571](https://github.com/yandex/ClickHouse/issues/2571)). +* Fixed incompatibility between servers with different versions in distributed queries that use a `CAST` function that isn't in uppercase letters ([fe8c4d6](https://github.com/yandex/ClickHouse/commit/fe8c4d64e434cacd4ceef34faa9005129f2190a5)). +* Added missing quoting of identifiers for queries to an external DBMS ([#2635](https://github.com/yandex/ClickHouse/issues/2635)). + +### Backward incompatible changes: + +* Converting a string containing the number zero to DateTime does not work. Example: `SELECT toDateTime('0')`. This is also the reason that `DateTime DEFAULT '0'` does not work in tables, as well as `0` in dictionaries. Solution: replace `0` with `0000-00-00 00:00:00`. + + ## ClickHouse release 1.1.54394, 2018-07-12 ### New features: diff --git a/CHANGELOG_RU.md b/CHANGELOG_RU.md index 5282e10a556..8150e1f5a57 100644 --- a/CHANGELOG_RU.md +++ b/CHANGELOG_RU.md @@ -1,3 +1,32 @@ +## ClickHouse release 18.1.0, 2018-07-23 + +### Новые возможности: +* Поддержка запроса `ALTER TABLE t DELETE WHERE` для нереплицированных MergeTree-таблиц ([#2634](https://github.com/yandex/ClickHouse/pull/2634)). +* Поддержка произвольных типов для семейства агрегатных функций `uniq*` ([#2010](https://github.com/yandex/ClickHouse/issues/2010)). +* Поддержка произвольных типов в операторах сравнения ([#2026](https://github.com/yandex/ClickHouse/issues/2026)). +* Возможность в `users.xml` указывать маску подсети в формате `10.0.0.1/255.255.255.0`. Это необходимо для использования "дырявых" масок IPv6 сетей ([#2637](https://github.com/yandex/ClickHouse/pull/2637)). +* Добавлена функция `arrayDistinct` ([#2670](https://github.com/yandex/ClickHouse/pull/2670)). +* Движок SummingMergeTree теперь может работать со столбцами типа AggregateFunction ([Constantin S. Pan](https://github.com/yandex/ClickHouse/pull/2566)). + +### Улучшения: +* Изменена схема версионирования релизов. Теперь первый компонент содержит год релиза (A.D.; по московскому времени; из номера вычитается 2000), второй - номер крупных изменений (увеличивается для большинства релизов), третий - патч-версия. Релизы по-прежнему обратно совместимы, если другое не указано в changelog. +* Ускорено преобразование чисел с плавающей точкой в строку ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2664)). +* Теперь, если при вставке из-за ошибок парсинга пропущено некоторое количество строк (такое возможно про включённых настройках `input_allow_errors_num`, `input_allow_errors_ratio`), это количество пишется в лог сервера ([Leonardo Cecchi](https://github.com/yandex/ClickHouse/pull/2669)). + +### Исправление ошибок: +* Исправлена работа команды TRUNCATE для временных таблиц ([Amos Bird](https://github.com/yandex/ClickHouse/pull/2624)). +* Исправлен редкий deadlock в клиентской библиотеке ZooKeeper, который возникал при сетевой ошибке во время вычитывания ответа ([c315200](https://github.com/yandex/ClickHouse/commit/c315200e64b87e44bdf740707fc857d1fdf7e947)). +* Исправлена ошибка при CAST в Nullable типы ([#1322](https://github.com/yandex/ClickHouse/issues/1322)). +* Исправлен неправильный результат функции `maxIntersection()` в случае совпадения границ отрезков ([Michael Furmur](https://github.com/yandex/ClickHouse/pull/2657)). +* Исправлено неверное преобразование цепочки OR-выражений в аргументе функции ([chenxing-xc](https://github.com/yandex/ClickHouse/pull/2663)). +* Исправлена деградация производительности запросов, содержащих выражение `IN (подзапрос)` внутри другого подзапроса ([#2571](https://github.com/yandex/ClickHouse/issues/2571)). +* Исправлена несовместимость серверов разных версий при распределённых запросах, использующих функцию `CAST` не в верхнем регистре ([fe8c4d6](https://github.com/yandex/ClickHouse/commit/fe8c4d64e434cacd4ceef34faa9005129f2190a5)). +* Добавлено недостающее квотирование идентификаторов при запросах к внешним СУБД ([#2635](https://github.com/yandex/ClickHouse/issues/2635)). + +### Обратно несовместимые изменения: +* Не работает преобразование строки, содержащей число ноль, в DateTime. Пример: `SELECT toDateTime('0')`. По той же причине не работает `DateTime DEFAULT '0'` в таблицах, а также `0` в словарях. Решение: заменить `0` на `0000-00-00 00:00:00`. + + ## ClickHouse release 1.1.54394, 2018-07-12 ### Новые возможности: diff --git a/README.md b/README.md index 905e6e5ba90..8cb9fa3379e 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,8 @@ ClickHouse is an open-source column-oriented database management system that allows generating analytical data reports in real time. +[![Build Status](https://travis-ci.org/yandex/ClickHouse.svg?branch=master)](https://travis-ci.org/yandex/ClickHouse) + ## Useful links * [Official website](https://clickhouse.yandex/) has quick high-level overview of ClickHouse on main page. @@ -9,5 +11,3 @@ ClickHouse is an open-source column-oriented database management system that all * [Documentation](https://clickhouse.yandex/docs/en/) provides more in-depth information. * [Contacts](https://clickhouse.yandex/#contacts) can help to get your questions answered if there are any. - -[![Build Status](https://travis-ci.org/yandex/ClickHouse.svg?branch=master)](https://travis-ci.org/yandex/ClickHouse) diff --git a/cmake/find_llvm.cmake b/cmake/find_llvm.cmake index a398255d5d7..b10a8cb87d4 100644 --- a/cmake/find_llvm.cmake +++ b/cmake/find_llvm.cmake @@ -26,10 +26,10 @@ if (ENABLE_EMBEDDED_COMPILER) if (LLVM_FOUND) find_library (LLD_LIBRARY_TEST lldCore PATHS ${LLVM_LIBRARY_DIRS}) - find_path (LLD_INCLUDE_DIR_TEST NAMES lld/Common/Driver.h PATHS ${LLVM_INCLUDE_DIRS}) + find_path (LLD_INCLUDE_DIR_TEST NAMES lld/Core/AbsoluteAtom.h PATHS ${LLVM_INCLUDE_DIRS}) if (NOT LLD_LIBRARY_TEST OR NOT LLD_INCLUDE_DIR_TEST) set (LLVM_FOUND 0) - message(WARNING "liblld not found in ${LLVM_INCLUDE_DIRS} ${LLVM_LIBRARY_DIRS}. Disabling internal compiler.") + message(WARNING "liblld (${LLD_LIBRARY_TEST}, ${LLD_INCLUDE_DIR_TEST}) not found in ${LLVM_INCLUDE_DIRS} ${LLVM_LIBRARY_DIRS}. Disabling internal compiler.") endif () endif () diff --git a/dbms/cmake/version.cmake b/dbms/cmake/version.cmake index 131e6f26aaa..9b8de90f6b7 100644 --- a/dbms/cmake/version.cmake +++ b/dbms/cmake/version.cmake @@ -1,11 +1,11 @@ # This strings autochanged from release_lib.sh: -set(VERSION_REVISION 54396 CACHE STRING "") +set(VERSION_REVISION 54397 CACHE STRING "") set(VERSION_MAJOR 18 CACHE STRING "") -set(VERSION_MINOR 1 CACHE STRING "") +set(VERSION_MINOR 2 CACHE STRING "") set(VERSION_PATCH 0 CACHE STRING "") -set(VERSION_GITHASH 550f41bc65cb03201acad489e7b96ea346ed8259 CACHE STRING "") -set(VERSION_DESCRIBE v18.1.0-testing CACHE STRING "") -set(VERSION_STRING 18.1.0 CACHE STRING "") +set(VERSION_GITHASH 6ad677d7d6961a0c9088ccd9eff55779cfdaa654 CACHE STRING "") +set(VERSION_DESCRIBE v18.2.0-testing CACHE STRING "") +set(VERSION_STRING 18.2.0 CACHE STRING "") # end of autochange set(VERSION_EXTRA "" CACHE STRING "") diff --git a/dbms/programs/obfuscator/Obfuscator.cpp b/dbms/programs/obfuscator/Obfuscator.cpp index 854771b3b26..3ba6d76179e 100644 --- a/dbms/programs/obfuscator/Obfuscator.cpp +++ b/dbms/programs/obfuscator/Obfuscator.cpp @@ -58,13 +58,13 @@ It is designed to retain the following properties of data: Most of the properties above are viable for performance testing: - reading data, filtering, aggregation and sorting will work at almost the same speed - as on original data due to saved cardinalities, magnitudes, compression ratios, etc. + as on original data due to saved cardinalities, magnitudes, compression ratios, etc. It works in deterministic fashion: you define a seed value and transform is totally determined by input data and by seed. Some transforms are one to one and could be reversed, so you need to have large enough seed and keep it in secret. It use some cryptographic primitives to transform data, but from the cryptographic point of view, - it doesn't do anything properly and you should never consider the result as secure, unless you have other reasons for it. + it doesn't do anything properly and you should never consider the result as secure, unless you have other reasons for it. It may retain some data you don't want to publish. @@ -74,7 +74,7 @@ So, the user will be able to count exact ratio of mobile traffic. Another example, suppose you have some private data in your table, like user email and you don't want to publish any single email address. If your table is large enough and contain multiple different emails and there is no email that have very high frequency than all others, - it will perfectly anonymize all data. But if you have small amount of different values in a column, it can possibly reproduce some of them. + it will perfectly anonymize all data. But if you have small amount of different values in a column, it can possibly reproduce some of them. And you should take care and look at exact algorithm, how this tool works, and probably fine tune some of it command line parameters. This tool works fine only with reasonable amount of data (at least 1000s of rows). diff --git a/dbms/programs/server/InterserverIOHTTPHandler.cpp b/dbms/programs/server/InterserverIOHTTPHandler.cpp index 3cdbaa69b64..39d214503ba 100644 --- a/dbms/programs/server/InterserverIOHTTPHandler.cpp +++ b/dbms/programs/server/InterserverIOHTTPHandler.cpp @@ -1,3 +1,4 @@ +#include #include #include @@ -23,14 +24,40 @@ namespace ErrorCodes extern const int TOO_MANY_SIMULTANEOUS_QUERIES; } +std::pair InterserverIOHTTPHandler::checkAuthentication(Poco::Net::HTTPServerRequest & request) const +{ + const auto & config = server.config(); + + if (config.has("interserver_http_credentials.user")) + { + if (!request.hasCredentials()) + return {"Server requires HTTP Basic authentification, but client doesn't provide it", false}; + String scheme, info; + request.getCredentials(scheme, info); + + if (scheme != "Basic") + return {"Server requires HTTP Basic authentification but client provides another method", false}; + + String user = config.getString("interserver_http_credentials.user"); + String password = config.getString("interserver_http_credentials.password", ""); + + Poco::Net::HTTPBasicCredentials credentials(info); + if (std::make_pair(user, password) != std::make_pair(credentials.getUsername(), credentials.getPassword())) + return {"Incorrect user or password in HTTP Basic authentification", false}; + } + else if (request.hasCredentials()) + { + return {"Client requires HTTP Basic authentification, but server doesn't provide it", false}; + } + return {"", true}; +} + void InterserverIOHTTPHandler::processQuery(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response) { HTMLForm params(request); LOG_TRACE(log, "Request URI: " << request.getURI()); - /// NOTE: You can do authentication here if you need to. - String endpoint_name = params.get("endpoint"); bool compress = params.get("compress") == "true"; @@ -65,8 +92,18 @@ void InterserverIOHTTPHandler::handleRequest(Poco::Net::HTTPServerRequest & requ try { - processQuery(request, response); - LOG_INFO(log, "Done processing query"); + if (auto [msg, success] = checkAuthentication(request); success) + { + processQuery(request, response); + LOG_INFO(log, "Done processing query"); + } + else + { + response.setStatusAndReason(Poco::Net::HTTPServerResponse::HTTP_UNAUTHORIZED); + if (!response.sent()) + response.send() << msg << std::endl; + LOG_WARNING(log, "Query processing failed request: '" << request.getURI() << "' authentification failed"); + } } catch (Exception & e) { diff --git a/dbms/programs/server/InterserverIOHTTPHandler.h b/dbms/programs/server/InterserverIOHTTPHandler.h index bf9fef59982..fbaf432d4f9 100644 --- a/dbms/programs/server/InterserverIOHTTPHandler.h +++ b/dbms/programs/server/InterserverIOHTTPHandler.h @@ -34,6 +34,8 @@ private: CurrentMetrics::Increment metric_increment{CurrentMetrics::InterserverConnection}; void processQuery(Poco::Net::HTTPServerRequest & request, Poco::Net::HTTPServerResponse & response); + + std::pair checkAuthentication(Poco::Net::HTTPServerRequest & request) const; }; } diff --git a/dbms/programs/server/Server.cpp b/dbms/programs/server/Server.cpp index 9a3db8bdb12..c4b0c77a026 100644 --- a/dbms/programs/server/Server.cpp +++ b/dbms/programs/server/Server.cpp @@ -230,6 +230,17 @@ int Server::main(const std::vector & /*args*/) global_context->setInterserverIOAddress(this_host, port); } + if (config().has("interserver_http_credentials")) + { + String user = config().getString("interserver_http_credentials.user", ""); + String password = config().getString("interserver_http_credentials.password", ""); + + if (user.empty()) + throw Exception("Configuration parameter interserver_http_credentials user can't be empty", ErrorCodes::NO_ELEMENTS_IN_CONFIG); + + global_context->setInterverserCredentials(user, password); + } + if (config().has("macros")) global_context->setMacros(std::make_unique(config(), "macros")); diff --git a/dbms/src/AggregateFunctions/AggregateFunctionArray.cpp b/dbms/src/AggregateFunctions/AggregateFunctionArray.cpp index f42c5b6d142..9cb7d03bf69 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionArray.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionArray.cpp @@ -18,6 +18,9 @@ public: DataTypes transformArguments(const DataTypes & arguments) const override { + if (0 == arguments.size()) + throw Exception("-Array aggregate functions require at least one argument", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); + DataTypes nested_arguments; for (const auto & type : arguments) { diff --git a/dbms/src/AggregateFunctions/AggregateFunctionBitwise.cpp b/dbms/src/AggregateFunctions/AggregateFunctionBitwise.cpp index 762baf2451b..8c188bcbb8e 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionBitwise.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionBitwise.cpp @@ -38,9 +38,9 @@ void registerAggregateFunctionsBitwise(AggregateFunctionFactory & factory) factory.registerFunction("groupBitXor", createAggregateFunctionBitwise); /// Aliases for compatibility with MySQL. - factory.registerFunction("BIT_OR", createAggregateFunctionBitwise, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("BIT_AND", createAggregateFunctionBitwise, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("BIT_XOR", createAggregateFunctionBitwise, AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("BIT_OR", "groupBitOr", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("BIT_AND", "groupBitAnd", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("BIT_XOR", "groupBitXor", AggregateFunctionFactory::CaseInsensitive); } } diff --git a/dbms/src/AggregateFunctions/AggregateFunctionFactory.cpp b/dbms/src/AggregateFunctions/AggregateFunctionFactory.cpp index eca854a031b..353b5a213b3 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionFactory.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionFactory.cpp @@ -78,11 +78,12 @@ AggregateFunctionPtr AggregateFunctionFactory::get( AggregateFunctionPtr AggregateFunctionFactory::getImpl( - const String & name, + const String & name_param, const DataTypes & argument_types, const Array & parameters, int recursion_level) const { + String name = getAliasToOrName(name_param); /// Find by exact match. auto it = aggregate_functions.find(name); if (it != aggregate_functions.end()) @@ -103,8 +104,8 @@ AggregateFunctionPtr AggregateFunctionFactory::getImpl( if (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name)) { - if (combinator->getName() == "Null") - throw Exception("Aggregate function combinator 'Null' is only for internal usage", ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION); + if (combinator->isForInternalUsageOnly()) + throw Exception("Aggregate function combinator '" + combinator->getName() + "' is only for internal usage", ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION); String nested_name = name.substr(0, name.size() - combinator->getName().size()); DataTypes nested_types = combinator->transformArguments(argument_types); @@ -126,10 +127,11 @@ AggregateFunctionPtr AggregateFunctionFactory::tryGet(const String & name, const bool AggregateFunctionFactory::isAggregateFunctionName(const String & name, int recursion_level) const { - if (aggregate_functions.count(name)) + if (aggregate_functions.count(name) || isAlias(name)) return true; - if (recursion_level == 0 && case_insensitive_aggregate_functions.count(Poco::toLower(name))) + String name_lowercase = Poco::toLower(name); + if (recursion_level == 0 && (case_insensitive_aggregate_functions.count(name_lowercase) || isAlias(name_lowercase))) return true; if (AggregateFunctionCombinatorPtr combinator = AggregateFunctionCombinatorFactory::instance().tryFindSuffix(name)) diff --git a/dbms/src/AggregateFunctions/AggregateFunctionFactory.h b/dbms/src/AggregateFunctions/AggregateFunctionFactory.h index bc36e76c11f..92598e52509 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionFactory.h +++ b/dbms/src/AggregateFunctions/AggregateFunctionFactory.h @@ -1,6 +1,7 @@ #pragma once #include +#include #include @@ -20,27 +21,18 @@ class IDataType; using DataTypePtr = std::shared_ptr; using DataTypes = std::vector; +/** Creator have arguments: name of aggregate function, types of arguments, values of parameters. + * Parameters are for "parametric" aggregate functions. + * For example, in quantileWeighted(0.9)(x, weight), 0.9 is "parameter" and x, weight are "arguments". + */ +using AggregateFunctionCreator = std::function; + /** Creates an aggregate function by name. */ -class AggregateFunctionFactory final : public ext::singleton +class AggregateFunctionFactory final : public ext::singleton, public IFactoryWithAliases { - friend class StorageSystemFunctions; - public: - /** Creator have arguments: name of aggregate function, types of arguments, values of parameters. - * Parameters are for "parametric" aggregate functions. - * For example, in quantileWeighted(0.9)(x, weight), 0.9 is "parameter" and x, weight are "arguments". - */ - using Creator = std::function; - - /// For compatibility with SQL, it's possible to specify that certain aggregate function name is case insensitive. - enum CaseSensitiveness - { - CaseSensitive, - CaseInsensitive - }; - /// Register a function by its name. /// No locking, you must register all functions before usage of get. void registerFunction( @@ -77,6 +69,13 @@ private: /// Case insensitive aggregate functions will be additionally added here with lowercased name. AggregateFunctions case_insensitive_aggregate_functions; + + const AggregateFunctions & getCreatorMap() const override { return aggregate_functions; } + + const AggregateFunctions & getCaseInsensitiveCreatorMap() const override { return case_insensitive_aggregate_functions; } + + String getFactoryName() const override { return "AggregateFunctionFactory"; } + }; } diff --git a/dbms/src/AggregateFunctions/AggregateFunctionNull.cpp b/dbms/src/AggregateFunctions/AggregateFunctionNull.cpp index 46a46a2370a..6ce7d94d970 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionNull.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionNull.cpp @@ -18,6 +18,8 @@ class AggregateFunctionCombinatorNull final : public IAggregateFunctionCombinato public: String getName() const override { return "Null"; } + bool isForInternalUsageOnly() const override { return true; } + DataTypes transformArguments(const DataTypes & arguments) const override { size_t size = arguments.size(); diff --git a/dbms/src/AggregateFunctions/AggregateFunctionQuantile.cpp b/dbms/src/AggregateFunctions/AggregateFunctionQuantile.cpp index 250ee422e8b..62455af6353 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionQuantile.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionQuantile.cpp @@ -93,30 +93,14 @@ void registerAggregateFunctionsQuantile(AggregateFunctionFactory & factory) createAggregateFunctionQuantile); /// 'median' is an alias for 'quantile' - - factory.registerFunction("median", - createAggregateFunctionQuantile); - - factory.registerFunction("medianDeterministic", - createAggregateFunctionQuantile); - - factory.registerFunction("medianExact", - createAggregateFunctionQuantile); - - factory.registerFunction("medianExactWeighted", - createAggregateFunctionQuantile); - - factory.registerFunction("medianTiming", - createAggregateFunctionQuantile); - - factory.registerFunction("medianTimingWeighted", - createAggregateFunctionQuantile); - - factory.registerFunction("medianTDigest", - createAggregateFunctionQuantile); - - factory.registerFunction("medianTDigestWeighted", - createAggregateFunctionQuantile); + factory.registerAlias("median", NameQuantile::name); + factory.registerAlias("medianDeterministic", NameQuantileDeterministic::name); + factory.registerAlias("medianExact", NameQuantileExact::name); + factory.registerAlias("medianExactWeighted", NameQuantileExactWeighted::name); + factory.registerAlias("medianTiming", NameQuantileTiming::name); + factory.registerAlias("medianTimingWeighted", NameQuantileTimingWeighted::name); + factory.registerAlias("medianTDigest", NameQuantileTDigest::name); + factory.registerAlias("medianTDigestWeighted", NameQuantileTDigestWeighted::name); } } diff --git a/dbms/src/AggregateFunctions/AggregateFunctionWindowFunnel.h b/dbms/src/AggregateFunctions/AggregateFunctionWindowFunnel.h index 4ad0400d160..b62755ef00c 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionWindowFunnel.h +++ b/dbms/src/AggregateFunctions/AggregateFunctionWindowFunnel.h @@ -116,7 +116,7 @@ struct AggregateFunctionWindowFunnelData /// TODO Protection against huge size events_list.clear(); - events_list.resize(size); + events_list.reserve(size); UInt32 timestamp; UInt8 event; diff --git a/dbms/src/AggregateFunctions/AggregateFunctionsStatisticsSimple.cpp b/dbms/src/AggregateFunctions/AggregateFunctionsStatisticsSimple.cpp index 089ea59cd79..c42372187bc 100644 --- a/dbms/src/AggregateFunctions/AggregateFunctionsStatisticsSimple.cpp +++ b/dbms/src/AggregateFunctions/AggregateFunctionsStatisticsSimple.cpp @@ -56,12 +56,12 @@ void registerAggregateFunctionsStatisticsSimple(AggregateFunctionFactory & facto factory.registerFunction("corr", createAggregateFunctionStatisticsBinary, AggregateFunctionFactory::CaseInsensitive); /// Synonims for compatibility. - factory.registerFunction("VAR_SAMP", createAggregateFunctionStatisticsUnary, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("VAR_POP", createAggregateFunctionStatisticsUnary, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("STDDEV_SAMP", createAggregateFunctionStatisticsUnary, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("STDDEV_POP", createAggregateFunctionStatisticsUnary, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("COVAR_SAMP", createAggregateFunctionStatisticsBinary, AggregateFunctionFactory::CaseInsensitive); - factory.registerFunction("COVAR_POP", createAggregateFunctionStatisticsBinary, AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("VAR_SAMP", "varSamp", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("VAR_POP", "varPop", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("STDDEV_SAMP", "stddevSamp", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("STDDEV_POP", "stddevPop", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("COVAR_SAMP", "covarSamp", AggregateFunctionFactory::CaseInsensitive); + factory.registerAlias("COVAR_POP", "covarPop", AggregateFunctionFactory::CaseInsensitive); } } diff --git a/dbms/src/AggregateFunctions/IAggregateFunctionCombinator.h b/dbms/src/AggregateFunctions/IAggregateFunctionCombinator.h index ba28026b1cd..0ac9a3d41cd 100644 --- a/dbms/src/AggregateFunctions/IAggregateFunctionCombinator.h +++ b/dbms/src/AggregateFunctions/IAggregateFunctionCombinator.h @@ -32,6 +32,8 @@ class IAggregateFunctionCombinator public: virtual String getName() const = 0; + virtual bool isForInternalUsageOnly() const { return false; } + /** From the arguments for combined function (ex: UInt64, UInt8 for sumIf), * get the arguments for nested function (ex: UInt64 for sum). * If arguments are not suitable for combined function, throw an exception. diff --git a/dbms/src/Client/ConnectionPoolWithFailover.cpp b/dbms/src/Client/ConnectionPoolWithFailover.cpp index ee8c3607c43..a311dac95b1 100644 --- a/dbms/src/Client/ConnectionPoolWithFailover.cpp +++ b/dbms/src/Client/ConnectionPoolWithFailover.cpp @@ -83,6 +83,16 @@ std::vector ConnectionPoolWithFailover::getMany(const Se return entries; } +std::vector ConnectionPoolWithFailover::getManyForTableFunction(const Settings * settings, PoolMode pool_mode) +{ + TryGetEntryFunc try_get_entry = [&](NestedPool & pool, std::string & fail_message) + { + return tryGetEntry(pool, fail_message, settings); + }; + + return getManyImpl(settings, pool_mode, try_get_entry); +} + std::vector ConnectionPoolWithFailover::getManyChecked( const Settings * settings, PoolMode pool_mode, const QualifiedTableName & table_to_check) { @@ -90,6 +100,7 @@ std::vector ConnectionPoolWithFailover::g { return tryGetEntry(pool, fail_message, settings, &table_to_check); }; + return getManyImpl(settings, pool_mode, try_get_entry); } diff --git a/dbms/src/Client/ConnectionPoolWithFailover.h b/dbms/src/Client/ConnectionPoolWithFailover.h index b61fa03d711..62ca75859ba 100644 --- a/dbms/src/Client/ConnectionPoolWithFailover.h +++ b/dbms/src/Client/ConnectionPoolWithFailover.h @@ -47,6 +47,9 @@ public: */ std::vector getMany(const Settings * settings, PoolMode pool_mode); + /// The same as getMany(), but return std::vector. + std::vector getManyForTableFunction(const Settings * settings, PoolMode pool_mode); + using Base = PoolWithFailoverBase; using TryResult = Base::TryResult; diff --git a/dbms/src/Common/BackgroundSchedulePool.cpp b/dbms/src/Common/BackgroundSchedulePool.cpp index 84eecdad7ff..9556c9a037b 100644 --- a/dbms/src/Common/BackgroundSchedulePool.cpp +++ b/dbms/src/Common/BackgroundSchedulePool.cpp @@ -128,7 +128,8 @@ void BackgroundSchedulePool::TaskInfo::execute() zkutil::WatchCallback BackgroundSchedulePool::TaskInfo::getWatchCallback() { - return [t=shared_from_this()](const ZooKeeperImpl::ZooKeeper::WatchResponse &) { + return [t = shared_from_this()](const ZooKeeperImpl::ZooKeeper::WatchResponse &) + { t->schedule(); }; } diff --git a/dbms/src/Common/IFactoryWithAliases.h b/dbms/src/Common/IFactoryWithAliases.h new file mode 100644 index 00000000000..9006a3c7cfd --- /dev/null +++ b/dbms/src/Common/IFactoryWithAliases.h @@ -0,0 +1,125 @@ +#pragma once + +#include +#include +#include + +#include + +namespace DB +{ + +namespace ErrorCodes +{ + extern const int LOGICAL_ERROR; +} + +/** If stored objects may have several names (aliases) + * this interface may be helpful + * template parameter is available as Creator + */ +template +class IFactoryWithAliases +{ +protected: + using Creator = CreatorFunc; + + String getAliasToOrName(const String & name) const + { + if (aliases.count(name)) + return aliases.at(name); + else if (String name_lowercase = Poco::toLower(name); case_insensitive_aliases.count(name_lowercase)) + return case_insensitive_aliases.at(name_lowercase); + else + return name; + } + +public: + /// For compatibility with SQL, it's possible to specify that certain function name is case insensitive. + enum CaseSensitiveness + { + CaseSensitive, + CaseInsensitive + }; + + /** Register additional name for creator + * real_name have to be already registered. + */ + void registerAlias(const String & alias_name, const String & real_name, CaseSensitiveness case_sensitiveness = CaseSensitive) + { + const auto & creator_map = getCreatorMap(); + const auto & case_insensitive_creator_map = getCaseInsensitiveCreatorMap(); + const String factory_name = getFactoryName(); + + String real_dict_name; + if (creator_map.count(real_name)) + real_dict_name = real_name; + else if (auto real_name_lowercase = Poco::toLower(real_name); case_insensitive_creator_map.count(real_name_lowercase)) + real_dict_name = real_name_lowercase; + else + throw Exception(factory_name + ": can't create alias '" + alias_name + "', the real name '" + real_name + "' is not registered", + ErrorCodes::LOGICAL_ERROR); + + String alias_name_lowercase = Poco::toLower(alias_name); + + if (creator_map.count(alias_name) || case_insensitive_creator_map.count(alias_name_lowercase)) + throw Exception( + factory_name + ": the alias name '" + alias_name + "' is already registered as real name", ErrorCodes::LOGICAL_ERROR); + + if (case_sensitiveness == CaseInsensitive) + if (!case_insensitive_aliases.emplace(alias_name_lowercase, real_dict_name).second) + throw Exception( + factory_name + ": case insensitive alias name '" + alias_name + "' is not unique", ErrorCodes::LOGICAL_ERROR); + + if (!aliases.emplace(alias_name, real_dict_name).second) + throw Exception(factory_name + ": alias name '" + alias_name + "' is not unique", ErrorCodes::LOGICAL_ERROR); + } + + std::vector getAllRegisteredNames() const + { + std::vector result; + auto getter = [](const auto & pair) { return pair.first; }; + std::transform(getCreatorMap().begin(), getCreatorMap().end(), std::back_inserter(result), getter); + std::transform(aliases.begin(), aliases.end(), std::back_inserter(result), getter); + return result; + } + + bool isCaseInsensitive(const String & name) const + { + String name_lowercase = Poco::toLower(name); + return getCaseInsensitiveCreatorMap().count(name_lowercase) || case_insensitive_aliases.count(name_lowercase); + } + + const String & aliasTo(const String & name) const + { + if (auto it = aliases.find(name); it != aliases.end()) + return it->second; + else if (auto it = case_insensitive_aliases.find(Poco::toLower(name)); it != case_insensitive_aliases.end()) + return it->second; + + throw Exception(getFactoryName() + ": name '" + name + "' is not alias", ErrorCodes::LOGICAL_ERROR); + } + + bool isAlias(const String & name) const + { + return aliases.count(name) || case_insensitive_aliases.count(name); + } + + virtual ~IFactoryWithAliases() {} + +private: + using InnerMap = std::unordered_map; // name -> creator + using AliasMap = std::unordered_map; // alias -> original type + + virtual const InnerMap & getCreatorMap() const = 0; + virtual const InnerMap & getCaseInsensitiveCreatorMap() const = 0; + virtual String getFactoryName() const = 0; + + /// Alias map to data_types from previous two maps + AliasMap aliases; + + /// Case insensitive aliases + AliasMap case_insensitive_aliases; +}; + +} diff --git a/dbms/src/DataTypes/DataTypeFactory.cpp b/dbms/src/DataTypes/DataTypeFactory.cpp index f1a12d75868..9706ecf4944 100644 --- a/dbms/src/DataTypes/DataTypeFactory.cpp +++ b/dbms/src/DataTypes/DataTypeFactory.cpp @@ -51,16 +51,19 @@ DataTypePtr DataTypeFactory::get(const ASTPtr & ast) const throw Exception("Unexpected AST element for data type.", ErrorCodes::UNEXPECTED_AST_STRUCTURE); } -DataTypePtr DataTypeFactory::get(const String & family_name, const ASTPtr & parameters) const +DataTypePtr DataTypeFactory::get(const String & family_name_param, const ASTPtr & parameters) const { + String family_name = getAliasToOrName(family_name_param); + { DataTypesDictionary::const_iterator it = data_types.find(family_name); if (data_types.end() != it) return it->second(parameters); } + String family_name_lowercase = Poco::toLower(family_name); + { - String family_name_lowercase = Poco::toLower(family_name); DataTypesDictionary::const_iterator it = case_insensitive_data_types.find(family_name_lowercase); if (case_insensitive_data_types.end() != it) return it->second(parameters); @@ -76,11 +79,16 @@ void DataTypeFactory::registerDataType(const String & family_name, Creator creat throw Exception("DataTypeFactory: the data type family " + family_name + " has been provided " " a null constructor", ErrorCodes::LOGICAL_ERROR); + String family_name_lowercase = Poco::toLower(family_name); + + if (isAlias(family_name) || isAlias(family_name_lowercase)) + throw Exception("DataTypeFactory: the data type family name '" + family_name + "' is already registered as alias", + ErrorCodes::LOGICAL_ERROR); + if (!data_types.emplace(family_name, creator).second) throw Exception("DataTypeFactory: the data type family name '" + family_name + "' is not unique", ErrorCodes::LOGICAL_ERROR); - String family_name_lowercase = Poco::toLower(family_name); if (case_sensitiveness == CaseInsensitive && !case_insensitive_data_types.emplace(family_name_lowercase, creator).second) @@ -88,7 +96,6 @@ void DataTypeFactory::registerDataType(const String & family_name, Creator creat ErrorCodes::LOGICAL_ERROR); } - void DataTypeFactory::registerSimpleDataType(const String & name, SimpleCreator creator, CaseSensitiveness case_sensitiveness) { if (creator == nullptr) @@ -103,7 +110,6 @@ void DataTypeFactory::registerSimpleDataType(const String & name, SimpleCreator }, case_sensitiveness); } - void registerDataTypeNumbers(DataTypeFactory & factory); void registerDataTypeDate(DataTypeFactory & factory); void registerDataTypeDateTime(DataTypeFactory & factory); diff --git a/dbms/src/DataTypes/DataTypeFactory.h b/dbms/src/DataTypes/DataTypeFactory.h index e6c873ba724..21d22cf932e 100644 --- a/dbms/src/DataTypes/DataTypeFactory.h +++ b/dbms/src/DataTypes/DataTypeFactory.h @@ -3,6 +3,7 @@ #include #include #include +#include #include #include @@ -19,10 +20,9 @@ using ASTPtr = std::shared_ptr; /** Creates a data type by name of data type family and parameters. */ -class DataTypeFactory final : public ext::singleton +class DataTypeFactory final : public ext::singleton, public IFactoryWithAliases> { private: - using Creator = std::function; using SimpleCreator = std::function; using DataTypesDictionary = std::unordered_map; @@ -31,24 +31,12 @@ public: DataTypePtr get(const String & family_name, const ASTPtr & parameters) const; DataTypePtr get(const ASTPtr & ast) const; - /// For compatibility with SQL, it's possible to specify that certain data type name is case insensitive. - enum CaseSensitiveness - { - CaseSensitive, - CaseInsensitive - }; - /// Register a type family by its name. void registerDataType(const String & family_name, Creator creator, CaseSensitiveness case_sensitiveness = CaseSensitive); /// Register a simple data type, that have no parameters. void registerSimpleDataType(const String & name, SimpleCreator creator, CaseSensitiveness case_sensitiveness = CaseSensitive); - const DataTypesDictionary & getAllDataTypes() const - { - return data_types; - } - private: DataTypesDictionary data_types; @@ -56,6 +44,13 @@ private: DataTypesDictionary case_insensitive_data_types; DataTypeFactory(); + + const DataTypesDictionary & getCreatorMap() const override { return data_types; } + + const DataTypesDictionary & getCaseInsensitiveCreatorMap() const override { return case_insensitive_data_types; } + + String getFactoryName() const override { return "DataTypeFactory"; } + friend class ext::singleton; }; diff --git a/dbms/src/DataTypes/DataTypeFixedString.cpp b/dbms/src/DataTypes/DataTypeFixedString.cpp index 05fdd34c464..ad875c4f85e 100644 --- a/dbms/src/DataTypes/DataTypeFixedString.cpp +++ b/dbms/src/DataTypes/DataTypeFixedString.cpp @@ -231,7 +231,7 @@ void registerDataTypeFixedString(DataTypeFactory & factory) factory.registerDataType("FixedString", create); /// Compatibility alias. - factory.registerDataType("BINARY", create, DataTypeFactory::CaseInsensitive); + factory.registerAlias("BINARY", "FixedString", DataTypeFactory::CaseInsensitive); } } diff --git a/dbms/src/DataTypes/DataTypeString.cpp b/dbms/src/DataTypes/DataTypeString.cpp index 4ffda6f2099..671d1b2d3a5 100644 --- a/dbms/src/DataTypes/DataTypeString.cpp +++ b/dbms/src/DataTypes/DataTypeString.cpp @@ -312,16 +312,16 @@ void registerDataTypeString(DataTypeFactory & factory) /// These synonims are added for compatibility. - factory.registerSimpleDataType("CHAR", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("VARCHAR", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("TEXT", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("TINYTEXT", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("MEDIUMTEXT", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("LONGTEXT", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("BLOB", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("TINYBLOB", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("MEDIUMBLOB", creator, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("LONGBLOB", creator, DataTypeFactory::CaseInsensitive); + factory.registerAlias("CHAR", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("VARCHAR", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("TEXT", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("TINYTEXT", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("MEDIUMTEXT", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("LONGTEXT", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("BLOB", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("TINYBLOB", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("MEDIUMBLOB", "String", DataTypeFactory::CaseInsensitive); + factory.registerAlias("LONGBLOB", "String", DataTypeFactory::CaseInsensitive); } } diff --git a/dbms/src/DataTypes/DataTypesNumber.cpp b/dbms/src/DataTypes/DataTypesNumber.cpp index 72861eff3ac..254d6ba6852 100644 --- a/dbms/src/DataTypes/DataTypesNumber.cpp +++ b/dbms/src/DataTypes/DataTypesNumber.cpp @@ -22,13 +22,13 @@ void registerDataTypeNumbers(DataTypeFactory & factory) /// These synonims are added for compatibility. - factory.registerSimpleDataType("TINYINT", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("SMALLINT", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("INT", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("INTEGER", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("BIGINT", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("FLOAT", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); - factory.registerSimpleDataType("DOUBLE", [] { return DataTypePtr(std::make_shared()); }, DataTypeFactory::CaseInsensitive); + factory.registerAlias("TINYINT", "Int8", DataTypeFactory::CaseInsensitive); + factory.registerAlias("SMALLINT", "Int16", DataTypeFactory::CaseInsensitive); + factory.registerAlias("INT", "Int32", DataTypeFactory::CaseInsensitive); + factory.registerAlias("INTEGER", "Int32", DataTypeFactory::CaseInsensitive); + factory.registerAlias("BIGINT", "Int64", DataTypeFactory::CaseInsensitive); + factory.registerAlias("FLOAT", "Float32", DataTypeFactory::CaseInsensitive); + factory.registerAlias("DOUBLE", "Float64", DataTypeFactory::CaseInsensitive); } } diff --git a/dbms/src/Dictionaries/Embedded/RegionsHierarchy.cpp b/dbms/src/Dictionaries/Embedded/RegionsHierarchy.cpp index 2dbab26acc1..978d7b9e496 100644 --- a/dbms/src/Dictionaries/Embedded/RegionsHierarchy.cpp +++ b/dbms/src/Dictionaries/Embedded/RegionsHierarchy.cpp @@ -41,7 +41,6 @@ void RegionsHierarchy::reload() RegionID max_region_id = 0; - auto regions_reader = data_source->createReader(); RegionEntry region_entry; diff --git a/dbms/src/Formats/ValuesRowInputStream.cpp b/dbms/src/Formats/ValuesRowInputStream.cpp index c291f147184..559ac658a6a 100644 --- a/dbms/src/Formats/ValuesRowInputStream.cpp +++ b/dbms/src/Formats/ValuesRowInputStream.cpp @@ -1,5 +1,6 @@ #include #include +#include #include #include #include @@ -29,7 +30,7 @@ namespace ErrorCodes ValuesRowInputStream::ValuesRowInputStream(ReadBuffer & istr_, const Block & header_, const Context & context_, const FormatSettings & format_settings) - : istr(istr_), header(header_), context(context_), format_settings(format_settings) + : istr(istr_), header(header_), context(std::make_unique(context_)), format_settings(format_settings) { /// In this format, BOM at beginning of stream cannot be confused with value, so it is safe to skip it. skipBOMIfExists(istr); @@ -112,7 +113,7 @@ bool ValuesRowInputStream::read(MutableColumns & columns) istr.position() = const_cast(token_iterator->begin); - std::pair value_raw = evaluateConstantExpression(ast, context); + std::pair value_raw = evaluateConstantExpression(ast, *context); Field value = convertFieldToType(value_raw.first, type, value_raw.second.get()); if (value.isNull()) diff --git a/dbms/src/Formats/ValuesRowInputStream.h b/dbms/src/Formats/ValuesRowInputStream.h index 00fa9071947..49775861746 100644 --- a/dbms/src/Formats/ValuesRowInputStream.h +++ b/dbms/src/Formats/ValuesRowInputStream.h @@ -28,7 +28,7 @@ public: private: ReadBuffer & istr; Block header; - const Context & context; + std::unique_ptr context; /// pimpl const FormatSettings format_settings; }; diff --git a/dbms/src/Functions/FunctionFactory.cpp b/dbms/src/Functions/FunctionFactory.cpp index 9bb2abbb013..0b2f042089d 100644 --- a/dbms/src/Functions/FunctionFactory.cpp +++ b/dbms/src/Functions/FunctionFactory.cpp @@ -6,7 +6,6 @@ #include - namespace DB { @@ -26,8 +25,13 @@ void FunctionFactory::registerFunction(const throw Exception("FunctionFactory: the function name '" + name + "' is not unique", ErrorCodes::LOGICAL_ERROR); + String function_name_lowercase = Poco::toLower(name); + if (isAlias(name) || isAlias(function_name_lowercase)) + throw Exception("FunctionFactory: the function name '" + name + "' is already registered as alias", + ErrorCodes::LOGICAL_ERROR); + if (case_sensitiveness == CaseInsensitive - && !case_insensitive_functions.emplace(Poco::toLower(name), creator).second) + && !case_insensitive_functions.emplace(function_name_lowercase, creator).second) throw Exception("FunctionFactory: the case insensitive function name '" + name + "' is not unique", ErrorCodes::LOGICAL_ERROR); } @@ -45,9 +49,11 @@ FunctionBuilderPtr FunctionFactory::get( FunctionBuilderPtr FunctionFactory::tryGet( - const std::string & name, + const std::string & name_param, const Context & context) const { + String name = getAliasToOrName(name_param); + auto it = functions.find(name); if (functions.end() != it) return it->second(context); diff --git a/dbms/src/Functions/FunctionFactory.h b/dbms/src/Functions/FunctionFactory.h index a061c3103fd..7fa0f81f475 100644 --- a/dbms/src/Functions/FunctionFactory.h +++ b/dbms/src/Functions/FunctionFactory.h @@ -1,6 +1,7 @@ #pragma once #include +#include #include @@ -20,19 +21,9 @@ class Context; * Function could use for initialization (take ownership of shared_ptr, for example) * some dictionaries from Context. */ -class FunctionFactory : public ext::singleton +class FunctionFactory : public ext::singleton, public IFactoryWithAliases> { - friend class StorageSystemFunctions; - public: - using Creator = std::function; - - /// For compatibility with SQL, it's possible to specify that certain function name is case insensitive. - enum CaseSensitiveness - { - CaseSensitive, - CaseInsensitive - }; template void registerFunction(CaseSensitiveness case_sensitiveness = CaseSensitive) @@ -67,6 +58,12 @@ private: return std::make_shared(Function::create(context)); } + const Functions & getCreatorMap() const override { return functions; } + + const Functions & getCaseInsensitiveCreatorMap() const override { return case_insensitive_functions; } + + String getFactoryName() const override { return "FunctionFactory"; } + /// Register a function by its name. /// No locking, you must register all functions before usage of get. void registerFunction( diff --git a/dbms/src/Functions/FunctionsArray.cpp b/dbms/src/Functions/FunctionsArray.cpp index d72dcf6f670..466610bcd45 100644 --- a/dbms/src/Functions/FunctionsArray.cpp +++ b/dbms/src/Functions/FunctionsArray.cpp @@ -1286,12 +1286,12 @@ DataTypePtr FunctionArrayDistinct::getReturnTypeImpl(const DataTypes & arguments { const DataTypeArray * array_type = checkAndGetDataType(arguments[0].get()); if (!array_type) - throw Exception("Argument for function " + getName() + " must be array but it " + throw Exception("Argument for function " + getName() + " must be array but it " " has type " + arguments[0]->getName() + ".", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT); - + auto nested_type = removeNullable(array_type->getNestedType()); - + return std::make_shared(nested_type); } @@ -1307,7 +1307,7 @@ void FunctionArrayDistinct::executeImpl(Block & block, const ColumnNumbers & arg const IColumn & src_data = array->getData(); const ColumnArray::Offsets & offsets = array->getOffsets(); - + ColumnRawPtrs original_data_columns; original_data_columns.push_back(&src_data); @@ -1416,7 +1416,7 @@ bool FunctionArrayDistinct::executeString( HashTableAllocatorWithStackMemory<(1ULL << INITIAL_SIZE_DEGREE) * sizeof(StringRef)>>; const PaddedPODArray * src_null_map = nullptr; - + if (nullable_col) { src_null_map = &static_cast(&nullable_col->getNullMapColumn())->getData(); @@ -1471,7 +1471,7 @@ void FunctionArrayDistinct::executeHashed( res_data_col.insertFrom(*columns[0], j); } } - + res_offsets.emplace_back(set.size() + prev_off); prev_off = off; } diff --git a/dbms/src/Functions/FunctionsArray.h b/dbms/src/Functions/FunctionsArray.h index 3cd1a8968f7..15fd5b420e2 100644 --- a/dbms/src/Functions/FunctionsArray.h +++ b/dbms/src/Functions/FunctionsArray.h @@ -1011,10 +1011,11 @@ public: DataTypePtr observed_type0 = removeNullable(array_type->getNestedType()); DataTypePtr observed_type1 = removeNullable(arguments[1]); - if (!(observed_type0->isNumber() && observed_type1->isNumber()) + /// We also support arrays of Enum type (that are represented by number) to search numeric values. + if (!(observed_type0->isValueRepresentedByNumber() && observed_type1->isNumber()) && !observed_type0->equals(*observed_type1)) throw Exception("Types of array and 2nd argument of function " - + getName() + " must be identical up to nullability. Passed: " + + getName() + " must be identical up to nullability or numeric types or Enum and numeric type. Passed: " + arguments[0]->getName() + " and " + arguments[1]->getName() + ".", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT); } @@ -1249,7 +1250,7 @@ private: IColumn & res_data_col, ColumnArray::Offsets & res_offsets, const ColumnNullable * nullable_col); - + void executeHashed( const ColumnArray::Offsets & offsets, const ColumnRawPtrs & columns, diff --git a/dbms/src/Functions/FunctionsRound.cpp b/dbms/src/Functions/FunctionsRound.cpp index 7bf7eb791ad..9cb9e1001ae 100644 --- a/dbms/src/Functions/FunctionsRound.cpp +++ b/dbms/src/Functions/FunctionsRound.cpp @@ -16,8 +16,8 @@ void registerFunctionsRound(FunctionFactory & factory) factory.registerFunction("trunc", FunctionFactory::CaseInsensitive); /// Compatibility aliases. - factory.registerFunction("ceiling", FunctionFactory::CaseInsensitive); - factory.registerFunction("truncate", FunctionFactory::CaseInsensitive); + factory.registerAlias("ceiling", "ceil", FunctionFactory::CaseInsensitive); + factory.registerAlias("truncate", "trunc", FunctionFactory::CaseInsensitive); } } diff --git a/dbms/src/IO/ReadWriteBufferFromHTTP.cpp b/dbms/src/IO/ReadWriteBufferFromHTTP.cpp index af0f34babbf..52ec808bd68 100644 --- a/dbms/src/IO/ReadWriteBufferFromHTTP.cpp +++ b/dbms/src/IO/ReadWriteBufferFromHTTP.cpp @@ -18,6 +18,7 @@ ReadWriteBufferFromHTTP::ReadWriteBufferFromHTTP(const Poco::URI & uri, const std::string & method_, OutStreamCallback out_stream_callback, const ConnectionTimeouts & timeouts, + const Poco::Net::HTTPBasicCredentials & credentials, size_t buffer_size_) : ReadBuffer(nullptr, 0), uri{uri}, @@ -30,6 +31,9 @@ ReadWriteBufferFromHTTP::ReadWriteBufferFromHTTP(const Poco::URI & uri, if (out_stream_callback) request.setChunkedTransferEncoding(true); + if (!credentials.getUsername().empty()) + credentials.authenticate(request); + Poco::Net::HTTPResponse response; LOG_TRACE((&Logger::get("ReadWriteBufferFromHTTP")), "Sending request to " << uri.toString()); diff --git a/dbms/src/IO/ReadWriteBufferFromHTTP.h b/dbms/src/IO/ReadWriteBufferFromHTTP.h index 93a8232f93d..d370bb3d4c7 100644 --- a/dbms/src/IO/ReadWriteBufferFromHTTP.h +++ b/dbms/src/IO/ReadWriteBufferFromHTTP.h @@ -1,6 +1,7 @@ #pragma once #include +#include #include #include #include @@ -32,6 +33,7 @@ public: const std::string & method = {}, OutStreamCallback out_stream_callback = {}, const ConnectionTimeouts & timeouts = {}, + const Poco::Net::HTTPBasicCredentials & credentials = {}, size_t buffer_size_ = DBMS_DEFAULT_BUFFER_SIZE); bool nextImpl() override; diff --git a/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.cpp b/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.cpp index c5ffd0ef4f7..eb1d54c457e 100644 --- a/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.cpp +++ b/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.cpp @@ -6,6 +6,7 @@ #include #include #include +#include #include @@ -28,13 +29,26 @@ namespace ClusterProxy { SelectStreamFactory::SelectStreamFactory( - const Block & header, + const Block & header_, QueryProcessingStage::Enum processed_stage_, QualifiedTableName main_table_, const Tables & external_tables_) - : header(header), + : header(header_), processed_stage{processed_stage_}, main_table(std::move(main_table_)), + table_func_ptr{nullptr}, + external_tables{external_tables_} +{ +} + +SelectStreamFactory::SelectStreamFactory( + const Block & header_, + QueryProcessingStage::Enum processed_stage_, + ASTPtr table_func_ptr_, + const Tables & external_tables_) + : header(header_), + processed_stage{processed_stage_}, + table_func_ptr{table_func_ptr_}, external_tables{external_tables_} { } @@ -71,13 +85,24 @@ void SelectStreamFactory::createForShard( { auto stream = std::make_shared(shard_info.pool, query, header, context, nullptr, throttler, external_tables, processed_stage); stream->setPoolMode(PoolMode::GET_MANY); - stream->setMainTable(main_table); + if (!table_func_ptr) + stream->setMainTable(main_table); res.emplace_back(std::move(stream)); }; if (shard_info.isLocal()) { - StoragePtr main_table_storage = context.tryGetTable(main_table.database, main_table.table); + StoragePtr main_table_storage; + + if (table_func_ptr) + { + auto table_function = static_cast(table_func_ptr.get()); + main_table_storage = TableFunctionFactory::instance().get(table_function->name, context)->execute(table_func_ptr, context); + } + else + main_table_storage = context.tryGetTable(main_table.database, main_table.table); + + if (!main_table_storage) /// Table is absent on a local server. { ProfileEvents::increment(ProfileEvents::DistributedConnectionMissingTable); @@ -158,14 +183,17 @@ void SelectStreamFactory::createForShard( auto lazily_create_stream = [ pool = shard_info.pool, shard_num = shard_info.shard_num, query, header = header, query_ast, context, throttler, - main_table = main_table, external_tables = external_tables, stage = processed_stage, + main_table = main_table, table_func_ptr = table_func_ptr, external_tables = external_tables, stage = processed_stage, local_delay]() -> BlockInputStreamPtr { std::vector try_results; try { - try_results = pool->getManyChecked(&context.getSettingsRef(), PoolMode::GET_MANY, main_table); + if (table_func_ptr) + try_results = pool->getManyForTableFunction(&context.getSettingsRef(), PoolMode::GET_MANY); + else + try_results = pool->getManyChecked(&context.getSettingsRef(), PoolMode::GET_MANY, main_table); } catch (const Exception & ex) { diff --git a/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.h b/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.h index 5325e5d463c..38dabf82dcc 100644 --- a/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.h +++ b/dbms/src/Interpreters/ClusterProxy/SelectStreamFactory.h @@ -13,11 +13,19 @@ namespace ClusterProxy class SelectStreamFactory final : public IStreamFactory { public: + /// Database in a query. SelectStreamFactory( - const Block & header, - QueryProcessingStage::Enum processed_stage, - QualifiedTableName main_table, + const Block & header_, + QueryProcessingStage::Enum processed_stage_, + QualifiedTableName main_table_, const Tables & external_tables); + + /// TableFunction in a query. + SelectStreamFactory( + const Block & header_, + QueryProcessingStage::Enum processed_stage_, + ASTPtr table_func_ptr_, + const Tables & external_tables_); void createForShard( const Cluster::ShardInfo & shard_info, @@ -29,6 +37,7 @@ private: const Block header; QueryProcessingStage::Enum processed_stage; QualifiedTableName main_table; + ASTPtr table_func_ptr; Tables external_tables; }; diff --git a/dbms/src/Interpreters/Context.cpp b/dbms/src/Interpreters/Context.cpp index 9fed370cfbc..0561c2f11c2 100644 --- a/dbms/src/Interpreters/Context.cpp +++ b/dbms/src/Interpreters/Context.cpp @@ -109,6 +109,8 @@ struct ContextShared String interserver_io_host; /// The host name by which this server is available for other servers. UInt16 interserver_io_port = 0; /// and port. + String interserver_io_user; + String interserver_io_password; String path; /// Path to the data directory, with a slash at the end. String tmp_path; /// The path to the temporary files that occur when processing the request. @@ -1378,6 +1380,17 @@ void Context::setInterserverIOAddress(const String & host, UInt16 port) shared->interserver_io_port = port; } +void Context::setInterverserCredentials(const String & user, const String & password) +{ + shared->interserver_io_user = user; + shared->interserver_io_password = password; +} + +std::pair Context::getInterserverCredentials() const +{ + return { shared->interserver_io_user, shared->interserver_io_password }; +} + std::pair Context::getInterserverIOAddress() const { diff --git a/dbms/src/Interpreters/Context.h b/dbms/src/Interpreters/Context.h index 1c867d65e8f..38a0e7cb4bc 100644 --- a/dbms/src/Interpreters/Context.h +++ b/dbms/src/Interpreters/Context.h @@ -249,6 +249,11 @@ public: /// How other servers can access this for downloading replicated data. void setInterserverIOAddress(const String & host, UInt16 port); std::pair getInterserverIOAddress() const; + + // Credentials which server will use to communicate with others + void setInterverserCredentials(const String & user, const String & password); + std::pair getInterserverCredentials() const; + /// The port that the server listens for executing SQL queries. UInt16 getTCPPort() const; diff --git a/dbms/src/Interpreters/ExpressionAnalyzer.cpp b/dbms/src/Interpreters/ExpressionAnalyzer.cpp index 31a6a566ae8..b23160f133f 100644 --- a/dbms/src/Interpreters/ExpressionAnalyzer.cpp +++ b/dbms/src/Interpreters/ExpressionAnalyzer.cpp @@ -64,6 +64,7 @@ #include #include #include +#include namespace DB diff --git a/dbms/src/Interpreters/InterpreterKillQueryQuery.cpp b/dbms/src/Interpreters/InterpreterKillQueryQuery.cpp index 336b45e7a5c..f0add31dc38 100644 --- a/dbms/src/Interpreters/InterpreterKillQueryQuery.cpp +++ b/dbms/src/Interpreters/InterpreterKillQueryQuery.cpp @@ -2,6 +2,7 @@ #include #include #include +#include #include #include #include @@ -172,6 +173,9 @@ BlockIO InterpreterKillQueryQuery::execute() { ASTKillQueryQuery & query = typeid_cast(*query_ptr); + if (!query.cluster.empty()) + return executeDDLQueryOnCluster(query_ptr, context, {"system"}); + BlockIO res_io; Block processes_block = getSelectFromSystemProcessesResult(); if (!processes_block) diff --git a/dbms/src/Interpreters/InterpreterOptimizeQuery.cpp b/dbms/src/Interpreters/InterpreterOptimizeQuery.cpp index 2472cff1876..80a64d83f90 100644 --- a/dbms/src/Interpreters/InterpreterOptimizeQuery.cpp +++ b/dbms/src/Interpreters/InterpreterOptimizeQuery.cpp @@ -1,6 +1,7 @@ #include #include #include +#include #include #include @@ -18,6 +19,9 @@ BlockIO InterpreterOptimizeQuery::execute() { const ASTOptimizeQuery & ast = typeid_cast(*query_ptr); + if (!ast.cluster.empty()) + return executeDDLQueryOnCluster(query_ptr, context, {ast.database}); + StoragePtr table = context.getTable(ast.database, ast.table); auto table_lock = table->lockStructure(true, __PRETTY_FUNCTION__); table->optimize(query_ptr, ast.partition, ast.final, ast.deduplicate, context); diff --git a/dbms/src/Interpreters/evaluateConstantExpression.cpp b/dbms/src/Interpreters/evaluateConstantExpression.cpp index 8ab3ca7bf1a..6dcff35e6a4 100644 --- a/dbms/src/Interpreters/evaluateConstantExpression.cpp +++ b/dbms/src/Interpreters/evaluateConstantExpression.cpp @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -10,6 +11,7 @@ #include #include #include +#include namespace DB @@ -52,13 +54,19 @@ std::pair> evaluateConstantExpression(co ASTPtr evaluateConstantExpressionAsLiteral(const ASTPtr & node, const Context & context) { + /// Branch with string in qery. if (typeid_cast(node.get())) return node; - + + /// Branch with TableFunction in query. + if (auto table_func_ptr = typeid_cast(node.get())) + if (TableFunctionFactory::instance().isTableFunctionName(table_func_ptr->name)) + + return node; + return std::make_shared(evaluateConstantExpression(node, context).first); } - ASTPtr evaluateConstantExpressionOrIdentifierAsLiteral(const ASTPtr & node, const Context & context) { if (auto id = typeid_cast(node.get())) diff --git a/dbms/src/Parsers/ASTKillQueryQuery.cpp b/dbms/src/Parsers/ASTKillQueryQuery.cpp index 8be944e8481..0f3e5406fdd 100644 --- a/dbms/src/Parsers/ASTKillQueryQuery.cpp +++ b/dbms/src/Parsers/ASTKillQueryQuery.cpp @@ -8,9 +8,22 @@ String ASTKillQueryQuery::getID() const return "KillQueryQuery_" + (where_expression ? where_expression->getID() : "") + "_" + String(sync ? "SYNC" : "ASYNC"); } +ASTPtr ASTKillQueryQuery::getRewrittenASTWithoutOnCluster(const std::string & /*new_database*/) const +{ + auto query_ptr = clone(); + ASTKillQueryQuery & query = static_cast(*query_ptr); + + query.cluster.clear(); + + return query_ptr; +} + void ASTKillQueryQuery::formatQueryImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const { - settings.ostr << (settings.hilite ? hilite_keyword : "") << "KILL QUERY WHERE " << (settings.hilite ? hilite_none : ""); + settings.ostr << (settings.hilite ? hilite_keyword : "") << "KILL QUERY "; + + formatOnCluster(settings); + settings.ostr << " WHERE " << (settings.hilite ? hilite_none : ""); if (where_expression) where_expression->formatImpl(settings, state, frame); diff --git a/dbms/src/Parsers/ASTKillQueryQuery.h b/dbms/src/Parsers/ASTKillQueryQuery.h index 4df1f28f733..086ee55e3bd 100644 --- a/dbms/src/Parsers/ASTKillQueryQuery.h +++ b/dbms/src/Parsers/ASTKillQueryQuery.h @@ -1,10 +1,11 @@ #include #include +#include namespace DB { -class ASTKillQueryQuery : public ASTQueryWithOutput +class ASTKillQueryQuery : public ASTQueryWithOutput, public ASTQueryWithOnCluster { public: ASTPtr where_expression; // expression to filter processes from system.processes table @@ -22,6 +23,8 @@ public: String getID() const override; void formatQueryImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override; + + ASTPtr getRewrittenASTWithoutOnCluster(const std::string &new_database) const override; }; } diff --git a/dbms/src/Parsers/ASTOptimizeQuery.cpp b/dbms/src/Parsers/ASTOptimizeQuery.cpp new file mode 100644 index 00000000000..dd37b665173 --- /dev/null +++ b/dbms/src/Parsers/ASTOptimizeQuery.cpp @@ -0,0 +1,39 @@ +#include + +namespace DB +{ + + +ASTPtr ASTOptimizeQuery::getRewrittenASTWithoutOnCluster(const std::string & new_database) const +{ + auto query_ptr = clone(); + ASTOptimizeQuery & query = static_cast(*query_ptr); + + query.cluster.clear(); + if (query.database.empty()) + query.database = new_database; + + return query_ptr; +} + +void ASTOptimizeQuery::formatQueryImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const +{ + settings.ostr << (settings.hilite ? hilite_keyword : "") << "OPTIMIZE TABLE " << (settings.hilite ? hilite_none : "") + << (!database.empty() ? backQuoteIfNeed(database) + "." : "") << backQuoteIfNeed(table); + + formatOnCluster(settings); + + if (partition) + { + settings.ostr << (settings.hilite ? hilite_keyword : "") << " PARTITION " << (settings.hilite ? hilite_none : ""); + partition->formatImpl(settings, state, frame); + } + + if (final) + settings.ostr << (settings.hilite ? hilite_keyword : "") << " FINAL" << (settings.hilite ? hilite_none : ""); + + if (deduplicate) + settings.ostr << (settings.hilite ? hilite_keyword : "") << " DEDUPLICATE" << (settings.hilite ? hilite_none : ""); +} + +} diff --git a/dbms/src/Parsers/ASTOptimizeQuery.h b/dbms/src/Parsers/ASTOptimizeQuery.h index 571b04d22ef..0b329d59559 100644 --- a/dbms/src/Parsers/ASTOptimizeQuery.h +++ b/dbms/src/Parsers/ASTOptimizeQuery.h @@ -1,7 +1,8 @@ #pragma once #include - +#include +#include namespace DB { @@ -9,7 +10,7 @@ namespace DB /** OPTIMIZE query */ -class ASTOptimizeQuery : public IAST +class ASTOptimizeQuery : public ASTQueryWithOutput, public ASTQueryWithOnCluster { public: String database; @@ -23,7 +24,8 @@ public: bool deduplicate; /** Get the text that identifies this element. */ - String getID() const override { return "OptimizeQuery_" + database + "_" + table + (final ? "_final" : "") + (deduplicate ? "_deduplicate" : ""); } + String getID() const override + { return "OptimizeQuery_" + database + "_" + table + (final ? "_final" : "") + (deduplicate ? "_deduplicate" : ""); } ASTPtr clone() const override { @@ -39,24 +41,10 @@ public: return res; } -protected: - void formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override - { - settings.ostr << (settings.hilite ? hilite_keyword : "") << "OPTIMIZE TABLE " << (settings.hilite ? hilite_none : "") - << (!database.empty() ? backQuoteIfNeed(database) + "." : "") << backQuoteIfNeed(table); + void formatQueryImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override; - if (partition) - { - settings.ostr << (settings.hilite ? hilite_keyword : "") << " PARTITION " << (settings.hilite ? hilite_none : ""); - partition->formatImpl(settings, state, frame); - } + ASTPtr getRewrittenASTWithoutOnCluster(const std::string &new_database) const override; - if (final) - settings.ostr << (settings.hilite ? hilite_keyword : "") << " FINAL" << (settings.hilite ? hilite_none : ""); - - if (deduplicate) - settings.ostr << (settings.hilite ? hilite_keyword : "") << " DEDUPLICATE" << (settings.hilite ? hilite_none : ""); - } }; } diff --git a/dbms/src/Parsers/ASTSelectQuery.cpp b/dbms/src/Parsers/ASTSelectQuery.cpp index f234b0ae4b5..8bb5f2488d8 100644 --- a/dbms/src/Parsers/ASTSelectQuery.cpp +++ b/dbms/src/Parsers/ASTSelectQuery.cpp @@ -372,9 +372,9 @@ void ASTSelectQuery::replaceDatabaseAndTable(const String & database_name, const children.emplace_back(tables_list); table_expression = table_expr.get(); } - + ASTPtr table = std::make_shared(table_name, ASTIdentifier::Table); - + if (!database_name.empty()) { ASTPtr database = std::make_shared(database_name, ASTIdentifier::Database); @@ -388,5 +388,27 @@ void ASTSelectQuery::replaceDatabaseAndTable(const String & database_name, const } } + +void ASTSelectQuery::addTableFunction(ASTPtr & table_function_ptr) +{ + ASTTableExpression * table_expression = getFirstTableExpression(*this); + + if (!table_expression) + { + auto tables_list = std::make_shared(); + auto element = std::make_shared(); + auto table_expr = std::make_shared(); + element->table_expression = table_expr; + element->children.emplace_back(table_expr); + tables_list->children.emplace_back(element); + tables = tables_list; + children.emplace_back(tables_list); + table_expression = table_expr.get(); + } + + table_expression->table_function = table_function_ptr; + table_expression->database_and_table_name = nullptr; +} + }; diff --git a/dbms/src/Parsers/ASTSelectQuery.h b/dbms/src/Parsers/ASTSelectQuery.h index d45f45c34d8..91d8d52172c 100644 --- a/dbms/src/Parsers/ASTSelectQuery.h +++ b/dbms/src/Parsers/ASTSelectQuery.h @@ -47,6 +47,7 @@ public: bool final() const; void setDatabaseIfNeeded(const String & database_name); void replaceDatabaseAndTable(const String & database_name, const String & table_name); + void addTableFunction(ASTPtr & table_function_ptr); protected: void formatImpl(const FormatSettings & settings, FormatState & state, FormatStateStacked frame) const override; diff --git a/dbms/src/Parsers/ParserKillQueryQuery.cpp b/dbms/src/Parsers/ParserKillQueryQuery.cpp index e6d1bae2e05..5e674d9da83 100644 --- a/dbms/src/Parsers/ParserKillQueryQuery.cpp +++ b/dbms/src/Parsers/ParserKillQueryQuery.cpp @@ -11,29 +11,36 @@ namespace DB bool ParserKillQueryQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expected) { + String cluster_str; auto query = std::make_shared(); - if (!ParserKeyword{"KILL QUERY"}.ignore(pos, expected)) - return false; - - if (!ParserKeyword{"WHERE"}.ignore(pos, expected)) - return false; - + ParserKeyword p_on{"ON"}; + ParserKeyword p_test{"TEST"}; + ParserKeyword p_sync{"SYNC"}; + ParserKeyword p_async{"ASYNC"}; + ParserKeyword p_where{"WHERE"}; + ParserKeyword p_kill_query{"KILL QUERY"}; ParserExpression p_where_expression; - if (!p_where_expression.parse(pos, query->where_expression, expected)) + + if (!p_kill_query.ignore(pos, expected)) return false; - query->children.emplace_back(query->where_expression); + if (p_on.ignore(pos, expected) && !ASTQueryWithOnCluster::parse(pos, cluster_str, expected)) + return false; - if (ParserKeyword{"SYNC"}.ignore(pos)) + if (p_where.ignore(pos, expected) && !p_where_expression.parse(pos, query->where_expression, expected)) + return false; + + if (p_sync.ignore(pos, expected)) query->sync = true; - else if (ParserKeyword{"ASYNC"}.ignore(pos)) + else if (p_async.ignore(pos, expected)) query->sync = false; - else if (ParserKeyword{"TEST"}.ignore(pos)) + else if (p_test.ignore(pos, expected)) query->test = true; + query->cluster = cluster_str; + query->children.emplace_back(query->where_expression); node = std::move(query); - return true; } diff --git a/dbms/src/Parsers/ParserOptimizeQuery.cpp b/dbms/src/Parsers/ParserOptimizeQuery.cpp index c01a1a7b5df..e0dcf7ffb47 100644 --- a/dbms/src/Parsers/ParserOptimizeQuery.cpp +++ b/dbms/src/Parsers/ParserOptimizeQuery.cpp @@ -28,6 +28,7 @@ bool ParserOptimizeQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expecte ASTPtr partition; bool final = false; bool deduplicate = false; + String cluster_str; if (!s_optimize_table.ignore(pos, expected)) return false; @@ -42,6 +43,9 @@ bool ParserOptimizeQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expecte return false; } + if (ParserKeyword{"ON"}.ignore(pos, expected) && !ASTQueryWithOnCluster::parse(pos, cluster_str, expected)) + return false; + if (s_partition.ignore(pos, expected)) { if (!partition_p.parse(pos, partition, expected)) @@ -61,6 +65,8 @@ bool ParserOptimizeQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expecte query->database = typeid_cast(*database).name; if (table) query->table = typeid_cast(*table).name; + + query->cluster = cluster_str; query->partition = partition; query->final = final; query->deduplicate = deduplicate; diff --git a/dbms/src/Parsers/ParserQuery.cpp b/dbms/src/Parsers/ParserQuery.cpp index efdac16d74c..7285e03bad7 100644 --- a/dbms/src/Parsers/ParserQuery.cpp +++ b/dbms/src/Parsers/ParserQuery.cpp @@ -21,14 +21,12 @@ bool ParserQuery::parseImpl(Pos & pos, ASTPtr & node, Expected & expected) ParserInsertQuery insert_p(end); ParserUseQuery use_p; ParserSetQuery set_p; - ParserOptimizeQuery optimize_p; ParserSystemQuery system_p; bool res = query_with_output_p.parse(pos, node, expected) || insert_p.parse(pos, node, expected) || use_p.parse(pos, node, expected) || set_p.parse(pos, node, expected) - || optimize_p.parse(pos, node, expected) || system_p.parse(pos, node, expected); return res; diff --git a/dbms/src/Parsers/ParserQueryWithOutput.cpp b/dbms/src/Parsers/ParserQueryWithOutput.cpp index e7fdc390dd6..3ec71de5f0c 100644 --- a/dbms/src/Parsers/ParserQueryWithOutput.cpp +++ b/dbms/src/Parsers/ParserQueryWithOutput.cpp @@ -10,6 +10,7 @@ #include #include #include +#include namespace DB @@ -27,6 +28,7 @@ bool ParserQueryWithOutput::parseImpl(Pos & pos, ASTPtr & node, Expected & expec ParserRenameQuery rename_p; ParserDropQuery drop_p; ParserCheckQuery check_p; + ParserOptimizeQuery optimize_p; ParserKillQueryQuery kill_query_p; ASTPtr query; @@ -41,7 +43,8 @@ bool ParserQueryWithOutput::parseImpl(Pos & pos, ASTPtr & node, Expected & expec || rename_p.parse(pos, query, expected) || drop_p.parse(pos, query, expected) || check_p.parse(pos, query, expected) - || kill_query_p.parse(pos, query, expected); + || kill_query_p.parse(pos, query, expected) + || optimize_p.parse(pos, query, expected); if (!parsed) return false; diff --git a/dbms/src/Storages/ITableDeclaration.h b/dbms/src/Storages/ITableDeclaration.h index 74d5b6db6d7..5f15ad626f7 100644 --- a/dbms/src/Storages/ITableDeclaration.h +++ b/dbms/src/Storages/ITableDeclaration.h @@ -39,9 +39,9 @@ public: */ void check(const NamesAndTypesList & columns, const Names & column_names) const; - /** Check that the data block for the record contains all the columns of the table with the correct types, + /** Check that the data block contains all the columns of the table with the correct types, * contains only the columns of the table, and all the columns are different. - * If need_all, still checks that all the columns of the table are in the block. + * If need_all, checks that all the columns of the table are in the block. */ void check(const Block & block, bool need_all = false) const; diff --git a/dbms/src/Storages/MergeTree/DataPartsExchange.cpp b/dbms/src/Storages/MergeTree/DataPartsExchange.cpp index 15d1c56b051..39db6142605 100644 --- a/dbms/src/Storages/MergeTree/DataPartsExchange.cpp +++ b/dbms/src/Storages/MergeTree/DataPartsExchange.cpp @@ -161,6 +161,8 @@ MergeTreeData::MutableDataPartPtr Fetcher::fetchPart( const String & host, int port, const ConnectionTimeouts & timeouts, + const String & user, + const String & password, bool to_detached, const String & tmp_prefix_) { @@ -175,7 +177,14 @@ MergeTreeData::MutableDataPartPtr Fetcher::fetchPart( {"compress", "false"} }); - ReadWriteBufferFromHTTP in{uri, Poco::Net::HTTPRequest::HTTP_POST, {}, timeouts}; + Poco::Net::HTTPBasicCredentials creds{}; + if (!user.empty()) + { + creds.setUsername(user); + creds.setPassword(password); + } + + ReadWriteBufferFromHTTP in{uri, Poco::Net::HTTPRequest::HTTP_POST, {}, timeouts, creds}; static const String TMP_PREFIX = "tmp_fetch_"; String tmp_prefix = tmp_prefix_.empty() ? TMP_PREFIX : tmp_prefix_; diff --git a/dbms/src/Storages/MergeTree/DataPartsExchange.h b/dbms/src/Storages/MergeTree/DataPartsExchange.h index 0ebc2ec358a..32eb80e96ca 100644 --- a/dbms/src/Storages/MergeTree/DataPartsExchange.h +++ b/dbms/src/Storages/MergeTree/DataPartsExchange.h @@ -54,6 +54,8 @@ public: const String & host, int port, const ConnectionTimeouts & timeouts, + const String & user, + const String & password, bool to_detached = false, const String & tmp_prefix_ = ""); diff --git a/dbms/src/Storages/MergeTree/MergeTreeSettings.h b/dbms/src/Storages/MergeTree/MergeTreeSettings.h index aa29dccc195..43276a6dd34 100644 --- a/dbms/src/Storages/MergeTree/MergeTreeSettings.h +++ b/dbms/src/Storages/MergeTree/MergeTreeSettings.h @@ -139,7 +139,7 @@ struct MergeTreeSettings * instead of ordinary ones (dozens KB). \ * Before enabling check that all replicas support new format. \ */ \ - M(SettingBool, use_minimalistic_checksums_in_zookeeper, false) + M(SettingBool, use_minimalistic_checksums_in_zookeeper, true) /// Settings that should not change after the creation of a table. #define APPLY_FOR_IMMUTABLE_MERGE_TREE_SETTINGS(M) \ diff --git a/dbms/src/Storages/StorageDistributed.cpp b/dbms/src/Storages/StorageDistributed.cpp index 53ec36fe2c4..5805ea439f3 100644 --- a/dbms/src/Storages/StorageDistributed.cpp +++ b/dbms/src/Storages/StorageDistributed.cpp @@ -65,12 +65,15 @@ namespace ErrorCodes namespace { -/// select query has database and table names as AST pointers -/// Creates a copy of query, changes database and table names. -ASTPtr rewriteSelectQuery(const ASTPtr & query, const std::string & database, const std::string & table) +/// select query has database, table and table function names as AST pointers +/// Creates a copy of query, changes database, table and table function names. +ASTPtr rewriteSelectQuery(const ASTPtr & query, const std::string & database, const std::string & table, ASTPtr table_function_ptr = nullptr) { auto modified_query_ast = query->clone(); - typeid_cast(*modified_query_ast).replaceDatabaseAndTable(database, table); + if (table_function_ptr) + typeid_cast(*modified_query_ast).addTableFunction(table_function_ptr); + else + typeid_cast(*modified_query_ast).replaceDatabaseAndTable(database, table); return modified_query_ast; } @@ -170,16 +173,48 @@ StorageDistributed::StorageDistributed( } -StoragePtr StorageDistributed::createWithOwnCluster( - const std::string & name_, +StorageDistributed::StorageDistributed( + const String & database_name, + const String & table_name_, const ColumnsDescription & columns_, - const String & remote_database_, - const String & remote_table_, + ASTPtr remote_table_function_ptr_, + const String & cluster_name_, + const Context & context_, + const ASTPtr & sharding_key_, + const String & data_path_, + bool attach) + : StorageDistributed(database_name, table_name_, columns_, String{}, String{}, cluster_name_, context_, sharding_key_, data_path_, attach) +{ + remote_table_function_ptr = remote_table_function_ptr_; +} + + +StoragePtr StorageDistributed::createWithOwnCluster( + const std::string & table_name_, + const ColumnsDescription & columns_, + const String & remote_database_, /// database on remote servers. + const String & remote_table_, /// The name of the table on the remote servers. + ClusterPtr owned_cluster_, + const Context & context_) +{ + auto res = ext::shared_ptr_helper::create( + String{}, table_name_, columns_, remote_database_, remote_table_, String{}, context_, ASTPtr(), String(), false); + + res->owned_cluster = owned_cluster_; + + return res; +} + + +StoragePtr StorageDistributed::createWithOwnCluster( + const std::string & table_name_, + const ColumnsDescription & columns_, + ASTPtr & remote_table_function_ptr_, ClusterPtr & owned_cluster_, const Context & context_) { auto res = ext::shared_ptr_helper::create( - String{}, name_, columns_, remote_database_, remote_table_, String{}, context_, ASTPtr(), String(), false); + String{}, table_name_, columns_, remote_table_function_ptr_, String{}, context_, ASTPtr(), String(), false); res->owned_cluster = owned_cluster_; @@ -209,15 +244,19 @@ BlockInputStreams StorageDistributed::read( processed_stage = result_size == 1 ? QueryProcessingStage::Complete : QueryProcessingStage::WithMergeableState; + const auto & modified_query_ast = rewriteSelectQuery( - query_info.query, remote_database, remote_table); + query_info.query, remote_database, remote_table, remote_table_function_ptr); Block header = materializeBlock(InterpreterSelectQuery(query_info.query, context, Names{}, processed_stage).getSampleBlock()); - - ClusterProxy::SelectStreamFactory select_stream_factory( - header, processed_stage, QualifiedTableName{remote_database, remote_table}, context.getExternalTables()); - + + ClusterProxy::SelectStreamFactory select_stream_factory = remote_table_function_ptr ? + ClusterProxy::SelectStreamFactory( + header, processed_stage, remote_table_function_ptr, context.getExternalTables()) + : ClusterProxy::SelectStreamFactory( + header, processed_stage, QualifiedTableName{remote_database, remote_table}, context.getExternalTables()); + return ClusterProxy::executeQuery( select_stream_factory, cluster, modified_query_ast, context, settings); } diff --git a/dbms/src/Storages/StorageDistributed.h b/dbms/src/Storages/StorageDistributed.h index bdfd654ea6e..dda56bb3312 100644 --- a/dbms/src/Storages/StorageDistributed.h +++ b/dbms/src/Storages/StorageDistributed.h @@ -9,6 +9,7 @@ #include #include #include +#include #include @@ -36,8 +37,15 @@ public: static StoragePtr createWithOwnCluster( const std::string & table_name_, const ColumnsDescription & columns_, - const String & remote_database_, /// database on remote servers. - const String & remote_table_, /// The name of the table on the remote servers. + const String & remote_database_, /// database on remote servers. + const String & remote_table_, /// The name of the table on the remote servers. + ClusterPtr owned_cluster_, + const Context & context_); + + static StoragePtr createWithOwnCluster( + const std::string & table_name_, + const ColumnsDescription & columns_, + ASTPtr & remote_table_function_ptr_, /// Table function ptr. ClusterPtr & owned_cluster_, const Context & context_); @@ -101,6 +109,7 @@ public: String table_name; String remote_database; String remote_table; + ASTPtr remote_table_function_ptr; const Context & context; Logger * log = &Logger::get("StorageDistributed"); @@ -146,6 +155,17 @@ protected: const ASTPtr & sharding_key_, const String & data_path_, bool attach); + + StorageDistributed( + const String & database_name, + const String & table_name_, + const ColumnsDescription & columns_, + ASTPtr remote_table_function_ptr_, + const String & cluster_name_, + const Context & context_, + const ASTPtr & sharding_key_, + const String & data_path_, + bool attach); }; } diff --git a/dbms/src/Storages/StorageFactory.h b/dbms/src/Storages/StorageFactory.h index 2acb9fb7c00..4addfcd9794 100644 --- a/dbms/src/Storages/StorageFactory.h +++ b/dbms/src/Storages/StorageFactory.h @@ -53,6 +53,11 @@ public: /// No locking, you must register all engines before usage of get. void registerStorage(const std::string & name, Creator creator); + const auto & getAllStorages() const + { + return storages; + } + private: using Storages = std::unordered_map; Storages storages; diff --git a/dbms/src/Storages/StorageKafka.cpp b/dbms/src/Storages/StorageKafka.cpp index a9666bab22c..7823dfdd65a 100644 --- a/dbms/src/Storages/StorageKafka.cpp +++ b/dbms/src/Storages/StorageKafka.cpp @@ -62,12 +62,19 @@ class ReadBufferFromKafkaConsumer : public ReadBuffer { rd_kafka_t * consumer; rd_kafka_message_t * current; + bool current_pending; Poco::Logger * log; size_t read_messages; + char row_delimiter; bool nextImpl() override { - reset(); + if (current_pending) + { + BufferBase::set(reinterpret_cast(current->payload), current->len, 0); + current_pending = false; + return true; + } // Process next buffered message rd_kafka_message_t * msg = rd_kafka_consumer_poll(consumer, READ_POLL_MS); @@ -88,13 +95,24 @@ class ReadBufferFromKafkaConsumer : public ReadBuffer rd_kafka_message_destroy(msg); return nextImpl(); } + ++read_messages; + + // Now we've received a new message. Check if we need to produce a delimiter + if (row_delimiter != '\0' && current != nullptr) + { + BufferBase::set(&row_delimiter, 1, 0); + reset(); + current = msg; + current_pending = true; + return true; + } // Consume message and mark the topic/partition offset - // The offsets will be committed in the insertSuffix() method after the block is completed - // If an exception is thrown before that would occur, the client will rejoin without comitting offsets - BufferBase::set(reinterpret_cast(msg->payload), msg->len, 0); + // The offsets will be committed in the readSuffix() method after the block is completed + // If an exception is thrown before that would occur, the client will rejoin without committing offsets + reset(); current = msg; - ++read_messages; + BufferBase::set(reinterpret_cast(current->payload), current->len, 0); return true; } @@ -108,8 +126,11 @@ class ReadBufferFromKafkaConsumer : public ReadBuffer } public: - ReadBufferFromKafkaConsumer(rd_kafka_t * consumer_, Poco::Logger * log_) - : ReadBuffer(nullptr, 0), consumer(consumer_), current(nullptr), log(log_), read_messages(0) {} + ReadBufferFromKafkaConsumer(rd_kafka_t * consumer_, Poco::Logger * log_, char row_delimiter_) + : ReadBuffer(nullptr, 0), consumer(consumer_), current(nullptr), + current_pending(false), log(log_), read_messages(0), row_delimiter(row_delimiter_) { + LOG_TRACE(log, "row delimiter is :" << row_delimiter); + } ~ReadBufferFromKafkaConsumer() { reset(); } @@ -143,7 +164,7 @@ public: // Create a formatted reader on Kafka messages LOG_TRACE(storage.log, "Creating formatted reader"); - read_buf = std::make_unique(consumer->stream, storage.log); + read_buf = std::make_unique(consumer->stream, storage.log, storage.row_delimiter); reader = FormatFactory::instance().getInput(storage.format_name, *read_buf, storage.getSampleBlock(), context, max_block_size); } @@ -226,13 +247,14 @@ StorageKafka::StorageKafka( Context & context_, const ColumnsDescription & columns_, const String & brokers_, const String & group_, const Names & topics_, - const String & format_name_, const String & schema_name_, size_t num_consumers_) + const String & format_name_, char row_delimiter_, const String & schema_name_, size_t num_consumers_) : IStorage{columns_}, table_name(table_name_), database_name(database_name_), context(context_), topics(context.getMacros()->expand(topics_)), brokers(context.getMacros()->expand(brokers_)), group(context.getMacros()->expand(group_)), format_name(context.getMacros()->expand(format_name_)), + row_delimiter(row_delimiter_), schema_name(context.getMacros()->expand(schema_name_)), num_consumers(num_consumers_), log(&Logger::get("StorageKafka (" + table_name_ + ")")), semaphore(0, num_consumers_), mutex(), consumers(), event_update() @@ -552,10 +574,10 @@ void registerStorageKafka(StorageFactory & factory) * - Schema (optional, if the format supports it) */ - if (engine_args.size() < 3 || engine_args.size() > 6) + if (engine_args.size() < 3 || engine_args.size() > 7) throw Exception( - "Storage Kafka requires 3-6 parameters" - " - Kafka broker list, list of topics to consume, consumer group ID, message format, schema, number of consumers", + "Storage Kafka requires 3-7 parameters" + " - Kafka broker list, list of topics to consume, consumer group ID, message format, row delimiter, schema, number of consumers", ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); String brokers; @@ -569,13 +591,33 @@ void registerStorageKafka(StorageFactory & factory) engine_args[2] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[2], args.local_context); engine_args[3] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[3], args.local_context); - // Parse format schema if supported (optional) - String schema; + // Parse row delimiter (optional) + char row_delimiter = '\0'; if (engine_args.size() >= 5) { engine_args[4] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[4], args.local_context); auto ast = typeid_cast(engine_args[4].get()); + String arg; + if (ast && ast->value.getType() == Field::Types::String) + arg = safeGet(ast->value); + else + throw Exception("Row delimiter must be a char", ErrorCodes::BAD_ARGUMENTS); + if (arg.size() > 1) + throw Exception("Row delimiter must be a char", ErrorCodes::BAD_ARGUMENTS); + else if (arg.size() == 0) + row_delimiter = '\0'; + else + row_delimiter = arg[0]; + } + + // Parse format schema if supported (optional) + String schema; + if (engine_args.size() >= 6) + { + engine_args[5] = evaluateConstantExpressionOrIdentifierAsLiteral(engine_args[4], args.local_context); + + auto ast = typeid_cast(engine_args[5].get()); if (ast && ast->value.getType() == Field::Types::String) schema = safeGet(ast->value); else @@ -584,9 +626,9 @@ void registerStorageKafka(StorageFactory & factory) // Parse number of consumers (optional) UInt64 num_consumers = 1; - if (engine_args.size() >= 6) + if (engine_args.size() >= 7) { - auto ast = typeid_cast(engine_args[5].get()); + auto ast = typeid_cast(engine_args[6].get()); if (ast && ast->value.getType() == Field::Types::UInt64) num_consumers = safeGet(ast->value); else @@ -613,7 +655,7 @@ void registerStorageKafka(StorageFactory & factory) return StorageKafka::create( args.table_name, args.database_name, args.context, args.columns, - brokers, group, topics, format, schema, num_consumers); + brokers, group, topics, format, row_delimiter, schema, num_consumers); }); } diff --git a/dbms/src/Storages/StorageKafka.h b/dbms/src/Storages/StorageKafka.h index 45530517e94..9652d1d6a46 100644 --- a/dbms/src/Storages/StorageKafka.h +++ b/dbms/src/Storages/StorageKafka.h @@ -75,6 +75,9 @@ private: const String brokers; const String group; const String format_name; + // Optional row delimiter for generating char delimited stream + // in order to make various input stream parsers happy. + char row_delimiter; const String schema_name; /// Total number of consumers size_t num_consumers; @@ -109,7 +112,7 @@ protected: Context & context_, const ColumnsDescription & columns_, const String & brokers_, const String & group_, const Names & topics_, - const String & format_name_, const String & schema_name_, size_t num_consumers_); + const String & format_name_, char row_delimiter_, const String & schema_name_, size_t num_consumers_); }; } diff --git a/dbms/src/Storages/StorageMergeTree.cpp b/dbms/src/Storages/StorageMergeTree.cpp index e3f78e746b5..42774b06f1d 100644 --- a/dbms/src/Storages/StorageMergeTree.cpp +++ b/dbms/src/Storages/StorageMergeTree.cpp @@ -311,6 +311,7 @@ std::vector StorageMergeTree::getMutationsStatus() cons part_data_versions.reserve(data_parts.size()); for (const auto & part : data_parts) part_data_versions.push_back(part->info.getDataVersion()); + std::sort(part_data_versions.begin(), part_data_versions.end()); std::vector result; for (const auto & kv : current_mutations_by_version) diff --git a/dbms/src/Storages/StorageReplicatedMergeTree.cpp b/dbms/src/Storages/StorageReplicatedMergeTree.cpp index c8b8b6d9706..09fd8bbba8a 100644 --- a/dbms/src/Storages/StorageReplicatedMergeTree.cpp +++ b/dbms/src/Storages/StorageReplicatedMergeTree.cpp @@ -1971,9 +1971,10 @@ bool StorageReplicatedMergeTree::executeReplaceRange(const LogEntry & entry) String replica_path = zookeeper_path + "/replicas/" + part_desc->replica; ReplicatedMergeTreeAddress address(getZooKeeper()->get(replica_path + "/host")); auto timeouts = ConnectionTimeouts::getHTTPTimeouts(context.getSettingsRef()); + auto [user, password] = context.getInterserverCredentials(); part_desc->res_part = fetcher.fetchPart(part_desc->found_new_part_name, replica_path, - address.host, address.replication_port, timeouts, false, TMP_PREFIX + "fetch_"); + address.host, address.replication_port, timeouts, user, password, false, TMP_PREFIX + "fetch_"); /// TODO: check columns_version of fetched part @@ -2706,10 +2707,11 @@ bool StorageReplicatedMergeTree::fetchPart(const String & part_name, const Strin ReplicatedMergeTreeAddress address(getZooKeeper()->get(replica_path + "/host")); auto timeouts = ConnectionTimeouts::getHTTPTimeouts(context.getSettingsRef()); + auto [user, password] = context.getInterserverCredentials(); try { - part = fetcher.fetchPart(part_name, replica_path, address.host, address.replication_port, timeouts, to_detached); + part = fetcher.fetchPart(part_name, replica_path, address.host, address.replication_port, timeouts, user, password, to_detached); if (!to_detached) { diff --git a/dbms/src/Storages/System/IStorageSystemWithStringColumns.h b/dbms/src/Storages/System/IStorageSystemOneBlock.h similarity index 63% rename from dbms/src/Storages/System/IStorageSystemWithStringColumns.h rename to dbms/src/Storages/System/IStorageSystemOneBlock.h index 08e2f0a7bf5..96286f56eee 100644 --- a/dbms/src/Storages/System/IStorageSystemWithStringColumns.h +++ b/dbms/src/Storages/System/IStorageSystemOneBlock.h @@ -14,21 +14,15 @@ class Context; /** Base class for system tables whose all columns have String type. */ template -class IStorageSystemWithStringColumns : public IStorage +class IStorageSystemOneBlock : public IStorage { protected: - virtual void fillData(MutableColumns & res_columns) const = 0; + virtual void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const = 0; public: - IStorageSystemWithStringColumns(const String & name_) : name(name_) + IStorageSystemOneBlock(const String & name_) : name(name_) { - auto names = Self::getColumnNames(); - NamesAndTypesList name_list; - for (const auto & name : names) - { - name_list.push_back(NameAndTypePair{name, std::make_shared()}); - } - setColumns(ColumnsDescription(name_list)); + setColumns(ColumnsDescription(Self::getNamesAndTypes())); } std::string getTableName() const override @@ -37,8 +31,8 @@ public: } BlockInputStreams read(const Names & column_names, - const SelectQueryInfo & /*query_info*/, - const Context & /*context*/, + const SelectQueryInfo & query_info, + const Context & context, QueryProcessingStage::Enum & processed_stage, size_t /*max_block_size*/, unsigned /*num_streams*/) override @@ -48,7 +42,7 @@ public: Block sample_block = getSampleBlock(); MutableColumns res_columns = sample_block.cloneEmptyColumns(); - fillData(res_columns); + fillData(res_columns, context, query_info); return BlockInputStreams(1, std::make_shared(sample_block.cloneWithColumns(std::move(res_columns)))); } diff --git a/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.cpp b/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.cpp index 9dd106ce2d7..8fa335faceb 100644 --- a/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.cpp +++ b/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.cpp @@ -3,12 +3,23 @@ namespace DB { -void StorageSystemAggregateFunctionCombinators::fillData(MutableColumns & res_columns) const + +NamesAndTypesList StorageSystemAggregateFunctionCombinators::getNamesAndTypes() +{ + return { + {"name", std::make_shared()}, + {"is_internal", std::make_shared()}, + }; +} + +void StorageSystemAggregateFunctionCombinators::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { const auto & combinators = AggregateFunctionCombinatorFactory::instance().getAllAggregateFunctionCombinators(); for (const auto & pair : combinators) { res_columns[0]->insert(pair.first); + res_columns[1]->insert(UInt64(pair.second->isForInternalUsageOnly())); } } + } diff --git a/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.h b/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.h index 097fe93666e..1d7226eda8b 100644 --- a/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.h +++ b/dbms/src/Storages/System/StorageSystemAggregateFunctionCombinators.h @@ -1,26 +1,25 @@ #pragma once -#include +#include +#include +#include #include namespace DB { class StorageSystemAggregateFunctionCombinators : public ext::shared_ptr_helper, - public IStorageSystemWithStringColumns + public IStorageSystemOneBlock { protected: - void fillData(MutableColumns & res_columns) const override; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + using IStorageSystemOneBlock::IStorageSystemOneBlock; public: - using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns; std::string getName() const override { return "SystemAggregateFunctionCombinators"; } - static std::vector getColumnNames() - { - return {"name"}; - } + static NamesAndTypesList getNamesAndTypes(); }; } diff --git a/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.cpp b/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.cpp index bc2f76379e9..059ef708a81 100644 --- a/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.cpp +++ b/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.cpp @@ -1,51 +1,34 @@ -#include - -#include -#include -#include #include #include -#include +#include +#include namespace DB { - -StorageSystemAsynchronousMetrics::StorageSystemAsynchronousMetrics(const std::string & name_, const AsynchronousMetrics & async_metrics_) - : name(name_), - async_metrics(async_metrics_) +NamesAndTypesList StorageSystemAsynchronousMetrics::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { {"metric", std::make_shared()}, {"value", std::make_shared()}, - })); + }; } -BlockInputStreams StorageSystemAsynchronousMetrics::read( - const Names & column_names, - const SelectQueryInfo &, - const Context &, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +StorageSystemAsynchronousMetrics::StorageSystemAsynchronousMetrics(const std::string & name_, const AsynchronousMetrics & async_metrics_) + : IStorageSystemOneBlock(name_), async_metrics(async_metrics_) { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); +} +void StorageSystemAsynchronousMetrics::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const +{ auto async_metrics_values = async_metrics.getValues(); - for (const auto & name_value : async_metrics_values) { res_columns[0]->insert(name_value.first); res_columns[1]->insert(name_value.second); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.h b/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.h index 60e50096143..853cb97c974 100644 --- a/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.h +++ b/dbms/src/Storages/System/StorageSystemAsynchronousMetrics.h @@ -1,8 +1,7 @@ #pragma once #include -#include - +#include namespace DB { @@ -13,26 +12,20 @@ class Context; /** Implements system table asynchronous_metrics, which allows to get values of periodically (asynchronously) updated metrics. */ -class StorageSystemAsynchronousMetrics : public ext::shared_ptr_helper, public IStorage +class StorageSystemAsynchronousMetrics : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemAsynchronousMetrics"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; + static NamesAndTypesList getNamesAndTypes(); private: - const std::string name; const AsynchronousMetrics & async_metrics; protected: StorageSystemAsynchronousMetrics(const std::string & name_, const AsynchronousMetrics & async_metrics_); + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemBuildOptions.cpp b/dbms/src/Storages/System/StorageSystemBuildOptions.cpp index e62e6e9bbfd..2a8ffc947be 100644 --- a/dbms/src/Storages/System/StorageSystemBuildOptions.cpp +++ b/dbms/src/Storages/System/StorageSystemBuildOptions.cpp @@ -1,46 +1,26 @@ -#include +#include #include -#include -#include #include #include -#include namespace DB { - -StorageSystemBuildOptions::StorageSystemBuildOptions(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemBuildOptions::getNamesAndTypes() { - setColumns(ColumnsDescription({ - { "name", std::make_shared() }, - { "value", std::make_shared() }, - })); + return { + {"name", std::make_shared()}, + {"value", std::make_shared()}, + }; } - -BlockInputStreams StorageSystemBuildOptions::read( - const Names & column_names, - const SelectQueryInfo &, - const Context &, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemBuildOptions::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (auto it = auto_config_build; *it; it += 2) { res_columns[0]->insert(String(it[0])); res_columns[1]->insert(String(it[1])); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemBuildOptions.h b/dbms/src/Storages/System/StorageSystemBuildOptions.h index d772b255383..749ffbddbaf 100644 --- a/dbms/src/Storages/System/StorageSystemBuildOptions.h +++ b/dbms/src/Storages/System/StorageSystemBuildOptions.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,18 @@ class Context; /** System table "build_options" with many params used for clickhouse building */ -class StorageSystemBuildOptions : public ext::shared_ptr_helper, public IStorage +class StorageSystemBuildOptions : public ext::shared_ptr_helper, public IStorageSystemOneBlock { -public: - std::string getName() const override { return "SystemBuildOptions"; } - std::string getTableName() const override { return name; } - - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; - protected: - StorageSystemBuildOptions(const std::string & name_); + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + + using IStorageSystemOneBlock::IStorageSystemOneBlock; + +public: + + std::string getName() const override { return "SystemBuildOptions"; } + + static NamesAndTypesList getNamesAndTypes(); }; } diff --git a/dbms/src/Storages/System/StorageSystemClusters.cpp b/dbms/src/Storages/System/StorageSystemClusters.cpp index fb5c4e41b82..3527de302a1 100644 --- a/dbms/src/Storages/System/StorageSystemClusters.cpp +++ b/dbms/src/Storages/System/StorageSystemClusters.cpp @@ -1,50 +1,32 @@ -#include -#include -#include -#include +#include #include #include -#include -#include +#include #include +#include namespace DB { - -StorageSystemClusters::StorageSystemClusters(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemClusters::getNamesAndTypes() { - setColumns(ColumnsDescription({ - { "cluster", std::make_shared() }, - { "shard_num", std::make_shared() }, - { "shard_weight", std::make_shared() }, - { "replica_num", std::make_shared() }, - { "host_name", std::make_shared() }, - { "host_address", std::make_shared() }, - { "port", std::make_shared() }, - { "is_local", std::make_shared() }, - { "user", std::make_shared() }, - { "default_database", std::make_shared() }, - })); + return { + {"cluster", std::make_shared()}, + {"shard_num", std::make_shared()}, + {"shard_weight", std::make_shared()}, + {"replica_num", std::make_shared()}, + {"host_name", std::make_shared()}, + {"host_address", std::make_shared()}, + {"port", std::make_shared()}, + {"is_local", std::make_shared()}, + {"user", std::make_shared()}, + {"default_database", std::make_shared()}, + }; } - -BlockInputStreams StorageSystemClusters::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemClusters::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - - auto updateColumns = [&](const std::string & cluster_name, const Cluster::ShardInfo & shard_info, - const Cluster::Address & address) + auto updateColumns = [&](const std::string & cluster_name, const Cluster::ShardInfo & shard_info, const Cluster::Address & address) { size_t i = 0; res_columns[i++]->insert(cluster_name); @@ -85,8 +67,5 @@ BlockInputStreams StorageSystemClusters::read( } } } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemClusters.h b/dbms/src/Storages/System/StorageSystemClusters.h index 1e36269ded2..dde9e53b626 100644 --- a/dbms/src/Storages/System/StorageSystemClusters.h +++ b/dbms/src/Storages/System/StorageSystemClusters.h @@ -1,7 +1,9 @@ #pragma once +#include +#include #include -#include +#include namespace DB @@ -13,25 +15,17 @@ class Context; * that allows to obtain information about available clusters * (which may be specified in Distributed tables). */ -class StorageSystemClusters : public ext::shared_ptr_helper, public IStorage +class StorageSystemClusters : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemClusters"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemClusters(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemCollations.cpp b/dbms/src/Storages/System/StorageSystemCollations.cpp index b75a2e70298..f2a7f5e8184 100644 --- a/dbms/src/Storages/System/StorageSystemCollations.cpp +++ b/dbms/src/Storages/System/StorageSystemCollations.cpp @@ -3,11 +3,18 @@ namespace DB { -void StorageSystemCollations::fillData(MutableColumns & res_columns) const + +NamesAndTypesList StorageSystemCollations::getNamesAndTypes() { - for (const auto & collation : Collator::getAvailableCollations()) - { - res_columns[0]->insert(collation); - } + return { + {"name", std::make_shared()}, + }; } + +void StorageSystemCollations::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const +{ + for (const auto & collation_name : Collator::getAvailableCollations()) + res_columns[0]->insert(collation_name); +} + } diff --git a/dbms/src/Storages/System/StorageSystemCollations.h b/dbms/src/Storages/System/StorageSystemCollations.h index a10b19bb954..f8b7b6ee3af 100644 --- a/dbms/src/Storages/System/StorageSystemCollations.h +++ b/dbms/src/Storages/System/StorageSystemCollations.h @@ -1,26 +1,22 @@ #pragma once -#include +#include #include namespace DB { + class StorageSystemCollations : public ext::shared_ptr_helper, - public IStorageSystemWithStringColumns + public IStorageSystemOneBlock { protected: - void fillData(MutableColumns & res_columns) const override; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + using IStorageSystemOneBlock::IStorageSystemOneBlock; public: - using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns; - std::string getName() const override - { - return "SystemTableCollations"; - } + std::string getName() const override { return "SystemTableCollations"; } - static std::vector getColumnNames() - { - return {"name"}; - } + static NamesAndTypesList getNamesAndTypes(); }; + } diff --git a/dbms/src/Storages/System/StorageSystemColumns.cpp b/dbms/src/Storages/System/StorageSystemColumns.cpp index d42d8a80394..1a5fc74d324 100644 --- a/dbms/src/Storages/System/StorageSystemColumns.cpp +++ b/dbms/src/Storages/System/StorageSystemColumns.cpp @@ -15,10 +15,9 @@ namespace DB { -StorageSystemColumns::StorageSystemColumns(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemColumns::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { { "database", std::make_shared() }, { "table", std::make_shared() }, { "name", std::make_shared() }, @@ -28,21 +27,11 @@ StorageSystemColumns::StorageSystemColumns(const std::string & name_) { "data_compressed_bytes", std::make_shared() }, { "data_uncompressed_bytes", std::make_shared() }, { "marks_bytes", std::make_shared() }, - })); + }; } - -BlockInputStreams StorageSystemColumns::read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemColumns::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - Block block_to_filter; std::map, StoragePtr> storages; @@ -60,7 +49,7 @@ BlockInputStreams StorageSystemColumns::read( VirtualColumnUtils::filterBlockWithQuery(query_info.query, block_to_filter, context); if (!block_to_filter.rows()) - return BlockInputStreams(); + return; ColumnPtr database_column = block_to_filter.getByName("database").column; size_t rows = database_column->size(); @@ -98,14 +87,12 @@ BlockInputStreams StorageSystemColumns::read( VirtualColumnUtils::filterBlockWithQuery(query_info.query, block_to_filter, context); if (!block_to_filter.rows()) - return BlockInputStreams(); + return; ColumnPtr filtered_database_column = block_to_filter.getByName("database").column; ColumnPtr filtered_table_column = block_to_filter.getByName("table").column; /// We compose the result. - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - size_t rows = filtered_database_column->size(); for (size_t i = 0; i < rows; ++i) { @@ -193,8 +180,6 @@ BlockInputStreams StorageSystemColumns::read( } } } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } } diff --git a/dbms/src/Storages/System/StorageSystemColumns.h b/dbms/src/Storages/System/StorageSystemColumns.h index ba187f7306f..dc1afc3f71a 100644 --- a/dbms/src/Storages/System/StorageSystemColumns.h +++ b/dbms/src/Storages/System/StorageSystemColumns.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -11,25 +11,17 @@ class Context; /** Implements system table 'columns', that allows to get information about columns for every table. */ -class StorageSystemColumns : public ext::shared_ptr_helper, public IStorage +class StorageSystemColumns : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemColumns"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemColumns(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; -private: - const std::string name; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemDataTypeFamilies.cpp b/dbms/src/Storages/System/StorageSystemDataTypeFamilies.cpp index c2041ac044c..c8d692fddd8 100644 --- a/dbms/src/Storages/System/StorageSystemDataTypeFamilies.cpp +++ b/dbms/src/Storages/System/StorageSystemDataTypeFamilies.cpp @@ -1,91 +1,35 @@ #include #include -#include -#include +#include +#include #include -#include -#include - -#include namespace DB { -namespace + +NamesAndTypesList StorageSystemDataTypeFamilies::getNamesAndTypes() { - String getPropertiesAsString(const DataTypePtr data_type) - { - std::vector properties; - if (data_type->isParametric()) - properties.push_back("parametric"); - if (data_type->haveSubtypes()) - properties.push_back("have_subtypes"); - if (data_type->cannotBeStoredInTables()) - properties.push_back("cannot_be_stored_in_tables"); - if (data_type->isComparable()) - properties.push_back("comparable"); - if (data_type->canBeComparedWithCollation()) - properties.push_back("can_be_compared_with_collation"); - if (data_type->canBeUsedAsVersion()) - properties.push_back("can_be_used_as_version"); - if (data_type->isSummable()) - properties.push_back("summable"); - if (data_type->canBeUsedInBitOperations()) - properties.push_back("can_be_used_in_bit_operations"); - if (data_type->canBeUsedInBooleanContext()) - properties.push_back("can_be_used_in_boolean_context"); - if (data_type->isValueRepresentedByNumber()) - properties.push_back("value_represented_by_number"); - if (data_type->isCategorial()) - properties.push_back("categorial"); - if (data_type->isNullable()) - properties.push_back("nullable"); - if (data_type->onlyNull()) - properties.push_back("only_null"); - if (data_type->canBeInsideNullable()) - properties.push_back("can_be_inside_nullable"); - return boost::algorithm::join(properties, ","); - } - ASTPtr createFakeEnumCreationAst() - { - String fakename{"e"}; - ASTPtr name = std::make_shared(Field(fakename.c_str(), fakename.size())); - ASTPtr value = std::make_shared(Field(UInt64(1))); - ASTPtr ast_func = makeASTFunction("equals", name, value); - ASTPtr clone = ast_func->clone(); - clone->children.clear(); - clone->children.push_back(ast_func); - return clone; - } + return { + {"name", std::make_shared()}, + {"case_insensitive", std::make_shared()}, + {"alias_to", std::make_shared()}, + }; } -void StorageSystemDataTypeFamilies::fillData(MutableColumns & res_columns) const +void StorageSystemDataTypeFamilies::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { const auto & factory = DataTypeFactory::instance(); - const auto & data_types = factory.getAllDataTypes(); - for (const auto & pair : data_types) + auto names = factory.getAllRegisteredNames(); + for (const auto & name : names) { - res_columns[0]->insert(pair.first); + res_columns[0]->insert(name); + res_columns[1]->insert(UInt64(factory.isCaseInsensitive(name))); - try - { - DataTypePtr type_ptr; - //special case with enum, because it has arguments but it's properties doesn't - //depend on arguments - if (boost::starts_with(pair.first, "Enum")) - { - type_ptr = factory.get(pair.first, createFakeEnumCreationAst()); - } - else - { - type_ptr = factory.get(pair.first); - } - - res_columns[1]->insert(getPropertiesAsString(type_ptr)); - } - catch (Exception & ex) - { - res_columns[1]->insert(String{"depends_on_arguments"}); - } + if (factory.isAlias(name)) + res_columns[2]->insert(factory.aliasTo(name)); + else + res_columns[2]->insert(String("")); } } + } diff --git a/dbms/src/Storages/System/StorageSystemDataTypeFamilies.h b/dbms/src/Storages/System/StorageSystemDataTypeFamilies.h index 38b769e6e1c..365e2790699 100644 --- a/dbms/src/Storages/System/StorageSystemDataTypeFamilies.h +++ b/dbms/src/Storages/System/StorageSystemDataTypeFamilies.h @@ -1,25 +1,23 @@ #pragma once -#include + #include +#include + namespace DB { + class StorageSystemDataTypeFamilies : public ext::shared_ptr_helper, - public IStorageSystemWithStringColumns + public IStorageSystemOneBlock { protected: - void fillData(MutableColumns & res_columns) const override; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + + using IStorageSystemOneBlock::IStorageSystemOneBlock; public: - using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns; + std::string getName() const override { return "SystemTableDataTypeFamilies"; } - std::string getName() const override - { - return "SystemTableDataTypeFamilies"; - } - - static std::vector getColumnNames() - { - return {"name", "properties"}; - } + static NamesAndTypesList getNamesAndTypes(); }; + } diff --git a/dbms/src/Storages/System/StorageSystemDatabases.cpp b/dbms/src/Storages/System/StorageSystemDatabases.cpp index 49c78688616..4df3d360a3b 100644 --- a/dbms/src/Storages/System/StorageSystemDatabases.cpp +++ b/dbms/src/Storages/System/StorageSystemDatabases.cpp @@ -1,40 +1,24 @@ -#include -#include -#include #include -#include +#include #include +#include namespace DB { - -StorageSystemDatabases::StorageSystemDatabases(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemDatabases::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { {"name", std::make_shared()}, {"engine", std::make_shared()}, {"data_path", std::make_shared()}, {"metadata_path", std::make_shared()}, - })); + }; } - -BlockInputStreams StorageSystemDatabases::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemDatabases::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - auto databases = context.getDatabases(); for (const auto & database : databases) { @@ -43,9 +27,6 @@ BlockInputStreams StorageSystemDatabases::read( res_columns[2]->insert(database.second->getDataPath()); res_columns[3]->insert(database.second->getMetadataPath()); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemDatabases.h b/dbms/src/Storages/System/StorageSystemDatabases.h index 621e490963a..c83f5a72efc 100644 --- a/dbms/src/Storages/System/StorageSystemDatabases.h +++ b/dbms/src/Storages/System/StorageSystemDatabases.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,20 @@ class Context; /** Implements `databases` system table, which allows you to get information about all databases. */ -class StorageSystemDatabases : public ext::shared_ptr_helper, public IStorage +class StorageSystemDatabases : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: - std::string getName() const override { return "SystemDatabases"; } - std::string getTableName() const override { return name; } + std::string getName() const override + { + return "SystemDatabases"; + } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemDatabases(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemDictionaries.cpp b/dbms/src/Storages/System/StorageSystemDictionaries.cpp index c57c1c7f459..665b992c829 100644 --- a/dbms/src/Storages/System/StorageSystemDictionaries.cpp +++ b/dbms/src/Storages/System/StorageSystemDictionaries.cpp @@ -1,27 +1,23 @@ -#include -#include -#include #include #include -#include -#include -#include -#include -#include +#include +#include #include #include #include +#include #include +#include + #include #include namespace DB { -StorageSystemDictionaries::StorageSystemDictionaries(const std::string & name) - : name{name} +NamesAndTypesList StorageSystemDictionaries::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { { "name", std::make_shared() }, { "origin", std::make_shared() }, { "type", std::make_shared() }, @@ -36,27 +32,14 @@ StorageSystemDictionaries::StorageSystemDictionaries(const std::string & name) { "creation_time", std::make_shared() }, { "source", std::make_shared() }, { "last_exception", std::make_shared() }, - })); + }; } - -BlockInputStreams StorageSystemDictionaries::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t, - const unsigned) +void StorageSystemDictionaries::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - const auto & external_dictionaries = context.getExternalDictionaries(); auto objects_map = external_dictionaries.getObjectsMap(); const auto & dictionaries = objects_map.get(); - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (const auto & dict_info : dictionaries) { size_t i = 0; @@ -102,8 +85,6 @@ BlockInputStreams StorageSystemDictionaries::read( else res_columns[i++]->insertDefault(); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } } diff --git a/dbms/src/Storages/System/StorageSystemDictionaries.h b/dbms/src/Storages/System/StorageSystemDictionaries.h index 57ac9b0b6eb..87df9ceada7 100644 --- a/dbms/src/Storages/System/StorageSystemDictionaries.h +++ b/dbms/src/Storages/System/StorageSystemDictionaries.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -10,25 +10,17 @@ namespace DB class Context; -class StorageSystemDictionaries : public ext::shared_ptr_helper, public IStorage +class StorageSystemDictionaries : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemDictionaries"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemDictionaries(const std::string & name); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemEvents.cpp b/dbms/src/Storages/System/StorageSystemEvents.cpp index 1dc49ad37b2..eb4832c0c92 100644 --- a/dbms/src/Storages/System/StorageSystemEvents.cpp +++ b/dbms/src/Storages/System/StorageSystemEvents.cpp @@ -1,39 +1,21 @@ #include -#include #include #include -#include #include - namespace DB { - -StorageSystemEvents::StorageSystemEvents(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemEvents::getNamesAndTypes() { - setColumns(ColumnsDescription( - { + return { {"event", std::make_shared()}, {"value", std::make_shared()}, - })); + }; } - -BlockInputStreams StorageSystemEvents::read( - const Names & column_names, - const SelectQueryInfo &, - const Context &, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemEvents::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (size_t i = 0, end = ProfileEvents::end(); i < end; ++i) { UInt64 value = ProfileEvents::counters[i]; @@ -44,9 +26,6 @@ BlockInputStreams StorageSystemEvents::read( res_columns[1]->insert(value); } } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemEvents.h b/dbms/src/Storages/System/StorageSystemEvents.h index b987151e400..5b02b7739f1 100644 --- a/dbms/src/Storages/System/StorageSystemEvents.h +++ b/dbms/src/Storages/System/StorageSystemEvents.h @@ -1,8 +1,7 @@ #pragma once #include -#include - +#include namespace DB { @@ -12,25 +11,17 @@ class Context; /** Implements `events` system table, which allows you to obtain information for profiling. */ -class StorageSystemEvents : public ext::shared_ptr_helper, public IStorage +class StorageSystemEvents : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemEvents"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemEvents(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemFormats.cpp b/dbms/src/Storages/System/StorageSystemFormats.cpp index b029e354cc2..96ce7ea7ed9 100644 --- a/dbms/src/Storages/System/StorageSystemFormats.cpp +++ b/dbms/src/Storages/System/StorageSystemFormats.cpp @@ -1,30 +1,32 @@ +#include +#include #include #include namespace DB { -void StorageSystemFormats::fillData(MutableColumns & res_columns) const + +NamesAndTypesList StorageSystemFormats::getNamesAndTypes() +{ + return { + {"name", std::make_shared()}, + {"is_input", std::make_shared()}, + {"is_output", std::make_shared()}, + }; +} + +void StorageSystemFormats::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { const auto & formats = FormatFactory::instance().getAllFormats(); for (const auto & pair : formats) { const auto & [name, creator_pair] = pair; - bool has_input_format = (creator_pair.first != nullptr); - bool has_output_format = (creator_pair.second != nullptr); + UInt64 has_input_format(creator_pair.first != nullptr); + UInt64 has_output_format(creator_pair.second != nullptr); res_columns[0]->insert(name); - std::string format_type; - if (has_input_format) - format_type = "input"; - - if (has_output_format) - { - if (!format_type.empty()) - format_type += "/output"; - else - format_type = "output"; - } - - res_columns[1]->insert(format_type); + res_columns[1]->insert(has_input_format); + res_columns[2]->insert(has_output_format); } } + } diff --git a/dbms/src/Storages/System/StorageSystemFormats.h b/dbms/src/Storages/System/StorageSystemFormats.h index 35634bdbc21..82f8303b5b0 100644 --- a/dbms/src/Storages/System/StorageSystemFormats.h +++ b/dbms/src/Storages/System/StorageSystemFormats.h @@ -1,26 +1,23 @@ #pragma once -#include +#include #include namespace DB { -class StorageSystemFormats : public ext::shared_ptr_helper, public IStorageSystemWithStringColumns +class StorageSystemFormats : public ext::shared_ptr_helper, public IStorageSystemOneBlock { protected: - void fillData(MutableColumns & res_columns) const override; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + + using IStorageSystemOneBlock::IStorageSystemOneBlock; public: - using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns; std::string getName() const override { return "SystemFormats"; } - static std::vector getColumnNames() - { - return {"name", "type"}; - } - + static NamesAndTypesList getNamesAndTypes(); }; } diff --git a/dbms/src/Storages/System/StorageSystemFunctions.cpp b/dbms/src/Storages/System/StorageSystemFunctions.cpp index 909bf1d8089..f63d0b9b932 100644 --- a/dbms/src/Storages/System/StorageSystemFunctions.cpp +++ b/dbms/src/Storages/System/StorageSystemFunctions.cpp @@ -1,56 +1,53 @@ -#include -#include -#include #include -#include -#include #include #include -#include +#include +#include #include +#include namespace DB { - -StorageSystemFunctions::StorageSystemFunctions(const std::string & name_) - : name(name_) +namespace { - setColumns(ColumnsDescription({ - { "name", std::make_shared() }, - { "is_aggregate", std::make_shared() }, - })); + template + void fillRow(MutableColumns & res_columns, const String & name, UInt64 is_aggregate, const Factory & f) + { + res_columns[0]->insert(name); + res_columns[1]->insert(is_aggregate); + res_columns[2]->insert(UInt64(f.isCaseInsensitive(name))); + if (f.isAlias(name)) + res_columns[3]->insert(f.aliasTo(name)); + else + res_columns[3]->insert(String{}); + } } - -BlockInputStreams StorageSystemFunctions::read( - const Names & column_names, - const SelectQueryInfo &, - const Context &, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +NamesAndTypesList StorageSystemFunctions::getNamesAndTypes() { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; + return { + {"name", std::make_shared()}, + {"is_aggregate", std::make_shared()}, + {"case_insensitive", std::make_shared()}, + {"alias_to", std::make_shared()}, + }; +} - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - - const auto & functions = FunctionFactory::instance().functions; - for (const auto & it : functions) +void StorageSystemFunctions::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const +{ + const auto & functions_factory = FunctionFactory::instance(); + const auto & function_names = functions_factory.getAllRegisteredNames(); + for (const auto & name : function_names) { - res_columns[0]->insert(it.first); - res_columns[1]->insert(UInt64(0)); + fillRow(res_columns, name, UInt64(0), functions_factory); } - const auto & aggregate_functions = AggregateFunctionFactory::instance().aggregate_functions; - for (const auto & it : aggregate_functions) + const auto & aggregate_functions_factory = AggregateFunctionFactory::instance(); + const auto & aggregate_function_names = aggregate_functions_factory.getAllRegisteredNames(); + for (const auto & name : aggregate_function_names) { - res_columns[0]->insert(it.first); - res_columns[1]->insert(UInt64(1)); + fillRow(res_columns, name, UInt64(1), aggregate_functions_factory); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemFunctions.h b/dbms/src/Storages/System/StorageSystemFunctions.h index f77b9536453..baead3d8186 100644 --- a/dbms/src/Storages/System/StorageSystemFunctions.h +++ b/dbms/src/Storages/System/StorageSystemFunctions.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -13,25 +13,17 @@ class Context; /** Implements `functions`system table, which allows you to get a list * all normal and aggregate functions. */ -class StorageSystemFunctions : public ext::shared_ptr_helper, public IStorage +class StorageSystemFunctions : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemFunctions"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemFunctions(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; -private: - const std::string name; + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemGraphite.cpp b/dbms/src/Storages/System/StorageSystemGraphite.cpp index c9ea685366b..7eab731bd12 100644 --- a/dbms/src/Storages/System/StorageSystemGraphite.cpp +++ b/dbms/src/Storages/System/StorageSystemGraphite.cpp @@ -124,10 +124,9 @@ static Strings getAllGraphiteSections(const AbstractConfiguration & config) } // namespace -StorageSystemGraphite::StorageSystemGraphite(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemGraphite::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { {"config_name", std::make_shared()}, {"regexp", std::make_shared()}, {"function", std::make_shared()}, @@ -135,23 +134,12 @@ StorageSystemGraphite::StorageSystemGraphite(const std::string & name_) {"precision", std::make_shared()}, {"priority", std::make_shared()}, {"is_default", std::make_shared()}, - })); + }; } -BlockInputStreams StorageSystemGraphite::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t /*max_block_size*/, - unsigned /*num_streams*/) +void StorageSystemGraphite::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - const auto & config = context.getConfigRef(); Strings sections = getAllGraphiteSections(config); @@ -172,8 +160,6 @@ BlockInputStreams StorageSystemGraphite::read( } } } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } } diff --git a/dbms/src/Storages/System/StorageSystemGraphite.h b/dbms/src/Storages/System/StorageSystemGraphite.h index 8c7a625de54..fa63c839857 100644 --- a/dbms/src/Storages/System/StorageSystemGraphite.h +++ b/dbms/src/Storages/System/StorageSystemGraphite.h @@ -1,31 +1,24 @@ #pragma once -#include +#include +#include #include namespace DB { /// Provides information about Graphite configuration. -class StorageSystemGraphite : public ext::shared_ptr_helper, public IStorage +class StorageSystemGraphite : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemGraphite"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemGraphite(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemMacros.cpp b/dbms/src/Storages/System/StorageSystemMacros.cpp index 456730bde4b..8e6420add8b 100644 --- a/dbms/src/Storages/System/StorageSystemMacros.cpp +++ b/dbms/src/Storages/System/StorageSystemMacros.cpp @@ -1,38 +1,21 @@ #include -#include -#include -#include -#include #include +#include namespace DB { - -StorageSystemMacros::StorageSystemMacros(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemMacros::getNamesAndTypes() { - setColumns(ColumnsDescription({ - {"macro", std::make_shared()}, - {"substitution", std::make_shared()}, - })); + return { + {"macro", std::make_shared()}, + {"substitution", std::make_shared()}, + }; } - -BlockInputStreams StorageSystemMacros::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemMacros::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - auto macros = context.getMacros(); for (const auto & macro : macros->getMacroMap()) @@ -40,9 +23,6 @@ BlockInputStreams StorageSystemMacros::read( res_columns[0]->insert(macro.first); res_columns[1]->insert(macro.second); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemMacros.h b/dbms/src/Storages/System/StorageSystemMacros.h index d4bb5ab3732..fdc091dfe1b 100644 --- a/dbms/src/Storages/System/StorageSystemMacros.h +++ b/dbms/src/Storages/System/StorageSystemMacros.h @@ -1,7 +1,8 @@ #pragma once +#include #include -#include +#include namespace DB @@ -12,25 +13,17 @@ class Context; /** Information about macros for introspection. */ -class StorageSystemMacros : public ext::shared_ptr_helper, public IStorage +class StorageSystemMacros : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemMacros"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemMacros(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemMerges.cpp b/dbms/src/Storages/System/StorageSystemMerges.cpp index d3af993e29e..29d54b83e32 100644 --- a/dbms/src/Storages/System/StorageSystemMerges.cpp +++ b/dbms/src/Storages/System/StorageSystemMerges.cpp @@ -1,53 +1,36 @@ -#include -#include -#include -#include -#include -#include #include #include +#include namespace DB { -StorageSystemMerges::StorageSystemMerges(const std::string & name) - : name{name} +NamesAndTypesList StorageSystemMerges::getNamesAndTypes() { - setColumns(ColumnsDescription({ - { "database", std::make_shared() }, - { "table", std::make_shared() }, - { "elapsed", std::make_shared() }, - { "progress", std::make_shared() }, - { "num_parts", std::make_shared() }, - { "source_part_names", std::make_shared(std::make_shared()) }, - { "result_part_name", std::make_shared() }, - { "total_size_bytes_compressed", std::make_shared() }, - { "total_size_marks", std::make_shared() }, - { "bytes_read_uncompressed", std::make_shared() }, - { "rows_read", std::make_shared() }, - { "bytes_written_uncompressed", std::make_shared() }, - { "rows_written", std::make_shared() }, - { "columns_written", std::make_shared() }, - { "memory_usage", std::make_shared() }, - { "thread_number", std::make_shared() }, - })); + return { + {"database", std::make_shared()}, + {"table", std::make_shared()}, + {"elapsed", std::make_shared()}, + {"progress", std::make_shared()}, + {"num_parts", std::make_shared()}, + {"source_part_names", std::make_shared(std::make_shared())}, + {"result_part_name", std::make_shared()}, + {"total_size_bytes_compressed", std::make_shared()}, + {"total_size_marks", std::make_shared()}, + {"bytes_read_uncompressed", std::make_shared()}, + {"rows_read", std::make_shared()}, + {"bytes_written_uncompressed", std::make_shared()}, + {"rows_written", std::make_shared()}, + {"columns_written", std::make_shared()}, + {"memory_usage", std::make_shared()}, + {"thread_number", std::make_shared()}, + }; } -BlockInputStreams StorageSystemMerges::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t, - const unsigned) +void StorageSystemMerges::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (const auto & merge : context.getMergeList().get()) { size_t i = 0; @@ -68,8 +51,6 @@ BlockInputStreams StorageSystemMerges::read( res_columns[i++]->insert(merge.memory_usage); res_columns[i++]->insert(merge.thread_number); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } } diff --git a/dbms/src/Storages/System/StorageSystemMerges.h b/dbms/src/Storages/System/StorageSystemMerges.h index d48c97bfa17..f45f895d661 100644 --- a/dbms/src/Storages/System/StorageSystemMerges.h +++ b/dbms/src/Storages/System/StorageSystemMerges.h @@ -1,7 +1,10 @@ #pragma once +#include +#include +#include #include -#include +#include namespace DB @@ -10,25 +13,17 @@ namespace DB class Context; -class StorageSystemMerges : public ext::shared_ptr_helper, public IStorage +class StorageSystemMerges : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemMerges"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemMerges(const std::string & name); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemMetrics.cpp b/dbms/src/Storages/System/StorageSystemMetrics.cpp index 9d3b1cc9fbc..acfbd1b7340 100644 --- a/dbms/src/Storages/System/StorageSystemMetrics.cpp +++ b/dbms/src/Storages/System/StorageSystemMetrics.cpp @@ -1,39 +1,23 @@ + #include -#include -#include #include #include -#include #include namespace DB { - -StorageSystemMetrics::StorageSystemMetrics(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemMetrics::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { {"metric", std::make_shared()}, - {"value", std::make_shared()}, - })); + {"value", std::make_shared()}, + }; } - -BlockInputStreams StorageSystemMetrics::read( - const Names & column_names, - const SelectQueryInfo &, - const Context &, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemMetrics::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (size_t i = 0, end = CurrentMetrics::end(); i < end; ++i) { Int64 value = CurrentMetrics::values[i].load(std::memory_order_relaxed); @@ -41,9 +25,6 @@ BlockInputStreams StorageSystemMetrics::read( res_columns[0]->insert(String(CurrentMetrics::getDescription(CurrentMetrics::Metric(i)))); res_columns[1]->insert(value); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemMetrics.h b/dbms/src/Storages/System/StorageSystemMetrics.h index 7b6058de9e5..f74db926126 100644 --- a/dbms/src/Storages/System/StorageSystemMetrics.h +++ b/dbms/src/Storages/System/StorageSystemMetrics.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /** Implements `metrics` system table, which provides information about the operation of the server. */ -class StorageSystemMetrics : public ext::shared_ptr_helper, public IStorage +class StorageSystemMetrics : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemMetrics"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemMetrics(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemModels.cpp b/dbms/src/Storages/System/StorageSystemModels.cpp index 5175989b861..2479742c8ec 100644 --- a/dbms/src/Storages/System/StorageSystemModels.cpp +++ b/dbms/src/Storages/System/StorageSystemModels.cpp @@ -2,45 +2,29 @@ #include #include #include -#include -#include -#include #include #include #include namespace DB { -StorageSystemModels::StorageSystemModels(const std::string & name) - : name{name} +NamesAndTypesList StorageSystemModels::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { { "name", std::make_shared() }, { "origin", std::make_shared() }, { "type", std::make_shared() }, { "creation_time", std::make_shared() }, { "last_exception", std::make_shared() }, - })); + }; } - -BlockInputStreams StorageSystemModels::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t, - const unsigned) +void StorageSystemModels::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - const auto & external_models = context.getExternalModels(); auto objects_map = external_models.getObjectsMap(); const auto & models = objects_map.get(); - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (const auto & model_info : models) { res_columns[0]->insert(model_info.first); @@ -73,8 +57,6 @@ BlockInputStreams StorageSystemModels::read( else res_columns[4]->insertDefault(); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } } diff --git a/dbms/src/Storages/System/StorageSystemModels.h b/dbms/src/Storages/System/StorageSystemModels.h index b32c5a804ce..ef30bd511ea 100644 --- a/dbms/src/Storages/System/StorageSystemModels.h +++ b/dbms/src/Storages/System/StorageSystemModels.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -10,25 +10,17 @@ namespace DB class Context; -class StorageSystemModels : public ext::shared_ptr_helper, public IStorage +class StorageSystemModels : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemModels"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemModels(const std::string & name); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemMutations.cpp b/dbms/src/Storages/System/StorageSystemMutations.cpp index 3cf204c6a77..17580c00940 100644 --- a/dbms/src/Storages/System/StorageSystemMutations.cpp +++ b/dbms/src/Storages/System/StorageSystemMutations.cpp @@ -13,10 +13,10 @@ namespace DB { -StorageSystemMutations::StorageSystemMutations(const std::string & name_) - : name(name_) + +NamesAndTypesList StorageSystemMutations::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { { "database", std::make_shared() }, { "table", std::make_shared() }, { "mutation_id", std::make_shared() }, @@ -28,21 +28,12 @@ StorageSystemMutations::StorageSystemMutations(const std::string & name_) std::make_shared()) }, { "parts_to_do", std::make_shared() }, { "is_done", std::make_shared() }, - })); + }; } -BlockInputStreams StorageSystemMutations::read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemMutations::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - /// Collect a set of *MergeTree tables. std::map> merge_tree_tables; for (const auto & db : context.getDatabases()) @@ -83,13 +74,12 @@ BlockInputStreams StorageSystemMutations::read( VirtualColumnUtils::filterBlockWithQuery(query_info.query, filtered_block, context); if (!filtered_block.rows()) - return BlockInputStreams(); + return; col_database = filtered_block.getByName("database").column; col_table = filtered_block.getByName("table").column; } - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); for (size_t i_storage = 0; i_storage < col_database->size(); ++i_storage) { auto database = (*col_database)[i_storage].safeGet(); @@ -129,12 +119,6 @@ BlockInputStreams StorageSystemMutations::read( res_columns[col_num++]->insert(UInt64(status.is_done)); } } - - Block res = getSampleBlock().cloneEmpty(); - for (size_t i_col = 0; i_col < res.columns(); ++i_col) - res.getByPosition(i_col).column = std::move(res_columns[i_col]); - - return BlockInputStreams(1, std::make_shared(res)); } } diff --git a/dbms/src/Storages/System/StorageSystemMutations.h b/dbms/src/Storages/System/StorageSystemMutations.h index 3b82f3f46be..d2dcf99aa46 100644 --- a/dbms/src/Storages/System/StorageSystemMutations.h +++ b/dbms/src/Storages/System/StorageSystemMutations.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /// Implements the `mutations` system table, which provides information about the status of mutations /// in the MergeTree tables. -class StorageSystemMutations : public ext::shared_ptr_helper, public IStorage +class StorageSystemMutations : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: String getName() const override { return "SystemMutations"; } - String getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const String name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemMutations(const String & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemProcesses.cpp b/dbms/src/Storages/System/StorageSystemProcesses.cpp index 793e3124a2a..f82c6554492 100644 --- a/dbms/src/Storages/System/StorageSystemProcesses.cpp +++ b/dbms/src/Storages/System/StorageSystemProcesses.cpp @@ -1,75 +1,60 @@ -#include #include #include -#include +#include #include #include -#include namespace DB { - -StorageSystemProcesses::StorageSystemProcesses(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemProcesses::getNamesAndTypes() { - setColumns(ColumnsDescription({ - { "is_initial_query", std::make_shared() }, + return { + {"is_initial_query", std::make_shared()}, - { "user", std::make_shared() }, - { "query_id", std::make_shared() }, - { "address", std::make_shared() }, - { "port", std::make_shared() }, + {"user", std::make_shared()}, + {"query_id", std::make_shared()}, + {"address", std::make_shared()}, + {"port", std::make_shared()}, - { "initial_user", std::make_shared() }, - { "initial_query_id", std::make_shared() }, - { "initial_address", std::make_shared() }, - { "initial_port", std::make_shared() }, + {"initial_user", std::make_shared()}, + {"initial_query_id", std::make_shared()}, + {"initial_address", std::make_shared()}, + {"initial_port", std::make_shared()}, - { "interface", std::make_shared() }, + {"interface", std::make_shared()}, - { "os_user", std::make_shared() }, - { "client_hostname", std::make_shared() }, - { "client_name", std::make_shared() }, - { "client_version_major", std::make_shared() }, - { "client_version_minor", std::make_shared() }, - { "client_revision", std::make_shared() }, + {"os_user", std::make_shared()}, + {"client_hostname", std::make_shared()}, + {"client_name", std::make_shared()}, + {"client_version_major", std::make_shared()}, + {"client_version_minor", std::make_shared()}, + {"client_revision", std::make_shared()}, - { "http_method", std::make_shared() }, - { "http_user_agent", std::make_shared() }, + {"http_method", std::make_shared()}, + {"http_user_agent", std::make_shared()}, - { "quota_key", std::make_shared() }, + {"quota_key", std::make_shared()}, - { "elapsed", std::make_shared() }, - { "is_cancelled", std::make_shared() }, - { "read_rows", std::make_shared() }, - { "read_bytes", std::make_shared() }, - { "total_rows_approx", std::make_shared() }, - { "written_rows", std::make_shared() }, - { "written_bytes", std::make_shared() }, - { "memory_usage", std::make_shared() }, - { "peak_memory_usage", std::make_shared() }, - { "query", std::make_shared() }, - })); + {"elapsed", std::make_shared()}, + {"is_cancelled", std::make_shared()}, + {"read_rows", std::make_shared()}, + {"read_bytes", std::make_shared()}, + {"total_rows_approx", std::make_shared()}, + {"written_rows", std::make_shared()}, + {"written_bytes", std::make_shared()}, + {"memory_usage", std::make_shared()}, + {"peak_memory_usage", std::make_shared()}, + {"query", std::make_shared()}, + }; } -BlockInputStreams StorageSystemProcesses::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemProcesses::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - ProcessList::Info info = context.getProcessList().getInfo(); - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (const auto & process : info) { size_t i = 0; @@ -103,9 +88,6 @@ BlockInputStreams StorageSystemProcesses::read( res_columns[i++]->insert(process.peak_memory_usage); res_columns[i++]->insert(process.query); } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemProcesses.h b/dbms/src/Storages/System/StorageSystemProcesses.h index f8f26d13d35..3cbe0028af3 100644 --- a/dbms/src/Storages/System/StorageSystemProcesses.h +++ b/dbms/src/Storages/System/StorageSystemProcesses.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /** Implements `processes` system table, which allows you to get information about the queries that are currently executing. */ -class StorageSystemProcesses : public ext::shared_ptr_helper, public IStorage +class StorageSystemProcesses : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemProcesses"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemProcesses(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemReplicationQueue.cpp b/dbms/src/Storages/System/StorageSystemReplicationQueue.cpp index 69fc73bd89c..51b0805c4c2 100644 --- a/dbms/src/Storages/System/StorageSystemReplicationQueue.cpp +++ b/dbms/src/Storages/System/StorageSystemReplicationQueue.cpp @@ -5,7 +5,6 @@ #include #include #include -#include #include #include #include @@ -17,10 +16,10 @@ namespace DB { -StorageSystemReplicationQueue::StorageSystemReplicationQueue(const std::string & name_) - : name(name_) + +NamesAndTypesList StorageSystemReplicationQueue::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { /// Table properties. { "database", std::make_shared() }, { "table", std::make_shared() }, @@ -43,21 +42,12 @@ StorageSystemReplicationQueue::StorageSystemReplicationQueue(const std::string & { "num_postponed", std::make_shared() }, { "postpone_reason", std::make_shared() }, { "last_postpone_time", std::make_shared() }, - })); + }; } -BlockInputStreams StorageSystemReplicationQueue::read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemReplicationQueue::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - std::map> replicated_tables; for (const auto & db : context.getDatabases()) for (auto iterator = db.second->getIterator(context); iterator->isValid(); iterator->next()) @@ -90,7 +80,7 @@ BlockInputStreams StorageSystemReplicationQueue::read( VirtualColumnUtils::filterBlockWithQuery(query_info.query, filtered_block, context); if (!filtered_block.rows()) - return BlockInputStreams(); + return; col_database_to_filter = filtered_block.getByName("database").column; col_table_to_filter = filtered_block.getByName("table").column; @@ -99,8 +89,6 @@ BlockInputStreams StorageSystemReplicationQueue::read( StorageReplicatedMergeTree::LogEntriesData queue; String replica_name; - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (size_t i = 0, tables_size = col_database_to_filter->size(); i < tables_size; ++i) { String database = (*col_database_to_filter)[i].safeGet(); @@ -139,9 +127,6 @@ BlockInputStreams StorageSystemReplicationQueue::read( res_columns[col_num++]->insert(UInt64(entry.last_postpone_time)); } } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemReplicationQueue.h b/dbms/src/Storages/System/StorageSystemReplicationQueue.h index 8554361e0df..63dc58118cd 100644 --- a/dbms/src/Storages/System/StorageSystemReplicationQueue.h +++ b/dbms/src/Storages/System/StorageSystemReplicationQueue.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /** Implements the `replication_queue` system table, which allows you to view the replication queues for the replicated tables. */ -class StorageSystemReplicationQueue : public ext::shared_ptr_helper, public IStorage +class StorageSystemReplicationQueue : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemReplicationQueue"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemReplicationQueue(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemSettings.cpp b/dbms/src/Storages/System/StorageSystemSettings.cpp index efb50c559cc..fee9467f6f9 100644 --- a/dbms/src/Storages/System/StorageSystemSettings.cpp +++ b/dbms/src/Storages/System/StorageSystemSettings.cpp @@ -1,8 +1,5 @@ -#include -#include #include #include -#include #include #include @@ -10,44 +7,27 @@ namespace DB { - -StorageSystemSettings::StorageSystemSettings(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemSettings::getNamesAndTypes() { - setColumns(ColumnsDescription({ - { "name", std::make_shared() }, - { "value", std::make_shared() }, - { "changed", std::make_shared() }, - { "description", std::make_shared() }, - })); + return { + {"name", std::make_shared()}, + {"value", std::make_shared()}, + {"changed", std::make_shared()}, + {"description", std::make_shared()}, + }; } - -BlockInputStreams StorageSystemSettings::read( - const Names & column_names, - const SelectQueryInfo &, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemSettings::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo &) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - const Settings & settings = context.getSettingsRef(); - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - -#define ADD_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \ - res_columns[0]->insert(String(#NAME)); \ - res_columns[1]->insert(settings.NAME.toString()); \ +#define ADD_SETTING(TYPE, NAME, DEFAULT, DESCRIPTION) \ + res_columns[0]->insert(String(#NAME)); \ + res_columns[1]->insert(settings.NAME.toString()); \ res_columns[2]->insert(UInt64(settings.NAME.changed)); \ res_columns[3]->insert(String(DESCRIPTION)); APPLY_FOR_SETTINGS(ADD_SETTING) #undef ADD_SETTING - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } - } diff --git a/dbms/src/Storages/System/StorageSystemSettings.h b/dbms/src/Storages/System/StorageSystemSettings.h index 153b9213ef8..e44e0abbcd4 100644 --- a/dbms/src/Storages/System/StorageSystemSettings.h +++ b/dbms/src/Storages/System/StorageSystemSettings.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /** implements system table "settings", which allows to get information about the current settings. */ -class StorageSystemSettings : public ext::shared_ptr_helper, public IStorage +class StorageSystemSettings : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemSettings"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemSettings(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/StorageSystemTableEngines.cpp b/dbms/src/Storages/System/StorageSystemTableEngines.cpp new file mode 100644 index 00000000000..d40fc6fa49e --- /dev/null +++ b/dbms/src/Storages/System/StorageSystemTableEngines.cpp @@ -0,0 +1,22 @@ +#include +#include +#include + +namespace DB +{ + +NamesAndTypesList StorageSystemTableEngines::getNamesAndTypes() +{ + return {{"name", std::make_shared()}}; +} + +void StorageSystemTableEngines::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const +{ + const auto & storages = StorageFactory::instance().getAllStorages(); + for (const auto & pair : storages) + { + res_columns[0]->insert(pair.first); + } +} + +} diff --git a/dbms/src/Storages/System/StorageSystemTableEngines.h b/dbms/src/Storages/System/StorageSystemTableEngines.h new file mode 100644 index 00000000000..f0f6b62d59d --- /dev/null +++ b/dbms/src/Storages/System/StorageSystemTableEngines.h @@ -0,0 +1,27 @@ +#pragma once + +#include +#include +#include + +namespace DB +{ + +class StorageSystemTableEngines : public ext::shared_ptr_helper, + public IStorageSystemOneBlock +{ +protected: + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; + + using IStorageSystemOneBlock::IStorageSystemOneBlock; + +public: + std::string getName() const override + { + return "SystemTableEngines"; + } + + static NamesAndTypesList getNamesAndTypes(); +}; + +} diff --git a/dbms/src/Storages/System/StorageSystemTableFunctions.cpp b/dbms/src/Storages/System/StorageSystemTableFunctions.cpp index aaf448a6559..15067bbc41f 100644 --- a/dbms/src/Storages/System/StorageSystemTableFunctions.cpp +++ b/dbms/src/Storages/System/StorageSystemTableFunctions.cpp @@ -3,7 +3,13 @@ #include namespace DB { -void StorageSystemTableFunctions::fillData(MutableColumns & res_columns) const + +NamesAndTypesList StorageSystemTableFunctions::getNamesAndTypes() +{ + return {{"name", std::make_shared()}}; +} + +void StorageSystemTableFunctions::fillData(MutableColumns & res_columns, const Context &, const SelectQueryInfo &) const { const auto & functions = TableFunctionFactory::instance().getAllTableFunctions(); for (const auto & pair : functions) @@ -11,4 +17,5 @@ void StorageSystemTableFunctions::fillData(MutableColumns & res_columns) const res_columns[0]->insert(pair.first); } } + } diff --git a/dbms/src/Storages/System/StorageSystemTableFunctions.h b/dbms/src/Storages/System/StorageSystemTableFunctions.h index 096b3d8f4a4..413af0f5c66 100644 --- a/dbms/src/Storages/System/StorageSystemTableFunctions.h +++ b/dbms/src/Storages/System/StorageSystemTableFunctions.h @@ -1,26 +1,29 @@ #pragma once -#include +#include +#include #include namespace DB { + class StorageSystemTableFunctions : public ext::shared_ptr_helper, - public IStorageSystemWithStringColumns + public IStorageSystemOneBlock { protected: - void fillData(MutableColumns & res_columns) const override; + + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; public: - using IStorageSystemWithStringColumns::IStorageSystemWithStringColumns; std::string getName() const override { return "SystemTableFunctions"; } - static std::vector getColumnNames() - { - return {"name"}; - } + static NamesAndTypesList getNamesAndTypes(); + }; + } diff --git a/dbms/src/Storages/System/StorageSystemZooKeeper.cpp b/dbms/src/Storages/System/StorageSystemZooKeeper.cpp index e506802ec74..612b0d782d7 100644 --- a/dbms/src/Storages/System/StorageSystemZooKeeper.cpp +++ b/dbms/src/Storages/System/StorageSystemZooKeeper.cpp @@ -1,9 +1,6 @@ -#include -#include #include #include #include -#include #include #include #include @@ -19,10 +16,9 @@ namespace DB { -StorageSystemZooKeeper::StorageSystemZooKeeper(const std::string & name_) - : name(name_) +NamesAndTypesList StorageSystemZooKeeper::getNamesAndTypes() { - setColumns(ColumnsDescription({ + return { { "name", std::make_shared() }, { "value", std::make_shared() }, { "czxid", std::make_shared() }, @@ -37,7 +33,7 @@ StorageSystemZooKeeper::StorageSystemZooKeeper(const std::string & name_) { "numChildren", std::make_shared() }, { "pzxid", std::make_shared() }, { "path", std::make_shared() }, - })); + }; } @@ -103,17 +99,8 @@ static String extractPath(const ASTPtr & query) } -BlockInputStreams StorageSystemZooKeeper::read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - const size_t /*max_block_size*/, - const unsigned /*num_streams*/) +void StorageSystemZooKeeper::fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const { - check(column_names); - processed_stage = QueryProcessingStage::FetchColumns; - String path = extractPath(query_info.query); if (path.empty()) throw Exception("SELECT from system.zookeeper table must contain condition like path = 'path' in WHERE clause."); @@ -136,8 +123,6 @@ BlockInputStreams StorageSystemZooKeeper::read( for (const String & node : nodes) futures.push_back(zookeeper->asyncTryGet(path_part + '/' + node)); - MutableColumns res_columns = getSampleBlock().cloneEmptyColumns(); - for (size_t i = 0, size = nodes.size(); i < size; ++i) { auto res = futures[i].get(); @@ -162,8 +147,6 @@ BlockInputStreams StorageSystemZooKeeper::read( res_columns[col_num++]->insert(Int64(stat.pzxid)); res_columns[col_num++]->insert(path); /// This is the original path. In order to process the request, condition in WHERE should be triggered. } - - return BlockInputStreams(1, std::make_shared(getSampleBlock().cloneWithColumns(std::move(res_columns)))); } diff --git a/dbms/src/Storages/System/StorageSystemZooKeeper.h b/dbms/src/Storages/System/StorageSystemZooKeeper.h index 45625ebab12..9644fe96162 100644 --- a/dbms/src/Storages/System/StorageSystemZooKeeper.h +++ b/dbms/src/Storages/System/StorageSystemZooKeeper.h @@ -1,7 +1,7 @@ #pragma once #include -#include +#include namespace DB @@ -12,25 +12,17 @@ class Context; /** Implements `zookeeper` system table, which allows you to view the data in ZooKeeper for debugging purposes. */ -class StorageSystemZooKeeper : public ext::shared_ptr_helper, public IStorage +class StorageSystemZooKeeper : public ext::shared_ptr_helper, public IStorageSystemOneBlock { public: std::string getName() const override { return "SystemZooKeeper"; } - std::string getTableName() const override { return name; } - BlockInputStreams read( - const Names & column_names, - const SelectQueryInfo & query_info, - const Context & context, - QueryProcessingStage::Enum & processed_stage, - size_t max_block_size, - unsigned num_streams) override; - -private: - const std::string name; + static NamesAndTypesList getNamesAndTypes(); protected: - StorageSystemZooKeeper(const std::string & name_); + using IStorageSystemOneBlock::IStorageSystemOneBlock; + + void fillData(MutableColumns & res_columns, const Context & context, const SelectQueryInfo & query_info) const override; }; } diff --git a/dbms/src/Storages/System/attachSystemTables.cpp b/dbms/src/Storages/System/attachSystemTables.cpp index e29a26df4eb..479337d1b41 100644 --- a/dbms/src/Storages/System/attachSystemTables.cpp +++ b/dbms/src/Storages/System/attachSystemTables.cpp @@ -27,6 +27,7 @@ #include #include #include +#include #include #include #include @@ -52,6 +53,7 @@ void attachSystemTablesLocal(IDatabase & system_database) system_database.attachTable("aggregate_function_combinators", StorageSystemAggregateFunctionCombinators::create("aggregate_function_combinators")); system_database.attachTable("data_type_families", StorageSystemDataTypeFamilies::create("data_type_families")); system_database.attachTable("collations", StorageSystemCollations::create("collations")); + system_database.attachTable("table_engines", StorageSystemTableEngines::create("table_engines")); } void attachSystemTablesServer(IDatabase & system_database, bool has_zookeeper) diff --git a/dbms/src/Storages/getStructureOfRemoteTable.cpp b/dbms/src/Storages/getStructureOfRemoteTable.cpp index bdbc04103a9..37cc036c367 100644 --- a/dbms/src/Storages/getStructureOfRemoteTable.cpp +++ b/dbms/src/Storages/getStructureOfRemoteTable.cpp @@ -7,6 +7,8 @@ #include #include #include +#include +#include namespace DB @@ -22,21 +24,40 @@ ColumnsDescription getStructureOfRemoteTable( const Cluster & cluster, const std::string & database, const std::string & table, - const Context & context) + const Context & context, + const ASTPtr & table_func_ptr) { /// Send to the first any remote shard. const auto & shard_info = cluster.getAnyShardInfo(); + + String query; - if (shard_info.isLocal()) - return context.getTable(database, table)->getColumns(); + if (table_func_ptr) + { + if (shard_info.isLocal()) + { + auto table_function = static_cast(table_func_ptr.get()); + return TableFunctionFactory::instance().get(table_function->name, context)->execute(table_func_ptr, context)->getColumns(); + } + + auto table_func_name = queryToString(table_func_ptr); + query = "DESC TABLE " + table_func_name; + } + else + { + if (shard_info.isLocal()) + return context.getTable(database, table)->getColumns(); + + /// Request for a table description + query = "DESC TABLE " + backQuoteIfNeed(database) + "." + backQuoteIfNeed(table); + } - /// Request for a table description - String query = "DESC TABLE " + backQuoteIfNeed(database) + "." + backQuoteIfNeed(table); ColumnsDescription res; auto input = std::make_shared(shard_info.pool, query, InterpreterDescribeQuery::getSampleBlock(), context); input->setPoolMode(PoolMode::GET_ONE); - input->setMainTable(QualifiedTableName{database, table}); + if (!table_func_ptr) + input->setMainTable(QualifiedTableName{database, table}); input->readPrefix(); const DataTypeFactory & data_type_factory = DataTypeFactory::instance(); diff --git a/dbms/src/Storages/getStructureOfRemoteTable.h b/dbms/src/Storages/getStructureOfRemoteTable.h index 20417ef50e1..9f1769a7096 100644 --- a/dbms/src/Storages/getStructureOfRemoteTable.h +++ b/dbms/src/Storages/getStructureOfRemoteTable.h @@ -1,6 +1,8 @@ #pragma once #include +#include +#include namespace DB @@ -15,6 +17,7 @@ ColumnsDescription getStructureOfRemoteTable( const Cluster & cluster, const std::string & database, const std::string & table, - const Context & context); + const Context & context, + const ASTPtr & table_func_ptr = nullptr); } diff --git a/dbms/src/TableFunctions/TableFunctionFactory.cpp b/dbms/src/TableFunctions/TableFunctionFactory.cpp index b6188ee5967..8fb8533176b 100644 --- a/dbms/src/TableFunctions/TableFunctionFactory.cpp +++ b/dbms/src/TableFunctions/TableFunctionFactory.cpp @@ -37,4 +37,9 @@ TableFunctionPtr TableFunctionFactory::get( return it->second(); } +bool TableFunctionFactory::isTableFunctionName(const std::string & name) const +{ + return functions.count(name); +} + } diff --git a/dbms/src/TableFunctions/TableFunctionFactory.h b/dbms/src/TableFunctions/TableFunctionFactory.h index 725f02f43b1..acc08bacffa 100644 --- a/dbms/src/TableFunctions/TableFunctionFactory.h +++ b/dbms/src/TableFunctions/TableFunctionFactory.h @@ -42,8 +42,11 @@ public: TableFunctionPtr get( const std::string & name, const Context & context) const; + + bool isTableFunctionName(const std::string & name) const; - const TableFunctions & getAllTableFunctions() const { + const TableFunctions & getAllTableFunctions() const + { return functions; } diff --git a/dbms/src/TableFunctions/TableFunctionRemote.cpp b/dbms/src/TableFunctions/TableFunctionRemote.cpp index ac79f0ac2f2..9bea9e881ca 100644 --- a/dbms/src/TableFunctions/TableFunctionRemote.cpp +++ b/dbms/src/TableFunctions/TableFunctionRemote.cpp @@ -198,6 +198,7 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C String cluster_description; String remote_database; String remote_table; + ASTPtr remote_table_function_ptr; String username; String password; @@ -230,25 +231,39 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C ++arg_num; args[arg_num] = evaluateConstantExpressionOrIdentifierAsLiteral(args[arg_num], context); - remote_database = static_cast(*args[arg_num]).value.safeGet(); - ++arg_num; - - size_t dot = remote_database.find('.'); - if (dot != String::npos) + + const auto table_function = static_cast(args[arg_num].get()); + + if (TableFunctionFactory::instance().isTableFunctionName(table_function->name)) { - /// NOTE Bad - do not support identifiers in backquotes. - remote_table = remote_database.substr(dot + 1); - remote_database = remote_database.substr(0, dot); - } - else - { - if (arg_num >= args.size()) - throw Exception(help_message, ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); - - args[arg_num] = evaluateConstantExpressionOrIdentifierAsLiteral(args[arg_num], context); - remote_table = static_cast(*args[arg_num]).value.safeGet(); + remote_table_function_ptr = args[arg_num]; ++arg_num; } + else { + remote_database = static_cast(*args[arg_num]).value.safeGet(); + + ++arg_num; + + size_t dot = remote_database.find('.'); + if (dot != String::npos) + { + /// NOTE Bad - do not support identifiers in backquotes. + remote_table = remote_database.substr(dot + 1); + remote_database = remote_database.substr(0, dot); + } + else + { + if (arg_num >= args.size()) + throw Exception(help_message, ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH); + else + { + args[arg_num] = evaluateConstantExpressionOrIdentifierAsLiteral(args[arg_num], context); + remote_table = static_cast(*args[arg_num]).value.safeGet(); + remote_database = remote_database; + ++arg_num; + } + } + } /// Username and password parameters are prohibited in cluster version of the function if (!is_cluster_function) @@ -299,18 +314,28 @@ StoragePtr TableFunctionRemote::executeImpl(const ASTPtr & ast_function, const C cluster = std::make_shared(context.getSettings(), names, username, password, context.getTCPPort(), false); } - auto res = StorageDistributed::createWithOwnCluster( - getName(), - getStructureOfRemoteTable(*cluster, remote_database, remote_table, context), - remote_database, - remote_table, - cluster, - context); + auto structure_remote_table = getStructureOfRemoteTable(*cluster, remote_database, remote_table, context, remote_table_function_ptr); + + StoragePtr res = remote_table_function_ptr ? + StorageDistributed::createWithOwnCluster( + getName(), + structure_remote_table, + remote_table_function_ptr, + cluster, + context) + + : StorageDistributed::createWithOwnCluster( + getName(), + structure_remote_table, + remote_database, + remote_table, + cluster, + context); + res->startup(); return res; } - TableFunctionRemote::TableFunctionRemote(const std::string & name_) : name(name_) { diff --git a/dbms/tests/external_dictionaries/dictionary_library/dictionary_library.cpp b/dbms/tests/external_dictionaries/dictionary_library/dictionary_library.cpp index 59b75d0a26f..2a411ebcb00 100644 --- a/dbms/tests/external_dictionaries/dictionary_library/dictionary_library.cpp +++ b/dbms/tests/external_dictionaries/dictionary_library/dictionary_library.cpp @@ -62,7 +62,8 @@ void MakeColumnsFromVector(DataHolder * ptr) ptr->ctable.data = ptr->rowHolder.get(); } -extern "C" { +extern "C" +{ void * ClickHouseDictionary_v3_loadIds(void * data_ptr, ClickHouseLibrary::CStrings * settings, @@ -151,7 +152,8 @@ void * ClickHouseDictionary_v3_loadKeys(void * data_ptr, ClickHouseLibrary::CStr if (requested_keys) { LOG(ptr->lib->log, "requested_keys columns passed: " << requested_keys->size); - for (size_t i = 0; i < requested_keys->size; ++i) { + for (size_t i = 0; i < requested_keys->size; ++i) + { LOG(ptr->lib->log, "requested_keys at column " << i << " passed: " << requested_keys->data[i].size); } } diff --git a/dbms/tests/integration/helpers/cluster.py b/dbms/tests/integration/helpers/cluster.py index 4242fa8fa62..8b5991ad117 100644 --- a/dbms/tests/integration/helpers/cluster.py +++ b/dbms/tests/integration/helpers/cluster.py @@ -49,17 +49,18 @@ class ClickHouseCluster: self.base_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name', self.project_name] self.base_zookeeper_cmd = None self.base_mysql_cmd = [] + self.base_kafka_cmd = [] self.pre_zookeeper_commands = [] self.instances = {} self.with_zookeeper = False self.with_mysql = False + self.with_kafka = False self.docker_client = None self.is_up = False - def add_instance(self, name, config_dir=None, main_configs=[], user_configs=[], macroses={}, with_zookeeper=False, with_mysql=False, - clickhouse_path_dir=None, hostname=None): + def add_instance(self, name, config_dir=None, main_configs=[], user_configs=[], macros={}, with_zookeeper=False, with_mysql=False, with_kafka=False, clickhouse_path_dir=None, hostname=None): """Add an instance to the cluster. name - the name of the instance directory and the value of the 'instance' macro in ClickHouse. @@ -76,8 +77,8 @@ class ClickHouseCluster: raise Exception("Can\'t add instance `%s': there is already an instance with the same name!" % name) instance = ClickHouseInstance( - self, self.base_dir, name, config_dir, main_configs, user_configs, macroses, with_zookeeper, - self.zookeeper_config_path, with_mysql, self.base_configs_dir, self.server_bin_path, clickhouse_path_dir, hostname=hostname) + self, self.base_dir, name, config_dir, main_configs, user_configs, macros, with_zookeeper, + self.zookeeper_config_path, with_mysql, with_kafka, self.base_configs_dir, self.server_bin_path, clickhouse_path_dir, hostname=hostname) self.instances[name] = instance self.base_cmd.extend(['--file', instance.docker_compose_path]) @@ -93,6 +94,12 @@ class ClickHouseCluster: self.base_mysql_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name', self.project_name, '--file', p.join(HELPERS_DIR, 'docker_compose_mysql.yml')] + if with_kafka and not self.with_kafka: + self.with_kafka = True + self.base_cmd.extend(['--file', p.join(HELPERS_DIR, 'docker_compose_kafka.yml')]) + self.base_kafka_cmd = ['docker-compose', '--project-directory', self.base_dir, '--project-name', + self.project_name, '--file', p.join(HELPERS_DIR, 'docker_compose_kafka.yml')] + return instance @@ -135,6 +142,10 @@ class ClickHouseCluster: if self.with_mysql and self.base_mysql_cmd: subprocess.check_call(self.base_mysql_cmd + ['up', '-d', '--no-recreate']) + if self.with_kafka and self.base_kafka_cmd: + subprocess.check_call(self.base_kafka_cmd + ['up', '-d', '--no-recreate']) + self.kafka_docker_id = self.get_instance_docker_id('kafka1') + # Uncomment for debugging #print ' '.join(self.base_cmd + ['up', '--no-recreate']) @@ -206,14 +217,15 @@ services: - server - --config-file=/etc/clickhouse-server/config.xml - --log-file=/var/log/clickhouse-server/clickhouse-server.log + - --errorlog-file=/var/log/clickhouse-server/clickhouse-server.err.log depends_on: {depends_on} ''' class ClickHouseInstance: def __init__( - self, cluster, base_path, name, custom_config_dir, custom_main_configs, custom_user_configs, macroses, - with_zookeeper, zookeeper_config_path, with_mysql, base_configs_dir, server_bin_path, clickhouse_path_dir, hostname=None): + self, cluster, base_path, name, custom_config_dir, custom_main_configs, custom_user_configs, macros, + with_zookeeper, zookeeper_config_path, with_mysql, with_kafka, base_configs_dir, server_bin_path, clickhouse_path_dir, hostname=None): self.name = name self.base_cmd = cluster.base_cmd[:] @@ -225,7 +237,7 @@ class ClickHouseInstance: self.custom_main_config_paths = [p.abspath(p.join(base_path, c)) for c in custom_main_configs] self.custom_user_config_paths = [p.abspath(p.join(base_path, c)) for c in custom_user_configs] self.clickhouse_path_dir = p.abspath(p.join(base_path, clickhouse_path_dir)) if clickhouse_path_dir else None - self.macroses = macroses if macroses is not None else {} + self.macros = macros if macros is not None else {} self.with_zookeeper = with_zookeeper self.zookeeper_config_path = zookeeper_config_path @@ -233,6 +245,7 @@ class ClickHouseInstance: self.server_bin_path = server_bin_path self.with_mysql = with_mysql + self.with_kafka = with_kafka self.path = p.join(self.cluster.instances_dir, name) self.docker_compose_path = p.join(self.path, 'docker_compose.yml') @@ -282,9 +295,10 @@ class ClickHouseInstance: deadline = start_time + timeout while True: - status = self.get_docker_handle().status + handle = self.get_docker_handle() + status = handle.status; if status == 'exited': - raise Exception("Instance `{}' failed to start. Container status: {}".format(self.name, status)) + raise Exception("Instance `{}' failed to start. Container status: {}, logs: {}".format(self.name, status, handle.logs())) current_time = time.time() time_left = deadline - current_time @@ -339,11 +353,11 @@ class ClickHouseInstance: shutil.copy(p.join(HELPERS_DIR, 'common_instance_config.xml'), config_d_dir) - # Generate and write macroses file - macroses = self.macroses.copy() - macroses['instance'] = self.name + # Generate and write macros file + macros = self.macros.copy() + macros['instance'] = self.name with open(p.join(config_d_dir, 'macros.xml'), 'w') as macros_config: - macros_config.write(self.dict_to_xml({"macros" : macroses})) + macros_config.write(self.dict_to_xml({"macros" : macros})) # Put ZooKeeper config if self.with_zookeeper: @@ -374,6 +388,9 @@ class ClickHouseInstance: if self.with_mysql: depends_on.append("mysql1") + if self.with_kafka: + depends_on.append("kafka1") + if self.with_zookeeper: depends_on.append("zoo1") depends_on.append("zoo2") diff --git a/dbms/tests/integration/helpers/docker_compose_kafka.yml b/dbms/tests/integration/helpers/docker_compose_kafka.yml new file mode 100644 index 00000000000..42dd154b1e8 --- /dev/null +++ b/dbms/tests/integration/helpers/docker_compose_kafka.yml @@ -0,0 +1,24 @@ +version: '2' + +services: + kafka_zookeeper: + image: zookeeper:3.4.9 + hostname: kafka_zookeeper + environment: + ZOO_MY_ID: 1 + ZOO_PORT: 2181 + ZOO_SERVERS: server.1=kafka_zookeeper:2888:3888 + + kafka1: + image: confluentinc/cp-kafka:4.1.0 + hostname: kafka1 + ports: + - "9092:9092" + environment: + KAFKA_ADVERTISED_LISTENERS: "PLAINTEXT://kafka1:9092" + KAFKA_ZOOKEEPER_CONNECT: "kafka_zookeeper:2181" + KAFKA_BROKER_ID: 1 + KAFKA_LOG4J_LOGGERS: "kafka.controller=INFO,kafka.producer.async.DefaultEventHandler=INFO,state.change.logger=INFO" + KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 + depends_on: + - kafka_zookeeper diff --git a/dbms/tests/integration/test_cluster_copier/test.py b/dbms/tests/integration/test_cluster_copier/test.py index a19fa8231cf..3f3c5f31741 100644 --- a/dbms/tests/integration/test_cluster_copier/test.py +++ b/dbms/tests/integration/test_cluster_copier/test.py @@ -58,7 +58,7 @@ def started_cluster(): name = "s{}_{}_{}".format(cluster_name, shard_name, replica_name) cluster.add_instance(name, config_dir="configs", - macroses={"cluster": cluster_name, "shard": shard_name, "replica": replica_name}, + macros={"cluster": cluster_name, "shard": shard_name, "replica": replica_name}, with_zookeeper=True) cluster.start() diff --git a/dbms/tests/integration/test_distributed_ddl/test.py b/dbms/tests/integration/test_distributed_ddl/test.py index 8621f723ac1..c2851438c00 100755 --- a/dbms/tests/integration/test_distributed_ddl/test.py +++ b/dbms/tests/integration/test_distributed_ddl/test.py @@ -72,7 +72,7 @@ def init_cluster(cluster): cluster.add_instance( 'ch{}'.format(i+1), config_dir="configs", - macroses={"layer": 0, "shard": i/2 + 1, "replica": i%2 + 1}, + macros={"layer": 0, "shard": i/2 + 1, "replica": i%2 + 1}, with_zookeeper=True) cluster.start() @@ -332,6 +332,26 @@ def test_allowed_databases(started_cluster): instance.query("DROP DATABASE db1 ON CLUSTER cluster", settings={"user" : "restricted_user"}) +def test_kill_query(started_cluster): + instance = cluster.instances['ch3'] + + ddl_check_query(instance, "KILL QUERY ON CLUSTER 'cluster' WHERE NOT elapsed FORMAT TSV") + +def test_detach_query(started_cluster): + instance = cluster.instances['ch3'] + + ddl_check_query(instance, "DROP TABLE IF EXISTS test_attach ON CLUSTER cluster FORMAT TSV") + ddl_check_query(instance, "CREATE TABLE test_attach ON CLUSTER cluster (i Int8)ENGINE = Log") + ddl_check_query(instance, "DETACH TABLE test_attach ON CLUSTER cluster FORMAT TSV") + ddl_check_query(instance, "ATTACH TABLE test_attach ON CLUSTER cluster") + + +def test_optimize_query(started_cluster): + instance = cluster.instances['ch3'] + + ddl_check_query(instance, "DROP TABLE IF EXISTS test_optimize ON CLUSTER cluster FORMAT TSV") + ddl_check_query(instance, "CREATE TABLE test_optimize ON CLUSTER cluster (p Date, i Int32) ENGINE = MergeTree(p, p, 8192)") + ddl_check_query(instance, "OPTIMIZE TABLE test_optimize ON CLUSTER cluster FORMAT TSV") if __name__ == '__main__': with contextmanager(started_cluster)() as cluster: diff --git a/dbms/tests/integration/test_extreme_deduplication/test.py b/dbms/tests/integration/test_extreme_deduplication/test.py index d1a19dc1c60..f8043632ba6 100644 --- a/dbms/tests/integration/test_extreme_deduplication/test.py +++ b/dbms/tests/integration/test_extreme_deduplication/test.py @@ -12,8 +12,8 @@ from helpers.client import QueryTimeoutExceedException cluster = ClickHouseCluster(__file__) -node1 = cluster.add_instance('node1', config_dir='configs', with_zookeeper=True, macroses={"layer": 0, "shard": 0, "replica": 1}) -node2 = cluster.add_instance('node2', config_dir='configs', with_zookeeper=True, macroses={"layer": 0, "shard": 0, "replica": 2}) +node1 = cluster.add_instance('node1', config_dir='configs', with_zookeeper=True, macros={"layer": 0, "shard": 0, "replica": 1}) +node2 = cluster.add_instance('node2', config_dir='configs', with_zookeeper=True, macros={"layer": 0, "shard": 0, "replica": 2}) nodes = [node1, node2] @pytest.fixture(scope="module") diff --git a/dbms/tests/integration/test_random_inserts/test.py b/dbms/tests/integration/test_random_inserts/test.py index 88abd762504..9e5029c5b64 100644 --- a/dbms/tests/integration/test_random_inserts/test.py +++ b/dbms/tests/integration/test_random_inserts/test.py @@ -14,8 +14,8 @@ from helpers.client import CommandRequest cluster = ClickHouseCluster(__file__) -node1 = cluster.add_instance('node1', config_dir='configs', with_zookeeper=True, macroses={"layer": 0, "shard": 0, "replica": 1}) -node2 = cluster.add_instance('node2', config_dir='configs', with_zookeeper=True, macroses={"layer": 0, "shard": 0, "replica": 2}) +node1 = cluster.add_instance('node1', config_dir='configs', with_zookeeper=True, macros={"layer": 0, "shard": 0, "replica": 1}) +node2 = cluster.add_instance('node2', config_dir='configs', with_zookeeper=True, macros={"layer": 0, "shard": 0, "replica": 2}) nodes = [node1, node2] @pytest.fixture(scope="module") diff --git a/dbms/tests/integration/test_replication_credentials/__init__.py b/dbms/tests/integration/test_replication_credentials/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/dbms/tests/integration/test_replication_credentials/configs/credentials1.xml b/dbms/tests/integration/test_replication_credentials/configs/credentials1.xml new file mode 100644 index 00000000000..1a5fbd393d5 --- /dev/null +++ b/dbms/tests/integration/test_replication_credentials/configs/credentials1.xml @@ -0,0 +1,7 @@ + + 9009 + + admin + 222 + + diff --git a/dbms/tests/integration/test_replication_credentials/configs/credentials2.xml b/dbms/tests/integration/test_replication_credentials/configs/credentials2.xml new file mode 100644 index 00000000000..cf846e7a53d --- /dev/null +++ b/dbms/tests/integration/test_replication_credentials/configs/credentials2.xml @@ -0,0 +1,7 @@ + + 9009 + + root + 111 + + diff --git a/dbms/tests/integration/test_replication_credentials/configs/no_credentials.xml b/dbms/tests/integration/test_replication_credentials/configs/no_credentials.xml new file mode 100644 index 00000000000..9822058811e --- /dev/null +++ b/dbms/tests/integration/test_replication_credentials/configs/no_credentials.xml @@ -0,0 +1,3 @@ + + 9009 + diff --git a/dbms/tests/integration/test_replication_credentials/configs/remote_servers.xml b/dbms/tests/integration/test_replication_credentials/configs/remote_servers.xml new file mode 100644 index 00000000000..d8b384a6392 --- /dev/null +++ b/dbms/tests/integration/test_replication_credentials/configs/remote_servers.xml @@ -0,0 +1,58 @@ + + + + + true + + test + node1 + 9000 + + + test + node2 + 9000 + + + + true + + test + node3 + 9000 + + + test + node4 + 9000 + + + + true + + test + node5 + 9000 + + + test + node7 + 9000 + + + + true + + test + node7 + 9000 + + + test + node8 + 9000 + + + + + diff --git a/dbms/tests/integration/test_replication_credentials/test.py b/dbms/tests/integration/test_replication_credentials/test.py new file mode 100644 index 00000000000..0b2163c05ad --- /dev/null +++ b/dbms/tests/integration/test_replication_credentials/test.py @@ -0,0 +1,132 @@ +import time +import pytest + +from helpers.cluster import ClickHouseCluster + + +def _fill_nodes(nodes, shard): + for node in nodes: + node.query( + ''' + CREATE DATABASE test; + + CREATE TABLE test_table(date Date, id UInt32, dummy UInt32) + ENGINE = ReplicatedMergeTree('/clickhouse/tables/test{shard}/replicated', '{replica}', date, id, 8192); + '''.format(shard=shard, replica=node.name)) + +cluster = ClickHouseCluster(__file__, server_bin_path="/home/alesap/ClickHouse/dbms/programs/clickhouse-server") +node1 = cluster.add_instance('node1', main_configs=['configs/remote_servers.xml', 'configs/credentials1.xml'], with_zookeeper=True) +node2 = cluster.add_instance('node2', main_configs=['configs/remote_servers.xml', 'configs/credentials1.xml'], with_zookeeper=True) + + +@pytest.fixture(scope="module") +def same_credentials_cluster(): + try: + cluster.start() + + _fill_nodes([node1, node2], 1) + + yield cluster + + finally: + cluster.shutdown() + +def test_same_credentials(same_credentials_cluster): + node1.query("insert into test_table values ('2017-06-16', 111, 0)") + time.sleep(1) + + assert node1.query("SELECT id FROM test_table order by id") == '111\n' + assert node2.query("SELECT id FROM test_table order by id") == '111\n' + + node2.query("insert into test_table values ('2017-06-17', 222, 1)") + time.sleep(1) + + assert node1.query("SELECT id FROM test_table order by id") == '111\n222\n' + assert node2.query("SELECT id FROM test_table order by id") == '111\n222\n' + + +node3 = cluster.add_instance('node3', main_configs=['configs/remote_servers.xml', 'configs/no_credentials.xml'], with_zookeeper=True) +node4 = cluster.add_instance('node4', main_configs=['configs/remote_servers.xml', 'configs/no_credentials.xml'], with_zookeeper=True) + +@pytest.fixture(scope="module") +def no_credentials_cluster(): + try: + cluster.start() + + _fill_nodes([node3, node4], 2) + + yield cluster + + finally: + cluster.shutdown() + + +def test_no_credentials(no_credentials_cluster): + node3.query("insert into test_table values ('2017-06-18', 111, 0)") + time.sleep(1) + + assert node3.query("SELECT id FROM test_table order by id") == '111\n' + assert node4.query("SELECT id FROM test_table order by id") == '111\n' + + node4.query("insert into test_table values ('2017-06-19', 222, 1)") + time.sleep(1) + + assert node3.query("SELECT id FROM test_table order by id") == '111\n222\n' + assert node4.query("SELECT id FROM test_table order by id") == '111\n222\n' + +node5 = cluster.add_instance('node5', main_configs=['configs/remote_servers.xml', 'configs/credentials1.xml'], with_zookeeper=True) +node6 = cluster.add_instance('node6', main_configs=['configs/remote_servers.xml', 'configs/credentials2.xml'], with_zookeeper=True) + +@pytest.fixture(scope="module") +def different_credentials_cluster(): + try: + cluster.start() + + _fill_nodes([node5, node6], 3) + + yield cluster + + finally: + cluster.shutdown() + +def test_different_credentials(different_credentials_cluster): + node5.query("insert into test_table values ('2017-06-20', 111, 0)") + time.sleep(1) + + assert node5.query("SELECT id FROM test_table order by id") == '111\n' + assert node6.query("SELECT id FROM test_table order by id") == '' + + node6.query("insert into test_table values ('2017-06-21', 222, 1)") + time.sleep(1) + + assert node5.query("SELECT id FROM test_table order by id") == '111\n' + assert node6.query("SELECT id FROM test_table order by id") == '222\n' + +node7 = cluster.add_instance('node7', main_configs=['configs/remote_servers.xml', 'configs/credentials1.xml'], with_zookeeper=True) +node8 = cluster.add_instance('node8', main_configs=['configs/remote_servers.xml', 'configs/no_credentials.xml'], with_zookeeper=True) + +@pytest.fixture(scope="module") +def credentials_and_no_credentials_cluster(): + try: + cluster.start() + + _fill_nodes([node7, node8], 4) + + yield cluster + + finally: + cluster.shutdown() + +def test_credentials_and_no_credentials(credentials_and_no_credentials_cluster): + node7.query("insert into test_table values ('2017-06-21', 111, 0)") + time.sleep(1) + + assert node7.query("SELECT id FROM test_table order by id") == '111\n' + assert node8.query("SELECT id FROM test_table order by id") == '' + + node8.query("insert into test_table values ('2017-06-22', 222, 1)") + time.sleep(1) + + assert node7.query("SELECT id FROM test_table order by id") == '111\n' + assert node8.query("SELECT id FROM test_table order by id") == '222\n' + diff --git a/dbms/tests/integration/test_storage_kafka/__init__.py b/dbms/tests/integration/test_storage_kafka/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/dbms/tests/integration/test_storage_kafka/configs/kafka.xml b/dbms/tests/integration/test_storage_kafka/configs/kafka.xml new file mode 100644 index 00000000000..e5c07881e06 --- /dev/null +++ b/dbms/tests/integration/test_storage_kafka/configs/kafka.xml @@ -0,0 +1,5 @@ + + + earliest + + diff --git a/dbms/tests/integration/test_storage_kafka/test.py b/dbms/tests/integration/test_storage_kafka/test.py new file mode 100644 index 00000000000..85d6f090d58 --- /dev/null +++ b/dbms/tests/integration/test_storage_kafka/test.py @@ -0,0 +1,68 @@ +import os.path as p +import time +import datetime +import pytest + +from helpers.cluster import ClickHouseCluster +from helpers.test_tools import TSV + +import json +import subprocess + + + +cluster = ClickHouseCluster(__file__) +instance = cluster.add_instance('instance', main_configs=['configs/kafka.xml'], with_kafka = True) + +@pytest.fixture(scope="module") +def started_cluster(): + try: + cluster.start() + instance.query('CREATE DATABASE test') + + yield cluster + + finally: + cluster.shutdown() + +def kafka_is_available(started_cluster): + p = subprocess.Popen(('docker', 'exec', '-i', started_cluster.kafka_docker_id, '/usr/bin/kafka-broker-api-versions', '--bootstrap-server', 'PLAINTEXT://localhost:9092'), stdout=subprocess.PIPE) + streamdata = p.communicate()[0] + return p.returncode == 0 + +def kafka_produce(started_cluster, topic, messages): + p = subprocess.Popen(('docker', 'exec', '-i', started_cluster.kafka_docker_id, '/usr/bin/kafka-console-producer', '--broker-list', 'localhost:9092', '--topic', topic), stdin=subprocess.PIPE) + p.communicate(messages) + p.stdin.close() + +def test_kafka_json(started_cluster): + instance.query(''' +DROP TABLE IF EXISTS test.kafka; +CREATE TABLE test.kafka (key UInt64, value UInt64) + ENGINE = Kafka('kafka1:9092', 'json', 'json', 'JSONEachRow', '\\n'); +''') + + retries = 0 + while True: + if kafka_is_available(started_cluster): + break + else: + retries += 1 + if retries > 50: + raise 'Cannot connect to kafka.' + print("Waiting for kafka to be available...") + time.sleep(1) + messages = '' + for i in xrange(50): + messages += json.dumps({'key': i, 'value': i}) + '\n' + kafka_produce(started_cluster, 'json', messages) + time.sleep(3) + result = instance.query('SELECT * FROM test.kafka;') + with open(p.join(p.dirname(__file__), 'test_kafka_json.reference')) as reference: + assert TSV(result) == TSV(reference) + instance.query('DROP TABLE test.kafka') + +if __name__ == '__main__': + cluster.start() + raw_input("Cluster created, press any key to destroy...") + cluster.shutdown() diff --git a/dbms/tests/integration/test_storage_kafka/test_kafka_json.reference b/dbms/tests/integration/test_storage_kafka/test_kafka_json.reference new file mode 100644 index 00000000000..959bb2aad74 --- /dev/null +++ b/dbms/tests/integration/test_storage_kafka/test_kafka_json.reference @@ -0,0 +1,50 @@ +0 0 +1 1 +2 2 +3 3 +4 4 +5 5 +6 6 +7 7 +8 8 +9 9 +10 10 +11 11 +12 12 +13 13 +14 14 +15 15 +16 16 +17 17 +18 18 +19 19 +20 20 +21 21 +22 22 +23 23 +24 24 +25 25 +26 26 +27 27 +28 28 +29 29 +30 30 +31 31 +32 32 +33 33 +34 34 +35 35 +36 36 +37 37 +38 38 +39 39 +40 40 +41 41 +42 42 +43 43 +44 44 +45 45 +46 46 +47 47 +48 48 +49 49 diff --git a/dbms/tests/queries/0_stateless/00534_filimonov.data b/dbms/tests/queries/0_stateless/00534_filimonov.data index 2dd470403c0..b4c15b01ef4 100644 --- a/dbms/tests/queries/0_stateless/00534_filimonov.data +++ b/dbms/tests/queries/0_stateless/00534_filimonov.data @@ -428,5 +428,7 @@ SELECT COVAR_SAMPArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int8)])); SELECT medianTimingWeightedArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int8)])); SELECT quantilesDeterministicArray([CAST( 0 AS Int8)],arrayPopBack([CAST( 0 AS Int32)])); -SELECT maxIntersections([], []) -SELECT sumMap([], []) +SELECT maxIntersections([], []); +SELECT sumMap([], []); + +SELECT countArray(); diff --git a/dbms/tests/queries/0_stateless/00674_has_array_enum.reference b/dbms/tests/queries/0_stateless/00674_has_array_enum.reference new file mode 100644 index 00000000000..d00491fd7e5 --- /dev/null +++ b/dbms/tests/queries/0_stateless/00674_has_array_enum.reference @@ -0,0 +1 @@ +1 diff --git a/dbms/tests/queries/0_stateless/00674_has_array_enum.sql b/dbms/tests/queries/0_stateless/00674_has_array_enum.sql new file mode 100644 index 00000000000..b8baf602216 --- /dev/null +++ b/dbms/tests/queries/0_stateless/00674_has_array_enum.sql @@ -0,0 +1 @@ +SELECT has([x], 10) FROM (SELECT CAST(10 AS Enum8('hello' = 1, 'world' = 2, 'abc' = 10)) AS x); diff --git a/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.reference b/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.reference new file mode 100644 index 00000000000..de78180725a --- /dev/null +++ b/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.reference @@ -0,0 +1,2 @@ +4 +8 diff --git a/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.sql b/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.sql new file mode 100644 index 00000000000..f10ceeaa646 --- /dev/null +++ b/dbms/tests/queries/0_stateless/00675_shard_remote_with_table_function.sql @@ -0,0 +1,9 @@ +DROP TABLE IF EXISTS test.remote_test; +CREATE TABLE test.remote_test(a1 UInt8) ENGINE=Memory; +INSERT INTO FUNCTION remote('127.0.0.1', test.remote_test) VALUES(1); +INSERT INTO FUNCTION remote('127.0.0.1', test.remote_test) VALUES(2); +INSERT INTO FUNCTION remote('127.0.0.1', test.remote_test) VALUES(3); +INSERT INTO FUNCTION remote('127.0.0.1', test.remote_test) VALUES(4); +SELECT COUNT(*) FROM remote('127.0.0.1', test.remote_test); +SELECT count(*) FROM remote('127.0.0.{1,2}', merge(test, '^remote_test')); +DROP TABLE test.remote_test; diff --git a/debian/changelog b/debian/changelog index 28ae11ca4c8..68f349b3c45 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,5 +1,5 @@ -clickhouse (18.1.0) unstable; urgency=low +clickhouse (18.2.0) unstable; urgency=low * Modified source code - -- Fri, 20 Jul 2018 04:00:20 +0300 + -- Mon, 23 Jul 2018 22:38:09 +0300 diff --git a/docs/en/development/build_osx.md b/docs/en/development/build_osx.md index 46ce91b3b9d..b54fd7ac32c 100644 --- a/docs/en/development/build_osx.md +++ b/docs/en/development/build_osx.md @@ -12,7 +12,7 @@ With appropriate changes, it should also work on any other Linux distribution. ## Install required compilers, tools, and libraries ```bash -brew install cmake ninja gcc icu4c mysql openssl unixodbc libtool gettext readline +brew install cmake ninja gcc icu4c mariadb-connector-c openssl unixodbc libtool gettext readline ``` ## Checkout ClickHouse sources diff --git a/docs/en/index.md b/docs/en/index.md index bcc70a4bd3c..df958bdaa7b 100644 --- a/docs/en/index.md +++ b/docs/en/index.md @@ -78,9 +78,9 @@ See the difference? Read further to learn why this happens. For example, the query "count the number of records for each advertising platform" requires reading one "advertising platform ID" column, which takes up 1 byte uncompressed. If most of the traffic was not from advertising platforms, you can expect at least 10-fold compression of this column. When using a quick compression algorithm, data decompression is possible at a speed of at least several gigabytes of uncompressed data per second. In other words, this query can be processed at a speed of approximately several billion rows per second on a single server. This speed is actually achieved in practice. -Example: - -```bash +
Example +

+

 $ clickhouse-client
 ClickHouse client version 0.0.52053.
 Connecting to localhost:9000.
@@ -122,7 +122,9 @@ LIMIT 20
 20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
 
 :)
-```
+
+

+
### CPU diff --git a/docs/en/introduction/distinctive_features.md b/docs/en/introduction/distinctive_features.md index f626f13c274..ad6a7efc6e0 100644 --- a/docs/en/introduction/distinctive_features.md +++ b/docs/en/introduction/distinctive_features.md @@ -2,27 +2,27 @@ ## True column-oriented DBMS -In a true column-oriented DBMS, there isn't any "garbage" stored with the values. Among other things, this means that constant-length values must be supported, to avoid storing their length "number" next to the values. As an example, a billion UInt8-type values should actually consume around 1 GB uncompressed, or this will strongly affect the CPU use. It is very important to store data compactly (without any "garbage") even when uncompressed, since the speed of decompression (CPU usage) depends mainly on the volume of uncompressed data. +In a true column-oriented DBMS, there is no excessive data stored with the values. For example, this means that constant-length values must be supported, to avoid storing their length as additional integer next to the values. In this case, a billion UInt8 values should actually consume around 1 GB uncompressed, or this will strongly affect the CPU use. It is very important to store data compactly even when uncompressed, since the speed of decompression (CPU usage) depends mainly on the volume of uncompressed data. -This is worth noting because there are systems that can store values of separate columns separately, but that can't effectively process analytical queries due to their optimization for other scenarios. Examples are HBase, BigTable, Cassandra, and HyperTable. In these systems, you will get throughput around a hundred thousand rows per second, but not hundreds of millions of rows per second. +This is worth noting because there are systems that can store values of different columns separately, but that can't effectively process analytical queries due to their optimization for other scenarios. Examples are HBase, BigTable, Cassandra, and HyperTable. In these systems, you will get throughput around a hundred thousand rows per second, but not hundreds of millions of rows per second. -Also note that ClickHouse is a DBMS, not a single database. ClickHouse allows creating tables and databases in runtime, loading data, and running queries without reconfiguring and restarting the server. +Also note that ClickHouse is a database management system, not a single database. ClickHouse allows creating tables and databases in runtime, loading data, and running queries without reconfiguring and restarting the server. ## Data compression -Some column-oriented DBMSs (InfiniDB CE and MonetDB) do not use data compression. However, data compression really improves performance. +Some column-oriented DBMSs (InfiniDB CE and MonetDB) do not use data compression. However, data compression is crucial to achieve excellent performance. ## Disk storage of data -Many column-oriented DBMSs (such as SAP HANA and Google PowerDrill) can only work in RAM. But even on thousands of servers, the RAM is too small for storing all the pageviews and sessions in Yandex.Metrica. +Many column-oriented DBMSs (such as SAP HANA and Google PowerDrill) can only work in RAM. This approach stimulates the allocation of a larger hardware budget than is actually necessary for real-time analysis. ClickHouse is designed to work on regular hard drives, which ensures low cost of ownership per gigabyte of data, but SSD and additional RAM are also utilized fully if available. ## Parallel processing on multiple cores -Large queries are parallelized in a natural way. +Large queries are parallelized in a natural way, utilizing all necessary resources that are available on the current server. ## Distributed processing on multiple servers -Almost none of the columnar DBMSs listed above have support for distributed processing. +Almost none of the columnar DBMSs mentioned above have support for distributed query processing. In ClickHouse, data can reside on different shards. Each shard can be a group of replicas that are used for fault tolerance. The query is processed on all the shards in parallel. This is transparent for the user. ## SQL support @@ -33,30 +33,37 @@ However, this is a declarative query language based on SQL that can't be differe JOINs are supported. Subqueries are supported in FROM, IN, and JOIN clauses, as well as scalar subqueries. Dependent subqueries are not supported. +ClickHouse supports declarative query language that is based on SQL and complies to SQL standard in many cases. +GROUP BY, ORDER BY, scalar subqueries and subqueries in FROM, IN and JOIN clauses are supported. +Correlated subqueries and window functions are not supported. + ## Vector engine -Data is not only stored by columns, but is processed by vectors (parts of columns). This allows us to achieve high CPU performance. +Data is not only stored by columns, but is also processed by vectors (parts of columns). This allows to achieve high CPU efficiency. ## Real-time data updates -ClickHouse supports primary key tables. In order to quickly perform queries on the range of the primary key, the data is sorted incrementally using the merge tree. Due to this, data can continually be added to the table. There is no locking when adding data. +ClickHouse supports tables with a primary key. In order to quickly perform queries on the range of the primary key, the data is sorted incrementally using the merge tree. Due to this, data can continually be added to the table. No locks are taken when new data is ingested. -## Indexes +## Index -Having a primary key makes it possible to extract data for specific clients (for instance, Yandex.Metrica tracking tags) for a specific time range, with low latency less than several dozen milliseconds. +Having a data physically sorted by primary key makes it possible to extract data for it's specific values or value ranges with low latency, less than few dozen milliseconds. ## Suitable for online queries -This lets us use the system as the back-end for a web interface. Low latency means queries can be processed without delay, while the Yandex.Metrica interface page is loading. In other words, in online mode. +Low latency means that queries can be processed without delay and without trying to prepare answer in advance, right at the same moment while user interface page is loading. In other words, online. ## Support for approximated calculations -1. The system contains aggregate functions for approximated calculation of the number of various values, medians, and quantiles. -2. Supports running a query based on a part (sample) of data and getting an approximated result. In this case, proportionally less data is retrieved from the disk. -3. Supports running an aggregation for a limited number of random keys, instead of for all keys. Under certain conditions for key distribution in the data, this provides a reasonably accurate result while using fewer resources. +ClickHouse provides various ways to trade accuracy for performance: -## Data replication and support for data integrity on replicas +1. Aggregate functions for approximated calculation of the number of distinct values, medians, and quantiles. +2. Running a query based on a part (sample) of data and getting an approximated result. In this case, proportionally less data is retrieved from the disk. +3. Running an aggregation for a limited number of random keys, instead of for all keys. Under certain conditions for key distribution in the data, this provides a reasonably accurate result while using fewer resources. -Uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the remaining replicas. The system maintains identical data on different replicas. Data is restored automatically after a failure, or using a "button" for complex cases. -For more information, see the section [Data replication](../operations/table_engines/replication.md#table_engines-replication). +## Data replication and integrity + +ClickHouse uses asynchronous multimaster replication. After being written to any available replica, data is distributed to all the other replicas in background. The system maintains identical data on different replicas. Data is restored automatically after most failures, or semiautomatically in complicated cases. + +For more information, see the [Data replication](../operations/table_engines/replication.md#table_engines-replication) section. diff --git a/docs/en/introduction/features_considered_disadvantages.md b/docs/en/introduction/features_considered_disadvantages.md index 80708c02883..54a1f2c7a69 100644 --- a/docs/en/introduction/features_considered_disadvantages.md +++ b/docs/en/introduction/features_considered_disadvantages.md @@ -1,6 +1,6 @@ # ClickHouse features that can be considered disadvantages -1. No transactions. -2. For aggregation, query results must fit in the RAM on a single server. However, the volume of source data for a query may be indefinitely large. -3. Lack of full-fledged UPDATE/DELETE implementation. +1. No full-fledged transactions. +2. Lack of ability to modify or delete already inserted data with high rate and low latency. There are batch deletes available to clean up data that is not needed anymore or to comply with [GDPR](https://gdpr-info.eu). Batch updates are in development as of July 2018. +3. Sparse index makes ClickHouse not really suitable for point queries retrieving single rows by their keys. diff --git a/docs/en/operations/configuration_files.md b/docs/en/operations/configuration_files.md index 52e9e10ffea..1551ab47952 100644 --- a/docs/en/operations/configuration_files.md +++ b/docs/en/operations/configuration_files.md @@ -4,7 +4,7 @@ The main server config file is `config.xml`. It resides in the `/etc/clickhouse-server/` directory. -Individual settings can be overridden in the `*.xml`and`*.conf` files in the `conf.d` and `config.d` directories next to the config file. +Individual settings can be overridden in the `*.xml`and `*.conf` files in the `conf.d` and `config.d` directories next to the config file. The `replace` or `remove` attributes can be specified for the elements of these config files. @@ -12,11 +12,11 @@ If neither is specified, it combines the contents of elements recursively, repla If `replace` is specified, it replaces the entire element with the specified one. -If ` remove` is specified, it deletes the element. +If `remove` is specified, it deletes the element. -The config can also define "substitutions". If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](server_settings/settings.md#server_settings-include_from) element in the server config. The substitution values are specified in `/yandex/substitution_name` elements in this file. If a substitution specified in ` incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros]()server_settings/settings.md#server_settings-macros)). +The config can also define "substitutions". If an element has the `incl` attribute, the corresponding substitution from the file will be used as the value. By default, the path to the file with substitutions is `/etc/metrika.xml`. This can be changed in the [include_from](server_settings/settings.md#server_settings-include_from) element in the server config. The substitution values are specified in `/yandex/substitution_name` elements in this file. If a substitution specified in ` incl` does not exist, it is recorded in the log. To prevent ClickHouse from logging missing substitutions, specify the `optional="true"` attribute (for example, settings for [macros](server_settings/settings.md#server_settings-macros)). -Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk = "/path/to/node"`. The element value is replaced with the contents of the node at ` /path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element. +Substitutions can also be performed from ZooKeeper. To do this, specify the attribute `from_zk="/path/to/node"`. The element value is replaced with the contents of the node at `/path/to/node` in ZooKeeper. You can also put an entire XML subtree on the ZooKeeper node and it will be fully inserted into the source element. The `config.xml` file can specify a separate config with user settings, profiles, and quotas. The relative path to this config is set in the 'users_config' element. By default, it is `users.xml`. If `users_config` is omitted, the user settings, profiles, and quotas are specified directly in `config.xml`. diff --git a/docs/en/operations/table_engines/url.md b/docs/en/operations/table_engines/url.md new file mode 100644 index 00000000000..d8dec7dcabd --- /dev/null +++ b/docs/en/operations/table_engines/url.md @@ -0,0 +1,77 @@ + + +# URL(URL, Format) + +This data source operates with data on remote HTTP/HTTPS server. The engine is +similar to [`File`](./file.md#). + +## Usage in ClickHouse server + +``` +URL(URL, Format) +``` + +`Format` should be supported for `SELECT` and/or `INSERT`. For the full list of +supported formats see [Formats](../../interfaces/formats.md#formats). + +`URL` must match the format of Uniform Resource Locator. The specified +URL must address a server working with HTTP or HTTPS. The server shouldn't +require any additional HTTP-headers. + +`INSERT` and `SELECT` queries are transformed into `POST` and `GET` requests +respectively. For correct `POST`-requests handling the remote server should support +[Chunked transfer encoding](https://ru.wikipedia.org/wiki/Chunked_transfer_encoding). + +**Example:** + +**1.** Create the `url_engine_table` table: + +```sql +CREATE TABLE url_engine_table (word String, value UInt64) +ENGINE=URL('http://127.0.0.1:12345/', CSV) +``` + +**2.** Implement simple http-server using python3: + +```python3 +from http.server import BaseHTTPRequestHandler, HTTPServer + +class CSVHTTPServer(BaseHTTPRequestHandler): + def do_GET(self): + self.send_response(200) + self.send_header('Content-type', 'text/csv') + self.end_headers() + + self.wfile.write(bytes('Hello,1\nWorld,2\n', "utf-8")) + +if __name__ == "__main__": + server_address = ('127.0.0.1', 12345) + HTTPServer(server_address, CSVHTTPServer).serve_forever() +``` + +```bash +python3 server.py +``` + +**3.** Query the data: + +```sql +SELECT * FROM url_engine_table +``` + +```text +┌─word──┬─value─┐ +│ Hello │ 1 │ +│ World │ 2 │ +└───────┴───────┘ +``` + + +## Details of implementation + +- Reads and writes can be parallel +- Not supported: + - `ALTER` + - `SELECT ... SAMPLE` + - Indices + - Replication diff --git a/docs/en/operations/tips.md b/docs/en/operations/tips.md index 9378c25fab1..4d999062a5d 100644 --- a/docs/en/operations/tips.md +++ b/docs/en/operations/tips.md @@ -113,6 +113,10 @@ With the default settings, ZooKeeper is a time bomb: This bomb must be defused. +If you want to move data between different ZooKeeper clusters, never move it by hand-written script, because it will produce wrong data for sequential nodes. Never use "zkcopy" tool, by the same reason: https://github.com/ksprojects/zkcopy/issues/15 + +If you want to split ZooKeeper cluster, proper way is to increase number of replicas and then reconfigure it as two independent clusters. + The ZooKeeper (3.5.1) configuration below is used in the Yandex.Metrica production environment as of May 20, 2017: zoo.cfg: diff --git a/docs/en/query_language/misc.md b/docs/en/query_language/misc.md index afa6d5164dc..99601086e70 100644 --- a/docs/en/query_language/misc.md +++ b/docs/en/query_language/misc.md @@ -11,7 +11,7 @@ After executing an ATTACH query, the server will know about the existence of the If the table was previously detached (``DETACH``), meaning that its structure is known, you can use shorthand without defining the structure. ```sql -ATTACH TABLE [IF NOT EXISTS] [db.]name +ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ``` This query is used when starting the server. The server stores table metadata as files with `ATTACH` queries, which it simply runs at launch (with the exception of system tables, which are explicitly created on the server). @@ -39,7 +39,7 @@ If `IF EXISTS` is specified, it doesn't return an error if the table doesn't exi Deletes information about the 'name' table from the server. The server stops knowing about the table's existence. ```sql -DETACH TABLE [IF EXISTS] [db.]name +DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster] ``` This does not delete the table's data or metadata. On the next server launch, the server will read the metadata and find out about the table again. @@ -167,7 +167,7 @@ To make settings that persist after a server restart, you can only use the serve ## OPTIMIZE ```sql -OPTIMIZE TABLE [db.]name [PARTITION partition] [FINAL] +OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition] [FINAL] ``` Asks the table engine to do something for optimization. @@ -181,7 +181,7 @@ If you specify `FINAL`, optimization will be performed even when all the data is ## KILL QUERY ```sql -KILL QUERY +KILL QUERY [ON CLUSTER cluster] WHERE [SYNC|ASYNC|TEST] [FORMAT format] diff --git a/docs/en/query_language/table_functions/url.md b/docs/en/query_language/table_functions/url.md new file mode 100644 index 00000000000..7e30936bd45 --- /dev/null +++ b/docs/en/query_language/table_functions/url.md @@ -0,0 +1,19 @@ + + +# url + +`url(URL, format, structure)` - returns a table created from the `URL` with given +`format` and `structure`. + +URL - HTTP or HTTPS server address, which can accept `GET` and/or `POST` requests. + +format - [format](../../interfaces/formats.md#formats) of the data. + +structure - table structure in `'UserID UInt64, Name String'` format. Determines column names and types. + +**Example** + +```sql +-- getting the first 3 lines of a table that contains columns of String and UInt32 type from HTTP-server which answers in CSV format. +SELECT * FROM url('http://127.0.0.1:12345/', CSV, 'column1 String, column2 UInt32') LIMIT 3 +``` diff --git a/docs/ru/index.md b/docs/ru/index.md index e6eb4a99bc3..cd83b410f55 100644 --- a/docs/ru/index.md +++ b/docs/ru/index.md @@ -78,9 +78,9 @@ ClickHouse - столбцовая система управления базам Для примера, для запроса "посчитать количество записей для каждой рекламной системы", требуется прочитать один столбец "идентификатор рекламной системы", который занимает 1 байт в несжатом виде. Если большинство переходов было не с рекламных систем, то можно рассчитывать хотя бы на десятикратное сжатие этого столбца. При использовании быстрого алгоритма сжатия, возможно разжатие данных со скоростью более нескольких гигабайт несжатых данных в секунду. То есть, такой запрос может выполняться со скоростью около нескольких миллиардов строк в секунду на одном сервере. На практике, такая скорость действительно достигается. -Пример: - -```bash +
Пример +

+

 $ clickhouse-client
 ClickHouse client version 0.0.52053.
 Connecting to localhost:9000.
@@ -122,7 +122,9 @@ LIMIT 20
 20 rows in set. Elapsed: 0.153 sec. Processed 1.00 billion rows, 4.00 GB (6.53 billion rows/s., 26.10 GB/s.)
 
 :)
-```
+
+

+
### По вычислениям diff --git a/docs/ru/introduction/distinctive_features.md b/docs/ru/introduction/distinctive_features.md index 031a5c7f6bb..c85d464222b 100644 --- a/docs/ru/introduction/distinctive_features.md +++ b/docs/ru/introduction/distinctive_features.md @@ -2,23 +2,23 @@ ## По-настоящему столбцовая СУБД -В по-настоящему столбцовой СУБД рядом со значениями не хранится никакого "мусора". Например, должны поддерживаться значения постоянной длины, чтобы не хранить рядом со значениями типа "число" их длины. Для примера, миллиард значений типа UInt8 должен действительно занимать в несжатом виде около 1GB, иначе это сильно ударит по эффективности использования CPU. Очень важно хранить данные компактно (без "мусора") в том числе в несжатом виде, так как скорость разжатия (использование CPU) зависит, в основном, от объёма несжатых данных. +В по-настоящему столбцовой СУБД рядом со значениями не хранится никаких лишних данных. Например, должны поддерживаться значения постоянной длины, чтобы не хранить рядом со значениями типа "число" их длины. Для примера, миллиард значений типа UInt8 должен действительно занимать в несжатом виде около 1GB, иначе это сильно ударит по эффективности использования CPU. Очень важно хранить данные компактно (без "мусора") в том числе в несжатом виде, так как скорость разжатия (использование CPU) зависит, в основном, от объёма несжатых данных. -Этот пункт пришлось выделить, так как существуют системы, которые могут хранить значения отдельных столбцов по отдельности, но не могут эффективно выполнять аналитические запросы в силу оптимизации под другой сценарий работы. Примеры: HBase, BigTable, Cassandra, HyperTable. В этих системах вы получите throughput в районе сотен тысяч строк в секунду, но не сотен миллионов строк в секунду. +Этот пункт пришлось выделить, так как существуют системы, которые могут хранить значения отдельных столбцов по отдельности, но не могут эффективно выполнять аналитические запросы в силу оптимизации под другой сценарий работы. Примеры: HBase, BigTable, Cassandra, HyperTable. В этих системах вы получите пропускную способность в районе сотен тысяч строк в секунду, но не сотен миллионов строк в секунду. -Также стоит заметить, что ClickHouse является СУБД, а не одной базой данных. То есть, ClickHouse позволяет создавать таблицы и базы данных в runtime, загружать данные и выполнять запросы без переконфигурирования и перезапуска сервера. +Также стоит заметить, что ClickHouse является системой управления базами данных, а не одной базой данных. То есть, ClickHouse позволяет создавать таблицы и базы данных в runtime, загружать данные и выполнять запросы без переконфигурирования и перезапуска сервера. ## Сжатие данных -Некоторые столбцовые СУБД (InfiniDB CE, MonetDB) не используют сжатие данных. Но сжатие данных действительно серьёзно увеличивает производительность. +Некоторые столбцовые СУБД (InfiniDB CE, MonetDB) не используют сжатие данных. Однако сжатие данных действительно играет одну из ключевых ролей в демонстрации отличной производительности. ## Хранение данных на диске -Многие столбцовые СУБД (SAP HANA, Google PowerDrill) могут работать только в оперативке. Но оперативки (даже на тысячах серверах) слишком мало для хранения всех хитов и визитов в Яндекс.Метрике. +Многие столбцовые СУБД (SAP HANA, Google PowerDrill) могут работать только в оперативной памяти. Такой подход стимулирует выделять больший бюджет на оборудование, чем фактически требуется для анализа в реальном времени. ClickHouse спроектирован для работы на обычных жестких дисках, что обеспечивает низкую стоимость хранения на гигабайт данных, но SSD b дополнительная оперативная память тоже полноценно используются, если доступны. ## Параллельная обработка запроса на многих процессорных ядрах -Большие запросы естественным образом распараллеливаются. +Большие запросы естественным образом распараллеливаются, используя все необходимые ресурсы из доступных на сервере. ## Распределённая обработка запроса на многих серверах @@ -27,11 +27,9 @@ ## Поддержка SQL -Если вы знаете, что такое стандартный SQL, то говорить о поддержке SQL всё-таки нельзя. -Все функции названы по-другому. -Тем не менее, это - декларативный язык запросов на основе SQL и во многих случаях не отличимый от SQL. -Поддерживаются JOIN-ы. Поддерживаются подзапросы в секциях FROM, IN, JOIN, а также скалярные подзапросы. -Зависимые подзапросы не поддерживаются. +ClickHouse поддерживает декларативный язык запросов на основе SQL и во многих случаях совпадающий с SQL стандартом. +Поддерживаются GROUP BY, ORDER BY, подзапросы в секциях FROM, IN, JOIN, а также скалярные подзапросы. +Зависимые подзапросы и оконные функции не поддерживаются. ## Векторный движок @@ -41,21 +39,24 @@ ClickHouse поддерживает таблицы с первичным ключом. Для того, чтобы можно было быстро выполнять запросы по диапазону первичного ключа, данные инкрементально сортируются с помощью merge дерева. За счёт этого, поддерживается постоянное добавление данных в таблицу. Блокировки при добавлении данных отсутствуют. -## Наличие индексов +## Наличие индекса -Наличие первичного ключа позволяет, например, вынимать данные для конкретных клиентов (счётчиков Метрики), для заданного диапазона времени, с низкими задержками - менее десятков миллисекунд. +Физическая сортировка данных по первичному ключу позволяет получать данные для конкретных его значений или их диапазонов с низкими задержками - менее десятков миллисекунд. ## Подходит для онлайн запросов -Это позволяет использовать систему в качестве бэкенда для веб-интерфейса. Низкие задержки позволяют не откладывать выполнение запроса, а выполнять его в момент загрузки страницы интерфейса Яндекс.Метрики. То есть, в режиме онлайн. +Низкие задержки позволяют не откладывать выполнение запроса и не подготавливать ответ заранее, а выполнять его именно в момент загрузки страницы пользовательского интерфейса. То есть, в режиме онлайн. ## Поддержка приближённых вычислений +ClickHouse предоставляет различные способы разменять точность вычислений на производительность: + 1. Система содержит агрегатные функции для приближённого вычисления количества различных значений, медианы и квантилей. 2. Поддерживается возможность выполнить запрос на основе части (выборки) данных и получить приближённый результат. При этом, с диска будет считано пропорционально меньше данных. 3. Поддерживается возможность выполнить агрегацию не для всех ключей, а для ограниченного количества первых попавшихся ключей. При выполнении некоторых условий на распределение ключей в данных, это позволяет получить достаточно точный результат с использованием меньшего количества ресурсов. -## Репликация данных, поддержка целостности данных на репликах +## Репликация данных и поддержка целостности + +Используется асинхронная multimaster репликация. После записи на любую доступную реплику, данные распространяются на все остальные реплики в фоне. Система поддерживает полную идентичность данных на разных репликах. Восстановление после большинства сбоев осуществляется автоматически, а в сложных случаях — полуавтоматически. -Используется асинхронная multimaster репликация. После записи на любую доступную реплику, данные распространяются на все остальные реплики. Система поддерживает полную идентичность данных на разных репликах. Восстановление после сбоя осуществляется автоматически, а в сложных случаях - "по кнопке". Подробнее смотрите раздел [Репликация данных](../operations/table_engines/replication.md#table_engines-replication). diff --git a/docs/ru/introduction/features_considered_disadvantages.md b/docs/ru/introduction/features_considered_disadvantages.md index c26272f4b6c..b7ac877cc32 100644 --- a/docs/ru/introduction/features_considered_disadvantages.md +++ b/docs/ru/introduction/features_considered_disadvantages.md @@ -1,6 +1,6 @@ # Особенности ClickHouse, которые могут считаться недостатками -1. Отсутствие транзакций. -2. Необходимо, чтобы результат выполнения запроса, в случае агрегации, помещался в оперативку на одном сервере. Объём исходных данных для запроса, при этом, может быть сколь угодно большим. -3. Отсутствие полноценной реализации UPDATE/DELETE. - +1. Отсутствие полноценных транзакций. +2. Возможность изменять или удалять ранее записанные данные с низкими задержками и высокой частотой запросов не предоставляется. Есть массовое удаление данных для очистки более не нужного или соответствия [GDPR](https://gdpr-info.eu). Массовое изменение данных находится в разработке (на момент июля 2018). +3. Разреженный индекс делает ClickHouse плохо пригодным для точечных чтений одиночных строк по своим +ключам. diff --git a/docs/ru/operations/table_engines/url.md b/docs/ru/operations/table_engines/url.md new file mode 100644 index 00000000000..b3daae06169 --- /dev/null +++ b/docs/ru/operations/table_engines/url.md @@ -0,0 +1,74 @@ + + +# URL(URL, Format) + +Управляет данными на удаленном HTTP/HTTPS сервере. Данный движок похож +на движок [`File`](./file.md#). + +## Использование движка в сервере ClickHouse + +`Format` должен быть таким, который ClickHouse может использовать в запросах +`SELECT` и, если есть необходимость, `INSERT`. Полный список поддерживаемых форматов смотрите в +разделе [Форматы](../../interfaces/formats.md#formats). + +`URL` должен соответствовать структуре Uniform Resource Locator. По указанному URL должен находится сервер +работающий по протоколу HTTP или HTTPS. При этом не должно требоваться никаких +дополнительных заголовков для получения ответа от сервера. + +Запросы `INSERT` и `SELECT` транслируются в `POST` и `GET` запросы +соответственно. Для обработки `POST`-запросов удаленный сервер должен поддерживать +[Chunked transfer encoding](https://ru.wikipedia.org/wiki/Chunked_transfer_encoding). + +**Пример:** + +**1.** Создадим на сервере таблицу `url_engine_table`: + +```sql +CREATE TABLE url_engine_table (word String, value UInt64) +ENGINE=URL('http://127.0.0.1:12345/', CSV) +``` + +**2.** Создадим простейший http-сервер стандартными средствами языка python3 и +запустим его: + +```python3 +from http.server import BaseHTTPRequestHandler, HTTPServer + +class CSVHTTPServer(BaseHTTPRequestHandler): + def do_GET(self): + self.send_response(200) + self.send_header('Content-type', 'text/csv') + self.end_headers() + + self.wfile.write(bytes('Hello,1\nWorld,2\n', "utf-8")) + +if __name__ == "__main__": + server_address = ('127.0.0.1', 12345) + HTTPServer(server_address, CSVHTTPServer).serve_forever() +``` + +```bash +python3 server.py +``` + +**3.** Запросим данные: + +```sql +SELECT * FROM url_engine_table +``` + +```text +┌─word──┬─value─┐ +│ Hello │ 1 │ +│ World │ 2 │ +└───────┴───────┘ +``` + +## Особенности использования + +- Поддерживается многопоточное чтение и запись. +- Не поддерживается: + - использование операций `ALTER` и `SELECT...SAMPLE`; + - индексы; + - репликация. + diff --git a/docs/ru/operations/tips.md b/docs/ru/operations/tips.md index 315a8fb07fa..a1ddc9246e5 100644 --- a/docs/ru/operations/tips.md +++ b/docs/ru/operations/tips.md @@ -107,6 +107,10 @@ XFS также подходит, но не так тщательно проте Лучше использовать свежую версию ZooKeeper, как минимум 3.4.9. Версия в стабильных дистрибутивах Linux может быть устаревшей. +Никогда не используете написанные вручную скрипты для переноса данных между разными ZooKeeper кластерами, потому что результат будет некорректный для sequential нод. Никогда не используйте утилиту "zkcopy", по той же причине: https://github.com/ksprojects/zkcopy/issues/15 + +Если вы хотите разделить существующий ZooKeeper кластер на два, правильный способ - увеличить количество его реплик, а затем переконфигурировать его как два независимых кластера. + С настройками по умолчанию, ZooKeeper является бомбой замедленного действия: > Сервер ZooKeeper не будет удалять файлы со старыми снепшоты и логами при использовании конфигурации по умолчанию (см. autopurge), это является ответственностью оператора. diff --git a/docs/ru/query_language/misc.md b/docs/ru/query_language/misc.md index f7e00babe1a..5385c0f20fd 100644 --- a/docs/ru/query_language/misc.md +++ b/docs/ru/query_language/misc.md @@ -13,7 +13,7 @@ Если таблица перед этим была отсоединена (`DETACH`), т.е. её структура известна, то можно использовать сокращенную форму записи без определения структуры. ```sql -ATTACH TABLE [IF NOT EXISTS] [db.]name +ATTACH TABLE [IF NOT EXISTS] [db.]name [ON CLUSTER cluster] ``` Этот запрос используется при старте сервера. Сервер хранит метаданные таблиц в виде файлов с запросами `ATTACH`, которые он просто исполняет при запуске (за исключением системных таблиц, создание которых явно вписано в сервер). @@ -39,7 +39,7 @@ DROP [TEMPORARY] TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster] Удаляет из сервера информацию о таблице name. Сервер перестаёт знать о существовании таблицы. ```sql -DETACH TABLE [IF EXISTS] [db.]name +DETACH TABLE [IF EXISTS] [db.]name [ON CLUSTER cluster] ``` Но ни данные, ни метаданные таблицы не удаляются. При следующем запуске сервера, сервер прочитает метаданные и снова узнает о таблице. @@ -166,7 +166,7 @@ SET param = value ## OPTIMIZE ```sql -OPTIMIZE TABLE [db.]name [PARTITION partition] [FINAL] +OPTIMIZE TABLE [db.]name [ON CLUSTER cluster] [PARTITION partition] [FINAL] ``` Просит движок таблицы сделать что-нибудь, что может привести к более оптимальной работе. @@ -180,7 +180,7 @@ OPTIMIZE TABLE [db.]name [PARTITION partition] [FINAL] ## KILL QUERY ```sql -KILL QUERY +KILL QUERY [ON CLUSTER cluster] WHERE [SYNC|ASYNC|TEST] [FORMAT format] diff --git a/docs/ru/query_language/table_functions/url.md b/docs/ru/query_language/table_functions/url.md new file mode 100644 index 00000000000..7c5068b3caa --- /dev/null +++ b/docs/ru/query_language/table_functions/url.md @@ -0,0 +1,20 @@ + + +# url + +`url(URL, format, structure)` - возвращает таблицу со столбцами, указанными в +`structure`, созданную из данных находящихся по `URL` в формате `format`. + +URL - адрес, по которому сервер принимает `GET` и/или `POST` запросы по +протоколу HTTP или HTTPS. + +format - [формат](../../interfaces/formats.md#formats) данных. + +structure - структура таблицы в форме `'UserID UInt64, Name String'`. Определяет имена и типы столбцов. + +**Пример** + +```sql +-- получение 3-х строк таблицы, состоящей из двух колонк типа String и UInt32 от сервера, отдающего данные в формате CSV +SELECT * FROM url('http://127.0.0.1:12345/', CSV, 'column1 String, column2 UInt32') LIMIT 3 +``` diff --git a/docs/toc_en.yml b/docs/toc_en.yml index a55b7426272..7f830eac379 100644 --- a/docs/toc_en.yml +++ b/docs/toc_en.yml @@ -94,6 +94,7 @@ pages: - 'merge': 'query_language/table_functions/merge.md' - 'numbers': 'query_language/table_functions/numbers.md' - 'remote': 'query_language/table_functions/remote.md' + - 'url': 'query_language/table_functions/url.md' - 'Dictionaries': - 'Introduction': 'query_language/dicts/index.md' - 'External dictionaries': @@ -134,6 +135,7 @@ pages: - 'Null': 'operations/table_engines/null.md' - 'Set': 'operations/table_engines/set.md' - 'Join': 'operations/table_engines/join.md' + - 'URL': 'operations/table_engines/url.md' - 'View': 'operations/table_engines/view.md' - 'MaterializedView': 'operations/table_engines/materializedview.md' - 'Integrations': diff --git a/docs/toc_ru.yml b/docs/toc_ru.yml index 2b088529cbb..9e086eed378 100644 --- a/docs/toc_ru.yml +++ b/docs/toc_ru.yml @@ -97,6 +97,7 @@ pages: - 'merge': 'query_language/table_functions/merge.md' - 'numbers': 'query_language/table_functions/numbers.md' - 'remote': 'query_language/table_functions/remote.md' + - 'url': 'query_language/table_functions/url.md' - 'Словари': - 'Введение': 'query_language/dicts/index.md' - 'Внешние словари': @@ -138,6 +139,7 @@ pages: - 'Null': 'operations/table_engines/null.md' - 'Set': 'operations/table_engines/set.md' - 'Join': 'operations/table_engines/join.md' + - 'URL': 'operations/table_engines/url.md' - 'View': 'operations/table_engines/view.md' - 'MaterializedView': 'operations/table_engines/materializedview.md' - 'Интеграции': diff --git a/docs/tools/build.py b/docs/tools/build.py index c5a69217b02..ff1551ed8d7 100755 --- a/docs/tools/build.py +++ b/docs/tools/build.py @@ -60,6 +60,7 @@ def build_for_lang(lang, args): 'static_templates': ['404.html'], 'extra': { 'single_page': False, + 'opposite_lang': 'en' if lang == 'ru' else 'ru', 'search': { 'language': 'en' if lang == 'en' else 'en, %s' % lang } @@ -108,6 +109,7 @@ def build_single_page_version(lang, args, cfg): 'site_dir': temp, 'extra': { 'single_page': True, + 'opposite_lang': 'en' if lang == 'ru' else 'ru', 'search': { 'language': 'en, ru' } diff --git a/docs/tools/mkdocs-material-theme/base.html b/docs/tools/mkdocs-material-theme/base.html index 548f57c853c..97283c55078 100644 --- a/docs/tools/mkdocs-material-theme/base.html +++ b/docs/tools/mkdocs-material-theme/base.html @@ -224,34 +224,11 @@ } }); } - function drawLanguageSwitch() { - var url, text, title; - if (window.location.pathname.indexOf('/ru/') >= 0) { - url = window.location.pathname.replace('/ru/', '/en/'); - text = "\n" + - "\n" + - "\n" + - "\n" + - "\n" + - ""; - title = "Switch to English" - } else { - url = window.location.pathname.replace('/en/', '/ru/'); - text = "\n" + - "\n" + - "\n" + - "\n" + - ""; - title = "Переключить на русский язык" - } - document.getElementById("md-language-switch").innerHTML = '' + text + ''; - } ready(function () { {% if config.extra.single_page and page.content %} document.getElementById("content").innerHTML = {{ page.content|tojson|safe }}; document.getElementsByClassName('md-footer')[0].style.display = 'block'; {% endif %} - drawLanguageSwitch(); app.initialize({ version: "{{ mkdocs_version }}", url: { diff --git a/docs/tools/mkdocs-material-theme/partials/header.html b/docs/tools/mkdocs-material-theme/partials/header.html index 9b81a53de8d..21b95950b88 100644 --- a/docs/tools/mkdocs-material-theme/partials/header.html +++ b/docs/tools/mkdocs-material-theme/partials/header.html @@ -40,9 +40,27 @@ {% endif %} {% endblock %} - {% if config.repo_url %} + {% if page %}
+ {% if config.extra.lang == 'ru' %} + + + + + + + + + {% else %} + + + + + + + + {% endif %}
{% endif %} diff --git a/docs/tools/mkdocs-material-theme/partials/nav.html b/docs/tools/mkdocs-material-theme/partials/nav.html index c306ba1746a..66366e6ad40 100644 --- a/docs/tools/mkdocs-material-theme/partials/nav.html +++ b/docs/tools/mkdocs-material-theme/partials/nav.html @@ -1,9 +1,9 @@ diff --git a/libs/libmysqlxx/CMakeLists.txt b/libs/libmysqlxx/CMakeLists.txt index cd80a7b9b24..95c1c44c315 100644 --- a/libs/libmysqlxx/CMakeLists.txt +++ b/libs/libmysqlxx/CMakeLists.txt @@ -37,7 +37,7 @@ endif () if (APPLE) find_library (ICONV_LIBRARY iconv) - set (MYSQLCLIENT_LIBRARIES ${STATIC_MYSQLCLIENT_LIB} ${ICONV_LIBRARY}) + set (MYSQLCLIENT_LIBRARIES ${MYSQLCLIENT_LIBRARIES} ${STATIC_MYSQLCLIENT_LIB} ${ICONV_LIBRARY}) elseif (USE_STATIC_LIBRARIES AND STATIC_MYSQLCLIENT_LIB) set (MYSQLCLIENT_LIB ${CMAKE_CURRENT_BINARY_DIR}/libmysqlclient.a) add_custom_command ( diff --git a/libs/libmysqlxx/cmake/find_mysqlclient.cmake b/libs/libmysqlxx/cmake/find_mysqlclient.cmake index 4b37aca436c..d019703e876 100644 --- a/libs/libmysqlxx/cmake/find_mysqlclient.cmake +++ b/libs/libmysqlxx/cmake/find_mysqlclient.cmake @@ -5,6 +5,7 @@ if (ENABLE_MYSQL) "/usr/local/opt/mysql/lib" "/usr/local/lib" "/usr/local/lib64" + "/usr/local/lib/mariadb" # macos brew mariadb-connector-c "/usr/mysql/lib" "/usr/mysql/lib64" "/usr/lib" @@ -18,17 +19,21 @@ if (ENABLE_MYSQL) "/usr/local/include" "/usr/include") - find_path (MYSQL_INCLUDE_DIR NAMES mysql/mysql.h PATHS ${MYSQL_INCLUDE_PATHS} PATH_SUFFIXES mysql) + find_path (MYSQL_INCLUDE_DIR NAMES mysql/mysql.h mariadb/mysql.h PATHS ${MYSQL_INCLUDE_PATHS} PATH_SUFFIXES mysql) if (USE_STATIC_LIBRARIES) find_library (STATIC_MYSQLCLIENT_LIB NAMES mariadbclient mysqlclient PATHS ${MYSQL_LIB_PATHS} PATH_SUFFIXES mysql) else () - find_library (MYSQLCLIENT_LIBRARIES NAMES mariadbclient mysqlclient PATHS ${MYSQL_LIB_PATHS} PATH_SUFFIXES mysql) + find_library (MYSQLCLIENT_LIBRARIES NAMES mariadb mariadbclient mysqlclient PATHS ${MYSQL_LIB_PATHS} PATH_SUFFIXES mysql) endif () if (MYSQL_INCLUDE_DIR AND (STATIC_MYSQLCLIENT_LIB OR MYSQLCLIENT_LIBRARIES)) set (USE_MYSQL 1) set (MYSQLXX_LIBRARY mysqlxx) + if (APPLE) + # /usr/local/include/mysql/mysql_com.h:1011:10: fatal error: mysql/udf_registration_types.h: No such file or directory + set(MYSQL_INCLUDE_DIR ${MYSQL_INCLUDE_DIR} ${MYSQL_INCLUDE_DIR}/mysql) + endif () endif () endif () diff --git a/libs/libmysqlxx/src/Connection.cpp b/libs/libmysqlxx/src/Connection.cpp index 00eeed49616..6d9e7d5b673 100644 --- a/libs/libmysqlxx/src/Connection.cpp +++ b/libs/libmysqlxx/src/Connection.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include #include diff --git a/libs/libmysqlxx/src/Exception.cpp b/libs/libmysqlxx/src/Exception.cpp index 623f66d720b..dadd37e29e7 100644 --- a/libs/libmysqlxx/src/Exception.cpp +++ b/libs/libmysqlxx/src/Exception.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include diff --git a/libs/libmysqlxx/src/Pool.cpp b/libs/libmysqlxx/src/Pool.cpp index 33065df2bb5..fec612abcad 100644 --- a/libs/libmysqlxx/src/Pool.cpp +++ b/libs/libmysqlxx/src/Pool.cpp @@ -1,5 +1,10 @@ +#if __has_include() +#include +#include +#else #include #include +#endif #include diff --git a/libs/libmysqlxx/src/Query.cpp b/libs/libmysqlxx/src/Query.cpp index a9dcc3d768b..0bcafa04421 100644 --- a/libs/libmysqlxx/src/Query.cpp +++ b/libs/libmysqlxx/src/Query.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include #include diff --git a/libs/libmysqlxx/src/ResultBase.cpp b/libs/libmysqlxx/src/ResultBase.cpp index ccfe320bfdc..b03f92e38f2 100644 --- a/libs/libmysqlxx/src/ResultBase.cpp +++ b/libs/libmysqlxx/src/ResultBase.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include #include diff --git a/libs/libmysqlxx/src/Row.cpp b/libs/libmysqlxx/src/Row.cpp index a24e705f24a..e4baa681d69 100644 --- a/libs/libmysqlxx/src/Row.cpp +++ b/libs/libmysqlxx/src/Row.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include diff --git a/libs/libmysqlxx/src/StoreQueryResult.cpp b/libs/libmysqlxx/src/StoreQueryResult.cpp index 2c8d5d79fbe..05ad4299e17 100644 --- a/libs/libmysqlxx/src/StoreQueryResult.cpp +++ b/libs/libmysqlxx/src/StoreQueryResult.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include #include diff --git a/libs/libmysqlxx/src/UseQueryResult.cpp b/libs/libmysqlxx/src/UseQueryResult.cpp index a779a4282af..c5c52ffcb9c 100644 --- a/libs/libmysqlxx/src/UseQueryResult.cpp +++ b/libs/libmysqlxx/src/UseQueryResult.cpp @@ -1,4 +1,8 @@ +#if __has_include() +#include +#else #include +#endif #include #include diff --git a/utils/build/build_macos.sh b/utils/build/build_macos.sh index 4ca50e2d78c..462dd3e528e 100755 --- a/utils/build/build_macos.sh +++ b/utils/build/build_macos.sh @@ -12,7 +12,7 @@ fi ## Install required compilers, tools, libraries -brew install cmake gcc icu4c mysql openssl unixodbc libtool gettext readline +brew install cmake gcc icu4c mariadb-connector-c openssl unixodbc libtool gettext readline ## Checkout ClickHouse sources diff --git a/website/index.html b/website/index.html index 9f9c75308d6..8d2ed3ec7ce 100644 --- a/website/index.html +++ b/website/index.html @@ -390,7 +390,7 @@
 sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E0C56BD4    # optional
 
-sudo apt-add-repository "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/"
+sudo echo "deb http://repo.yandex.ru/clickhouse/deb/stable/ main/" > /etc/apt/sources.list.d/clickhouse.list
 sudo apt-get update
 
 sudo apt-get install -y clickhouse-server clickhouse-client