Since ClickHouse does not support unquoted utf-8 strings but MySQL does.
Instead of fixing Lexer to recognize utf-8 chars as TokenType::BareWord,
suggesting to quote all unrecognized tokens before applying any DDL.
Actual parsing and validating the syntax will be done by particular Parser.
If there is any TokenType::Error, the query is unable to be parsed anyway.
Quoting such tokens can provide the support of utf-8 names.
See `tryQuoteUnrecognizedTokens` and `QuoteUnrecognizedTokensTest`.
mysql> CREATE TABLE 道.渠(...
is converted to
CREATE TABLE `道`.`渠`(...
Also fixed the bug with missing * while doing SELECT in full sync because db or table name are back quoted when not needed.
Since there can be some leftovers:
2023.07.24 07:08:25.238066 [ 140 ] {} <Error> Application: Code: 219. DB::Exception: Cannot drop: filesystem error: in remove: Directory not empty ["/var/lib/clickhouse/data/system/"]. Probably database contain some detached tables or metadata leftovers from Ordinary engine. If you want to remove all data anyway, try to attach database back and drop it again with enabled force_remove_data_recursively_on_drop setting: Exception while trying to convert database system from Ordinary to Atomic. It may be in some intermediate state. You can finish conversion manually by moving the rest tables from system to .tmp_convert.system.9396432095832455195 (using RENAME TABLE) and executing DROP DATABASE system and RENAME DATABASE .tmp_convert.system.9396432095832455195 TO system. (DATABASE_NOT_EMPTY), Stack trace (when copying this message, always include the lines below):
0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000e68af57 in /usr/bin/clickhouse
1. ? @ 0x000000000cab443c in /usr/bin/clickhouse
2. DB::DatabaseOnDisk::drop(std::shared_ptr<DB::Context const>) @ 0x000000001328d617 in /usr/bin/clickhouse
3. DB::DatabaseCatalog::detachDatabase(std::shared_ptr<DB::Context const>, String const&, bool, bool) @ 0x0000000013524a6c in /usr/bin/clickhouse
4. DB::InterpreterDropQuery::executeToDatabaseImpl(DB::ASTDropQuery const&, std::shared_ptr<DB::IDatabase>&, std::vector<StrongTypedef<wide::integer<128ul, unsigned int>, DB::UUIDTag>, std::allocator<StrongTypedef<wide::integer<128ul, unsigned int>, DB::UUIDTag>>>&) @ 0x0000000013bc05e4 in /usr/bin/clickhouse
5. DB::InterpreterDropQuery::executeToDatabase(DB::ASTDropQuery const&) @ 0x0000000013bbc6b8 in /usr/bin/clickhouse
6. DB::InterpreterDropQuery::execute() @ 0x0000000013bbba22 in /usr/bin/clickhouse
7. ? @ 0x00000000140b13a5 in /usr/bin/clickhouse
8. DB::executeQuery(String const&, std::shared_ptr<DB::Context>, bool, DB::QueryProcessingStage::Enum) @ 0x00000000140ad20e in /usr/bin/clickhouse
9. ? @ 0x00000000140d2ef0 in /usr/bin/clickhouse
10. DB::maybeConvertSystemDatabase(std::shared_ptr<DB::Context>) @ 0x00000000140d0aaf in /usr/bin/clickhouse
11. DB::Server::main(std::vector<String, std::allocator<String>> const&) @ 0x000000000e724e55 in /usr/bin/clickhouse
12. Poco::Util::Application::run() @ 0x0000000017ead086 in /usr/bin/clickhouse
13. DB::Server::run() @ 0x000000000e714a5d in /usr/bin/clickhouse
14. Poco::Util::ServerApplication::run(int, char**) @ 0x0000000017ec07b9 in /usr/bin/clickhouse
15. mainEntryClickHouseServer(int, char**) @ 0x000000000e711a26 in /usr/bin/clickhouse
16. main @ 0x0000000008cf13cf in /usr/bin/clickhouse
17. __libc_start_main @ 0x0000000000021b97 in /lib/x86_64-linux-gnu/libc-2.27.so
18. _start @ 0x00000000080705ae in /usr/bin/clickhouse
(version 23.7.1.2012)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
The problem is that old versions of cURL (7.81.0 at least) handle
additional parameters incorrectly if in previous parameter was "/":
$ docker run --rm curlimages/curl:8.1.2 --http1.1 --get -vvv 'http://kernel.org/?bar=foo/baz' --data-urlencode "query=select 1 format Null"; echo
> GET /?bar=foo/baz&query=select+1+format+Null HTTP/1.1
> User-Agent: curl/8.1.2
$ docker run --rm curlimages/curl:7.81.0 --http1.1 --get -vvv 'http://kernel.org/?bar=foo/baz' --data-urlencode "query=select 1 format Null"; echo
> GET /?bar=foo/baz?query=select+1+format+Null HTTP/1.1
> User-Agent: curl/7.81.0-DEV
Note, that I thought about making the same for cli, but it is not that
easy, even after getting rid of sh -c and string contantenation, it
still cannot be done for CLICKHOUSE_CLIENT_OPT.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
There was some cases when some patches to the datetime code leads to
flaky tests, due to the tests itself had been runned using regular
timezone (TZ).
But if you will this tests with something "specific" (that is not
strictly defined around 1970 year), those tests will fail.
So to catch such issues in the PRs itself, let's randomize
session_timezone as well.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>