Commit Graph

449 Commits

Author SHA1 Message Date
kssenii
22b515fbc9 Add namespace, simplify names 2021-03-27 20:14:02 +00:00
Azat Khuzhin
82c79fe4ce Fix query cancellation with use_hedged_requests=0 and async_socket_for_remote=1
In #21643 async_socket_for_remote=1 was fixed to avoid leaving the
connection in the unsynchronised state.

But one should not try to wait for the current packet in case of timeout
because this will exceed the timeout.

Anyway if the timeout is exceeded, then the connection will be shutdown
(disconnected), so it will not left in an unsynchronised state.
2021-03-26 21:24:42 +03:00
Anton Popov
6a15431be7 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-25 15:57:35 +03:00
Alexander Kuzmenkov
2f5dbf57b6
Merge pull request #21839 from kssenii/add-postgres-connection-pool
Add connection pool for postgres engine
2021-03-22 19:49:51 +03:00
kssenii
9057aad798 Better version 2021-03-19 08:11:36 +00:00
kssenii
f1ef87d966 Fix 2021-03-18 20:04:54 +00:00
Nikolai Kochetov
c3c393a7aa Merge branch 'master' into refactor-actions-dag 2021-03-18 14:33:07 +03:00
kssenii
3903d59d30 Better 2021-03-17 14:34:04 +00:00
kssenii
ae64a24844 Add connection pool 2021-03-17 13:55:47 +00:00
alexey-milovidov
d02726bcac
Merge pull request #9404 from Enmk/DateTime64_extended_range
Date time64 extended range
2021-03-17 11:06:03 +03:00
Alexey Milovidov
3f67f4f47b Saturation for DateTime 2021-03-15 23:40:33 +03:00
Alexey Milovidov
671395e8c8 Most likely improve performance 2021-03-15 22:23:27 +03:00
Anton Popov
81ac6382a3 slightly better performance 2021-03-13 21:05:18 +03:00
Azat Khuzhin
65f90f2ce9 Fix distributed requests cancellation with async_socket_for_remote=1
Before this patch for distributed queries, that requires cancellation
(simple select from multiple shards with limit, i.e. `select * from
remote('127.{2,3}', system.numbers) limit 100`) it is very easy to
trigger the situation when remote shard is in the middle of sending Data
block while the initiator already send Cancel and expecting some new
packet, but it will receive not new packet, but part of the Data block
that was in the middle of sending before cancellation, and this will
lead to some various errors, like:
- Unknown packet X from server Y
- Unexpected packet from server Y
- and a lot more...

Fix this, by correctly waiting for the pending packet before
cancellation.

It is not very easy to write a test, since localhost is too fast.

Also note, that it is not possible to get these errors with hedged
requests (use_hedged_requests=1) since handle fibers correctly.

But it had been disabled by default for 21.3 in #21534, while
async_socket_for_remote is enabled by default.
2021-03-11 21:55:21 +03:00
Anton Popov
bc417cf54a refactoring of serializations 2021-03-09 17:46:52 +03:00
Nikolai Kochetov
a669f7d641 Merge branch 'master' into refactor-actions-dag 2021-03-05 18:21:14 +03:00
Nikolai Kochetov
9a39459888 Refactor ActionsDAG 2021-03-04 20:38:12 +03:00
Alexey Milovidov
4e8239e098 Merge branch 'master' into DateTime64_extended_range 2021-03-03 23:43:20 +03:00
Nikolai Kochetov
4775ea305e Remove index by name from ActionsDAG 2021-03-02 20:51:54 +03:00
Pavel Kruglov
153bfbfc28 Merge branch 'master' of github.com:ClickHouse/ClickHouse into hedged-requests 2021-03-02 11:59:32 +03:00
Maksim Kita
315824978d CheckConstraintsBlockOutputStream optimize nullable column case 2021-02-27 19:19:21 +03:00
Maksim Kita
da321c2bfe Fixed check for null value in null map 2021-02-25 16:08:04 +03:00
Maksim Kita
23af53067d Updated support for Nullable column 2021-02-25 14:27:46 +03:00
Maksim Kita
2eec1d021b Fixed unused code 2021-02-25 14:27:46 +03:00
Maksim Kita
8fec34af12 Constraints complex types support 2021-02-25 14:27:46 +03:00
Vasily Nemkov
2d03d330bc Extended range of DateTime64 to years 1925 - 2238
The Year 1925 is a starting point because most of the timezones
switched to saner (mostly 15-minutes based) offsets somewhere
during 1924 or before. And that significantly simplifies implementation.

2238 is to simplify arithmetics for sanitizing LUT index access;
there are less than 0x1ffff days from 1925.

* Extended DateLUTImpl internal LUT to 0x1ffff items, some of which
  represent negative (pre-1970) time values.
  As a collateral benefit, Date now correctly supports dates up to 2149
  (instead of 2106).
* Added a new strong typedef ExtendedDayNum, which represents dates
  pre-1970 and post 2149.
* Functions that used to return DayNum now return ExtendedDayNum.
* Refactored DateLUTImpl to untie DayNum from the dual role of being
  a value and an index (due to negative time). Index is now a different
  type LUTIndex with explicit conversion functions from DatNum, time_t,
  and ExtendedDayNum.
* Updated DateLUTImpl to properly support values close to epoch start
  (1970-01-01 00:00), including negative ones.
* Reduced resolution of DateLUTImpl::Values::time_at_offset_change
  to multiple of 15-minutes to allow storing 64-bits of time_t in
  DateLUTImpl::Value while keeping same size.
* Minor performance updates to DateLUTImpl when building month LUT
  by skipping non-start-of-month days.
* Fixed extractTimeZoneFromFunctionArguments to work correctly
  with DateTime64.
* New unit-tests and stateless integration tests for both DateTime
  and DateTime64.
2021-02-24 17:08:35 +02:00
Nikolai Kochetov
ed4697cffc Fix timeout in epoll_wait for RemoteQueryExecutorReadContext 2021-02-19 12:20:24 +03:00
Nikolai Kochetov
b9d6df9618 Check for eintr in epoll_wait 2021-02-19 11:49:41 +03:00
Kruglov Pavel
598576ce70
Merge branch 'master' into hedged-requests 2021-02-15 16:51:45 +03:00
alexey-milovidov
dc3ffd3fe2
Merge pull request #19451 from azat/safe-writes
Do not silently ignore write errors
2021-02-11 21:19:11 +03:00
alesapin
c2bb2c2902
Merge pull request #20097 from ClickHouse/remove-adding-missed-step
Build actions dag to evaluate missing defaults.
2021-02-11 10:51:21 +03:00
Nikolai Kochetov
af214e794f Review fixes. 2021-02-10 15:45:39 +03:00
Nikolai Kochetov
27d607a955 Respect header in addMissingDefaults 2021-02-09 22:48:34 +03:00
Pavel Kruglov
9048dc43d4 Fix style and build 2021-02-06 22:13:50 +03:00
Pavel Kruglov
8ff3dde290 Add sendIgnoredPartUUIDs to HedgedRequests 2021-02-06 18:26:36 +03:00
Pavel Kruglov
f946aab759 Merge branch 'master' of github.com:ClickHouse/ClickHouse into hedged-requests 2021-02-06 17:38:56 +03:00
Pavel Kruglov
0704d3cf27 Refactor 2021-02-06 03:54:27 +03:00
alesapin
011109c82a
Merge pull request #17348 from xjewer/alex/CLICKHOUSE-606_deduplication_UUID
CLICKHOUSE-606: query deduplication based on parts' UUID
2021-02-05 22:47:34 +03:00
Nikolai Kochetov
85c175883e Rename functions. 2021-02-05 18:11:26 +03:00
Nikolai Kochetov
9869f70a0d Remove AddMissed step and transform. 2021-02-05 14:41:44 +03:00
alesapin
7cbc135e72 More isolated code 2021-02-05 12:54:34 +03:00
Azat Khuzhin
98e3a99a88 Do not catch exceptions during final flush in writers destructors
Since this hides real problems, since destructor does final flush and if
it fails, then data will be lost.

One of such examples if MEMORY_LIMIT_EXCEEDED exception, so lock
exceptions from destructors, by using
MemoryTracker::LockExceptionInThread to block these exception, and allow
others (so std::terminate will be called, since this is c++11 with
noexcept for destructors by default).

Here is an example, that leads to empty block in the distributed batch:

    2021.01.21 12:43:18.619739 [ 46468 ] {7bd60d75-ebcb-45d2-874d-260df9a4ddac} <Error> virtual DB::CompressedWriteBuffer::~CompressedWriteBuffer(): Code: 241, e.displayText() = DB::Exception: Memory limit (for user) exceeded: would use 332.07 GiB (attempt to allocate chunk of 4355342 bytes), maximum: 256.00 GiB, Stack trace (when copying this message, always include the lines below):

    0. DB::Exception::Exception<>() @ 0x86f7b88 in /usr/bin/clickhouse
    ...
    4. void DB::PODArrayBase<>::resize<>(unsigned long) @ 0xe9e878d in /usr/bin/clickhouse
    5. DB::CompressedWriteBuffer::nextImpl() @ 0xe9f0296 in /usr/bin/clickhouse
    6. DB::CompressedWriteBuffer::~CompressedWriteBuffer() @ 0xe9f0415 in /usr/bin/clickhouse
    7. DB::DistributedBlockOutputStream::writeToShard() @ 0xf6bed4a in /usr/bin/clickhouse
2021-02-05 01:31:45 +03:00
Nikolai Kochetov
d9aa1096cf Build actions dag to evaluate missing defaults. 2021-02-04 23:36:50 +03:00
Azat Khuzhin
c59f22d7b4 Change PushingToViewsBlockOutputStream::process, to accept ref to ViewInfo
It is safe, no need to use quirks with passing view_num.
2021-02-04 00:44:24 +03:00
Azat Khuzhin
50d8a87c27 Fix logging of elapsed time while pushing to views
Logging from PushingToViewsBlockOutputStream::write() is incorrect,
since it does not takes into account squashing (default to 1<<20 rows
and 256<<20), whle write will be done from
PushingToViewsBlockOutputStream::writeSuffix().

Fixes: #19378
2021-02-04 00:44:24 +03:00
Aleksei Semiglazov
d05c6446b9
Send cancel packet and cancel read_context before retrying the query 2021-02-03 00:07:00 +00:00
Aleksei Semiglazov
921518db0a CLICKHOUSE-606: query deduplication based on parts' UUID
* add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

* add _part_uuid virtual column, allowing to use UUIDs in predicates.

Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>

address comments
2021-02-02 16:53:39 +00:00
Pavel Kruglov
60a92e9a99 Fix build, add comments, update tests 2021-02-02 15:14:31 +03:00
Pavel Kruglov
5b16a54233 Fix synchronization 2021-02-01 20:23:46 +03:00
Pavel Kruglov
25e85d71ee Merge branch 'master' of github.com:ClickHouse/ClickHouse into hedged-requests 2021-01-29 21:08:47 +03:00
Pavel Kruglov
01a0cb649a Fix build, style, tests 2021-01-29 18:46:28 +03:00
Anton Popov
e5125b8c73 add comments and test for compatibility 2021-01-28 02:39:15 +03:00
Anton Popov
a8f3078ce9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-27 19:48:55 +03:00
Pavel Kruglov
b3b832cde7 Work with any number of replicas simultaneously, support max_parallel_replicas 2021-01-27 12:33:11 +03:00
Azat Khuzhin
2c42600cf9 Add log message with elapsed time while pushing to view
Since right now theses queries are not logged and there is no way to
determine the time that it takes.

Proper fix will be to account them in system.processes (and all other
places), but this is not that easy.
2021-01-21 21:09:23 +03:00
Pavel Kruglov
97b5179e55 Implement HedgedRequests 2021-01-19 22:41:05 +03:00
alexey-milovidov
15f4ae26c2
Merge pull request #17310 from CurtizJ/multiple-nested
Allow nested with multiple nesting and subcolumns of complex types
2021-01-17 15:00:26 +03:00
alexey-milovidov
7de745ce77
Merge pull request #18554 from kssenii/pg2ch
Add PostgreSQL table function, dictionary source, database engine
2021-01-16 23:55:05 +03:00
Alexey Milovidov
24c8e53440 Merge branch 'master' into multiple-nested 2021-01-16 16:28:40 +03:00
Azat Khuzhin
ee45c122ea Fix leaking of pipe fd for async_socket_for_remote 2021-01-16 01:57:36 +03:00
Azat Khuzhin
cf085b0687 Split RemoteQueryExecutorReadContext into module part 2021-01-16 01:57:36 +03:00
Alexey Milovidov
a19e7edd14 Merge branch 'master' into kssenii-pg2ch 2021-01-15 17:33:19 +03:00
alexey-milovidov
78fff6bc39
Merge branch 'master' into multiple-nested 2021-01-15 14:54:27 +03:00
Anton Popov
0e903552a0 fix TTLs with WHERE 2021-01-13 17:04:27 +03:00
Anton Popov
15ead18673 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-12 19:46:10 +03:00
Anton Popov
60b88986bf minor changes near TTL computation 2021-01-12 19:42:49 +03:00
Anton Popov
5822ee1f01 allow multiple rows TTL with WHERE expression 2021-01-12 02:07:21 +03:00
alesapin
c5df8f324c More checks in writer wide 2021-01-11 15:03:00 +03:00
kssenii
38a9cba850 Fix 2021-01-11 10:55:38 +00:00
Anton Popov
36ae0e4d35 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-11 13:51:12 +03:00
kssenii
6ec59f1304 Update libpq, tiny fix 2021-01-10 15:38:46 +00:00
kssenii
d952b0897e Minor adjustments 2021-01-10 12:06:18 +00:00
Alexey Milovidov
8af19c3251 Fix Arcadia 2021-01-07 15:29:02 +03:00
Alexey Milovidov
60d4db421c Fix Arcadia 2021-01-07 06:45:12 +03:00
alexey-milovidov
72b142a00a
Merge branch 'master' into pg2ch 2021-01-06 23:18:59 +03:00
Nikolai Kochetov
a20c4cce76
Merge pull request #18715 from ClickHouse/try-fix-tsan-forremote-query-executor
Use relaxed for flag in RemoteQueryExecutorReadContext.
2021-01-06 18:04:30 +03:00
alexey-milovidov
70d899340e
Update PushingToViewsBlockOutputStream.h 2021-01-05 20:10:12 +03:00
Alexey Milovidov
dab4719aac Remove some headers 2021-01-05 06:22:06 +03:00
Nikolai Kochetov
e48ea5c5b3 Use relaxed for flag in RemoteQueryExecutorReadContext. 2021-01-04 10:59:01 +03:00
Alexey Milovidov
4b3ae495d6 Merge branch 'master' into CurtizJ-multiple-nested 2021-01-02 00:25:16 +03:00
kssenii
8efd85bef2 Fix build 2020-12-29 16:56:50 +00:00
Anton Popov
a8f1786d95 fix TTL with GROUP BY 2020-12-29 18:19:11 +03:00
kssenii
9b25890674 Generate ya.make, fix fast test 2020-12-28 12:54:52 +00:00
Nikita Mikhailov
8667842390 ya make update 2020-12-28 15:54:46 +03:00
Nikita Mikhailov
964e12d8b7 Fix 2020-12-28 15:53:58 +03:00
kssenii
8f8920a7ee Add table cache, better drop table 2020-12-27 15:52:15 +00:00
kssenii
375e8e9736 Add postgres dictionary source 2020-12-27 12:18:09 +00:00
kssenii
aa3484515d Better 2020-12-27 12:17:23 +00:00
kssenii
69f6714461 Add table function 2020-12-27 12:16:46 +00:00
kssenii
42023f4b95 Support insert into table 2020-12-27 12:16:27 +00:00
Anton Popov
11283e3d81 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-25 21:25:59 +03:00
Anton Popov
b60c00ba74 refactoring of TTL stream 2020-12-25 18:46:13 +03:00
Anton Popov
42725b27c0 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-24 15:18:15 +03:00
Nikolai Kochetov
97ffc40321
Merge pull request #18409 from ClickHouse/try-to-fix-ya-check
Try fix ya.make
2020-12-23 20:02:53 +03:00
Nikita Mikhaylov
c005dcdd26
Merge pull request #17641 from nikitamikhaylov/parallel_row_numbers
Added an offset to exception message for parallel parsing
2020-12-23 17:24:35 +03:00
Nikolai Kochetov
7a9282ee7b Try fix ya.make 2020-12-23 15:31:16 +03:00
Anton Popov
57857dda63 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-23 15:16:43 +03:00
Nikolai Kochetov
af7f5c9518
Merge pull request #17868 from ClickHouse/async-read-from-socket
Async read from socket
2020-12-23 12:20:42 +03:00
Nikita Mikhaylov
e1fc9122cc
Update ParallelParsingBlockInputStream.cpp 2020-12-23 05:28:53 +03:00
Nikita Mikhaylov
54d2fe847f
Update src/DataStreams/ParallelParsingBlockInputStream.cpp
Co-authored-by: Alexander Kuzmenkov <36882414+akuzm@users.noreply.github.com>
2020-12-23 05:02:54 +03:00