Commit Graph

514 Commits

Author SHA1 Message Date
Anton Ivashkin
265d293934 Use 'merge on single replica' option instead of zookeeper lock 2021-03-09 20:39:55 +03:00
alesapin
d80c2cef06 Slightly better 2021-03-09 11:45:41 +03:00
Amos Bird
2ec20c5d23
update and add tests 2021-03-08 17:38:07 +08:00
Amos Bird
e6522e1ebe
JBOD data balancer 2021-03-08 11:10:35 +08:00
alesapin
5b3161e0b5 Get rid of const_cast 2021-03-05 20:24:06 +03:00
Anton Ivashkin
d08b481660 Fixes by review responces 2021-03-05 19:20:38 +03:00
Nikolai Kochetov
a669f7d641 Merge branch 'master' into refactor-actions-dag 2021-03-05 18:21:14 +03:00
Nikolai Kochetov
9a39459888 Refactor ActionsDAG 2021-03-04 20:38:12 +03:00
Amos Bird
93b661ad5a
partition id pruning 2021-03-04 19:43:03 +08:00
Anton Ivashkin
e69124a0a6 Merge master 2021-03-04 13:26:40 +03:00
Alexey Milovidov
4e8239e098 Merge branch 'master' into DateTime64_extended_range 2021-03-03 23:43:20 +03:00
Mike Kot
6ea574525c Small fixes regarding the review 2021-03-03 16:51:41 +03:00
alesapin
06678d650d Merge branch 'master' into fix_alter_partition_key 2021-03-02 13:43:41 +03:00
Mike Kot
4e1ac185b5 Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-03-01 20:58:07 +03:00
Anton Ivashkin
98065ec56e Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into s3_zero_copy_replication 2021-03-01 13:33:40 +03:00
Anton Ivashkin
3c11d44494 Add description for getUniqueId method, fix typos 2021-03-01 13:31:36 +03:00
alesapin
9c8afbeb53 Fix alter modify query for partition key and other metadata fields 2021-03-01 12:59:19 +03:00
Nikolai Kochetov
976dbe8077
Merge pull request #20341 from ClickHouse/filter-push-down
Filter push down
2021-03-01 12:35:06 +03:00
alexey-milovidov
b8fba768e5
Merge pull request #21264 from ClickHouse/fix_zookeeper_update
Fix several bugs with ZooKeeper client
2021-02-28 01:57:04 +03:00
alesapin
9e93d7f507 Fix tidy and add comments 2021-02-27 11:07:14 +03:00
Nikolai Kochetov
d91b8a3acb
Merge branch 'master' into filter-push-down 2021-02-26 19:33:27 +03:00
Nikolai Kochetov
d328bfa41f Review fixes. Add setting max_optimizations_to_apply. 2021-02-26 19:29:56 +03:00
Anton Ivashkin
5b267b7eec Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into s3_zero_copy_replication 2021-02-26 13:15:18 +03:00
Anton Ivashkin
c891cf4557 Fixes by review response 2021-02-26 12:48:57 +03:00
Mike Kot
2b3b335eda Exit fixes 2021-02-25 21:41:09 +03:00
Mike Kot
b2c898f58c Adding the part when found 2021-02-25 21:25:55 +03:00
fastio
fea2836673 get zookeeper from global context 2021-02-25 14:19:58 +08:00
fastio
3ddb729e4a fix metadata leak when drop Replicated*MergeTree 2021-02-25 14:19:58 +08:00
Vasily Nemkov
4fcc23ec9a Fixed build for GCC-10 2021-02-24 17:08:43 +02:00
Vasily Nemkov
2d03d330bc Extended range of DateTime64 to years 1925 - 2238
The Year 1925 is a starting point because most of the timezones
switched to saner (mostly 15-minutes based) offsets somewhere
during 1924 or before. And that significantly simplifies implementation.

2238 is to simplify arithmetics for sanitizing LUT index access;
there are less than 0x1ffff days from 1925.

* Extended DateLUTImpl internal LUT to 0x1ffff items, some of which
  represent negative (pre-1970) time values.
  As a collateral benefit, Date now correctly supports dates up to 2149
  (instead of 2106).
* Added a new strong typedef ExtendedDayNum, which represents dates
  pre-1970 and post 2149.
* Functions that used to return DayNum now return ExtendedDayNum.
* Refactored DateLUTImpl to untie DayNum from the dual role of being
  a value and an index (due to negative time). Index is now a different
  type LUTIndex with explicit conversion functions from DatNum, time_t,
  and ExtendedDayNum.
* Updated DateLUTImpl to properly support values close to epoch start
  (1970-01-01 00:00), including negative ones.
* Reduced resolution of DateLUTImpl::Values::time_at_offset_change
  to multiple of 15-minutes to allow storing 64-bits of time_t in
  DateLUTImpl::Value while keeping same size.
* Minor performance updates to DateLUTImpl when building month LUT
  by skipping non-start-of-month days.
* Fixed extractTimeZoneFromFunctionArguments to work correctly
  with DateTime64.
* New unit-tests and stateless integration tests for both DateTime
  and DateTime64.
2021-02-24 17:08:35 +02:00
Alexander Tokmakov
2a36d6cb55 review suggestions 2021-02-20 02:41:58 +03:00
Mike Kot
eb1826a5e3 Trying to figure out why iteration doesn't work 2021-02-19 17:28:29 +03:00
Mike Kot
f1ef382cf9 Added part_checksum to Replicated...Entry serialization. 2021-02-19 16:04:12 +03:00
Alexander Tokmakov
1aac7b3471 Merge branch 'master' into database_replicated 2021-02-17 00:39:56 +03:00
Mike Kot
de88a7ca94 Fixed the UB with maybe-nullptr member access 2021-02-16 18:36:30 +03:00
Mike Kot
ca83775711 Multiple small hotfixes
Small fixes

Some fix for old bug

Another old code fix
2021-02-16 16:39:18 +03:00
Mike Kot
fefc7234df Replaced the part lookup algo to "by hash only", comments on test stub 2021-02-16 16:00:26 +03:00
Alexander Tokmakov
bf6f64a3fb Merge branch 'master' into database_replicated 2021-02-16 01:28:19 +03:00
Mike Kot
8182482cbd Add test stub 2021-02-15 21:06:20 +03:00
Mike Kot
0a02fe913a Add some code for the checksum pre-calculation in ...BlockOutputStream
Added the comment explaining the double-get for the zookeeper header for
the part.
2021-02-15 20:31:58 +03:00
Mike Kot
73f4740be5 Trying to re-write the solution by adding new command type
ATTACH_PART into the replicated log.

The LogEntry now also has the pre-calculated part checksum for this
entry type, which is later used while searching in the detached/ folder
2021-02-15 18:06:48 +03:00
Mike Kot
cd32803709 Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-02-15 15:42:17 +03:00
tavplubix
3f86ce4c67
Update StorageReplicatedMergeTree.cpp 2021-02-15 15:04:30 +03:00
Mike Kot
feff4c6a22 Started adding the new "ATTACH_PART" command into the replicated log
The original ticket idea was to search for the possibly available data
into the /detached folders for the GET_PART command, but
@tavplubix pointed out this would be quite expensive for an every
fetch.

So a new command is going to be introduced, ATTACH_PART, which will
cover ALTER TABLE ATTACH PART and only for which the search will start.
2021-02-15 01:59:13 +03:00
alesapin
3253638969 Fix backoff for failed background tasks in replicated merge tree 2021-02-11 14:46:18 +03:00
Nicolae Vartolomei
b153e8c190 Add support for custom fetchPart timeouts 2021-02-08 19:44:02 +00:00
Alexander Tokmakov
78c1d69b8c better code 2021-02-08 22:36:17 +03:00
tavplubix
ac477d9850
Merge pull request #19771 from ClickHouse/thread_state_improvements
Minor code improvements around ThreadStatus
2021-02-08 22:34:55 +03:00
Anton Ivashkin
4d44d75bc7 Fix build after merge one more time 2021-02-08 14:45:10 +03:00
Anton Ivashkin
e64c63c611 Merge master 2021-02-05 20:10:06 +03:00
Anton Ivashkin
df6c882aab Fix build after merge 2021-02-05 18:52:40 +03:00
Alexander Tokmakov
87502d0220 Merge branch 'thread_state_improvements' into database_replicated 2021-02-03 20:19:35 +03:00
Alexander Tokmakov
d010f97db0 Merge branch 'master' into database_replicated 2021-02-03 20:13:25 +03:00
Maksim Kita
d0151de4bb
Merge pull request #19608 from kreuzerkrieg/Add_IStoragePolicy_interface
Add IStoragePolicy interface
2021-02-02 11:03:20 +03:00
Kruglov Pavel
caef103837
Merge branch 'master' into Add_IStoragePolicy_interface 2021-01-29 14:00:12 +03:00
alesapin
2881c830e3 Merge branch 'master' into fix_rare_bug_after_part_corruption 2021-01-28 23:16:52 +03:00
tavplubix
10460313ad
Update StorageReplicatedMergeTree.cpp 2021-01-28 17:48:09 +03:00
Alexander Tokmakov
ffaa8e34a6 minor code improvements around ThreadStatus 2021-01-28 16:57:36 +03:00
Alexander Tokmakov
52e5c0aad7 fix thread status 2021-01-28 16:48:17 +03:00
alesapin
5622e6daa6 Fix rare max_number_of_merges_with_ttl_in_pool limit overrun for non-replicated MergeTree 2021-01-27 14:56:12 +03:00
alesapin
01c8b9e1b1 Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption 2021-01-27 13:07:18 +03:00
kreuzerkrieg
29a2ef3089 Add IStoragePolicy interface 2021-01-26 10:55:28 +02:00
Alexander Tokmakov
3bd4d97353 Merge branch 'master' into database_replicated 2021-01-25 14:19:04 +03:00
Anton Ivashkin
357d98eb36 Merge master 2021-01-20 12:23:03 +03:00
Anton Ivashkin
eba98b04b0 Zero copy replication over S3: Hybrid storage support 2021-01-18 19:16:45 +03:00
Alexander Tokmakov
7f97a11c84 Merge branch 'master' into database_replicated 2021-01-18 17:09:39 +03:00
alesapin
4ee96869a2 Don't wait forever for log update after table was dropped 2021-01-18 15:15:07 +03:00
alesapin
bfc27254b2 Avoid redundant exception while dropping part 2021-01-14 11:07:13 +03:00
Mike Kot
6b949109f1 marked places to edit 2021-01-12 21:46:03 +03:00
alesapin
dead1016d6 Fix deduplication block names parsing 2021-01-12 13:55:02 +03:00
fastio
a1d0c04e68 fix build 2021-01-08 13:10:00 +08:00
fastio
ea047b951f Expand macros for fetchPartition 2021-01-07 22:13:17 +08:00
alexey-milovidov
a08db94343
Revert "Add metrics for part number in MergeTree in ClickHouse" 2021-01-07 16:40:52 +03:00
alexey-milovidov
f91626e7ff
Merge pull request #17838 from weeds085490/dev/add_metrics_for_parts
Add metrics for part number in MergeTree in ClickHouse
2021-01-07 15:27:04 +03:00
weeds085490
5f5b86b485 Merge remote-tracking branch 'origin' into dev/add_metrics_for_parts 2021-01-06 17:32:45 +08:00
sundy-li
6cc0668af4 Add one more argument 2021-01-04 16:21:04 +08:00
sundy-li
8d7fe410cd Fix Logger with unmatched arg size 2021-01-04 16:15:13 +08:00
alexey-milovidov
1bcdf37c36
Merge pull request #18614 from CurtizJ/fix-empty-parts
Fix removing of empty parts in tables with old syntax
2020-12-30 17:20:31 +03:00
Anton Popov
6336fbf1df fix removing of empty parts in tables with old syntax 2020-12-29 20:16:57 +03:00
alesapin
a50615a22b More correct error code 2020-12-25 16:38:04 +03:00
roverxu
b339b9dfd0 fix consistence 2020-12-22 17:38:15 +08:00
Alexey Milovidov
9be5fa9ef2 Merge branch 'master' into Enmk-Optimize_deduplicate 2020-12-20 09:57:10 +03:00
alesapin
f45993cd1e More logs during quorum insert 2020-12-17 19:13:01 +03:00
Anton Ivashkin
0f0500ca0c Merge master 2020-12-16 18:31:13 +03:00
Nikolai Kochetov
6defcbb662
Merge branch 'master' into optimize-data-on-insert 2020-12-15 16:50:42 +03:00
Azat Khuzhin
5b3ab48861 More forward declaration for generic headers
The following headers are pretty generic, so use forward declaration as
much as possible:
- Context.h
- Settings.h
- ConnectionTimeouts.h
(Also this shows that some missing some includes -- this has been fixed)

And split ConnectionTimeouts.h into ConnectionTimeoutsContext.h (since
module part cannot be added for it, due to recursive build dependencies
that will be introduced)

Also remove Settings from the RemoteBlockInputStream/RemoteQueryExecutor
and just pass the context, since settings was passed only in speicifc
places, that can allow making a copy of Context (i.e. Copier).

Approx results (How much units will be recompiled after changing file X?):

- ConnectionTimeouts.h
  - mainline: 100

- Context.h:
  - mainline: ~800
  - patched:  415

- Settings.h:
  - mainline: 900-1K
  - patched:  440 (most of them because of the Context.h)
2020-12-12 17:43:10 +03:00
Kruglov Pavel
e19eb6f17a
Merge branch 'master' into optimize-data-on-insert 2020-12-08 15:57:46 +03:00
Vasily Nemkov
70ea507dae OPTIMIZE DEDUPLICATE BY columns
Extended OPTIMIZE ... DEDUPLICATE syntax to allow explicit (or implicit with asterisk/column transformers) list of columns to check for duplicates on.

Following syntax variants are now supported:

OPTIMIZE TABLE table DEDUPLICATE; -- the old one
OPTIMIZE TABLE table DEDUPLICATE BY *;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT (colX, colY);
OPTIMIZE TABLE table DEDUPLICATE BY col1,col2,col3;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex');
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT (colX, colY);

Note that * behaves just like in SELECT: MATERIALIZED, and ALIAS columns are not used for expansion.
Also, it is an error to specify empty list of columns, or write an expression that results in an empty list of columns, or deduplicate by an ALIAS column.
Column transformers other than EXCEPT are not supported.
2020-12-07 09:44:07 +03:00
alesapin
27c3301083
Merge pull request #17800 from nvartolomei/nv/waitForAllReplicasToProcessLogEntry-foreign-shard
Update StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry to support waiting on foreign shards / tables
2020-12-05 16:15:46 +03:00
Pavel Kruglov
5ae6c6dab9 Fix build error 2020-12-04 20:40:28 +03:00
Pavel Kruglov
905ba78adc Merge branch 'master' of github.com:ClickHouse/ClickHouse into optimize-data-on-insert 2020-12-04 18:56:46 +03:00
Pavel Kruglov
9dbced0474 Pass setting instead of context 2020-12-04 17:01:59 +03:00
Nicolae Vartolomei
796aee032d Update StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry to support waiting on foreign shards / tables
This is not used anywhere yet but needed for an upcoming PR for part movement between shards.
2020-12-04 13:01:12 +00:00
Anton Popov
cab9855dd1
Update StorageReplicatedMergeTree.cpp 2020-12-03 16:54:05 +03:00
Anton Popov
cd1917c7a6
Merge branch 'master' into optimize_final_optimization 2020-12-03 16:52:51 +03:00
Alexander Tokmakov
19c8399eb0 Merge branch 'master' into database_replicated 2020-12-01 17:33:07 +03:00
alexey-milovidov
f4a61ac3c3
Merge pull request #17527 from ucasFL/spelling
fix spelling errors
2020-11-29 13:45:42 +03:00
feng lv
7e3524caa1 fix spelling errors 2020-11-28 08:17:20 +00:00
Alexander Tokmakov
9e3fd3c170 Merge branch 'master' into database_replicated 2020-11-27 17:08:34 +03:00
alexey-milovidov
dfae1efbbd
Merge pull request #17070 from fastio/master
Support multiple ZooKeeper clusters
2020-11-27 10:38:01 +03:00