Commit Graph

402 Commits

Author SHA1 Message Date
Mike Kot
fefc7234df Replaced the part lookup algo to "by hash only", comments on test stub 2021-02-16 16:00:26 +03:00
Mike Kot
8182482cbd Add test stub 2021-02-15 21:06:20 +03:00
Mike Kot
0a02fe913a Add some code for the checksum pre-calculation in ...BlockOutputStream
Added the comment explaining the double-get for the zookeeper header for
the part.
2021-02-15 20:31:58 +03:00
Mike Kot
73f4740be5 Trying to re-write the solution by adding new command type
ATTACH_PART into the replicated log.

The LogEntry now also has the pre-calculated part checksum for this
entry type, which is later used while searching in the detached/ folder
2021-02-15 18:06:48 +03:00
Mike Kot
cd32803709 Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-02-15 15:42:17 +03:00
Mike Kot
feff4c6a22 Started adding the new "ATTACH_PART" command into the replicated log
The original ticket idea was to search for the possibly available data
into the /detached folders for the GET_PART command, but
@tavplubix pointed out this would be quite expensive for an every
fetch.

So a new command is going to be introduced, ATTACH_PART, which will
cover ALTER TABLE ATTACH PART and only for which the search will start.
2021-02-15 01:59:13 +03:00
alesapin
3253638969 Fix backoff for failed background tasks in replicated merge tree 2021-02-11 14:46:18 +03:00
tavplubix
ac477d9850
Merge pull request #19771 from ClickHouse/thread_state_improvements
Minor code improvements around ThreadStatus
2021-02-08 22:34:55 +03:00
Maksim Kita
d0151de4bb
Merge pull request #19608 from kreuzerkrieg/Add_IStoragePolicy_interface
Add IStoragePolicy interface
2021-02-02 11:03:20 +03:00
Kruglov Pavel
caef103837
Merge branch 'master' into Add_IStoragePolicy_interface 2021-01-29 14:00:12 +03:00
alesapin
2881c830e3 Merge branch 'master' into fix_rare_bug_after_part_corruption 2021-01-28 23:16:52 +03:00
tavplubix
10460313ad
Update StorageReplicatedMergeTree.cpp 2021-01-28 17:48:09 +03:00
Alexander Tokmakov
ffaa8e34a6 minor code improvements around ThreadStatus 2021-01-28 16:57:36 +03:00
alesapin
5622e6daa6 Fix rare max_number_of_merges_with_ttl_in_pool limit overrun for non-replicated MergeTree 2021-01-27 14:56:12 +03:00
alesapin
01c8b9e1b1 Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption 2021-01-27 13:07:18 +03:00
kreuzerkrieg
29a2ef3089 Add IStoragePolicy interface 2021-01-26 10:55:28 +02:00
alesapin
4ee96869a2 Don't wait forever for log update after table was dropped 2021-01-18 15:15:07 +03:00
alesapin
bfc27254b2 Avoid redundant exception while dropping part 2021-01-14 11:07:13 +03:00
Mike Kot
6b949109f1 marked places to edit 2021-01-12 21:46:03 +03:00
alesapin
dead1016d6 Fix deduplication block names parsing 2021-01-12 13:55:02 +03:00
fastio
a1d0c04e68 fix build 2021-01-08 13:10:00 +08:00
fastio
ea047b951f Expand macros for fetchPartition 2021-01-07 22:13:17 +08:00
alexey-milovidov
a08db94343
Revert "Add metrics for part number in MergeTree in ClickHouse" 2021-01-07 16:40:52 +03:00
alexey-milovidov
f91626e7ff
Merge pull request #17838 from weeds085490/dev/add_metrics_for_parts
Add metrics for part number in MergeTree in ClickHouse
2021-01-07 15:27:04 +03:00
weeds085490
5f5b86b485 Merge remote-tracking branch 'origin' into dev/add_metrics_for_parts 2021-01-06 17:32:45 +08:00
sundy-li
6cc0668af4 Add one more argument 2021-01-04 16:21:04 +08:00
sundy-li
8d7fe410cd Fix Logger with unmatched arg size 2021-01-04 16:15:13 +08:00
alexey-milovidov
1bcdf37c36
Merge pull request #18614 from CurtizJ/fix-empty-parts
Fix removing of empty parts in tables with old syntax
2020-12-30 17:20:31 +03:00
Anton Popov
6336fbf1df fix removing of empty parts in tables with old syntax 2020-12-29 20:16:57 +03:00
alesapin
a50615a22b More correct error code 2020-12-25 16:38:04 +03:00
roverxu
b339b9dfd0 fix consistence 2020-12-22 17:38:15 +08:00
Alexey Milovidov
9be5fa9ef2 Merge branch 'master' into Enmk-Optimize_deduplicate 2020-12-20 09:57:10 +03:00
alesapin
f45993cd1e More logs during quorum insert 2020-12-17 19:13:01 +03:00
Nikolai Kochetov
6defcbb662
Merge branch 'master' into optimize-data-on-insert 2020-12-15 16:50:42 +03:00
Azat Khuzhin
5b3ab48861 More forward declaration for generic headers
The following headers are pretty generic, so use forward declaration as
much as possible:
- Context.h
- Settings.h
- ConnectionTimeouts.h
(Also this shows that some missing some includes -- this has been fixed)

And split ConnectionTimeouts.h into ConnectionTimeoutsContext.h (since
module part cannot be added for it, due to recursive build dependencies
that will be introduced)

Also remove Settings from the RemoteBlockInputStream/RemoteQueryExecutor
and just pass the context, since settings was passed only in speicifc
places, that can allow making a copy of Context (i.e. Copier).

Approx results (How much units will be recompiled after changing file X?):

- ConnectionTimeouts.h
  - mainline: 100

- Context.h:
  - mainline: ~800
  - patched:  415

- Settings.h:
  - mainline: 900-1K
  - patched:  440 (most of them because of the Context.h)
2020-12-12 17:43:10 +03:00
Kruglov Pavel
e19eb6f17a
Merge branch 'master' into optimize-data-on-insert 2020-12-08 15:57:46 +03:00
Vasily Nemkov
70ea507dae OPTIMIZE DEDUPLICATE BY columns
Extended OPTIMIZE ... DEDUPLICATE syntax to allow explicit (or implicit with asterisk/column transformers) list of columns to check for duplicates on.

Following syntax variants are now supported:

OPTIMIZE TABLE table DEDUPLICATE; -- the old one
OPTIMIZE TABLE table DEDUPLICATE BY *;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT (colX, colY);
OPTIMIZE TABLE table DEDUPLICATE BY col1,col2,col3;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex');
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT (colX, colY);

Note that * behaves just like in SELECT: MATERIALIZED, and ALIAS columns are not used for expansion.
Also, it is an error to specify empty list of columns, or write an expression that results in an empty list of columns, or deduplicate by an ALIAS column.
Column transformers other than EXCEPT are not supported.
2020-12-07 09:44:07 +03:00
alesapin
27c3301083
Merge pull request #17800 from nvartolomei/nv/waitForAllReplicasToProcessLogEntry-foreign-shard
Update StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry to support waiting on foreign shards / tables
2020-12-05 16:15:46 +03:00
Pavel Kruglov
5ae6c6dab9 Fix build error 2020-12-04 20:40:28 +03:00
Pavel Kruglov
905ba78adc Merge branch 'master' of github.com:ClickHouse/ClickHouse into optimize-data-on-insert 2020-12-04 18:56:46 +03:00
Pavel Kruglov
9dbced0474 Pass setting instead of context 2020-12-04 17:01:59 +03:00
Nicolae Vartolomei
796aee032d Update StorageReplicatedMergeTree::waitForAllReplicasToProcessLogEntry to support waiting on foreign shards / tables
This is not used anywhere yet but needed for an upcoming PR for part movement between shards.
2020-12-04 13:01:12 +00:00
Anton Popov
cab9855dd1
Update StorageReplicatedMergeTree.cpp 2020-12-03 16:54:05 +03:00
Anton Popov
cd1917c7a6
Merge branch 'master' into optimize_final_optimization 2020-12-03 16:52:51 +03:00
alexey-milovidov
f4a61ac3c3
Merge pull request #17527 from ucasFL/spelling
fix spelling errors
2020-11-29 13:45:42 +03:00
feng lv
7e3524caa1 fix spelling errors 2020-11-28 08:17:20 +00:00
alexey-milovidov
dfae1efbbd
Merge pull request #17070 from fastio/master
Support multiple ZooKeeper clusters
2020-11-27 10:38:01 +03:00
nikitamikhaylov
72c7cd6693 replace Context& to Settings& 2020-11-25 16:47:32 +03:00
nikitamikhaylov
68bef22fda Merge branch 'master' of github.com:ClickHouse/ClickHouse into merging-sequential-consistency 2020-11-23 16:28:35 +03:00
Pavel Kruglov
ca3fe49a2a Make setting global 2020-11-20 17:29:13 +03:00