Commit Graph

246 Commits

Author SHA1 Message Date
ianton-ru
e6fd4bfb50
Merge branch 'master' into MDB-15474 2021-12-21 17:38:36 +03:00
Anton Ivashkin
0c0bf66334 Merge master 2021-12-21 17:27:54 +03:00
Alexander Tokmakov
9cd49bc0ec Merge branch 'master' into mvcc_prototype 2021-12-20 22:06:22 +03:00
Igor Nikonov
100ee92c64 insert_deduplication_token setting for INSERT statement
The setting allows a user to provide own deduplication semantic in Replicated*MergeTree
If provided, it's used instead of data digest to generate block ID
So, for example, by providing a unique value for the setting in each INSERT statement,
user can avoid the same inserted data being deduplicated

Inserting data within the same INSERT statement are split into blocks
according to the *insert_block_size* settings
(max_insert_block_size, min_insert_block_size_rows, min_insert_block_size_bytes).
Each block with the same INSERT statement will get an ordinal number.
The ordinal number is added to insert_deduplication_token to get block dedup token
i.e. <token>_0, <token>_1, ... Deduplication is done per block
So, to guarantee deduplication for two same INSERT queries,
dedup token and number of blocks to have to be the same

Issue: #7461
2021-12-19 13:15:45 +00:00
liyang
37ba8004ff Speep up mergetree starting up process 2021-12-18 16:39:59 +08:00
Alexander Tokmakov
d7ad72838c Merge branch 'master' into mvcc_prototype 2021-12-14 23:07:52 +03:00
Anton Popov
16312e7e4a Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-14 18:58:17 +03:00
tavplubix
25427719d4
Try fix 'Directory tmp_merge_<part_name>' already exists (#32201)
* try fix 'directory tmp_merge_<part_name>' already exists

* fix

* fix

* fix

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-12-10 16:29:51 +03:00
Nikita Mikhaylov
dbf5091016
Parallel reading from replicas (#29279) 2021-12-09 13:39:28 +03:00
Anton Popov
ea67abf3f0 fix custom serializations in native protocol 2021-12-08 21:59:36 +03:00
Anton Popov
61a5f8a61a add comments 2021-12-08 18:56:30 +03:00
Anton Popov
d8367334a3 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-08 18:26:19 +03:00
Alexander Tokmakov
7fcb79ae72 Merge branch 'master' into mvcc_prototype 2021-12-07 14:39:29 +03:00
Alexander Tokmakov
57e4f3698c fix 'directory exists' error when detaching part 2021-12-01 17:24:26 +03:00
Anton Ivashkin
80ab73c691 Fix Zero-Copy replication lost locks, fix remove used remote data in DROP DETACHED PART 2021-12-01 16:11:26 +03:00
Anton Ivashkin
0f9038ebed Zero-copy: move shared mark outside table node in ZooKeeper 2021-11-29 19:05:31 +03:00
Alexander Tokmakov
0a4647f927 support alter partition 2021-11-17 21:14:14 +03:00
Alexander Tokmakov
5656203bc6 minor fixes 2021-11-11 21:23:42 +03:00
Alexander Tokmakov
92eec74ad7 Merge branch 'master' into mvcc_prototype 2021-11-06 21:08:36 +03:00
Anton Popov
84e914e05a minor fixes near serializations 2021-11-05 01:46:00 +03:00
Anton Popov
f37d12d16e keep serialization infos after drops and renames 2021-11-03 23:29:48 +03:00
Anton Popov
9823f28855 fix nested 2021-11-02 06:03:52 +03:00
Anton Popov
d50137013c Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-01 16:55:53 +03:00
Anton Popov
8fe3de16c6 fix nested 2021-11-01 05:40:43 +03:00
Anton Popov
0099dfd523 refactoring of SerializationInfo 2021-10-29 20:21:02 +03:00
kssenii
881ae8617e Merge branch 'master' of github.com:ClickHouse/ClickHouse into disk-async-read 2021-10-15 15:09:56 +03:00
Anton Popov
7aa6068fb2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-14 19:44:08 +03:00
Nikolai Kochetov
077aba4a97
Merge pull request #29520 from amosbird/projection-improve4
Get rid of naming limitation of projections.
2021-10-12 14:25:29 +03:00
Amos Bird
72bccaa501
Fix path name 2021-10-11 19:12:08 +08:00
Maksim Kita
2d069acc22 System table data skipping indices added size 2021-10-11 11:39:50 +03:00
kssenii
aa4b797808 Turn off tasks stealing for remote disk 2021-10-10 23:22:58 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Alexander Tokmakov
72b1b2e360 Merge branch 'master' into mvcc_prototype 2021-09-23 22:53:27 +03:00
Anton Popov
6f9e53197c Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-20 17:17:05 +03:00
Nikita Mikhaylov
c52b8ec083
Introduced MergeTask and MutateTask (#25165)
Introduced MergeTask and MutateTask
2021-09-17 00:19:58 +03:00
Anton Popov
eef436fe22 Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-16 18:07:42 +03:00
Anton Popov
8203bd1ac6 Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-09 14:04:37 +03:00
Mike Kot
8e9aacadd1 Initial: replacing hardcoded toString for enums with magic_enum 2021-09-06 16:24:03 +02:00
Amos Bird
b68857d086
Simplify projection, add minmax_count projection. 2021-08-28 11:25:37 +08:00
Alexander Tokmakov
c74bfbf991 Merge branch 'master' into mvcc_prototype 2021-07-28 22:21:48 +03:00
Anton Popov
c4b454494f Merge remote-tracking branch 'upstream/master' into HEAD 2021-07-20 15:41:01 +03:00
Alexey Milovidov
261a220227 Remove some code 2021-07-17 21:06:46 +03:00
xiedeyantu
df438f4288 Modify code comments 2021-07-13 17:52:24 +08:00
Anton Popov
567043113c Merge remote-tracking branch 'upstream/master' into HEAD 2021-06-21 01:36:06 +03:00
Mike Kot
4c391f8e99
SYSTEM RESTORE REPLICA replica [ON CLUSTER cluster] (#13652)
* initial commit: add setting and stub

* typo

* added test stub

* fix

* wip merging new integration test and code proto

* adding steps interpreters

* adding firstly proposed solution (moving parts etc)

* added checking zookeeper path existence

* fixing the include

* fixing and sorting includes

* fixing outdated struct

* fix the name

* added ast ptr as level of indirection

* fix ref

* updating the changes

* working on test stub

* fix iterator -> reference

* revert rocksdb submodule update

* fixed show privileges test

* updated the test stub

* replaced rand() with thread_local_rng(), updated the tests

updated the test

fixed test config path

test fix

removed error messages

fixed the test

updated the test

fixed string literal

fixed literal

typo: =

* fixed the empty replica error message

* updated the test and the code with logs

* updated the possible test cases, updated

* added the code/test milestone comments

* updated the test (added more testcases)

* replaced native assert with CH one

* individual replicas recursive delete fix

* updated the AS db.name AST

* two small logging fixes

* manually generated AST fixes

* Updated the test, added the possible algo change

* Some thoughts about optimizing the solution:

ALTER MOVE PARTITION .. TO TABLE -> move to detached/ + ALTER ... ATTACH

* fix

* Removed the replica sync in test as it's invalid

* Some test tweaks

* tmp

* Rewrote the algo by using the executeQuery instead of

hand-crafting the ASTPtr.

Two questions still active.

* tr: logging active parts

* Extracted the parts moving algo into a separate helper function

* Fixed the test data and the queries slightly

* Replaced query to system.parts to direct invocation,

started building the test that breaks on various parts.

* Added the case for tables when at least one replica is alive

* Updated the test to test replicas restoration by detaching/attaching

* Altered the test to check restoration without replica restart

* Added the tables swap in the start if the server failed last time

* Hotfix when only /replicas/replica... path was deleted

* Restore ZK paths while creating a replicated MergeTree table

* Updated the docs, fixed the algo for individual replicas restoration case

* Initial parts table storage fix, tests sync fix

* Reverted individual replica restoration to general algo

* Slightly optimised getDataParts

* Trying another solution with parts detaching

* Rewrote algo without any steps, added ON CLUSTER support

* Attaching parts from other replica on restoration

* Getting part checksums from ZK

* Removed ON CLUSTER, finished working solution

* Multiple small changes after review

* Fixing parallel test

* Supporting rewritten form on cluster

* Test fix

* Moar logging

* Using source replica as checksum provider

* improve test, remove some code from parser

* Trying solution with move to detached + forget

* Moving all parts (not only Committed) to detached

* Edited docs for RESTORE REPLICA

* Re-merging

* minor fixes

Co-authored-by: Alexander Tokmakov <avtokmakov@yandex-team.ru>
2021-06-20 11:24:43 +03:00
Alexander Tokmakov
73ff1728ae rename flag to more generic name 2021-06-11 15:41:48 +03:00
Alexander Tokmakov
cef22688ff make code less ugly 2021-06-09 15:36:47 +03:00
Alexander Tokmakov
3ade38df82 remove copypaste 2021-06-08 22:17:45 +03:00
Alexander Tokmakov
9915bd0a8b Merge branch 'master' into mvcc_prototype 2021-06-03 00:52:57 +03:00
Anton Popov
e69cbab3fb Merge remote-tracking branch 'upstream/master' into HEAD 2021-06-02 20:18:19 +03:00
Anton Popov
018a303387 Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-31 23:08:04 +03:00
Alexander Tokmakov
5692db736c Merge branch 'master' into mvcc_prototype 2021-05-31 21:11:22 +03:00
kssenii
2a631aaf08 Final fixes 2021-05-29 00:34:44 +03:00
Anton Popov
3e92c7f61a Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-25 21:45:19 +03:00
mergify[bot]
3c6e6464a7
Merge branch 'master' into s3_zero_copy_replication 2021-05-23 07:09:17 +00:00
Anton Ivashkin
29336a4a34 Use keep_s3_on_delet flag instead of DeleteOnDestroyKeepS3 state 2021-05-21 15:29:10 +03:00
Alexey Milovidov
1006a970f7 Remove AutoArray 2021-05-20 09:30:13 +03:00
Anton Popov
a06f2fed9a serializations: fix mutations 2021-05-19 19:09:04 +03:00
Anton Popov
76613a5dd1 serialization: better interfaces 2021-05-19 04:48:46 +03:00
feng lv
403d0a305b remove useless code 2021-05-18 12:19:04 +00:00
Anton Ivashkin
8ed4a5de62 Fix Zero Copy after merge master 2021-05-17 16:01:08 +03:00
Anton Ivashkin
39a30b77fe Merge master 2021-05-17 11:47:48 +03:00
Anton Popov
d8df0903b9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-14 23:38:16 +03:00
Alexander Tokmakov
df2849be90 Merge branch 'master' into mvcc_prototype 2021-05-13 21:48:36 +03:00
Amos Bird
264cff6415
Projections
TODO (suggested by Nikolai)

1. Build query plan fro current query (inside storage::read) up to WithMergableState
2. Check, that plan is simple enough: Aggregating - Expression - Filter - ReadFromStorage (or simplier)
3. Check, that filter is the same as filter in projection, and also expression calculates the same aggregation keys as in projection
4. Return WithMergableState if projection applies

3 will be easier to do with ActionsDAG, cause it sees all functions, and dependencies are direct (but it is possible with ExpressionActions also)

Also need to figure out how prewhere works for projections, and
row_filter_policies.

wip
2021-05-11 18:12:23 +08:00
Anton Ivashkin
09379b2b8a Fix Zero-Copy replication with several S3 volumes (issue 22679) 2021-04-16 12:34:48 +03:00
Anton Popov
6ce875175b Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-16 02:08:20 +03:00
Anton Popov
aa617c6b3c ColumnSparse: fix vertical merge 2021-04-16 00:47:11 +03:00
Alexander Tokmakov
3422bd1742 check parts visibility for select 2021-04-08 20:20:45 +03:00
Anton Popov
d46958a8d2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-06 00:54:49 +03:00
Alexander Tokmakov
f596906f47 add transaction counters 2021-04-05 14:41:34 +03:00
alesapin
0204f5dd35 Merge branch 'master' into merge_tree_deduplication 2021-04-03 15:24:26 +03:00
Mike Kot
c947280dfc Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-04-01 21:38:51 +03:00
alesapin
7c8f54e694 More changes 2021-04-01 11:07:56 +03:00
Anton Popov
372a1b1fe7 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-29 19:57:49 +03:00
Anton Popov
577d571300 ColumnSparse: initial implementation 2021-03-29 19:54:24 +03:00
Anton Popov
ea82e7725f
Merge pull request #21562 from CurtizJ/serialization-refactoring-4
Refactoring of data types serialization
2021-03-29 16:36:44 +03:00
Alexey Milovidov
8d0210b510 Expose DateTime64 minmax part index in system.parts and system.parts_columns #18244 2021-03-23 01:16:41 +03:00
Anton Popov
173d2ea1f4 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-16 02:50:14 +03:00
Mike Kot
406d037ebb Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-03-15 18:41:47 +03:00
Anton Popov
f7c7c5a9c7 Revert "refactoring of serializations"
This reverts commit df6663dcb6.
2021-03-09 20:25:23 +03:00
Anton Popov
df6663dcb6 refactoring of serializations 2021-03-09 20:02:26 +03:00
Anton Popov
bc417cf54a refactoring of serializations 2021-03-09 17:46:52 +03:00
Anton Ivashkin
3c11d44494 Add description for getUniqueId method, fix typos 2021-03-01 13:31:36 +03:00
Anton Ivashkin
c891cf4557 Fixes by review response 2021-02-26 12:48:57 +03:00
Mike Kot
b2c898f58c Adding the part when found 2021-02-25 21:25:55 +03:00
Anton Ivashkin
357d98eb36 Merge master 2021-01-20 12:23:03 +03:00
Anton Ivashkin
eba98b04b0 Zero copy replication over S3: Hybrid storage support 2021-01-18 19:16:45 +03:00
Alexey Milovidov
24c8e53440 Merge branch 'master' into multiple-nested 2021-01-16 16:28:40 +03:00
Alexey Milovidov
6a2a5e53ed Slightly better code of IMergeTreeDataPart #18955 2021-01-15 15:15:13 +03:00
Anton Popov
d7200ee2ed minor changes 2021-01-13 02:20:32 +03:00
Anton Ivashkin
0f0500ca0c Merge master 2020-12-16 18:31:13 +03:00
Anton Popov
b384beb564 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-23 17:46:51 +03:00
Nicolae Vartolomei
94293ca3ce Assign UUIDs to parts only when configured to do so
Avoid breaking backwards compatibility by default for now.
2020-11-20 13:49:17 +00:00
Nicolae Vartolomei
425dc4b11b Add unique identifiers IMergeTreeDataPart structure
For now uuids are not generated at all, they are present only if the
part is updated manually (as you can see in the integration test).

The only place where they can be seen today by an end user is in
`system.parts` table. I was looking for hiding this column behind an
option but couldn't find an easy way to do that.

Likely this is also required for WAL, but need to think how not to break
compatibility.

Relates to #13574, https://github.com/ClickHouse/ClickHouse/issues/13574

Next 1: In the upcoming PR the plan is to integrate de-duplication based on
these fingerprints in the query pipeline.

Next 2: We'll enable automatic generation of uuids and come up with a
way for conditionally sending uuids when processing distributed queries
only when part movement is in progress.
2020-11-19 13:14:25 +00:00
Anton Popov
245c395a68 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-06 22:00:32 +03:00
Anton Ivashkin
1742fb3256 Merge master 2020-11-03 12:27:16 +03:00
Anton Ivashkin
78021714f1 S3 zero copy replication: more simple s3 check 2020-11-03 12:20:26 +03:00
Mikhail Filimonov
41971e073a
Fix typos reported by codespell 2020-10-27 12:04:03 +01:00
Anton Popov
a249f0c95e Merge remote-tracking branch 'upstream/master' into HEAD 2020-10-23 22:05:00 +03:00
Anton Ivashkin
652c56e74e Fix style, fix build 2020-10-22 16:07:20 +03:00
Anton Ivashkin
f833501c77 Merge branch 'master' into s3_zero_copy_replication 2020-10-22 11:17:21 +03:00
Vladimir Chebotarev
aa5f207fd4
Added disable_merges option for volumes in multi-disk configuration (#13956)
Co-authored-by: Alexander Kazakov <Akazz@users.noreply.github.com>
2020-10-20 18:10:24 +03:00
alexey-milovidov
124379cccc
Update IMergeTreeDataPart.h 2020-10-20 04:24:30 +03:00
alexey-milovidov
26517ff08d
Update IMergeTreeDataPart.h 2020-10-20 04:23:23 +03:00
Pavel Kovalenko
ed61c5681b Use 'moving' directory instead of 'detached' when move part to another disk/volume. 2020-10-15 16:55:13 +03:00
Anton Popov
cbe12a532e allow to extract subcolumns from column 2020-10-13 22:39:22 +03:00
Anton Ivashkin
9272ed06b4 Move Zookeeper lock for S3 shared part in IMergeTreeDataPart 2020-10-09 17:24:10 +03:00
Anton Ivashkin
acf86568a7 S3 zero copy replication proof of concept 2020-10-08 18:45:10 +03:00
Anton Popov
edc2fe2226 Merge remote-tracking branch 'upstream/master' into HEAD 2020-09-18 17:51:54 +03:00
Anton Popov
cb4801e3be allow to read subcolumns of complex types 2020-09-18 02:12:43 +03:00
Artem Zuikov
51ba12c2c3
Try speedup build (#14809) 2020-09-15 12:55:57 +03:00
alesapin
acc0ee0657 Apply TTL if it's not calculated for part 2020-09-03 11:59:41 +03:00
alesapin
77faf9587f Better interface 2020-08-28 12:07:20 +03:00
alesapin
2fc80189af Add default compression codec to merge tree data part 2020-08-26 18:29:46 +03:00
alexey-milovidov
7b53a0ef33 Revert "Added allow_merges option for volumes in multi-disk configuration (#13402)"
This reverts commit 1e2616542a.
2020-08-21 18:44:29 +03:00
Vladimir Chebotarev
1e2616542a
Added allow_merges option for volumes in multi-disk configuration (#13402) 2020-08-21 12:04:13 +03:00
Alexey Milovidov
edd89a8610 Fix half of typos 2020-08-08 03:47:03 +03:00
Alexey Milovidov
ea970fd57c Remove bad ugliness 2020-07-09 04:00:16 +03:00
Anton Popov
4422df2e37 Merge remote-tracking branch 'upstream/master' into HEAD 2020-07-02 20:18:21 +03:00
Anton Popov
53e955c6dd several fixes 2020-06-29 23:36:18 +03:00
Alexey Milovidov
97ad23b905 Allow to ALTER partition key in some cases 2020-06-28 22:39:31 +03:00
alesapin
6f1824f0ea Correct merge with master 2020-06-26 14:30:23 +03:00
alesapin
e9c47dc89c Merge branch 'master' into CurtizJ-polymorphic-parts 2020-06-26 14:27:19 +03:00
alesapin
dffdece350 getColumns in StorageInMemoryMetadta (only compilable) 2020-06-17 19:39:58 +03:00
alesapin
1afdebeebd Primary key in storage metadata 2020-06-17 15:39:20 +03:00
Anton Popov
b19d48a11c Merge remote-tracking branch 'upstream/master' into HEAD 2020-06-16 06:37:55 +03:00
Anton Popov
66e31d4311 in-memory parts: several fixes 2020-06-05 23:49:24 +03:00
Anton Popov
4a3d3c6e54
Merge pull request #11419 from CurtizJ/polymorphic-parts-2
Return lost comments and default values
2020-06-04 13:32:38 +03:00
Anton Popov
df3dfd5b81 fix clang-tidy build 2020-06-04 01:00:02 +03:00
Anton Popov
1980d58ba6 add lost comments and default values 2020-06-04 00:30:45 +03:00
Anton Popov
11c4e9dde3 in-memory parts: fix 'check' query 2020-06-03 22:19:49 +03:00
Anton Popov
1ce09e1faa Merge remote-tracking branch 'upstream/master' into polymorphic-parts 2020-06-03 16:27:54 +03:00
Anton Popov
caf1e4e8cc in-memory-parts: fixes 2020-06-03 12:51:23 +03:00
Anton Popov
c919840722 in-memory parts: partition commands 2020-05-29 18:02:12 +03:00
Alexander Kuzmenkov
f98ffdbc4c
Merge pull request #11087 from azat/context-fwd-decl
[RFC] Forward declaration for Context as much as possible.
2020-05-21 19:43:29 +03:00
BohuTANG
c9d315654c Fix Storages/MergeTree typo 2020-05-21 17:00:44 +08:00
Azat Khuzhin
d93b9a57f6 Forward declaration for Context as much as possible.
Now after changing Context.h 488 modules will be recompiled instead of 582.
2020-05-21 01:53:18 +03:00
Gleb Novikov
1a25ac6e1f Merge branch 'master' into refactor-reservations 2020-05-16 23:34:45 +03:00
Gleb Novikov
45b84d491e small fixes of review remarks 2020-05-16 23:31:17 +03:00
alesapin
26606dd640 Fix polymorphic parts 2020-05-15 13:36:35 +03:00
Gleb Novikov
390b39b272 VolumePtr instead of DiskPtr in MergeTreeData* 2020-05-10 00:24:15 +03:00
Anton Popov
4878c91d07 in-memory parts: better restore from wal 2020-05-06 14:57:38 +03:00
Anton Popov
4069dbcc58 in-memory parts: add waiting for insert 2020-04-20 04:38:38 +03:00
Anton Popov
391f7c34be in memory parts: basic read/write 2020-04-17 20:30:46 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00