Commit Graph

326 Commits

Author SHA1 Message Date
mergify[bot]
1e0642065b
Merge branch 'master' into deduplication_token_7461 2021-12-22 15:27:28 +00:00
Anton Ivashkin
e88b97dafb Fix typos 2021-12-21 19:56:29 +03:00
ianton-ru
e6fd4bfb50
Merge branch 'master' into MDB-15474 2021-12-21 17:38:36 +03:00
Anton Ivashkin
0c0bf66334 Merge master 2021-12-21 17:27:54 +03:00
Maksim Kita
51477adf1b Updated additional cases 2021-12-20 15:55:07 +03:00
Igor Nikonov
100ee92c64 insert_deduplication_token setting for INSERT statement
The setting allows a user to provide own deduplication semantic in Replicated*MergeTree
If provided, it's used instead of data digest to generate block ID
So, for example, by providing a unique value for the setting in each INSERT statement,
user can avoid the same inserted data being deduplicated

Inserting data within the same INSERT statement are split into blocks
according to the *insert_block_size* settings
(max_insert_block_size, min_insert_block_size_rows, min_insert_block_size_bytes).
Each block with the same INSERT statement will get an ordinal number.
The ordinal number is added to insert_deduplication_token to get block dedup token
i.e. <token>_0, <token>_1, ... Deduplication is done per block
So, to guarantee deduplication for two same INSERT queries,
dedup token and number of blocks to have to be the same

Issue: #7461
2021-12-19 13:15:45 +00:00
liyang
37ba8004ff Speep up mergetree starting up process 2021-12-18 16:39:59 +08:00
Anton Popov
16312e7e4a Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-14 18:58:17 +03:00
Nikolai Kochetov
22e6fc1685
Merge pull request #32067 from amosbird/projection-fix23
Fix detaching parts with projections
2021-12-13 12:00:17 +03:00
tavplubix
25427719d4
Try fix 'Directory tmp_merge_<part_name>' already exists (#32201)
* try fix 'directory tmp_merge_<part_name>' already exists

* fix

* fix

* fix

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2021-12-10 16:29:51 +03:00
Nikita Mikhaylov
dbf5091016
Parallel reading from replicas (#29279) 2021-12-09 13:39:28 +03:00
Anton Popov
61a5f8a61a add comments 2021-12-08 18:56:30 +03:00
Anton Popov
d8367334a3 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-08 18:26:19 +03:00
Anton Popov
f6be3d16fd
Merge pull request #24820 from kssenii/versioning
Versioning of aggregate function states
2021-12-03 01:41:44 +03:00
Amos Bird
8dbc7a8dae
Fix detaching parts with projections 2021-12-01 23:21:17 +08:00
tavplubix
7ae45b9d52
Update IMergeTreeDataPart.cpp 2021-12-01 18:00:40 +03:00
Alexander Tokmakov
57e4f3698c fix 'directory exists' error when detaching part 2021-12-01 17:24:26 +03:00
Anton Ivashkin
80ab73c691 Fix Zero-Copy replication lost locks, fix remove used remote data in DROP DETACHED PART 2021-12-01 16:11:26 +03:00
Anton Ivashkin
0f9038ebed Zero-copy: move shared mark outside table node in ZooKeeper 2021-11-29 19:05:31 +03:00
kssenii
515261f5dd Better 2021-11-27 09:40:46 +00:00
kssenii
37f482d478 Merge branch 'master' of github.com:ClickHouse/ClickHouse into versioning 2021-11-15 07:31:11 +00:00
Anton Popov
46fa062a81 fix tests 2021-11-02 23:30:28 +03:00
Anton Popov
9823f28855 fix nested 2021-11-02 06:03:52 +03:00
Anton Popov
8fe3de16c6 fix nested 2021-11-01 05:40:43 +03:00
Anton Popov
0099dfd523 refactoring of SerializationInfo 2021-10-29 20:21:02 +03:00
Anton Popov
7aa6068fb2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-14 19:44:08 +03:00
Nikolai Kochetov
077aba4a97
Merge pull request #29520 from amosbird/projection-improve4
Get rid of naming limitation of projections.
2021-10-12 14:25:29 +03:00
Maksim Kita
b0d887a0fe Added tests 2021-10-11 14:00:10 +03:00
Amos Bird
83717b7c3b
Get rid of naming limitation of projections. 2021-10-11 18:32:17 +08:00
Maksim Kita
2d069acc22 System table data skipping indices added size 2021-10-11 11:39:50 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Anton Popov
6f9e53197c Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-20 17:17:05 +03:00
Nikita Mikhaylov
c52b8ec083
Introduced MergeTask and MutateTask (#25165)
Introduced MergeTask and MutateTask
2021-09-17 00:19:58 +03:00
Anton Popov
eef436fe22 Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-16 18:07:42 +03:00
Anton Popov
8203bd1ac6 Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-09 14:04:37 +03:00
Mike Kot
8e9aacadd1 Initial: replacing hardcoded toString for enums with magic_enum 2021-09-06 16:24:03 +02:00
alexey-milovidov
4cc0b0298c
Merge pull request #28269 from amosbird/fixweirdcode
Better nullable primary key implementation
2021-09-01 00:48:45 +03:00
Amos Bird
f2374a6916
Better nullable primary key implementation. 2021-08-28 17:48:28 +08:00
Alexey Milovidov
8f57216180 Progress on development 2021-08-25 00:45:58 +03:00
Anton Popov
c3c3a06078 Merge remote-tracking branch 'upstream/master' into HEAD 2021-08-20 01:45:38 +03:00
Alexey Milovidov
d184b79bba Progress on async reads. 2021-08-16 03:00:32 +03:00
Alexander Tokmakov
23f8b3d07d fix part name parsing in system.detached_parts 2021-08-04 17:42:48 +03:00
kssenii
58b3a3f3fc Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into versioning 2021-07-29 19:56:27 +00:00
Anton Popov
c4b454494f Merge remote-tracking branch 'upstream/master' into HEAD 2021-07-20 15:41:01 +03:00
alexey-milovidov
b52411a715
Merge pull request #12455 from amosbird/npc
Nullable primary key with correct KeyCondition
2021-07-18 17:52:20 +03:00
Zhichang Yu
b4e6689bf9 fix test_hdfs_zero_copy_replication_move[tiered_copy-2] 2021-07-13 07:20:23 +00:00
Zhichang Yu
5047c758f4 fix per review 2021-07-13 07:20:20 +00:00
Zhichang Yu
fbd5eee8a1 hdfs zero copy 2021-07-13 07:19:12 +00:00
Anton Popov
567043113c Merge remote-tracking branch 'upstream/master' into HEAD 2021-06-21 01:36:06 +03:00
Amos Bird
f2ed5ef42b
Nullable primary key with correct KeyCondition 2021-06-18 23:04:24 +08:00
Alexander Tokmakov
73ff1728ae rename flag to more generic name 2021-06-11 15:41:48 +03:00
Alexander Tokmakov
cef22688ff make code less ugly 2021-06-09 15:36:47 +03:00
Alexander Tokmakov
3ade38df82 remove copypaste 2021-06-08 22:17:45 +03:00
Anton Popov
2a2074fae7 fix debug asserts 2021-06-04 16:06:57 +03:00
Anton Popov
e69cbab3fb Merge remote-tracking branch 'upstream/master' into HEAD 2021-06-02 20:18:19 +03:00
kssenii
32095a2b74 Merge branch 'master' of https://github.com/ClickHouse/ClickHouse into versioning 2021-06-01 08:01:06 +00:00
kssenii
e510c3839e More correct 2021-05-31 22:09:54 +00:00
Anton Popov
018a303387 Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-31 23:08:04 +03:00
kssenii
c11ad44aad More correct version 2021-05-30 22:54:42 +00:00
kssenii
d18609467b First version 2021-05-30 13:57:30 +00:00
kssenii
73f16ee9ee Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs 2021-05-26 23:08:08 +03:00
Anton Ivashkin
29336a4a34 Use keep_s3_on_delet flag instead of DeleteOnDestroyKeepS3 state 2021-05-21 15:29:10 +03:00
Anton Popov
a06f2fed9a serializations: fix mutations 2021-05-19 19:09:04 +03:00
Anton Popov
76613a5dd1 serialization: better interfaces 2021-05-19 04:48:46 +03:00
Anton Ivashkin
d256cf4edf Fix PVS check (Two or more case-branches perform the same actions) 2021-05-18 17:41:09 +03:00
Anton Ivashkin
8ed4a5de62 Fix Zero Copy after merge master 2021-05-17 16:01:08 +03:00
Anton Ivashkin
39a30b77fe Merge master 2021-05-17 11:47:48 +03:00
Anton Popov
78dc7bf8fe fix build 2021-05-15 00:45:13 +03:00
Anton Popov
d8df0903b9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-05-14 23:38:16 +03:00
Anton Popov
8ae1533f8f better serialization in native format 2021-05-14 23:29:48 +03:00
kssenii
0527f0ea33 Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs 2021-05-12 16:54:18 +03:00
Amos Bird
264cff6415
Projections
TODO (suggested by Nikolai)

1. Build query plan fro current query (inside storage::read) up to WithMergableState
2. Check, that plan is simple enough: Aggregating - Expression - Filter - ReadFromStorage (or simplier)
3. Check, that filter is the same as filter in projection, and also expression calculates the same aggregation keys as in projection
4. Return WithMergableState if projection applies

3 will be easier to do with ActionsDAG, cause it sees all functions, and dependencies are direct (but it is possible with ExpressionActions also)

Also need to figure out how prewhere works for projections, and
row_filter_policies.

wip
2021-05-11 18:12:23 +08:00
kssenii
35f999bf04 Poco::createFile to fs::createFile 2021-05-08 02:41:47 +03:00
kssenii
9ec92ec514 Fix tests, less manual concatination of paths 2021-05-05 18:39:30 +03:00
Anton Popov
aea93d9ae5 Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-20 15:16:12 +03:00
Azat Khuzhin
2561a67fd8 Replace !__clang__ with !defined(__clang) to fix gcc builds
$ gg 'if !__clang__' | cut -d: -f1 | sort -u | xargs sed -i 's/#if !__clang__/#if !defined(__clang__)/g'
2021-04-18 23:37:50 +03:00
Anton Popov
2afa1590e0 ColumnSparse: fix MergeTree in old syntax 2021-04-17 04:06:59 +03:00
Anton Ivashkin
09379b2b8a Fix Zero-Copy replication with several S3 volumes (issue 22679) 2021-04-16 12:34:48 +03:00
Anton Popov
6ce875175b Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-16 02:08:20 +03:00
Anton Popov
298251e55d fix merges with sparse columns and disable sparse for some data types 2021-04-12 02:33:53 +03:00
Anton Popov
d46958a8d2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-06 00:54:49 +03:00
alesapin
8d5a787f6b Merge branch 'master' into merge_tree_deduplication 2021-04-05 10:40:03 +03:00
Alexey Milovidov
8b5d0a598f Minor improvement in index deserialization 2021-04-04 12:17:54 +03:00
alesapin
0204f5dd35 Merge branch 'master' into merge_tree_deduplication 2021-04-03 15:24:26 +03:00
alesapin
a555d078a2 Add exception handling 2021-04-02 19:56:02 +03:00
Mike Kot
c947280dfc Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-04-01 21:38:51 +03:00
alesapin
c15d7e009d Some initial code 2021-03-31 18:20:30 +03:00
Anton Popov
372a1b1fe7 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-29 19:57:49 +03:00
Anton Popov
577d571300 ColumnSparse: initial implementation 2021-03-29 19:54:24 +03:00
Anton Popov
ea82e7725f
Merge pull request #21562 from CurtizJ/serialization-refactoring-4
Refactoring of data types serialization
2021-03-29 16:36:44 +03:00
alexey-milovidov
f895bc895c
Merge pull request #22011 from ClickHouse/min_max_time_system_parts_datetime64
Expose DateTime64 minmax part index in system.parts and system.parts_columns
2021-03-25 16:02:33 +03:00
Anton Popov
6a15431be7 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-25 15:57:35 +03:00
Mike Kot
285af08949 Merge remote-tracking branch 'upstream/master' into feature/attach-partition-local 2021-03-24 22:34:20 +03:00
alexey-milovidov
612d4fb073
Update IMergeTreeDataPart.cpp 2021-03-24 02:03:14 +03:00
Pavel Kovalenko
a92cf30b67 Code review fixes. 2021-03-23 13:33:07 +03:00
Alexey Milovidov
8d0210b510 Expose DateTime64 minmax part index in system.parts and system.parts_columns #18244 2021-03-23 01:16:41 +03:00
Mike Kot
c55a73b752 Added the solution to handle the corruption case
When the part data (e.g. data.bin) is corrupted, but the checksums.txt
is present -- explicitly deleting the checksums.txt.

Removed the extra logging, changes some exceptions message.
2021-03-22 17:23:43 +03:00
Anton Popov
173d2ea1f4 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-16 02:50:14 +03:00
Anton Popov
1b07d28043 fix unwanted changes 2021-03-13 02:59:42 +03:00
Anton Popov
bc417cf54a refactoring of serializations 2021-03-09 17:46:52 +03:00
alesapin
5b3161e0b5 Get rid of const_cast 2021-03-05 20:24:06 +03:00
Anton Ivashkin
d08b481660 Fixes by review responces 2021-03-05 19:20:38 +03:00
Anton Ivashkin
e69124a0a6 Merge master 2021-03-04 13:26:40 +03:00
alesapin
9ebf1b4fad Get rid of separate minmax index fields 2021-03-02 13:33:54 +03:00
Anton Ivashkin
c891cf4557 Fixes by review response 2021-02-26 12:48:57 +03:00
Anton Ivashkin
4d44d75bc7 Fix build after merge one more time 2021-02-08 14:45:10 +03:00
Anton Ivashkin
e64c63c611 Merge master 2021-02-05 20:10:06 +03:00
Anton Ivashkin
df6c882aab Fix build after merge 2021-02-05 18:52:40 +03:00
Anton Popov
a8f3078ce9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-27 19:48:55 +03:00
Anton Popov
c7070da85a better abstractions in disk interface 2021-01-26 17:49:35 +03:00
Anton Popov
658f24dcff
Merge pull request #19358 from CurtizJ/fix-subcolumns
Fix several cases, while reading subcolumns
2021-01-25 20:26:07 +03:00
Azat Khuzhin
cb951c2116 Add metrics for MergeTree parts types
- PartsWide
- PartsCompact
- PartsInMemory
2021-01-21 21:17:00 +03:00
Anton Popov
ac3de63a71 fix several cases, while reading subcolumns 2021-01-21 15:34:11 +03:00
Anton Ivashkin
357d98eb36 Merge master 2021-01-20 12:23:03 +03:00
Anton Ivashkin
eba98b04b0 Zero copy replication over S3: Hybrid storage support 2021-01-18 19:16:45 +03:00
Alexey Milovidov
029302d766 Merge with master 2021-01-16 17:09:44 +03:00
Alexey Milovidov
24c8e53440 Merge branch 'master' into multiple-nested 2021-01-16 16:28:40 +03:00
alexey-milovidov
2e2988e5d8
Merge pull request #19146 from azat/server-memory-limit-blocking
MemoryTracker: Do not ignore server memory limits during blocking by default
2021-01-16 11:09:19 +03:00
alexey-milovidov
5f189c5756
Merge pull request #19122 from ClickHouse/data-part-better-code
Add metrics for part number in MergeTree in ClickHouse
2021-01-16 00:20:15 +03:00
Azat Khuzhin
61b2d0ce42 MemoryTracker: Do not ignore server memory limits during blocking by default 2021-01-15 22:46:58 +03:00
alexey-milovidov
b97beea22a
Merge pull request #19101 from ClickHouse/check_compression_codec_read
Fix compression codec read for empty files
2021-01-15 20:55:58 +03:00
Alexey Milovidov
e238fd64ac Add part metrics 2021-01-15 15:28:53 +03:00
Alexey Milovidov
6a2a5e53ed Slightly better code of IMergeTreeDataPart #18955 2021-01-15 15:15:13 +03:00
alesapin
e106df2ad0 Fix comment 2021-01-15 12:10:03 +03:00
alesapin
0662d6bd7d Fix compression codec read for empty files 2021-01-15 12:04:23 +03:00
Alexey Milovidov
8276a1c8d2 Faster parts removal, more safe and efficient interface of IDisk 2021-01-14 19:24:13 +03:00
Anton Popov
0e903552a0 fix TTLs with WHERE 2021-01-13 17:04:27 +03:00
Anton Popov
d7200ee2ed minor changes 2021-01-13 02:20:32 +03:00
Anton Popov
15ead18673 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-12 19:46:10 +03:00
Anton Popov
5822ee1f01 allow multiple rows TTL with WHERE expression 2021-01-12 02:07:21 +03:00
Anton Popov
36ae0e4d35 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-11 13:51:12 +03:00
Azat Khuzhin
b1f08f5c27 Rename FileSyncGuard to DirectorySyncGuard 2021-01-07 20:26:18 +03:00
Azat Khuzhin
513a824f30 Fix fsync_part_directory for parts renames 2021-01-07 19:30:25 +03:00
Alexey Milovidov
36e1361cf8 Miscellaneous 2021-01-07 15:29:34 +03:00
Anton Popov
11283e3d81 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-25 21:25:59 +03:00
Anton Popov
b60c00ba74 refactoring of TTL stream 2020-12-25 18:46:13 +03:00
alesapin
54455b4740 Add test for already working code 2020-12-23 14:53:49 +03:00
Anton Popov
a42b00c9aa Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-17 20:43:23 +03:00
Anton Ivashkin
0f0500ca0c Merge master 2020-12-16 18:31:13 +03:00
alesapin
b307e545a9 Fix check 2020-12-09 14:46:04 +03:00
alesapin
0f4056fd95 Add additional size check in debug mode 2020-12-09 14:23:37 +03:00
Anton Popov
66e0add2ba fix nested 2020-12-07 16:35:12 +03:00
Anton Popov
b384beb564 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-23 17:46:51 +03:00
Nicolae Vartolomei
040aba9f85 Add uuid.txt to checksums for parts stored on disk
We are breaking backwards compatibility anyway (but agted by a setting)
2020-11-20 13:49:17 +00:00
Nicolae Vartolomei
425dc4b11b Add unique identifiers IMergeTreeDataPart structure
For now uuids are not generated at all, they are present only if the
part is updated manually (as you can see in the integration test).

The only place where they can be seen today by an end user is in
`system.parts` table. I was looking for hiding this column behind an
option but couldn't find an easy way to do that.

Likely this is also required for WAL, but need to think how not to break
compatibility.

Relates to #13574, https://github.com/ClickHouse/ClickHouse/issues/13574

Next 1: In the upcoming PR the plan is to integrate de-duplication based on
these fingerprints in the query pipeline.

Next 2: We'll enable automatic generation of uuids and come up with a
way for conditionally sending uuids when processing distributed queries
only when part movement is in progress.
2020-11-19 13:14:25 +00:00
Anton Popov
245c395a68 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-06 22:00:32 +03:00
Anton Ivashkin
1742fb3256 Merge master 2020-11-03 12:27:16 +03:00
Anton Ivashkin
78021714f1 S3 zero copy replication: more simple s3 check 2020-11-03 12:20:26 +03:00
Mikhail Filimonov
41971e073a
Fix typos reported by codespell 2020-10-27 12:04:03 +01:00
Anton Popov
a249f0c95e Merge remote-tracking branch 'upstream/master' into HEAD 2020-10-23 22:05:00 +03:00