Commit Graph

1434 Commits

Author SHA1 Message Date
Aleksei Semiglazov
921518db0a CLICKHOUSE-606: query deduplication based on parts' UUID
* add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

* add _part_uuid virtual column, allowing to use UUIDs in predicates.

Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>

address comments
2021-02-02 16:53:39 +00:00
Nikolai Kochetov
b2d7790794 Fix test. 2021-02-02 17:30:54 +03:00
Maksim Kita
d0151de4bb
Merge pull request #19608 from kreuzerkrieg/Add_IStoragePolicy_interface
Add IStoragePolicy interface
2021-02-02 11:03:20 +03:00
Pavel Kruglov
a3f1b825cc Fix build 2021-02-01 21:17:12 +03:00
alexey-milovidov
6b2f0435c8
Merge pull request #19375 from Avogar/select-final
Improve do_not_merge_across_partitions_select_final optimization.
2021-02-01 20:31:08 +03:00
alesapin
2aa8a6304b
Merge pull request #15450 from CurtizJ/fix-ttl-group-by
Fix some cases of TTL expressions
2021-02-01 16:48:07 +03:00
Nikolai Kochetov
19e4a33f9d
Merge pull request #19544 from amosbird/limitconcurrency
Per MergeTree table query limit
2021-02-01 16:09:12 +03:00
alesapin
d9598c47b4
Merge pull request #19702 from ClickHouse/fix_rare_bug_after_part_corruption
Fix incorrect virtual_parts after part corruption
2021-02-01 10:09:48 +03:00
alexey-milovidov
d19feb724b
Merge pull request #19799 from CurtizJ/fix-uint8-filtering
Fix filtering by Uint8 greater than 127
2021-01-29 21:34:42 +03:00
Pavel Kruglov
78371e15dc Update test, reduce num_threads_for_lonely_parts if data is small 2021-01-29 21:00:08 +03:00
alesapin
f4236fd765 Fix style 2021-01-29 20:12:53 +03:00
alesapin
c373d92a80 Less strict check 2021-01-29 18:50:08 +03:00
Pavel Kruglov
71f4acd48b Use one pool for lonely parts, update tests 2021-01-29 17:30:14 +03:00
Pavel Kruglov
a437ee4e31 Merge branch 'master' of github.com:ClickHouse/ClickHouse into select-final 2021-01-29 14:25:47 +03:00
Kruglov Pavel
caef103837
Merge branch 'master' into Add_IStoragePolicy_interface 2021-01-29 14:00:12 +03:00
Anton Popov
031132038b fix filtering by uint8 greater than 127 2021-01-29 10:39:18 +03:00
alexey-milovidov
fa7a795c3a
Merge pull request #19588 from ClickHouse/disable-checksums-on-read
Allow to disable checksums on read
2021-01-29 02:37:26 +03:00
alesapin
2881c830e3 Merge branch 'master' into fix_rare_bug_after_part_corruption 2021-01-28 23:16:52 +03:00
alesapin
be1104d6c1 Fix typos 2021-01-28 19:07:47 +03:00
alesapin
c1d18fc8a2 Fix very rare case 2021-01-28 18:10:53 +03:00
alesapin
a7ca26fec4
Merge pull request #19708 from ClickHouse/fix_flaky_test_multiple_ttl
Fix flaky test test_concurrent_ttl_merges
2021-01-28 15:26:56 +03:00
alesapin
d14f4a525b Merge branch 'master' into fix_rare_bug_after_part_corruption 2021-01-28 11:09:41 +03:00
alesapin
402e031a22 Throw exception only in debug mode 2021-01-28 11:07:18 +03:00
Alexey Milovidov
065d22cf2b Merge branch 'master' into disable-checksums-on-read 2021-01-28 03:07:31 +03:00
Anton Popov
e5125b8c73 add comments and test for compatibility 2021-01-28 02:39:15 +03:00
alesapin
c74631c650 Fix test and add logical error 2021-01-27 21:54:05 +03:00
Anton Popov
a8f3078ce9 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-27 19:48:55 +03:00
alesapin
5622e6daa6 Fix rare max_number_of_merges_with_ttl_in_pool limit overrun for non-replicated MergeTree 2021-01-27 14:56:12 +03:00
alesapin
6fc39b10d3 Spelling 2021-01-27 13:11:48 +03:00
alesapin
01c8b9e1b1 Fix rare bug when some replicated operations (like mutation) cannot process some parts after data corruption 2021-01-27 13:07:18 +03:00
Alexey Milovidov
48b4d98b21 Amend 2021-01-27 04:48:41 +03:00
Alexey Milovidov
a6f5f40a65 Merge branch 'master' into disable-checksums-on-read 2021-01-27 04:09:00 +03:00
Anton Popov
c7070da85a better abstractions in disk interface 2021-01-26 17:49:35 +03:00
kreuzerkrieg
29a2ef3089 Add IStoragePolicy interface 2021-01-26 10:55:28 +02:00
tavplubix
2d6a71fced
Merge pull request #19443 from ClickHouse/fix_virtual_parts_in_parts_to_do
Addition to #15537
2021-01-26 11:27:14 +03:00
Amos Bird
66fe97d8bd
Per MergeTree table query limit 2021-01-26 14:03:31 +08:00
Alexey Milovidov
8dfa933028 Amend 2021-01-25 23:48:10 +03:00
Alexey Milovidov
9ee5c1535e Allow to disable checksums on read 2021-01-25 23:29:04 +03:00
Anton Popov
658f24dcff
Merge pull request #19358 from CurtizJ/fix-subcolumns
Fix several cases, while reading subcolumns
2021-01-25 20:26:07 +03:00
tavplubix
5f07bfb9f8
Update ReplicatedMergeTreeQueue.cpp 2021-01-25 16:15:47 +03:00
Alexander Tokmakov
3bd4d97353 Merge branch 'master' into database_replicated 2021-01-25 14:19:04 +03:00
tavplubix
a88a564aae
Update ReplicatedMergeTreeQueue.cpp 2021-01-25 12:51:06 +03:00
alexey-milovidov
73501102f3
Merge pull request #19528 from azat/merge_tree_min_for_concurrent_read-SIGSEGV
Fix SIGSEGV with merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read=0/UINT64_MAX
2021-01-25 00:18:42 +03:00
Azat Khuzhin
1c364b6ee3 Fix SIGSEGV with merge_tree_min_rows_for_concurrent_read/merge_tree_min_bytes_for_concurrent_read=0/UINT64_MAX
In case of 0 or too huge value it will try to read not existing marks
and got:

    Logical error: 'Trying to get non existing mark 11936128518282651045, while size is 2'.
2021-01-24 14:39:57 +03:00
Alexey Milovidov
7d2108d4e9 Fix double whitespace 2021-01-23 02:57:35 +03:00
Alexander Tokmakov
963dadae54 fix wrong parts in parts_to_do 2021-01-22 19:07:19 +03:00
alexey-milovidov
9818bd319a
Merge pull request #19381 from azat/parts-types-metrics
Add metrics for MergeTree parts (Wide/Compact/InMemory) types
2021-01-22 17:24:50 +03:00
alexey-milovidov
4e11e7cfa9
Merge pull request #18772 from azat/optimize-memory-tracking-fix
[RFC] Fix memory tracking for OPTIMIZE TABLE/merges
2021-01-22 03:48:06 +03:00
Azat Khuzhin
cb951c2116 Add metrics for MergeTree parts types
- PartsWide
- PartsCompact
- PartsInMemory
2021-01-21 21:17:00 +03:00
Pavel Kruglov
900580af02 Add parallel select when there is one part with level>0 in select final 2021-01-21 20:34:50 +03:00
Anton Popov
ac3de63a71 fix several cases, while reading subcolumns 2021-01-21 15:34:11 +03:00
Alexander Tokmakov
7f97a11c84 Merge branch 'master' into database_replicated 2021-01-18 17:09:39 +03:00
Pavel Kovalenko
1e3a059f64 Merge remote-tracking branch 'origin/master' into disk-s3-backup-restore-metadata
# Conflicts:
#	src/Disks/DiskCacheWrapper.cpp
#	src/Disks/S3/DiskS3.cpp
2021-01-18 13:39:49 +03:00
Alexey Milovidov
029302d766 Merge with master 2021-01-16 17:09:44 +03:00
Alexey Milovidov
24c8e53440 Merge branch 'master' into multiple-nested 2021-01-16 16:28:40 +03:00
alexey-milovidov
4efc7a7dc3
Update MergeTreeIOSettings.h 2021-01-16 13:07:58 +03:00
alexey-milovidov
2e2988e5d8
Merge pull request #19146 from azat/server-memory-limit-blocking
MemoryTracker: Do not ignore server memory limits during blocking by default
2021-01-16 11:09:19 +03:00
alesapin
67fd381034
Merge pull request #19123 from ClickHouse/additional_check_in_writer
Fix max granules size in MergeTreeDataWriter
2021-01-16 10:17:44 +03:00
alexey-milovidov
5f189c5756
Merge pull request #19122 from ClickHouse/data-part-better-code
Add metrics for part number in MergeTree in ClickHouse
2021-01-16 00:20:15 +03:00
alexey-milovidov
e09ec9fed5
Merge pull request #18373 from amosbird/fix-18364
Fix 2-arg functions with constant in PK analysis
2021-01-16 00:20:03 +03:00
Azat Khuzhin
61b2d0ce42 MemoryTracker: Do not ignore server memory limits during blocking by default 2021-01-15 22:46:58 +03:00
alesapin
184dbedb06 Fix stupid error 2021-01-15 21:50:30 +03:00
alexey-milovidov
b97beea22a
Merge pull request #19101 from ClickHouse/check_compression_codec_read
Fix compression codec read for empty files
2021-01-15 20:55:58 +03:00
alexey-milovidov
971ff2ee0a
Merge pull request #19086 from ClickHouse/faster-parts-removal
Faster parts removal, more safe and efficient interface of IDisk
2021-01-15 20:37:35 +03:00
alesapin
8ccaa6ede9 Additional check for huge granules in MergeTreeDataWriter 2021-01-15 15:40:37 +03:00
Alexey Milovidov
e238fd64ac Add part metrics 2021-01-15 15:28:53 +03:00
Alexey Milovidov
6a2a5e53ed Slightly better code of IMergeTreeDataPart #18955 2021-01-15 15:15:13 +03:00
alexey-milovidov
78fff6bc39
Merge branch 'master' into multiple-nested 2021-01-15 14:54:27 +03:00
Alexey Milovidov
4bae04d500 Merge branch 'master' into amosbird/fix-18364 2021-01-15 14:37:35 +03:00
alexey-milovidov
4a71971b43
Update KeyCondition.cpp 2021-01-15 14:36:07 +03:00
alexey-milovidov
8d58ce532a
Merge pull request #19064 from CurtizJ/restrict-modify-ttl
Restrict MODIFY TTL for tables created in old syntax
2021-01-15 14:09:47 +03:00
Alexey Milovidov
d553e46a06 Merge branch 'master' into faster-parts-removal 2021-01-15 13:25:20 +03:00
alesapin
d601faa669
Merge pull request #18935 from fastio/bugfix_attach_partition_does_not_reset_mutation
Bugfix: attach partition should reset the mutation
2021-01-15 12:21:16 +03:00
alesapin
e106df2ad0 Fix comment 2021-01-15 12:10:03 +03:00
alesapin
0662d6bd7d Fix compression codec read for empty files 2021-01-15 12:04:23 +03:00
Alexey Milovidov
8276a1c8d2 Faster parts removal, more safe and efficient interface of IDisk 2021-01-14 19:24:13 +03:00
Anton Popov
ac426c3da6 restrict MODIFY TTL for tables created in old syntax 2021-01-14 15:32:20 +03:00
Anton Popov
0e903552a0 fix TTLs with WHERE 2021-01-13 17:04:27 +03:00
ygrek
8f2a830d83
add zstd long range option (#17184)
* add zstd long compression option

* tests: add zstd long read-write test

Co-authored-by: Joris Giovannangeli <joris.giovannangeli@ahrefs.com>
Co-authored-by: ip <igor@ahrefs.com>
2021-01-13 16:22:59 +03:00
alesapin
73e536a074
Merge pull request #18928 from ClickHouse/more_checks_in_writer_wide
More checks in merge tree writer wide
2021-01-13 09:59:45 +03:00
Anton Popov
d7200ee2ed minor changes 2021-01-13 02:20:32 +03:00
Pavel Kovalenko
b09862b7b9 Ability to backup-restore metadata files for DiskS3 (fixes and tests) 2021-01-12 20:18:40 +03:00
Anton Popov
15ead18673 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-12 19:46:10 +03:00
Anton Popov
60b88986bf minor changes near TTL computation 2021-01-12 19:42:49 +03:00
Anton Popov
58b9ef5a10 fix TTL info serialization 2021-01-12 17:14:47 +03:00
alexey-milovidov
2c71b997de
Merge pull request #18464 from hexiaoting/dev_fp_as_pk
Disallow floating point as partition key
2021-01-12 13:08:41 +03:00
alesapin
f3e55183ad Better test and check 2021-01-12 11:46:31 +03:00
alesapin
5ae31b8068 Slightly relax check 2021-01-12 10:59:14 +03:00
Anton Popov
5822ee1f01 allow multiple rows TTL with WHERE expression 2021-01-12 02:07:21 +03:00
Azat Khuzhin
05608687d6 Get back memory tracking blocker in calculateAndSerializePrimaryIndex()
But reduce scope, to avoid leaking too much memory, since there are old
values in last_block_index_columns.

The scope of the MemoryTracker::BlockerInThread has been increased in #8290
2021-01-12 01:00:37 +03:00
Azat Khuzhin
82edbfb581 Account query memory limits and sampling for OPTIMIZE TABLE/merges 2021-01-12 00:49:41 +03:00
Azat Khuzhin
bd05d9db2f Fix memory tracking for OPTIMIZE TABLE queries
Because of BlockerInThread in
MergeTreeDataPartWriterOnDisk::calculateAndSerializePrimaryIndex memory
was accounted incorrectly and grows constantly.

And IIUC there is no need in that blocker, since INSERT SELECT shares
the same thread group.
2021-01-12 00:49:40 +03:00
Pavel Kovalenko
0856b2c514 Ability to backup-restore metadata files for DiskS3 (fixes and tests) 2021-01-11 20:37:08 +03:00
fastio
8dde70b937 Bugfix: attach partition should reset the mutation 2021-01-11 21:26:43 +08:00
alesapin
c5df8f324c More checks in writer wide 2021-01-11 15:03:00 +03:00
Anton Popov
36ae0e4d35 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-11 13:51:12 +03:00
Nikolai Kochetov
ee094ed7fd
Merge pull request #18896 from amosbird/keyconditionalias
correct index analysis of WITH aliases
2021-01-11 11:13:28 +03:00
alexey-milovidov
571e37188a
Merge pull request #18886 from ClickHouse/remove-useless-code-4
Remove useless code
2021-01-10 13:13:17 +03:00
Amos Bird
44758935df
correct index analysis of WITH aliases 2021-01-10 17:40:47 +08:00
Alexey Milovidov
c38dca155c Fix clang-tidy 2021-01-10 05:51:54 +03:00