Commit Graph

216 Commits

Author SHA1 Message Date
Amos Bird
264cff6415
Projections
TODO (suggested by Nikolai)

1. Build query plan fro current query (inside storage::read) up to WithMergableState
2. Check, that plan is simple enough: Aggregating - Expression - Filter - ReadFromStorage (or simplier)
3. Check, that filter is the same as filter in projection, and also expression calculates the same aggregation keys as in projection
4. Return WithMergableState if projection applies

3 will be easier to do with ActionsDAG, cause it sees all functions, and dependencies are direct (but it is possible with ExpressionActions also)

Also need to figure out how prewhere works for projections, and
row_filter_policies.

wip
2021-05-11 18:12:23 +08:00
Ivan
495c6e03aa
Replace all Context references with std::weak_ptr (#22297)
* Replace all Context references with std::weak_ptr

* Fix shared context captured by value

* Fix build

* Fix Context with named sessions

* Fix copy context

* Fix gcc build

* Merge with master and fix build

* Fix gcc-9 build
2021-04-11 02:33:54 +03:00
Anton Popov
173d2ea1f4 Merge remote-tracking branch 'upstream/master' into HEAD 2021-03-16 02:50:14 +03:00
alesapin
06eb2d8dfd
Merge branch 'master' into s3_zero_copy_replication 2021-03-13 22:32:54 +03:00
Azat Khuzhin
69b2b2a159 Fix fsync_part_directory for horizontal merge 2021-03-11 21:41:27 +03:00
Anton Popov
df6663dcb6 refactoring of serializations 2021-03-09 20:02:26 +03:00
Anton Popov
bc417cf54a refactoring of serializations 2021-03-09 17:46:52 +03:00
Anton Ivashkin
e69124a0a6 Merge master 2021-03-04 13:26:40 +03:00
alesapin
9ebf1b4fad Get rid of separate minmax index fields 2021-03-02 13:33:54 +03:00
Anton Ivashkin
c891cf4557 Fixes by review response 2021-02-26 12:48:57 +03:00
Anton Ivashkin
e64c63c611 Merge master 2021-02-05 20:10:06 +03:00
Pavel Kovalenko
b7151aa754 Merge remote-tracking branch 'origin/master' into disk-s3-backup-restore-metadata
# Conflicts:
#	src/Disks/DiskDecorator.h
#	src/Disks/IDisk.h
#	src/Disks/S3/DiskS3.cpp
2021-02-03 14:22:18 +03:00
Kruglov Pavel
caef103837
Merge branch 'master' into Add_IStoragePolicy_interface 2021-01-29 14:00:12 +03:00
Anton Popov
c7070da85a better abstractions in disk interface 2021-01-26 17:49:35 +03:00
kreuzerkrieg
29a2ef3089 Add IStoragePolicy interface 2021-01-26 10:55:28 +02:00
Anton Ivashkin
357d98eb36 Merge master 2021-01-20 12:23:03 +03:00
Anton Ivashkin
eba98b04b0 Zero copy replication over S3: Hybrid storage support 2021-01-18 19:16:45 +03:00
Pavel Kovalenko
1e3a059f64 Merge remote-tracking branch 'origin/master' into disk-s3-backup-restore-metadata
# Conflicts:
#	src/Disks/DiskCacheWrapper.cpp
#	src/Disks/S3/DiskS3.cpp
2021-01-18 13:39:49 +03:00
Alexey Milovidov
24c8e53440 Merge branch 'master' into multiple-nested 2021-01-16 16:28:40 +03:00
Alexey Milovidov
8276a1c8d2 Faster parts removal, more safe and efficient interface of IDisk 2021-01-14 19:24:13 +03:00
Pavel Kovalenko
b09862b7b9 Ability to backup-restore metadata files for DiskS3 (fixes and tests) 2021-01-12 20:18:40 +03:00
Anton Popov
36ae0e4d35 Merge remote-tracking branch 'upstream/master' into HEAD 2021-01-11 13:51:12 +03:00
Alexey Milovidov
6eb5a5f4d9 Remove useless code 2021-01-10 03:28:59 +03:00
Azat Khuzhin
b1f08f5c27 Rename FileSyncGuard to DirectorySyncGuard 2021-01-07 20:26:18 +03:00
Anton Popov
11283e3d81 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-25 21:25:59 +03:00
alexey-milovidov
491f481713
Merge pull request #18481 from ClickHouse/disable_write_with_aio
Disable write with AIO even for big merges
2020-12-25 04:24:43 +03:00
alesapin
5d94b7eec0 Remove non working code 2020-12-24 19:30:53 +03:00
Anton Popov
518cac6150 fix for review 2020-12-23 13:11:24 +03:00
Anton Popov
78c4b9f2bc do not allow merge from wide to compact parts 2020-12-23 03:16:15 +03:00
Anton Popov
b6ff6300b2 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-22 18:06:21 +03:00
Alexey Milovidov
9be5fa9ef2 Merge branch 'master' into Enmk-Optimize_deduplicate 2020-12-20 09:57:10 +03:00
Anton Popov
a42b00c9aa Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-17 20:43:23 +03:00
alesapin
849db47f8a Better exception messages 2020-12-16 17:31:17 +03:00
Vasily Nemkov
59fc301344 Fixed test to be less flaky
Also logging expanded list of columns passed from `DEDUPLICATE BY` to actual deduplication routines.
2020-12-08 19:44:34 +03:00
Anton Popov
d7aad3bf79 Merge remote-tracking branch 'upstream/master' into HEAD 2020-12-07 19:13:01 +03:00
Vasily Nemkov
70ea507dae OPTIMIZE DEDUPLICATE BY columns
Extended OPTIMIZE ... DEDUPLICATE syntax to allow explicit (or implicit with asterisk/column transformers) list of columns to check for duplicates on.

Following syntax variants are now supported:

OPTIMIZE TABLE table DEDUPLICATE; -- the old one
OPTIMIZE TABLE table DEDUPLICATE BY *;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY * EXCEPT (colX, colY);
OPTIMIZE TABLE table DEDUPLICATE BY col1,col2,col3;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex');
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT colX;
OPTIMIZE TABLE table DEDUPLICATE BY COLUMNS('column-matched-by-regex') EXCEPT (colX, colY);

Note that * behaves just like in SELECT: MATERIALIZED, and ALIAS columns are not used for expansion.
Also, it is an error to specify empty list of columns, or write an expression that results in an empty list of columns, or deduplicate by an ALIAS column.
Column transformers other than EXCEPT are not supported.
2020-12-07 09:44:07 +03:00
Pavel Kruglov
9dbced0474 Pass setting instead of context 2020-12-04 17:01:59 +03:00
Anton Popov
cd1917c7a6
Merge branch 'master' into optimize_final_optimization 2020-12-03 16:52:51 +03:00
Anton Popov
b384beb564 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-23 17:46:51 +03:00
Pavel Kruglov
ca3fe49a2a Make setting global 2020-11-20 17:29:13 +03:00
Nicolae Vartolomei
040aba9f85 Add uuid.txt to checksums for parts stored on disk
We are breaking backwards compatibility anyway (but agted by a setting)
2020-11-20 13:49:17 +00:00
Nicolae Vartolomei
94293ca3ce Assign UUIDs to parts only when configured to do so
Avoid breaking backwards compatibility by default for now.
2020-11-20 13:49:17 +00:00
Pavel Kruglov
4c30857759 Minor change 2020-11-20 01:22:40 +03:00
Nicolae Vartolomei
746f8e45f5 All new parts must have uuids 2020-11-19 13:18:03 +00:00
Nicolae Vartolomei
425dc4b11b Add unique identifiers IMergeTreeDataPart structure
For now uuids are not generated at all, they are present only if the
part is updated manually (as you can see in the integration test).

The only place where they can be seen today by an end user is in
`system.parts` table. I was looking for hiding this column behind an
option but couldn't find an easy way to do that.

Likely this is also required for WAL, but need to think how not to break
compatibility.

Relates to #13574, https://github.com/ClickHouse/ClickHouse/issues/13574

Next 1: In the upcoming PR the plan is to integrate de-duplication based on
these fingerprints in the query pipeline.

Next 2: We'll enable automatic generation of uuids and come up with a
way for conditionally sending uuids when processing distributed queries
only when part movement is in progress.
2020-11-19 13:14:25 +00:00
Pavel Kruglov
c648f62629 Merge branch 'master' of github.com:ClickHouse/ClickHouse into optimize_final_optimization 2020-11-10 22:58:21 +03:00
Pavel Kruglov
9120189d8a Add SelectPartsDecision enum class 2020-11-10 17:42:56 +03:00
Anton Popov
245c395a68 Merge remote-tracking branch 'upstream/master' into HEAD 2020-11-06 22:00:32 +03:00
alesapin
40fc512e79 Merge branch 'master' into no_background_pool_no_more 2020-10-29 12:53:34 +03:00
alesapin
8e8bdeb5d7
Merge pull request #16434 from ClickHouse/fix_fake_race_on_merges_list
Fix fake race condition on system.merges merge_algorithm
2020-10-28 15:51:53 +03:00