Commit Graph

215 Commits

Author SHA1 Message Date
Alexander Gololobov
f324ca9921 Cleanups 2022-07-18 20:07:22 +02:00
Alexander Gololobov
9de72d995a POC lightweight delete using __row_exists virtual column and prewhere-like filtering 2022-07-18 20:06:42 +02:00
jianmei zhang
197d3796ec Killed mutation has empty commands may cause SegV 2022-07-15 15:37:36 +08:00
jianmei zhang
ca42f649da Rewrite logic for loading deleted mask related to getDeletedMask() 2022-07-15 15:31:10 +08:00
jianmei zhang
d37152a5d6 Remove loadDeletedMask() and get deleted mask when needed 2022-07-15 12:32:42 +08:00
jianmei zhang
4268ac8b0e Add supportLightweightDeleteMutate() in IMergeTreeDataPart to disable LWD for part with projections 2022-07-15 12:32:42 +08:00
jianmei zhang
780cdfb8f0 Code changes based on review opinions. Make ordinary single alter delete lightwight by default 2022-07-15 12:32:42 +08:00
jianmei zhang
6e6f77ef8a Fix compile error in clang tidy build 2022-07-15 12:32:41 +08:00
jianmei zhang
7e433859ea Change deleted rows mask from String to Native UInt8 format 2022-07-15 12:32:41 +08:00
jianmei zhang
dcc7367ac4 Fix code style error 2022-07-15 12:32:41 +08:00
jianmei zhang
6248486790 Fix lightweight delete bugs: add skip files and use source part's columns to avoid metadata updated cases 2022-07-15 12:32:41 +08:00
jianmei zhang
8ad2bb7c33 Code changes due to master new fixes, and update reference for mutations table 2022-07-15 12:32:41 +08:00
jianmei zhang
9d27af7ee2 For some columns mutations, skip to apply deleted mask when read some columns. Also add unit test case 2022-07-15 12:32:41 +08:00
jianmei zhang
b4a37e1e22 Disable optimizations for count() when lightweight delete exists, add hasLightweightDelete() function in IMergeTreeDataPart 2022-07-15 12:32:41 +08:00
jianmei zhang
8696319d62 Support lightweight delete execution using string as deleted rows mask,also part of select can handle LWD 2022-07-15 12:32:41 +08:00
kssenii
b52084265c Merge master 2022-07-04 21:37:43 +02:00
alesapin
6429b72371 Fixes 2022-06-28 14:41:22 +02:00
alesapin
0a3fab1cb6 Some sad changes 2022-06-28 12:51:49 +02:00
alesapin
d88cdcd5c1 AAA 2022-06-27 21:41:29 +02:00
kssenii
e40d9bcf55 Merge master 2022-06-22 23:28:52 +02:00
Nikolai Kochetov
854d148d73 Add some comments. Remove some commented code. 2022-06-21 12:31:02 +00:00
Nikolai Kochetov
dccf90b1ea Cleanup. 2022-06-20 18:18:17 +00:00
Nikolai Kochetov
ef8b967c3f Fix more tests. 2022-06-20 13:07:14 +00:00
kssenii
aaf9a46f07 Merge master 2022-06-11 15:56:05 +02:00
Nikolai Kochetov
678d978acf Merge branch 'master' into refactor-something-in-part-volumes 2022-06-07 15:23:00 +00:00
Anton Popov
5cf2c67410
Merge branch 'master' into fix-mutations-again 2022-06-03 14:17:05 +02:00
Nikolai Kochetov
89c5855d20 Merge branch 'master' into refactor-something-in-part-volumes 2022-06-02 12:19:07 +02:00
Anton Popov
cae2478b3f Revert "Merge pull request #37355 from ClickHouse/revert-37266-fix-mutations-with-object"
This reverts commit a53cfa9fca, reversing
changes made to 9acb42fcdb.
2022-06-01 10:57:20 +00:00
Nikolai Kochetov
6d4a26afac Update ReadProgressCallback. 2022-05-25 19:45:48 +00:00
Nikolai Kochetov
1b85f2c1d6 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-25 16:27:40 +02:00
Nikolai Kochetov
3d84aae0ab Better. 2022-05-24 20:06:08 +00:00
Nikolai Kochetov
56feef01e7 Move some resources 2022-05-20 19:49:31 +00:00
Alexander Tokmakov
f787dc7097
Revert "Fix mutations in tables with columns of type Object" 2022-05-19 13:24:48 +03:00
kssenii
5e3410f60a Better 2022-05-16 22:09:11 +02:00
Anton Popov
b6c5ab4fcf fix mutations in tables with columns of type Object 2022-05-16 18:26:53 +00:00
Alexey Milovidov
fdbb5b75b2
Merge branch 'master' into minor-renames 2022-05-07 14:18:50 +03:00
Nikolai Kochetov
d6735f06c7 Fixing spaces. 2022-05-05 09:24:34 +00:00
Nikolai Kochetov
35095191eb Merge branch 'master' into refactor-something-in-part-volumes 2022-05-03 17:51:47 +02:00
Nikolai Kochetov
4a20e4f37e Fixing build. 2022-05-03 15:48:05 +00:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
   previously allowed.

Hence, this change

- removes shared_ptr_helper and as a result all inherited create() methods,

- instead, Storage objects are now created using make_shared<>() by the
  caller (for that to work, many constructors had to be made public), and

- all Storage classes were marked as noncopyable using boost::noncopyable.

In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Alexey Milovidov
1ddb04b992
Merge pull request #36715 from amosbird/refactorbase
Reorganize source files so that base won't depend on Common
2022-04-30 09:40:58 +03:00
Azat Khuzhin
bf312c2a5b Fix ALTER DROP COLUMN of nested column with compact parts
ALTER DROP COLUMN of nested column did not requires mutation before, and
so it leaves nested column as-is, and in case of compact parts
subsequent alter, that requires mutation, will trigger READ_COLUMN of
that nested column (because it exists in part), but it will fail because
there is no such column in the table already.

Here is example of such a failure on CI - [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/35459/52099b23a1cb9a7ff036c5c60aa037c999b333ef/stateless_tests__thread__actions__[1/3].html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-29 07:12:34 +03:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
Nikolai Kochetov
e44af67fee Merge branch 'master' into refactor-something-in-part-volumes 2022-04-26 21:08:00 +02:00
Azat Khuzhin
8b544e26d3 Move some functions from MergeTreeDataMergerMutator to MutateTask
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-25 18:16:38 +03:00
Azat Khuzhin
8440fec3db Fix coding alignment in MutateTask
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-04-25 18:16:38 +03:00
Nikolai Kochetov
8c00692844 Part 8 2022-04-22 16:58:09 +00:00
Nikolai Kochetov
9133e398b8 Part 7 2022-04-21 19:19:13 +00:00
alesapin
5465415751 Fix replace/move partition with zero copy replication 2022-04-21 14:39:12 +02:00
alesapin
c14e2e0b96 Fix more 2022-04-20 21:08:26 +02:00
alesapin
40c15222f8 Merge branch 'master' into fix_trash 2022-04-20 12:45:49 +02:00
alesapin
aea7c7755a Fix style 2022-04-19 17:25:08 +02:00
alesapin
7cb7c120cc Less ugly 2022-04-19 15:53:10 +02:00
alesapin
cc06bc3d99 Add some clarifications 2022-04-19 14:01:30 +02:00
alesapin
bd7b3847c1 Some code 2022-04-19 01:09:09 +02:00
Robert Schulze
118e94523c
Activate clang-tidy warning "readability-container-contains"
This check suggests replacing <Container>.count() by
<Container>.contains() which is more speaking and in case of
multimaps/multisets also faster.
2022-04-18 23:53:11 +02:00
alesapin
5a8419a48e Remove more trash 2022-04-15 17:05:17 +02:00
alesapin
eb7593f786 Remove more trash 2022-04-15 16:24:38 +02:00
Alexander Tokmakov
49c35f3261 Merge branch 'master' into mvcc_prototype 2022-04-08 13:34:40 +02:00
kssenii
d2a3cfe5dc Cache on all write operations 2022-03-23 19:14:33 +01:00
Alexander Tokmakov
9e05b12d2c Merge branch 'master' into mvcc_prototype 2022-03-20 22:42:26 +01:00
Alexander Tokmakov
c2ac8d4a5c review fixes 2022-03-16 21:05:34 +01:00
Anton Popov
ccbddd53a3 fix mutations in tables with enabled sparse columns 2022-03-15 01:48:21 +00:00
Anton Popov
630182b2b1 minor renames 2022-03-14 14:42:09 +00:00
Alexander Tokmakov
711aad6953 fix 2022-02-24 01:31:21 +03:00
Alexander Tokmakov
aa6b9a2abc Merge branch 'master' into mvcc_prototype 2022-02-23 23:22:03 +03:00
Azat Khuzhin
fef5f146e7 Fix ENOENT with fsync_part_directory and Vertical merge
fsync of the temporary part directory is superfluous anyway, and besides
that directory is not exists at that time, that will lead to ENOENT
error:

    2022.02.18 17:02:51.634565 [ 35639 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/data/system/text_log/tmp_merge_202202_1864_3192_14/, errno: 2, strerror: No such file or directory. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

    0. DB::Exception::Exception() @ 0xb26ecfa in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    1. DB::throwFromErrnoWithPath() @ 0xb2700ea in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    2. DB::LocalDirectorySyncGuard::LocalDirectorySyncGuard() @ 0x14905531 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    3. DB::DiskLocal::getDirectorySyncGuard() const @ 0x148af3e3 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    4. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x157bef13 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug

Note, that IMergeTreeDataPart::renameTo() anyway will have fsync for the
directory.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-19 07:50:59 +03:00
Azat Khuzhin
65e9b4879d Fix possible memory_tracker use-after-free for merges/mutations
There are two possible cases for execution merges/mutations:
1) from background thread
2) from OPTIMIZE TABLE query

1) is pretty simple, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)" ==
      background_thread_memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Global / description="(total)"

  So as you can see it is pretty simple and MemoryTrackerThreadSwitcher
  does not do anything icky for this case.

2) is complex, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Process / description="(for query)" ==
      background_thread_memory_tracker = level=Process / description="(for query)"

  Before this patch to track memory (and related things, like sampling,
  profiling and so on) for OPTIMIZE TABLE query dirty hacks was done to
  do this, since current_thread memory_tracker was of Thread scope, that
  does not have any limits.

  And so if will change parent for it to Merge/Mutate memory tracker
  (which also does not have some of settings) it will not be correctly
  tracked.

  To address this Merge/Mutate was set as parent not to the
  current_thread memory_tracker but to it's parent, since it's scope is
  Process with all settings.

  But that parent's memory_tracker is the memory_tracker of the
  thread_group, and so if you will have nested ThreadPool inside
  merge/mutate (this is the case for s3 async writes, which has been
  added in #33291) you may get use-after-free of memory_tracker.

  Consider the following example:

    MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker.parent = merge_list_entry->memory_tracker
      (see also background_thread_memory_tracker above)

    CurrentThread::attachTo()
      current_thread.memory_tracker.parent = thread_group.memory_tracker

    CurrentThread::detachQuery()
      current_thread.memory_tracker.parent = thread_group.memory_tracker.parent
      # and this is equal to merge_list_entry->memory_tracker

    ~MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker = thread_group.memory_tracker.parent

  So after the following we will get incorrect memory_tracker (from the
  mege_list_entry) when the next job in that ThreadPool will not have
  thread_group, since in this case it will not try to update the
  current_thread.memory_tracker.parent and use-after-free will happens.

So to address the (2) issue, settings from the parent memory_tracker
should be copied to the merge_list_entry->memory_tracker, to avoid
playing with parent memory tracker.

Note, that settings from the query (OPTIMIZE TABLE) is not available at
that time, so it cannot be used (instead of parent's memory tracker
settings).

v2: remove memory_tracker.setOrRaiseHardLimit() from settings

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-18 16:23:54 +03:00
Alexander Tokmakov
e37ef4560c fix 2022-02-15 18:00:45 +03:00
Alexander Tokmakov
ae5aa8c12d write part version before other files 2022-02-15 02:24:51 +03:00
Nikolai Kochetov
2a6eb593be
Revert "Revert "Add pool to WriteBufferFromS3"" 2022-02-01 13:36:51 +03:00
alexey-milovidov
095d9bfa43
Revert "Add pool to WriteBufferFromS3" 2022-02-01 05:49:40 +03:00
Nikolai Kochetov
9b2998c639 Review fixes. 2022-01-26 18:08:01 +00:00
Nikolai Kochetov
506ee8c024 Refactor some code. 2022-01-24 15:55:29 +00:00
Nikolai Kochetov
f74cf1e152 Merge branch 'master' into add-pool-to-s3-write-buffer 2022-01-21 21:55:42 +03:00
Nikolai Kochetov
b3cbb63487 Merge branch 'master' into add-pool-to-s3-write-buffer 2022-01-21 21:41:54 +03:00
Anton Popov
c2206c0b96 better storing serialization infos 2022-01-21 03:29:30 +03:00
Amos Bird
62441f0a0f
Fix mutation when table contains projections (#33679) 2022-01-19 15:27:11 +03:00
Nikolai Kochetov
6d49a62666 Some more async writes. 2022-01-14 19:53:55 +00:00
Nikolai Kochetov
28f2012b06 Add some more async writing. 2022-01-11 19:02:48 +00:00
Anton Popov
54f51444c0 Merge remote-tracking branch 'upstream/master' into HEAD 2021-12-01 15:49:02 +03:00
Vitaly Baranov
6634fcbac7 Rename Quota::ResourceType -> QuotaType and move it to Access/Common. 2021-11-19 00:14:23 +03:00
Anton Popov
f37d12d16e keep serialization infos after drops and renames 2021-11-03 23:29:48 +03:00
Anton Popov
9823f28855 fix nested 2021-11-02 06:03:52 +03:00
Anton Popov
d50137013c Merge remote-tracking branch 'upstream/master' into HEAD 2021-11-01 16:55:53 +03:00
Anton Popov
0099dfd523 refactoring of SerializationInfo 2021-10-29 20:21:02 +03:00
Nikita Mikhaylov
d004fd96ed
Merge pull request #30318 from azat/fix-shared-build
Move SquashingTransform to Interpreters (to fix split build)
2021-10-18 22:41:01 +03:00
Anton Popov
d71ffc355a Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-18 15:18:22 +03:00
Anton Popov
71b474793c
Merge pull request #30174 from CurtizJ/better-interfaces
Better interfaces for `IDataType` and `ISerialization`
2021-10-18 12:07:04 +03:00
Azat Khuzhin
6d4af3bac1 Move SquashingTransform to Interpreters (to fix split build)
clickhouse_common_io requires clickhouse_core:

    ld.lld: error: undefined symbol: DB::blocksHaveEqualStructure(DB::Block const&, DB::Block const&)
    >>> referenced by SquashingTransform.cpp:92 (/src/ch/clickhouse/src/Common/SquashingTransform.cpp:92)
    >>>               src/CMakeFiles/clickhouse_common_io.dir/Common/SquashingTransform.cpp.o:(void DB::SquashingTransform::append<DB::Block&&>(DB::Block&&))
    >>> referenced by SquashingTransform.cpp:92 (/src/ch/clickhouse/src/Common/SquashingTransform.cpp:92)
    >>>               src/CMakeFiles/clickhouse_common_io.dir/Common/SquashingTransform.cpp.o:(void DB::SquashingTransform::append<DB::Block const&>(DB::Block const&))

while clickhouse_core requires clickhouse_common_io:

    "clickhouse_core" of type SHARED_LIBRARY
      depends on "roaring" (weak)
      depends on "clickhouse_common_io" (weak)
      depends on "clickhouse_common_config" (weak)
      depends on "clickhouse_common_zookeeper" (weak)
      depends on "clickhouse_dictionaries_embedded" (weak)
      depends on "clickhouse_parsers" (weak)

Follow-up for: #30247 (cc @KochetovNicolai)
2021-10-18 10:28:36 +03:00
Nikolai Kochetov
067eaadadd Merge branch 'master' into removing-data-streams-folder 2021-10-16 09:46:05 +03:00
Nikolai Kochetov
c668696047
Merge pull request #30171 from ClickHouse/remove-stream-interfaces
Remove stream interfaces
2021-10-16 09:34:01 +03:00
Nikolai Kochetov
fd14faeae2 Remove DataStreams folder. 2021-10-15 23:18:20 +03:00
Nikolai Kochetov
ad8a344b46 Move TTL streams and algo 2021-10-15 13:11:57 +03:00
Azat Khuzhin
fd38cbb0df Fix memory tracking for merges and mutations (by destroying earlier)
It fixes only some tiny allocations, and so it should not affect any
huge mutations/merges.

And plus, this should not be a real fix, since peak_memory_usage is
obtained before even destrying this objects, and destroing objects will
unlikely update peak memory usage (although it is possible).

v0: do this in dtors
v2: do this earlier
2021-10-15 01:43:27 +03:00
Azat Khuzhin
e4b39ca99c Remove unused Block member from MutateTask
v0: Fix excessive memory usage in MutateTask
    This is relevant only for 01200_mutations_memory_consumption test, since
    this is just a block.
v2: just remove unused Block member, since other part had been fixed in #29768
2021-10-15 01:43:27 +03:00
Anton Popov
7aa6068fb2 Merge remote-tracking branch 'upstream/master' into HEAD 2021-10-14 19:44:08 +03:00
Anton Popov
c2d376d0d6
Merge branch 'master' into better-interfaces 2021-10-14 14:33:36 +03:00
Anton Popov
92413aed68 better interfaces for IDataType and ISerialization 2021-10-14 05:36:49 +03:00
Nikolai Kochetov
2957971ee3 Remove some last streams. 2021-10-13 21:22:02 +03:00
Nikolai Kochetov
077aba4a97
Merge pull request #29520 from amosbird/projection-improve4
Get rid of naming limitation of projections.
2021-10-12 14:25:29 +03:00
Amos Bird
83717b7c3b
Get rid of naming limitation of projections. 2021-10-11 18:32:17 +08:00
Maksim Kita
2d069acc22 System table data skipping indices added size 2021-10-11 11:39:50 +03:00
Nikolai Kochetov
340b53ef85 Remove some more streams. 2021-10-08 17:03:54 +03:00
Nikolai Kochetov
d0c6f11fcb More. 2021-10-06 20:59:27 +03:00
mergify[bot]
ab55225c8f
Merge branch 'master' into remove-merging-streams 2021-10-05 14:16:34 +00:00
Nikolai Kochetov
96d070a5ba Fix tests. 2021-10-05 16:58:24 +03:00
Nikolai Kochetov
2001ebbf9d Fix build. 2021-10-04 21:52:31 +03:00
Azat Khuzhin
117e9e77c8 Apply max_untracked_memory/memory_profiler_{step,sample_probability} during mutate/merge 2021-10-03 17:39:07 +03:00
Alexey Milovidov
fe6b7c77c7 Rename "common" to "base" 2021-10-02 10:13:14 +03:00
Nikita Mikhaylov
9756b8dc33
Added an ability to execute more merges and mutations than threads, added new settings (#29140) 2021-10-01 00:26:24 +03:00
Anton Popov
914781052e Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-29 17:37:07 +03:00
Amos Bird
23d3d894e6
Fix projection merges and mutations. 2021-09-24 22:45:50 +08:00
Anton Popov
6f9e53197c Merge remote-tracking branch 'upstream/master' into HEAD 2021-09-20 17:17:05 +03:00
Nikita Mikhaylov
c52b8ec083
Introduced MergeTask and MutateTask (#25165)
Introduced MergeTask and MutateTask
2021-09-17 00:19:58 +03:00