Commit Graph

3369 Commits

Author SHA1 Message Date
Amos Bird
10996b1434
Fix mixed constant type during partition pruning 2022-03-18 17:47:03 +08:00
Alexander Tokmakov
07d952b728 use snapshots for semistructured data, durability fixes 2022-03-17 18:26:18 +01:00
Alexander Tokmakov
d04dc03fa4 Merge branch 'master' into mvcc_prototype 2022-03-17 15:24:32 +01:00
Nikolai Kochetov
ee9c2ec735
Merge pull request #34780 from azat/mt-delayed-part-flush
Do not delay final part writing by default (fixes possible Memory limit exceeded during INSERT)
2022-03-17 12:30:51 +01:00
alesapin
bb251938dc
Merge pull request #35344 from ClickHouse/changelog-22.3
Changelog 22.3
2022-03-17 11:25:36 +01:00
Alexey Milovidov
68ef49ea51 Fix something stupid 2022-03-17 05:57:13 +01:00
Anton Popov
de2cc23e15 fix race 2022-03-16 20:16:59 +00:00
Alexander Tokmakov
c2ac8d4a5c review fixes 2022-03-16 21:05:34 +01:00
Anton Popov
0ba78c3c3a Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-16 15:28:09 +00:00
Alexander Tokmakov
1f571b7734 Merge branch 'master' into mvcc_prototype 2022-03-15 14:45:06 +01:00
Anton Popov
ccbddd53a3 fix mutations in tables with enabled sparse columns 2022-03-15 01:48:21 +00:00
alesapin
fbb1ebd9b8
Merge pull request #35274 from CurtizJ/fix-check-table-sparse-columns
Fix check table in case when there exist sparse columns
2022-03-14 21:56:04 +01:00
Alexander Tokmakov
9702b5177d Merge branch 'master' into mvcc_prototype 2022-03-14 21:45:38 +01:00
Alexander Tokmakov
278d779a01 log cleanup, more comments 2022-03-14 21:43:34 +01:00
Maksim Kita
2fdcf53a76 Fix clang-tidy warnings in Server, Storages folders 2022-03-14 18:17:35 +00:00
Anton Popov
063917786e minor fixes 2022-03-14 17:29:18 +00:00
Anton Popov
36ec379aeb Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-14 16:28:35 +00:00
Anton Popov
428bbd6377 fix check table in case when there exist sparse columns 2022-03-14 15:22:23 +00:00
Kseniia Sumarokova
818459b9f0
Merge pull request #33717 from kssenii/local-cache-for-remote-fs
Local cache for remote filesystem
2022-03-11 07:23:10 +01:00
alesapin
c0d8ccc91b
Merge pull request #35178 from Varinara/master
Added disk_name to system.part_log
2022-03-10 22:22:37 +01:00
Varinara
f5523f7ff0 added disk_name to system.part_log 2022-03-10 18:44:19 +03:00
Alexander Tokmakov
061fa6a6f2 Merge branch 'master' into mvcc_prototype 2022-03-10 13:13:04 +01:00
kssenii
787a0805a5 Merge master 2022-03-10 11:42:19 +01:00
zhangyifan27
e6fa9f699a fix typo 2022-03-10 18:29:42 +08:00
Alexander Tokmakov
0906b59fba fixes 2022-03-09 21:38:18 +01:00
Vladimir C
ce266b5a3e
Merge pull request #35146 from amosbird/fixpartitionprunerin 2022-03-09 13:23:45 +01:00
Amos Bird
a19224bc9b
Fix partition pruner: non-monotonic function IN 2022-03-09 15:48:42 +08:00
Azat Khuzhin
3a5a39a9df Do not delay final part writing by default
For async s3 writes final part flushing was defered until all the INSERT
block was processed, however in case of too many partitions/columns you
may exceed max_memory_usage limit (since each stream has overhead).

Introduce max_insert_delayed_streams_for_parallel_writes (with default
to 1000 for S3, 0 otherwise), to avoid this.

This should "Memory limit exceeded" errors in performance tests.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 22:17:36 +03:00
Alexander Tokmakov
d2f838bd91 fix another race condition 2022-03-08 20:11:47 +01:00
kssenii
5260822964 Merge master 2022-03-08 18:21:28 +01:00
kssenii
e231c3a3e0 Fix split build 2022-03-08 18:05:55 +01:00
Azat Khuzhin
caffc144b5 Fix possible "Part directory doesn't exist" during INSERT
In #33291 final part commit had been defered, and now it can take
significantly more time, that may lead to "Part directory doesn't exist"
error during INSERT:

    2022.02.21 18:18:06.979881 [ 11329 ] {insert} <Debug> executeQuery: (from 127.1:24572, user: default) INSERT INTO db.table (...) VALUES
    2022.02.21 20:58:03.933593 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18044_18044_0 to 20220214_270654_270654_0.
    2022.02.21 21:16:50.961917 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18197_18197_0 to 20220214_270689_270689_0.
    ...
    2022.02.22 21:16:57.632221 [ 64878 ] {} <Warning> db.table: Removing temporary directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/
    ...
    2022.02.23 12:23:56.277480 [ 11329 ] {insert} <Trace> db.table: Renaming temporary part tmp_insert_20220214_18232_18232_0 to 20220214_273459_273459_0.
    2022.02.23 12:23:56.299218 [ 11329 ] {insert} <Error> executeQuery: Code: 107. DB::Exception: Part directory /clickhouse/data/db/table/tmp_insert_20220214_18232_18232_0/ doesn't exist. Most likely it is a logical error. (FILE_DOESNT_EXIST) (version 22.2.1.1) (from 127.1:24572) (in query: INSERT INTO db.table (...) VALUES), Stack trace (when copying this message, always include the lines below):

Follow-up for: #28760
Refs: #33291

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-08 07:44:11 +03:00
taiyang-li
b4174b0bef merge master and fix conflicts 2022-03-08 11:39:25 +08:00
Alexander Tokmakov
8acfb8d27f Merge branch 'master' into mvcc_prototype 2022-03-07 17:40:15 +01:00
Alexander Tokmakov
ea2f65fef6 fix tests with DiskS3 2022-03-07 17:35:47 +01:00
Anton Popov
0bc57da238 Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-07 14:46:08 +00:00
Azat Khuzhin
bc224dee36 Do not hide exceptions during mutations
system.mutations includes only the message, but not stacktrace, and it
is not always obvious to understand the culprit w/o stacktrace.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-06 13:39:49 +03:00
Maksim Kita
7ae1f0fa3b
Merge pull request #34911 from larspars/master
Allow LowCardinality strings for ngrambf_v1/tokenbf_v1 indexes. Fixes #21865
2022-03-04 19:17:48 +01:00
Anton Popov
df3b07fe7c Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-03 22:25:28 +00:00
Maksim Kita
b1a956c5f1 clang-tidy check performance-move-const-arg fix 2022-03-02 18:15:27 +00:00
mreddy017
f893002b69 Fix vulnerable code related to std::move and noexcept
This commit fixes the vulnerable code related to std::move and noexcept identified by clangtidy tool.
2022-03-02 18:15:27 +00:00
Maksim Kita
53116faeeb
Update MergeTreeIndexFullText.cpp 2022-03-02 11:08:35 +01:00
Filatenkov Artur
f48f35cad0
Merge pull request #34975 from Vector-Similarity-Search-for-ClickHouse/fix-typo
Fix typo
2022-03-02 09:59:06 +03:00
Anton Popov
d7cd9aa69b fix reading of missed subcolumns 2022-03-02 03:31:40 +03:00
NikitaEvs
06f47673f4 Fix typo 2022-03-01 21:42:27 +00:00
Anton Popov
04a3a10148 minor fixes 2022-03-01 20:20:53 +03:00
Anton Popov
2758db5341 add more comments 2022-03-01 19:32:55 +03:00
Lars Eidnes
2629614dfe Allow LowCardinality strings for ngrambf_v1/tokenbf_v1 indexes. Fixes #21865 2022-02-25 15:36:36 +01:00
Anton Popov
fcdebea925 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-25 13:41:30 +03:00
Alexander Tokmakov
11ae0d144b fix 2022-02-25 00:51:21 +03:00
Alexander Tokmakov
711aad6953 fix 2022-02-24 01:31:21 +03:00
Alexander Tokmakov
aa6b9a2abc Merge branch 'master' into mvcc_prototype 2022-02-23 23:22:03 +03:00
Dmitry Novik
2fd4baaa64
Merge pull request #34387 from nvartolomei/nv/move-part-settings-cleanup
Remove useless setting experimental_query_deduplication_send_all_part_uuids
2022-02-22 06:11:00 -08:00
Kseniia Sumarokova
eeea322556
Merge pull request #34629 from amosbird/remotefsimprove
Some refactoring and improvement over async and remote buffer related stuff
2022-02-22 11:36:40 +01:00
mergify[bot]
314ab73b11
Merge branch 'master' into nv/move-part-settings-cleanup 2022-02-21 10:18:44 +00:00
Dmitry Novik
4428e7aa1b
Merge branch 'master' into nv/move-part-count 2022-02-21 02:14:23 -08:00
Azat Khuzhin
fef5f146e7 Fix ENOENT with fsync_part_directory and Vertical merge
fsync of the temporary part directory is superfluous anyway, and besides
that directory is not exists at that time, that will lead to ENOENT
error:

    2022.02.18 17:02:51.634565 [ 35639 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/data/system/text_log/tmp_merge_202202_1864_3192_14/, errno: 2, strerror: No such file or directory. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

    0. DB::Exception::Exception() @ 0xb26ecfa in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    1. DB::throwFromErrnoWithPath() @ 0xb2700ea in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    2. DB::LocalDirectorySyncGuard::LocalDirectorySyncGuard() @ 0x14905531 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    3. DB::DiskLocal::getDirectorySyncGuard() const @ 0x148af3e3 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    4. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x157bef13 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug

Note, that IMergeTreeDataPart::renameTo() anyway will have fsync for the
directory.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-19 07:50:59 +03:00
Azat Khuzhin
65e9b4879d Fix possible memory_tracker use-after-free for merges/mutations
There are two possible cases for execution merges/mutations:
1) from background thread
2) from OPTIMIZE TABLE query

1) is pretty simple, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)" ==
      background_thread_memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Global / description="(total)"

  So as you can see it is pretty simple and MemoryTrackerThreadSwitcher
  does not do anything icky for this case.

2) is complex, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Process / description="(for query)" ==
      background_thread_memory_tracker = level=Process / description="(for query)"

  Before this patch to track memory (and related things, like sampling,
  profiling and so on) for OPTIMIZE TABLE query dirty hacks was done to
  do this, since current_thread memory_tracker was of Thread scope, that
  does not have any limits.

  And so if will change parent for it to Merge/Mutate memory tracker
  (which also does not have some of settings) it will not be correctly
  tracked.

  To address this Merge/Mutate was set as parent not to the
  current_thread memory_tracker but to it's parent, since it's scope is
  Process with all settings.

  But that parent's memory_tracker is the memory_tracker of the
  thread_group, and so if you will have nested ThreadPool inside
  merge/mutate (this is the case for s3 async writes, which has been
  added in #33291) you may get use-after-free of memory_tracker.

  Consider the following example:

    MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker.parent = merge_list_entry->memory_tracker
      (see also background_thread_memory_tracker above)

    CurrentThread::attachTo()
      current_thread.memory_tracker.parent = thread_group.memory_tracker

    CurrentThread::detachQuery()
      current_thread.memory_tracker.parent = thread_group.memory_tracker.parent
      # and this is equal to merge_list_entry->memory_tracker

    ~MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker = thread_group.memory_tracker.parent

  So after the following we will get incorrect memory_tracker (from the
  mege_list_entry) when the next job in that ThreadPool will not have
  thread_group, since in this case it will not try to update the
  current_thread.memory_tracker.parent and use-after-free will happens.

So to address the (2) issue, settings from the parent memory_tracker
should be copied to the merge_list_entry->memory_tracker, to avoid
playing with parent memory tracker.

Note, that settings from the query (OPTIMIZE TABLE) is not available at
that time, so it cannot be used (instead of parent's memory tracker
settings).

v2: remove memory_tracker.setOrRaiseHardLimit() from settings

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-18 16:23:54 +03:00
Amos Bird
f459e8fc95
Less getMark calls 2022-02-18 19:55:19 +08:00
Alexander Tokmakov
f4a46a13fb fixes 2022-02-18 00:26:37 +03:00
Alexander Tokmakov
dae044f86b Merge branch 'master' into mvcc_prototype 2022-02-17 13:49:37 +03:00
Amos Bird
d3bd8b5f93
Cosmetic fix 2022-02-17 14:31:22 +08:00
Amos Bird
ba19c7cf44
Slightly better interface of compressed buffer 2022-02-17 14:31:22 +08:00
Azat Khuzhin
774744a86d Fix allow_experimental_projection_optimization with enable_global_with_statement
allow_experimental_projection_optimization requires one more
InterpreterSelectQuery, which with enable_global_with_statement will
apply ApplyWithAliasVisitor if the query is not subquery.

But this should not be done for queries from
MergeTreeData::getQueryProcessingStage()/getQueryProcessingStageWithAggregateProjections()
since this will duplicate WITH statements over and over.

This will also fix scalar.xml perf tests, that leads to the following
error now:

    scalar.query0.prewarm0: DB::Exception: Stack size too large.

And since it has very long query in the log, this leads to the following
perf test error:

    _csv.Error: field larger than field limit (131072)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-16 19:14:47 +03:00
Anton Popov
a661eaf39f better performance of getting storage snapshot 2022-02-16 02:17:22 +03:00
alesapin
bc2d0ee7c7
Merge pull request #34215 from ClickHouse/revert-34211-revert-34153-add_func_tests_over_s3
Add func tests run with s3 and fix several bugs
2022-02-15 19:07:11 +03:00
Alexander Tokmakov
e37ef4560c fix 2022-02-15 18:00:45 +03:00
Nikolai Kochetov
d6cbac1ed3
Merge pull request #34577 from ClickHouse/alwasy-remove-unused-actions-for-add-missing-defaults
Always remove unused actions from addMissingDefaults
2022-02-15 11:01:29 +01:00
alesapin
447cd56cb9 Fix comments 2022-02-15 12:11:50 +03:00
李扬
f52b67b939
Merge branch 'master' into rocksdb_metacache 2022-02-15 02:16:29 -06:00
Alexander Tokmakov
1e4e569151 Merge branch 'master' into mvcc_prototype 2022-02-15 02:26:47 +03:00
Alexander Tokmakov
ae5aa8c12d write part version before other files 2022-02-15 02:24:51 +03:00
Alexander Tokmakov
cbd3b45646 add EXPLAIN CURRENT TRANSACTION 2022-02-14 22:47:17 +03:00
alesapin
e15396d90c Fix race condition: 2022-02-14 22:19:49 +03:00
Nikolai Kochetov
b3ea360cd2 Fix a little bit more 2022-02-14 19:05:30 +00:00
alesapin
bb69455395
Merge pull request #34504 from CurtizJ/ttl-move-if-exists
Support `TTL TO [DISK|VOLUME] [IF EXISTS]`
2022-02-14 14:56:18 +03:00
alesapin
b2886a429b Fix lock during fetch 2022-02-14 12:20:27 +03:00
alesapin
89373155fc Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-13 21:07:54 +03:00
李扬
daa27d0bda
Merge branch 'master' into rocksdb_metacache 2022-02-12 07:50:12 -06:00
Anton Popov
6a8e35930f fix comparison with integers and floats in index analysis 2022-02-11 18:20:37 +03:00
Anton Popov
2fcd69baf7 fix comparison with integers and floats in index analysis 2022-02-11 17:15:27 +03:00
Alexander Tokmakov
07e66e690d Merge branch 'master' into mvcc_prototype 2022-02-11 15:53:32 +03:00
Anton Popov
f012871a7c better caching of common types of object columns 2022-02-11 01:20:30 +03:00
alesapin
705529ca03 Followup 2022-02-10 22:50:15 +03:00
alesapin
ef61c9b47c fix 2022-02-10 22:49:33 +03:00
alesapin
3af06b23f8 POC 2022-02-10 22:45:52 +03:00
Anton Popov
70986a70a1 support TTL TO [DISK|VOLUME] [IF EXISTS] 2022-02-10 19:26:23 +03:00
alesapin
f764da35ca Also zero copy mutations 2022-02-10 14:15:08 +03:00
alesapin
70221b272b Better solution 2022-02-10 12:57:11 +03:00
Anton Popov
dcd7312d75 cache common type on objects in MergeTree 2022-02-09 23:47:53 +03:00
Anton Popov
18940b8637 Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-09 23:38:38 +03:00
alesapin
c587160308 Bad fix which will affect zk and should be changed 2022-02-09 23:06:44 +03:00
alesapin
57037465f5 Trying to fix tests blindly 2022-02-09 22:56:22 +03:00
alesapin
10c3e6e546 Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-09 14:14:58 +03:00
alesapin
72863cc4c3 Fix error 2022-02-09 13:57:10 +03:00
taiyang-li
d04ccc0489 Merge branch 'master' into rocksdb_metacache 2022-02-09 11:54:10 +08:00
Anton Popov
587d7399ba support dynamic subcolumns for Memory engine 2022-02-09 03:18:53 +03:00
alesapin
36909a986f Fix bug with files remove 2022-02-08 22:21:16 +03:00
alesapin
02a93cb852 Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-08 19:42:27 +03:00
Nikolai Kochetov
0f7c0c72bd
Merge pull request #34305 from amosbird/projection-fix27
Fix various issues when projection is enabled by default
2022-02-08 17:19:56 +03:00
Nicolae Vartolomei
50ee264223 Disable projects when allow_experimental_query_deduplication is in use 2022-02-08 12:16:10 +00:00
alesapin
8e9ccbd077
Merge pull request #33933 from sunny19930321/fix-substr-zk-metadata
Better local metadata comparison with ZooKeeper metadata
2022-02-08 14:58:46 +03:00
alesapin
b47b0eb1dc Revert accident change 2022-02-08 14:05:01 +03:00
alesapin
3af6012cb4 Revert "Revert "Revert "Revert "Merge pull request #34219 from ClickHouse/revert-34212-revert-33291-add-pool-to-s3-write-buffer""""
This reverts commit 2bc2ea485e.
2022-02-08 11:01:26 +03:00
alesapin
2bc2ea485e Revert "Revert "Revert "Merge pull request #34219 from ClickHouse/revert-34212-revert-33291-add-pool-to-s3-write-buffer"""
This reverts commit fb77d7a7d5.
2022-02-08 10:56:29 +03:00
taiyang-li
b6132d490f merge master and solve conflict 2022-02-08 15:24:59 +08:00
Nicolae Vartolomei
7d77678a9f Remove useless setting experimental_query_deduplication_send_all_part_uuids
This setting made sense for testing deduplication before part movement was actually implemented.

allow_experimental_query_deduplication setting is enough and code is covered by test_part_moves_between_shards
2022-02-07 19:03:20 +00:00
alesapin
ba28c2c013 Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-07 12:44:56 +03:00
alesapin
fb77d7a7d5 Revert "Revert "Merge pull request #34219 from ClickHouse/revert-34212-revert-33291-add-pool-to-s3-write-buffer""
This reverts commit 875e5413ad.
2022-02-07 12:36:54 +03:00
Alexander Tokmakov
45be75b4db Merge branch 'master' into mvcc_prototype 2022-02-06 23:36:08 +03:00
Amos Bird
52aabf98fe
Revise and add more comments 2022-02-06 16:53:54 +08:00
Amos Bird
1ab773cc90
Fix aggregation_in_order with normal projection 2022-02-06 16:46:12 +08:00
Amos Bird
3fab7af541
Bug fix and improvement of minmax_count_projection 2022-02-06 16:46:11 +08:00
Amos Bird
27fcefd315
Disable projection when doing parallel replica reading 2022-02-06 16:46:10 +08:00
Amos Bird
7674bc986e
Disable projection when there is JOIN or SAMPLE 2022-02-06 16:46:09 +08:00
Alexander Tokmakov
3956941aaf fixes 2022-02-04 21:18:20 +03:00
Nikolai Kochetov
daeeb6f3a2
Merge pull request #34316 from ClickHouse/probably-fix-data-race-in-WriteBufferFromS3
Probably fix data race in WriteBufferFromS3 destructor.
2022-02-04 21:04:46 +03:00
Nikolai Kochetov
6436024e08 Fix test with ttl. 2022-02-04 16:05:02 +00:00
alesapin
c028269e6f Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-04 18:02:25 +03:00
alesapin
d4d147abab Remove redundant 2022-02-04 18:01:49 +03:00
Azat Khuzhin
63e674280b Decrease severity for "Reading ... ranges ..." log message to Trace
That way with send_logs_level='debug', you will not get verbose
information that you already has, since there is summary row:

    Selected ... parts by partition key, ... parts by primary key, ... marks by primary key, ... marks to read from ... ranges

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-04 18:00:59 +03:00
tavplubix
22de534fdc
Merge pull request #34297 from ClickHouse/fix_restarting_thread
Try to fix race between pullLogsToQueue and RestartingThread
2022-02-04 17:30:17 +03:00
alesapin
1582c4bf24 Fix mutaion status for retryable errors 2022-02-04 16:27:59 +03:00
Maksim Kita
ab696e6b59
Merge pull request #34310 from ClickHouse/fix-parallel-loading-parts
Fix parallel loading of data parts
2022-02-04 13:10:24 +01:00
alesapin
875e5413ad Revert "Merge pull request #34219 from ClickHouse/revert-34212-revert-33291-add-pool-to-s3-write-buffer"
This reverts commit b92efed350, reversing
changes made to ecce006cb2.
2022-02-04 14:30:33 +03:00
alesapin
af18905a33 Fixup 2022-02-04 13:03:13 +03:00
alesapin
2ed45b2a98 Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-04 11:23:46 +03:00
Nikolai Kochetov
b92efed350
Merge pull request #34219 from ClickHouse/revert-34212-revert-33291-add-pool-to-s3-write-buffer
Revert "Revert "Add pool to WriteBufferFromS3""
2022-02-04 11:00:29 +03:00
Alexey Milovidov
78eebd5c7c Fix parallel loading of data parts 2022-02-04 02:29:46 +03:00
Alexander Tokmakov
897e94c16c make restarting thread less bad 2022-02-03 23:29:24 +03:00
Alexander Tokmakov
fe30e0f162 fixes 2022-02-03 21:57:09 +03:00
Alexander Tokmakov
ca5f951558 Merge branch 'master' into mvcc_prototype 2022-02-03 18:56:44 +03:00
alesapin
2a9bc7cba8 Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-03 16:20:06 +03:00
Nikolai Kochetov
a2a9c10617
Merge pull request #34221 from ClickHouse/revert-34201-revert-34189-less-logging-for-remote_fs_execute_merges_on_single_replica_time_threshold
Revert "Revert "Additionally check remote_fs_execute_merges_on_single_replica_time_threshold inside ReplicatedMergeTreeQueue""
2022-02-03 13:09:44 +03:00
alesapin
763dfd7895 Fix write ahead log 2022-02-03 12:14:00 +03:00
alesapin
25375bc76b Remove unused param 2022-02-03 11:21:19 +03:00
alesapin
e0f640dd9f Fix 2022-02-02 19:44:29 +03:00
alesapin
80800e051e Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-02 19:42:04 +03:00
Nikolai Kochetov
ea044fc6b2 Merge branch 'master' into revert-34212-revert-33291-add-pool-to-s3-write-buffer 2022-02-02 19:40:41 +03:00
alesapin
b9c118524f Fix race condition on hardlink/erase/read metadata 2022-02-02 19:40:21 +03:00
Nikolai Kochetov
1117e6095d
Update src/Storages/MergeTree/ReplicatedMergeTreeQueue.cpp
Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>
2022-02-02 18:34:05 +03:00
Azat Khuzhin
15993cb13b Add missing fmt::runtime() in MergeTreeBackgroundExecutor
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-02 11:37:53 +03:00
alesapin
47d538b52f
Merge pull request #34232 from ClickHouse/cancelled-merging-parts-error-message
Change severity of the "Cancelled merging parts" message in logs
2022-02-02 10:29:07 +03:00
Sergei Trifonov
68bc456830
Merge pull request #34223 from azat/bump-fmt
Bump fmtlib from 7.0.0 to 8.1.1
2022-02-02 00:03:25 +03:00
Nikolai Kochetov
dd81eff301 Fix tests. 2022-02-01 15:28:22 +00:00
mergify[bot]
d8bea598b2
Merge branch 'master' into fix-substr-zk-metadata 2022-02-01 14:10:12 +00:00
Anton Popov
836a348a9c Merge remote-tracking branch 'upstream/master' into HEAD 2022-02-01 15:23:07 +03:00
mergify[bot]
a177ec91e5
Merge branch 'master' into read_in_order_max_rows_to_read 2022-02-01 12:12:20 +00:00
Alexander Tokmakov
d93c9f0be4 fix build after merge 2022-02-01 15:05:05 +03:00
Azat Khuzhin
5dfafd68a7 ReplicatedMergeTreeQueue: Fix fmt:: and reduce copy-paste of logging and out reason
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-01 14:30:04 +03:00
Azat Khuzhin
5be76bc969 Use proper fmt-like logging
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-01 14:30:04 +03:00
Azat Khuzhin
3b3635c6d5 Fix formatting error in logging messages
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-01 14:30:04 +03:00
Azat Khuzhin
743096a883 Use proper fmt:: like Exception ctor in DataPartsExchange
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-01 14:30:03 +03:00
Azat Khuzhin
bedf208cbd Use fmt::runtime() for LOG_* for non constexpr
Here is oneliner:

    $ gg 'LOG_\(DEBUG\|TRACE\|INFO\|TEST\|WARNING\|ERROR\|FATAL\)([^,]*, [a-zA-Z]' -- :*.cpp :*.h | cut -d: -f1 | sort -u | xargs -r sed -E -i 's#(LOG_[A-Z]*)\(([^,]*), ([A-Za-z][^,)]*)#\1(\2, fmt::runtime(\3)#'

Note, that I tried to do this with coccinelle (tool for semantic
patchin), but it cannot parse C++:

    $ cat fmt.cocci
    @@
    expression log;
    expression var;
    @@

    -LOG_DEBUG(log, var)
    +LOG_DEBUG(log, fmt::runtime(var))

I've also tried to use some macros/templates magic to do this implicitly
in logger_useful.h, but I failed to do so, and apparently it is not
possible for now.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>

v2: manual fixes
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-01 14:30:03 +03:00
Nikolai Kochetov
a6cc61bd14
Revert "Revert "Additionally check remote_fs_execute_merges_on_single_replica_time_threshold inside ReplicatedMergeTreeQueue"" 2022-02-01 14:17:46 +03:00
alesapin
0aebab50f5 Fix exception in destructor 2022-02-01 14:07:58 +03:00
Nikolai Kochetov
2a6eb593be
Revert "Revert "Add pool to WriteBufferFromS3"" 2022-02-01 13:36:51 +03:00
Alexander Tokmakov
2e4ae37d98 Merge branch 'master' into mvcc_prototype 2022-02-01 13:20:03 +03:00
alexey-milovidov
06477c2a7e Update ReplicatedMergeTreeSink.cpp 2022-02-01 09:22:40 +01:00
Igor Nikonov
f4c0b64420 Clean up: insert_deduplication_token setting for INSERT statement
+ reduce number of allocations on replication merge tree path
+ bash test: move insert block settings into variable

Issue: ClickHouse#7461
2022-02-01 09:22:33 +01:00
alexey-milovidov
99392b5ca7
Merge pull request #13544 from amosbird/mdha
Multi-Disk auto-recovery.
2022-02-01 06:13:26 +03:00
Alexey Milovidov
7dbf0dede5 Change severity of the "Cancelled merging parts" message in logs 2022-02-01 05:55:07 +03:00
alexey-milovidov
095d9bfa43
Revert "Add pool to WriteBufferFromS3" 2022-02-01 05:49:40 +03:00
mergify[bot]
e229487817
Merge branch 'master' into mdha 2022-02-01 01:22:16 +00:00
alexey-milovidov
15e4fe5c78
Revert "Additionally check remote_fs_execute_merges_on_single_replica_time_threshold inside ReplicatedMergeTreeQueue" 2022-02-01 01:51:39 +03:00
Alexander Tokmakov
5fad3fdffc throw exception on non-transactional queries 2022-02-01 01:27:55 +03:00
Amos Bird
ec7d367814
DiskLocal checker
Add DiskLocal checker so that ReplicatedMergeTree can recover data when some of its disks are broken.
2022-02-01 05:55:27 +08:00
alesapin
75d73d2785
Merge pull request #34139 from ClickHouse/fix_buf_s3_low_cardinality
Fix bug with bounded S3 reads and LowCardinality
2022-01-31 22:41:14 +03:00
Nikolai Kochetov
348d72266a
Merge pull request #34189 from ClickHouse/less-logging-for-remote_fs_execute_merges_on_single_replica_time_threshold
Additionally check remote_fs_execute_merges_on_single_replica_time_threshold inside ReplicatedMergeTreeQueue
2022-01-31 21:39:04 +03:00
Nikolai Kochetov
a207cdf28f Additionally check remote_fs_execute_merges_on_single_replica_time_threshold inside ReplicatedMergeTreeQueue. 2022-01-31 17:53:28 +00:00
Nikolai Kochetov
321fa4a9e8
Merge pull request #33291 from ClickHouse/add-pool-to-s3-write-buffer
Add pool to WriteBufferFromS3
2022-01-31 19:37:40 +03:00
Maksim Kita
8513f20cfd
Merge pull request #34145 from kitaisreal/bitset-sort-performance-check
pdqsort performance check
2022-01-31 12:35:13 +01:00
tavplubix
d19e24f530
Merge pull request #34096 from ClickHouse/fix_race_merge_selecting_task
Fix race between mergeSelectingTask and queue reinitialization
2022-01-31 12:16:29 +03:00
alesapin
55c7936257 Fix incorrect range for index 2022-01-31 11:11:32 +03:00
Maksim Kita
5ef83deaa6 Update sort to pdqsort 2022-01-30 19:49:48 +00:00
alesapin
4f1b902342 Fix compact parts as well 2022-01-30 22:36:19 +03:00
alesapin
4bedcc19b5 Better invariants 2022-01-30 20:40:09 +03:00
alesapin
c237c03c50 Fix 2022-01-30 18:39:26 +03:00
alesapin
bf918892ac More clear code with less getMark calls 2022-01-30 18:21:05 +03:00
alesapin
cb45a348f1 Merge branch 'master' into fix_buf_s3_low_cardinality 2022-01-30 17:30:55 +03:00
Anton Popov
78b9f15abb Merge remote-tracking branch 'upstream/master' into HEAD 2022-01-30 03:24:37 +03:00
alesapin
3f3e90c8ba Remove redundant code 2022-01-29 23:55:45 +03:00
alexey-milovidov
2b43bad923
Update MergeTreeReaderStream.cpp 2022-01-29 19:28:39 +03:00
alesapin
7ada8227cf Fix bug with bounded S3 reads and LowCardinality 2022-01-29 18:28:40 +03:00
李扬
6d50d36405
Merge branch 'master' into rocksdb_metacache 2022-01-28 22:00:31 -06:00
alexey-milovidov
6535b75322
Merge pull request #34001 from azat/memory-tracker-fix
Fix memory accounting for queries that uses < max_untracker_memory
2022-01-29 00:59:53 +03:00
Alexander Tokmakov
fb9b2d5326 Merge branch 'master' into mvcc_prototype 2022-01-28 21:18:36 +03:00
Alexander Tokmakov
e0304c2a58 review fixes, write tid into mutation entry 2022-01-28 20:47:37 +03:00
Azat Khuzhin
1519985c98 Fix possible "Can't attach query to the thread, it is already attached"
After detachQueryIfNotDetached() had been removed it is not enough to
use attachTo() for ThreadPool (scheduleOrThrowOnError()) since the query
may be already attached, if the thread doing multiple jobs, so
CurrentThread::attachToIfDetached() should be used instead.

This should fix all the places from the failures on CI [1]:

    $ fgrep DB::CurrentThread::attachTo -A1 ~/Downloads/47.txt  | fgrep -v attachTo | cut -d' ' -f5,6 | sort | uniq -c
         92 --
          2 /fasttest-workspace/build/../../ClickHouse/contrib/libcxx/include/deque:1393: DB::ParallelParsingInputFormat::parserThreadFunction(std::__1::shared_ptr<DB::ThreadGroupStatus>,
          4 /fasttest-workspace/build/../../ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp:1595: void
         87 /fasttest-workspace/build/../../ClickHouse/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp:993: void

  [1]: https://github.com/ClickHouse/ClickHouse/runs/4954466034?check_suite_focus=true

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-01-28 16:25:33 +03:00
Azat Khuzhin
b0c862c297 Fix memory accounting for queries that uses < max_untracker_memory
MemoryTracker starts accounting memory directly only after per-thread
allocation exceeded max_untracker_memory (or memory_profiler_step).

But even memory under this limit should be accounted too, and there is
code to do this in ThreadStatus dtor, however due to
PullingAsyncPipelineExecutor detached the query from thread group that
memory was not accounted.

So remove CurrentThread::detachQueryIfNotDetached() from threads that
uses ThreadFromGlobalPool since it has ThreadStatus, and the query will
be detached using CurrentThread::defaultThreadDeleter.

Note, that before this patch memory accounting works for HTTP queries
due to it had been accounted from ParallelFormattingOutputFormat, but
not for TCP.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-01-28 16:25:33 +03:00
Nikolai Kochetov
1c9f026178 Merge branch 'master' into add-pool-to-s3-write-buffer 2022-01-28 16:01:42 +03:00
Alexander Tokmakov
b3ddc601a5 fix race between mergeSelectingTask and queue reinitialization 2022-01-28 15:50:58 +03:00
alexey-milovidov
f6684dbc62
Merge pull request #32304 from devcrafter/deduplication_token_7461
insert_deduplication_token setting for INSERT statement
2022-01-28 13:03:55 +03:00
taiyang-li
3de8bde7ce Merge remote-tracking branch 'origin/master' into rocksdb_metacache 2022-01-28 09:58:52 +08:00
liyang830
eca0453564 fix local metadata differ zk metadata 2022-01-27 16:33:40 +08:00
Nikolai Kochetov
9b2998c639 Review fixes. 2022-01-26 18:08:01 +00:00
Nikolai Kochetov
a8171269a1 Review fixes. 2022-01-26 17:55:24 +00:00
Raúl Marín
5a59d976dd CurrentlyExecuting: Require mutex usage explicitly 2022-01-26 18:44:35 +01:00
Nikolai Kochetov
fcc29dbd15 Try to fix integration tests. 2022-01-25 15:26:36 +00:00
taiyang-li
5398af7c05 update again 2022-01-25 18:59:29 +08:00