Commit Graph

6980 Commits

Author SHA1 Message Date
Lars Eidnes
2629614dfe Allow LowCardinality strings for ngrambf_v1/tokenbf_v1 indexes. Fixes #21865 2022-02-25 15:36:36 +01:00
tavplubix
43626b3ffd
Update src/Storages/FileLog/StorageFileLog.cpp
Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>
2022-02-23 21:07:37 +03:00
Alexander Tokmakov
5a26f856d9 remove trash that shouldn't have been merged 2022-02-22 23:41:33 +03:00
Dmitry Novik
2fd4baaa64
Merge pull request #34387 from nvartolomei/nv/move-part-settings-cleanup
Remove useless setting experimental_query_deduplication_send_all_part_uuids
2022-02-22 06:11:00 -08:00
Kseniia Sumarokova
eeea322556
Merge pull request #34629 from amosbird/remotefsimprove
Some refactoring and improvement over async and remote buffer related stuff
2022-02-22 11:36:40 +01:00
Dmitry Novik
1df43a7f57
Merge pull request #34385 from nvartolomei/nv/move-part-count
Disable optimize_trivial_count when deduplication for part movement feature is enabled
2022-02-21 08:53:09 -08:00
Anton Popov
065305ab65
Merge pull request #34764 from ucasfl/hints-index
Add name hints for data skipping indices
2022-02-21 16:50:59 +03:00
Mikhail f. Shiryaev
5ac8cdbc69
Merge pull request #34786 from ClickHouse/make_drop_column_metadata_only
Make drop of alias column metadata only
2022-02-21 14:11:55 +01:00
mergify[bot]
314ab73b11
Merge branch 'master' into nv/move-part-settings-cleanup 2022-02-21 10:18:44 +00:00
Dmitry Novik
4428e7aa1b
Merge branch 'master' into nv/move-part-count 2022-02-21 02:14:23 -08:00
alesapin
d7cae5ffb4 Fix build 2022-02-21 11:54:52 +03:00
alesapin
852757219f Make drop of alias column metadata only 2022-02-21 11:46:16 +03:00
Vitaly Baranov
aee67a6693
Merge pull request #31484 from eungenue/Implement-SSL-X509-certificate-authentication
Implement ssl x509 certificate authentication
2022-02-21 11:30:52 +03:00
Vitaly Baranov
0d377de5f0 Support syntax CREATE USER IDENTIFIED WITH ssl_certificate CN ... 2022-02-21 07:01:00 +03:00
Vitaly Baranov
7b97c986cb
Revert "Allow restrictive row policies without permissive" 2022-02-21 06:54:28 +03:00
feng lv
07280e0ab1 Add name hints for data skipping indices
fix test
2022-02-20 11:48:22 +00:00
Vitaly Baranov
874b2c8dcb
Merge pull request #34596 from vitlibar/allow-restrictive-without-permissive
Allow restrictive row policies without permissive
2022-02-19 21:45:28 +07:00
Azat Khuzhin
fef5f146e7 Fix ENOENT with fsync_part_directory and Vertical merge
fsync of the temporary part directory is superfluous anyway, and besides
that directory is not exists at that time, that will lead to ENOENT
error:

    2022.02.18 17:02:51.634565 [ 35639 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 107. DB::ErrnoException: Cannot open file /var/lib/clickhouse/data/system/text_log/tmp_merge_202202_1864_3192_14/, errno: 2, strerror: No such file or directory. (FILE_DOESNT_EXIST), Stack trace (when copying this message, always include the lines below):

    0. DB::Exception::Exception() @ 0xb26ecfa in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    1. DB::throwFromErrnoWithPath() @ 0xb2700ea in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    2. DB::LocalDirectorySyncGuard::LocalDirectorySyncGuard() @ 0x14905531 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    3. DB::DiskLocal::getDirectorySyncGuard() const @ 0x148af3e3 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug
    4. DB::MergeTask::ExecuteAndFinalizeHorizontalPart::prepare() @ 0x157bef13 in /usr/lib/debug/.build-id/01/8c328bd4858d67.debug

Note, that IMergeTreeDataPart::renameTo() anyway will have fsync for the
directory.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-19 07:50:59 +03:00
Nikolai Kochetov
e4d5db6161
Merge pull request #34717 from azat/merge-mutate-memory-tracker
Fix possible memory_tracker use-after-free (for async s3 writes) for merges/mutations
2022-02-18 19:28:43 +01:00
Vladimir C
9b7d011ee7
Merge pull request #34529 from vdimir/join-nullable-on-pipeline
Apply join_use_nulls on types before join
2022-02-18 18:34:44 +01:00
Azat Khuzhin
65e9b4879d Fix possible memory_tracker use-after-free for merges/mutations
There are two possible cases for execution merges/mutations:
1) from background thread
2) from OPTIMIZE TABLE query

1) is pretty simple, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)" ==
      background_thread_memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Global / description="(total)"

  So as you can see it is pretty simple and MemoryTrackerThreadSwitcher
  does not do anything icky for this case.

2) is complex, it's memory tracking structure is as follow:

    current_thread::memory_tracker = level=Thread / description="(for thread)"
    current_thread::memory_tracker.parent = level=Process / description="(for query)" ==
      background_thread_memory_tracker = level=Process / description="(for query)"

  Before this patch to track memory (and related things, like sampling,
  profiling and so on) for OPTIMIZE TABLE query dirty hacks was done to
  do this, since current_thread memory_tracker was of Thread scope, that
  does not have any limits.

  And so if will change parent for it to Merge/Mutate memory tracker
  (which also does not have some of settings) it will not be correctly
  tracked.

  To address this Merge/Mutate was set as parent not to the
  current_thread memory_tracker but to it's parent, since it's scope is
  Process with all settings.

  But that parent's memory_tracker is the memory_tracker of the
  thread_group, and so if you will have nested ThreadPool inside
  merge/mutate (this is the case for s3 async writes, which has been
  added in #33291) you may get use-after-free of memory_tracker.

  Consider the following example:

    MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker.parent = merge_list_entry->memory_tracker
      (see also background_thread_memory_tracker above)

    CurrentThread::attachTo()
      current_thread.memory_tracker.parent = thread_group.memory_tracker

    CurrentThread::detachQuery()
      current_thread.memory_tracker.parent = thread_group.memory_tracker.parent
      # and this is equal to merge_list_entry->memory_tracker

    ~MemoryTrackerThreadSwitcher()
      thread_group.memory_tracker = thread_group.memory_tracker.parent

  So after the following we will get incorrect memory_tracker (from the
  mege_list_entry) when the next job in that ThreadPool will not have
  thread_group, since in this case it will not try to update the
  current_thread.memory_tracker.parent and use-after-free will happens.

So to address the (2) issue, settings from the parent memory_tracker
should be copied to the merge_list_entry->memory_tracker, to avoid
playing with parent memory tracker.

Note, that settings from the query (OPTIMIZE TABLE) is not available at
that time, so it cannot be used (instead of parent's memory tracker
settings).

v2: remove memory_tracker.setOrRaiseHardLimit() from settings

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-18 16:23:54 +03:00
Amos Bird
f459e8fc95
Less getMark calls 2022-02-18 19:55:19 +08:00
tavplubix
0f5ee19d0b
Merge pull request #34633 from zhangjmruc/master
For ReplatedMergeTree, early break for multiple leaders case when log has been updated by the other leader
2022-02-17 14:01:50 +03:00
Kruglov Pavel
6dcb766879
Merge pull request #34465 from Avogar/fix-url-globs
Improve schema inference with globs in FIle/S3/HDFS/URL engines
2022-02-17 13:33:27 +03:00
Vitaly Baranov
2de6e8e575 Change type of RowPolicyKind: bool -> enum. 2022-02-17 14:18:10 +07:00
Amos Bird
d3bd8b5f93
Cosmetic fix 2022-02-17 14:31:22 +08:00
Amos Bird
ba19c7cf44
Slightly better interface of compressed buffer 2022-02-17 14:31:22 +08:00
Jianmei Zhang
ef0c3b99ff Merge remote-tracking branch 'upstream/master' 2022-02-17 14:02:27 +08:00
Azat Khuzhin
774744a86d Fix allow_experimental_projection_optimization with enable_global_with_statement
allow_experimental_projection_optimization requires one more
InterpreterSelectQuery, which with enable_global_with_statement will
apply ApplyWithAliasVisitor if the query is not subquery.

But this should not be done for queries from
MergeTreeData::getQueryProcessingStage()/getQueryProcessingStageWithAggregateProjections()
since this will duplicate WITH statements over and over.

This will also fix scalar.xml perf tests, that leads to the following
error now:

    scalar.query0.prewarm0: DB::Exception: Stack size too large.

And since it has very long query in the log, this leads to the following
perf test error:

    _csv.Error: field larger than field limit (131072)

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-02-16 19:14:47 +03:00
Mikhail f. Shiryaev
4f84406136
Merge pull request #34641 from ClickHouse/version-and-release
refactor version_helper, create release script
2022-02-16 14:00:55 +01:00
Maksim Kita
d6e88f56cd
Merge pull request #34623 from CurtizJ/minor-subcolumns-fix
Fix quadratic complexity while adding subcolumns
2022-02-16 12:38:00 +01:00
Mikhail f. Shiryaev
c5db40f679
Deprecate sh script for StorageSystemContributors, update generated file 2022-02-16 12:16:43 +01:00
Nikolai Kochetov
f9d2dae88e
Merge pull request #34424 from yakov-olkhovskiy/ephemeral-column
Ephemeral column issue #9436
2022-02-16 12:04:57 +01:00
Kruglov Pavel
dd863ca2a0
Merge branch 'master' into fix-url-globs 2022-02-16 12:45:31 +03:00
Jianmei Zhang
25c761b3b6 Early break for multiple leaders case when log updated by other leader 2022-02-16 16:06:41 +08:00
Anton Popov
e4fddaa03a fix quadratic complexity while adding subcolumns 2022-02-16 02:42:50 +03:00
alesapin
bc2d0ee7c7
Merge pull request #34215 from ClickHouse/revert-34211-revert-34153-add_func_tests_over_s3
Add func tests run with s3 and fix several bugs
2022-02-15 19:07:11 +03:00
Nikolai Kochetov
ab288642f6 Merge branch 'master' into ephemeral-column 2022-02-15 10:03:34 +00:00
Nikolai Kochetov
d6cbac1ed3
Merge pull request #34577 from ClickHouse/alwasy-remove-unused-actions-for-add-missing-defaults
Always remove unused actions from addMissingDefaults
2022-02-15 11:01:29 +01:00
alesapin
447cd56cb9 Fix comments 2022-02-15 12:11:50 +03:00
alesapin
e15396d90c Fix race condition: 2022-02-14 22:19:49 +03:00
Nikolai Kochetov
b3ea360cd2 Fix a little bit more 2022-02-14 19:05:30 +00:00
Kseniia Sumarokova
382b8e0388
Merge pull request #34432 from ClickHouse/static-files-disk-uploader-create-symlinks
`static-files-disk-uploader`: add a mode to create symlinks
2022-02-14 18:10:53 +01:00
vdimir
99ca89c0ca
Fix StorageJoin and Asof or join_use_nulls in pipeline 2022-02-14 14:14:27 +00:00
alesapin
bb69455395
Merge pull request #34504 from CurtizJ/ttl-move-if-exists
Support `TTL TO [DISK|VOLUME] [IF EXISTS]`
2022-02-14 14:56:18 +03:00
alesapin
b75d551281 Fix clang tidy and add check for master 2022-02-14 14:37:41 +03:00
alesapin
b2886a429b Fix lock during fetch 2022-02-14 12:20:27 +03:00
alesapin
beb4400978 Fix 'same local part' check 2022-02-13 23:08:29 +03:00
alesapin
89373155fc Merge branch 'master' into revert-34211-revert-34153-add_func_tests_over_s3 2022-02-13 21:07:54 +03:00
Yakov Olkhovskiy
579fe6c97a major rework, transform added to the insert pipe 2022-02-13 17:42:59 +00:00