Commit Graph

106215 Commits

Author SHA1 Message Date
Alexander Tokmakov
ec5d7d0a3a
Update src/Functions/FunctionsConversion.h
Co-authored-by: Alexander Gololobov <440544+davenger@users.noreply.github.com>
2023-01-20 17:33:01 +03:00
Kruglov Pavel
28ddcc2432
Merge branch 'master' into tsv-csv-detect-header 2023-01-20 15:08:38 +01:00
Robert Schulze
b55c8ddd65
Merge pull request #45472 from ClickHouse/rs-duplicate-writing-guide
Don't duplicate writing guide, instead point to existing writing guide
2023-01-20 15:05:13 +01:00
Sema Checherinda
b76b612d23
fix typo 2023-01-20 14:55:58 +01:00
Nikolai Kochetov
039901b395 Fixing build 2023-01-20 13:49:50 +00:00
Han Fei
5fc4998f10 update docs for async insert deduplication 2023-01-20 14:42:11 +01:00
Robert Schulze
4ac17d71fa
Merge pull request #45470 from ClickHouse/rs-doc-typos
Fix typos
2023-01-20 14:39:27 +01:00
Robert Schulze
3b182e0fec
Don't duplicate writing guide, instead point to existing writing guide 2023-01-20 13:36:31 +00:00
Robert Schulze
3f2e4c8217
Fix typos 2023-01-20 13:20:25 +00:00
Robert Schulze
687f9c35a7
Merge pull request #45469 from ClickHouse/inv-idx-docs
Docs for inverted index
2023-01-20 14:17:58 +01:00
Robert Schulze
1a966a9590
Fix bad comparison 2023-01-20 13:05:06 +00:00
Mikhail f. Shiryaev
d4f60bc9da
Fix the case when merge-base does not show the oldest commit 2023-01-20 13:55:21 +01:00
Sema Checherinda
02f22f04e8
fix typos 2023-01-20 13:35:23 +01:00
kssenii
8d20af8127 Fix 2023-01-20 13:34:23 +01:00
Robert Schulze
7e6d3163b1
Initial inverted index docs 2023-01-20 12:12:20 +00:00
Azat Khuzhin
bdeb5514c5 Fix ASan builds for glibc 2.36+ (use RTLD_NEXT for ThreadFuzzer interceptors)
Recently I noticed that clickhouse compiled with ASan does not work with
newer glibc 2.36+, before I though that this was only about compiling
with old but using new, however that was not correct, ASan simply does
not work with glibc 2.36+.

Here is a simple reproducer [1]:

    $ cat > test-asan.cpp <<EOL
    #include <pthread.h>
    int main()
    {
        // something broken in ASan in interceptor for __pthread_mutex_lock
        // and only since glibc 2.36, and for pthread_mutex_lock everything is OK
        pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
        return __pthread_mutex_lock(&mutex);
    }
    EOL
    $ clang -g3 -o test-asan test-asan.cpp -fsanitize=address
    $ ./test-asan
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==15659==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7fffffffccb0 sp 0x7fffffffcb98 T0)
    ==15659==Hint: pc points to the zero page.
    ==15659==The signal is caused by a READ memory access.
    ==15659==Hint: address points to the zero page.
        #0 0x0  (<unknown module>)
        #1 0x7ffff7cda28f  (/usr/lib/libc.so.6+0x2328f) (BuildId: 1e94beb079e278ac4f2c8bce1f53091548ea1584)

    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (<unknown module>)
    ==15659==ABORTING

  [1]: https://gist.github.com/azat/af073e57a248e04488b21068643f079e

I've started observing glibc code, there was some changes in glibc, that
moves pthread functions out from libpthread.so.0 into libc.so.6
(somewhere between 2.31 and 2.35), but
the problem pops up only with 2.36, 2.35 works fine.

After this I've looked into changes between 2.35 and 2.36, and found
this patch [2] - "dlsym: Make RTLD_NEXT prefer default version
definition [BZ #14932]", that fixes this bug [3].

  [2]: https://sourceware.org/git/?p=glibc.git;a=commit;h=efa7936e4c91b1c260d03614bb26858fbb8a0204
  [3]: https://sourceware.org/bugzilla/show_bug.cgi?id=14932

The problem with using DL_LOOKUP_RETURN_NEWEST flag for RTLD_NEXT is
that it does not resolve hidden symbols (and __pthread_mutex_lock is
indeed hidden).

Here is a sample that will show the difference [4]:

    $ cat > test-dlsym.c <<EOL
    #define _GNU_SOURCE
    #include <dlfcn.h>
    #include <stdio.h>

    int main()
    {
        void *p = dlsym(RTLD_NEXT, "__pthread_mutex_lock");
        printf("__pthread_mutex_lock: %p (via RTLD_NEXT)\n", p);
        return 0;
    }
    EOL

    # glibc 2.35: __pthread_mutex_lock: 0x7ffff7e27f70 (via RTLD_NEXT)
    # glibc 2.36: __pthread_mutex_lock: (nil) (via RTLD_NEXT)

  [4]: https://gist.github.com/azat/3b5f2ae6011bef2ae86392cea7789eb7

But ThreadFuzzer uses internal symbols to wrap
pthread_mutex_lock/pthread_mutex_unlock, which are intercepted by ASan
and this leads to NULL dereference.

The fix was obvious - just use dlsym(RTLD_NEXT), however on older
glibc's this leads to endless recursion (see commits in the code). But
only for jemalloc [5], and even though sanitizers does not uses jemalloc
the code of ThreadFuzzer is generic and I don't want to guard it with
more preprocessors macros.

  [5]: https://gist.github.com/azat/588d9c72c1e70fc13ebe113197883aa2

So we have to use RTLD_NEXT only for ASan.

There is also one more interesting issue, if you will compile with clang
that itself had been compiled with newer libc (i.e. 2.36), you will get
the following error:

    $ podman run --privileged -v $PWD/.cmake-asan/programs:/root/bin -e PATH=/bin:/root/bin -e --rm -it ubuntu-dev-v3 clickhouse
    ==1==ERROR: AddressSanitizer failed to allocate 0x0 (0) bytes of SetAlternateSignalStack (error code: 22)
    ...
    ==1==End of process memory map.
    AddressSanitizer: CHECK failed: sanitizer_common.cpp:53 "((0 && "unable to mmap")) != (0)" (0x0, 0x0) (tid=1)
        <empty stack>

The problem is that since GLIBC_2.31, `SIGSTKSZ` is a call to
`getconf(_SC_MINSIGSTKSZ)`, but older glibc does not have it, so `-1`
will be returned and used as `SIGSTKSZ` instead.

The workaround to disable alternative stack:

    $ podman run --privileged -v $PWD/.cmake-asan/programs:/root/bin -e PATH=/bin:/root/bin -e ASAN_OPTIONS=use_sigaltstack=0 --rm -it ubuntu-dev-v3 clickhouse client --version
    ClickHouse client version 22.13.1.1.

Fixes: #43426
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-20 13:09:13 +01:00
Robert Schulze
bfc3b4f5ca
Suffix "GinFilter" --> "Inverted" 2023-01-20 12:02:35 +00:00
Nikolai Kochetov
1e29993aef Fixing build 2023-01-20 11:55:20 +00:00
Robert Schulze
0738b2499c
Use GinFilters typedef where possible 2023-01-20 11:52:04 +00:00
Maksim Kita
3e08a98f16
Merge pull request #45388 from azat/dict/remove-preallocate
Remove PREALLOCATE for HASHED/SPARSE_HASHED dictionaries
2023-01-20 14:51:25 +03:00
Robert Schulze
0b77f07f67
Remove superfluous check (the same is checked in MergeTreeIndices.cpp) 2023-01-20 11:50:35 +00:00
Robert Schulze
d2c830ec39
Cosmetics 2023-01-20 11:49:08 +00:00
Robert Schulze
72973076c9
Rename MergeTreeIndexGin.h/cpp to MergeTreeIndexInverted.h/cpp 2023-01-20 11:42:36 +00:00
Robert Schulze
1ef2704539
Cosmetics 2023-01-20 11:39:23 +00:00
Anton Popov
9c0ba7c7ca
Merge pull request #45432 from CurtizJ/allow-json-extract-int-from-float
Allow to convert float stored in string field to integer in `JSONExtract`
2023-01-20 12:35:06 +01:00
Robert Schulze
463cc843de
"segment file" --> "segment metadata file" 2023-01-20 11:26:22 +00:00
Robert Schulze
58df3953bb
Move some code around (no other changes) 2023-01-20 11:24:23 +00:00
Kseniia Sumarokova
c066b9bddd
Update SwapHelper.h 2023-01-20 12:19:19 +01:00
Maksim Kita
e067a55b78 Fixed tests 2023-01-20 12:19:16 +01:00
Robert Schulze
3267ac2787
Prefix more typedefs in DB namespace with "Gin" 2023-01-20 11:19:07 +00:00
Robert Schulze
919b67f117
Cosmetics 2023-01-20 11:15:28 +00:00
Sema Checherinda
09f3a5c599 add a comment, add a check, fix test 2023-01-20 12:10:31 +01:00
Robert Schulze
98e117dca6
SegmentDictionary --> GinSegmentDictionary, also move typedef 2023-01-20 11:09:49 +00:00
Robert Schulze
908fa83f72
Move some typedefs around 2023-01-20 11:08:19 +00:00
Robert Schulze
44618927f9
Inline two short methods + uppercase 2023-01-20 11:04:35 +00:00
Robert Schulze
f8b446f517
Move method implementations (no other changes) 2023-01-20 10:57:16 +00:00
Robert Schulze
5c3cc5283f
"term dictionary" --> "dictionary" 2023-01-20 10:53:41 +00:00
Robert Schulze
be936b257c
Make version enum private 2023-01-20 10:48:43 +00:00
Robert Schulze
0653f86de9
Various cosmetic cleanups 2023-01-20 10:45:35 +00:00
Maksim Kita
23e26032ca
Merge pull request #45399 from aalexfvk/alexfvk/mdb-21326_fix_system_dictionaries_when_dictionary_with_bad_structure
Fix select from system.dictionaries when there is dictionary with bad structure
2023-01-20 13:36:32 +03:00
Maksim Kita
758c8f2776
Merge branch 'master' into dict/remove-preallocate 2023-01-20 13:15:37 +03:00
Maksim Kita
e6ee5554d1 Fixed tests 2023-01-20 11:15:13 +01:00
Azat Khuzhin
1f9a65b875 Modernize InternalTextLogsQueue::getPriorityName()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-20 11:09:35 +01:00
Azat Khuzhin
fc276abadd Fix log level "Test" for send_logs_level in client
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-20 11:09:35 +01:00
Antonio Andelic
0ad37ad286
Merge pull request #45320 from stigsb/system_tables_volume_config
Add <storage_policy> config parameter for system logs
2023-01-20 10:27:57 +01:00
Robert Schulze
5ec6d89d43
Merge pull request #38667 from ClibMouse/ftsearch
Inverted Indices Implementation
2023-01-20 10:18:05 +01:00
SmitaRKulkarni
6aa63414db
Merge pull request #45072 from ClickHouse/43891_Disallow_concurrent_backups_and_restores
Added settings to disallow concurrent backups and restores
2023-01-20 09:17:20 +01:00
Aleksei Filatov
42549e89f2 [rev 2] Fix review notes 2023-01-20 09:37:49 +03:00
Nikolai Kochetov
3e00d18498 Merge branch 'master' into fix-disabled-two-level-agg 2023-01-19 20:54:04 +00:00
Nikolay Degterinsky
dd7fef11a2 Add default granularity 2023-01-19 20:52:38 +00:00