MADV_DONTNEED check in runtime for qemu (via patching jemalloc)

qemu does not support MADV_DONTNEED, and by not support it simply ignore
it (i.e. return 0 -- no error).

This issue has been "fixed" in #15590, however it just
terminates the process, and completely breaks clickhouse under qemu
(see also #15174).

But there is no need in such strong protection, we can stop using
madvise in jemalloc if MADV_DONTNEED does not work properly.
And this is what #18169 was tried to do (by override madvise), however
this will break sanitizers, at least TSAN and UBSAN.
The problem there is that sanitizers initialization code uses madvise
(and there is no way to turn this off with TSAN_OPTIONS) and overwritten
madvise function will have sanitizers traits (__tsan_func_entry), while
TSAN is not ready for this, and eventually it SIGSEGV.
Interesting thing is that in the recent clang-12, madvise was replaced
with direct syscall [1].

  [1]: 9f8c4039f2

But it is better to make clickhouse compatible with clang < 12 too, so
instead of override madvise completely, the runtime check was moved into
the jemalloc code [2].

  [2]: https://github.com/ClickHouse-Extras/jemalloc/pull/1
This commit is contained in:
Azat Khuzhin 2020-12-19 00:49:42 +03:00
parent 753aa36baf
commit e379b80d45
3 changed files with 4 additions and 44 deletions

2
contrib/jemalloc vendored

@ -1 +1 @@
Subproject commit 93e27e435cac846028da20cd9b0841fbc9110bd2
Subproject commit e6891d9746143bf2cf617493d880ba5a0b9a3efd

View File

@ -308,53 +308,11 @@ void checkRequiredInstructions()
}
}
#ifdef __linux__
/// clickhouse uses jemalloc as a production allocator
/// and jemalloc relies on working MADV_DONTNEED,
/// which doesn't work under qemu
///
/// but do this only under for linux, since only it return zeroed pages after MADV_DONTNEED
/// (and jemalloc assumes this too, see contrib/jemalloc-cmake/include_linux_x86_64/jemalloc/internal/jemalloc_internal_defs.h.in)
void checkRequiredMadviseFlags()
{
size_t size = 1 << 16;
void * addr = mmap(nullptr, size, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED)
{
writeError("Can not mmap pages for MADV_DONTNEED check\n");
_Exit(1);
}
memset(addr, 'A', size);
if (!madvise(addr, size, MADV_DONTNEED))
{
/// Suboptimal, but should be simple.
for (size_t i = 0; i < size; ++i)
{
if (reinterpret_cast<unsigned char *>(addr)[i] != 0)
{
writeError("MADV_DONTNEED does not zeroed page. jemalloc will be broken\n");
_Exit(1);
}
}
}
if (munmap(addr, size))
{
writeError("Can not munmap pages for MADV_DONTNEED check\n");
_Exit(1);
}
}
#endif
struct Checker
{
Checker()
{
checkRequiredInstructions();
#ifdef __linux__
checkRequiredMadviseFlags();
#endif
}
} checker;

View File

@ -2,4 +2,6 @@ Instruction check fail. The CPU does not support SSSE3 instruction set.
Instruction check fail. The CPU does not support SSE4.1 instruction set.
Instruction check fail. The CPU does not support SSE4.2 instruction set.
Instruction check fail. The CPU does not support POPCNT instruction set.
MADV_DONTNEED does not zeroed page. jemalloc will be broken
<jemalloc>: MADV_DONTNEED does not work (memset will be used instead)
<jemalloc>: (This is the expected behaviour if you are running under QEMU)
1