In glibc 2.32 new version of some symbols had been added [1]:
$ nm -D clickhouse | fgrep -e @GLIBC_2.32
U pthread_getattr_np@GLIBC_2.32
U pthread_sigmask@GLIBC_2.32
[1]: https://www.spinics.net/lists/fedora-devel/msg273044.html
Right now ubuntu 20.04 is used as official image for building
ClickHouse, however once it will be switched someone may not be happy
with that fact that he/she cannot use official binaries anymore because
they have glibc < 2.32.
To avoid this dependency, let's force previous version of those
symbols from glibc.
Note, that I've tested this by compiling with glibc 2.32 and verifying
that output ELF does not have @GLIBC_2.32 symbols and also running that
binary inside ubuntu:20.04 image (that has glibc 2.31).
v1: -Wl,--wrap
v2: -Wl,--defsym
v3: -include
v4: fix versioning for aarch64
This will allow to avoid superfluous sleep during query execution, since
this not only not desired behavoiur, but also may hang the server, since
if you will execute enough queries that will use MySQL database but will
not allow enough connections (or your MySQL server is too slow) then you
may run out of threads in the global thread pool.
Also note that right now it is possible to get deadlock when the mysql
pool is full, consider the following scenario:
- you have m1 and m2 mysql tables
- you have q1 and q2 queries, bot queries join m1 and m2
- q1 allocated connection for m1 but cannot allocate connection for m2
- q2 allocated connection for m2 but cannot allocate connection for m1
- but to resolve the lock one should give up on the locking while it is not possible right now...
And then you got no free threads and this:
# grep -h ^202 /proc/$(pgrep clickhouse-serv)/task/*/syscall | cut -d' ' -f2 | sort | uniq -c | sort -nr | head
1554 0x7ffb60b92fe8 # mutex in mysqlxx::PoolWithFailover::get
1375 0x7ffb9f1c4748 # mutex in ::PoolEntryHelper::~PoolEntryHelper from DB::MultiplexedConnections::invalidateReplica
1160 0x7ffb612918b8 # mutex in mysqlxx::PoolWithFailover::get
42 0x7ffb9f057984 # mutex in ThreadPoolImpl<std::__1::thread>::worker
*NOTE: 202 is a `futex` with WAIT*
(Went with `syscall` because debugging 10k+ threads is not easy, and
eventually it may TRAP)