Previously it was the pid of the subshell 40 while it should be the pid
of the clickhouse-server 39:
Here we see that the server pid is 39:
2021-09-28 11:02:34 + pgrep -f clickhouse-server
2021-09-28 11:02:34 39
Here we see that the 40 is the pid of subshell:
2021-09-28 11:02:45 ch/docker/test/fuzzer/run-fuzzer.sh: line 90: 39 Killed clickhouse-server --config-file db/config.xml -- --path db 2>&1
2021-09-28 11:02:45 40 Done | tail -100000 > server.log
And here we see that server_pid variable is 40:
2021-09-28 11:02:45 + server_exit_code=0
2021-09-28 11:02:45 + wait 40
v2: wait in background to call wait in foreground and ensure that the
process is alive, since w/o job control this is the only way to obtain
the exit code
It was 64 for a long time, and even linux kernel never has such a small
limit, it had 128 (net.core.somaxconn sysctl).
But recently, in 5.4, the default value had been increased even in
linux kernel, to 4096 [1].
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=19f92a030ca6d772ab44b22ee6a01378a8cb32d4
Let's increase it in ClickHouse too.
Also note, that I've looked through some instances with
clickhouse-server only, and TcpExtListenOverflows was non zero (`nstat
-za`), in other words backlog of the listen socket indeed overflowed
there.
So it is better to increase the default to move the problem to
clickhouse-server itself (yes you will unlikely have 4K new incomming
connections at one time, with accept thread do not accept them all, but
still seems that it is possible, maybe due to some locks or something
else).
In glibc 2.32 new version of some symbols had been added [1]:
$ nm -D clickhouse | fgrep -e @GLIBC_2.32
U pthread_getattr_np@GLIBC_2.32
U pthread_sigmask@GLIBC_2.32
[1]: https://www.spinics.net/lists/fedora-devel/msg273044.html
Right now ubuntu 20.04 is used as official image for building
ClickHouse, however once it will be switched someone may not be happy
with that fact that he/she cannot use official binaries anymore because
they have glibc < 2.32.
To avoid this dependency, let's force previous version of those
symbols from glibc.
Note, that I've tested this by compiling with glibc 2.32 and verifying
that output ELF does not have @GLIBC_2.32 symbols and also running that
binary inside ubuntu:20.04 image (that has glibc 2.31).
v1: -Wl,--wrap
v2: -Wl,--defsym
v3: -include
v4: fix versioning for aarch64