tests/stress: improve OOM detection (add separate check by dmesg)

Right now if you will look at the OOM errors:
- OOM killer (or signal 9) in clickhouse-server.log
- Backward compatibility check: OOM messages in clickhouse-server.log

Most of them are not real, but just clickhouse server got KILLed by
clickhouse stop, #40678 may imporove the situation, but to definitely
sure that there was OOM let's look at dmesg.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
This commit is contained in:
Azat Khuzhin 2022-08-27 12:46:16 +02:00
parent 7e183675e5
commit ebc61a36e0

View File

@ -445,6 +445,13 @@ else
[ -s /test_output/bc_check_fatal_messages.txt ] || rm /test_output/bc_check_fatal_messages.txt
fi
dmesg -T > /test_output/dmesg.log
# OOM in dmesg -- those are real
grep -q -F -e 'Out of memory: Killed process' -e 'oom_reaper: reaped process' -e 'oom-kill:constraint=CONSTRAINT_NONE' /test_output/dmesg.log \
&& echo -e 'OOM in dmesg\tFAIL' >> /test_output/test_results.tsv \
|| echo -e 'No OOM in dmesg\tOK' >> /test_output/test_results.tsv
tar -chf /test_output/coordination.tar /var/lib/clickhouse/coordination ||:
mv /var/log/clickhouse-server/stderr.log /test_output/
@ -466,5 +473,3 @@ for core in core.*; do
pigz $core
mv $core.gz /test_output/
done
dmesg -T > /test_output/dmesg.log