Commit Graph

329 Commits

Author SHA1 Message Date
alesapin
69f3a66538 Keep the most important log in stress tests 2022-09-27 11:16:10 +02:00
Antonio Andelic
97cf045203
Merge pull request #41721 from ClickHouse/collect-correctly-logs-in-stress-test
Collect logs in Stress test using clickhouse-local
2022-09-27 08:43:44 +02:00
Antonio Andelic
eb78761a7e Collect necessary 2022-09-26 16:30:01 +00:00
Antonio Andelic
6f4a636e8f Remove wildcard 2022-09-26 11:21:53 +00:00
Antonio Andelic
8fde8b2c56 Try with multiple calls 2022-09-26 11:03:24 +00:00
Antonio Andelic
c60d9db687
Merge branch 'master' into ignore-attach-thread-keeper-errors 2022-09-26 08:38:48 +02:00
Antonio Andelic
5ff1bcd553
Merge branch 'master' into collect-correctly-logs-in-stress-test 2022-09-26 08:38:38 +02:00
alesapin
06e0f554d8 Fix fetch to local disk 2022-09-23 16:46:53 +02:00
Antonio Andelic
1d93c56d1a Collect logs using clickhouse-local 2022-09-23 10:54:16 +00:00
Antonio Andelic
a17a3e1de1 Ignore Keeper hardware errors 2022-09-23 08:23:57 +00:00
kssenii
46f74aaba9 Update stress/run.sh 2022-09-12 20:10:35 +02:00
Alexander Tokmakov
e77b9e4d0c
Merge pull request #40775 from azat/ci/core-dumps-rework
Rework core collecting on CI (eliminate gcore usage)
2022-09-09 20:20:10 +03:00
Alexey Milovidov
7776512b04
Merge pull request #41002 from azat/ci/fix-oom-check
ci/stress: clear dmesg before run to fix "OOM in dmesg" check
2022-09-06 06:41:36 +03:00
Alexander Tokmakov
b264be3c63
Merge branch 'master' into zookeeper_client_fault_injection 2022-09-05 22:13:09 +03:00
Azat Khuzhin
2724b67537 ci/stress: clear dmesg before run to fix "OOM in dmesg" check
CI: https://s3.amazonaws.com/clickhouse-test-reports/40772/afa137ae2b6108e72c2d6e43556a04548afa2ea9/stress_test__ubsan_.html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-05 15:51:36 +02:00
Azat Khuzhin
25e3bebd9d Rework core collecting on CI (eliminate gcore usage)
gcore is a gdb command, that internally uses gdb to dump the core.

However with proper configuration of limits (core_dump.size_limit) it
should not be required, althought some issues is possible:
- non standard kernel.core_pattern
- sanitizers

So yes, gcore is more "universal" (you don't need to configure any
`kernel_pattern`), but it is ad-hoc, and it has drawbacks -
**it does not work when gdb fails**. For example gdb may fail with
`Dwarf Error: DW_FORM_strx1 found in non-DWO CU` in case of DWARF-5 [1].

  [1]: https://github.com/ClickHouse/ClickHouse/pull/40772#issuecomment-1236331323.

Let's try to switch to more native way.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-04 22:07:16 +02:00
Alexander Tokmakov
8bdb589c2b Merge branch 'master' into zookeeper_client_fault_injection 2022-08-29 13:34:57 +02:00
alesapin
133ca01447 Merge branch 'master' into stress_s3 2022-08-29 11:25:28 +02:00
Azat Khuzhin
ebc61a36e0 tests/stress: improve OOM detection (add separate check by dmesg)
Right now if you will look at the OOM errors:
- OOM killer (or signal 9) in clickhouse-server.log
- Backward compatibility check: OOM messages in clickhouse-server.log

Most of them are not real, but just clickhouse server got KILLed by
clickhouse stop, #40678 may imporove the situation, but to definitely
sure that there was OOM let's look at dmesg.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-27 12:46:58 +02:00
Azat Khuzhin
3b519c5d44 tests/stress: capture stacktrace of server hungs if pid was removed already
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-26 22:06:32 +03:00
alesapin
704c4b2c5b Stop thread fuzzer on shutdown 2022-08-26 11:54:54 +02:00
alesapin
3ff6489fae Merge branch 'master' into stress_s3 2022-08-25 13:14:58 +02:00
alesapin
ad692f732a Merge branch 'master' into stress_s3 2022-08-25 13:13:30 +02:00
alesapin
35f9815b8e Fix backward comp check 2022-08-24 14:43:02 +02:00
Azat Khuzhin
50bddc43dc tests/stress: ignore NETLINK_ERROR from checkPermissionsImpl
Since now with --privileged it has CAP_SYS_ADMIN and tries to
communicate via netlink.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-19 14:07:06 +02:00
Alexander Tokmakov
3d253ec51b
Merge branch 'master' into zookeeper_client_fault_injection 2022-08-18 21:23:50 +03:00
alesapin
922818bbd9 Merge branch 'stress_s3' of github.com:ClickHouse/ClickHouse into stress_s3 2022-08-18 14:46:50 +02:00
alesapin
932ea146f5 Merge branch 'master' into stress_s3 2022-08-18 13:14:47 +02:00
alesapin
86b1e33eed Disable cache on writes 2022-08-17 19:00:53 +02:00
alesapin
600d22851f Grep dangerous S3 errors 2022-08-17 12:43:11 +02:00
alesapin
0433b801d2 Configure properly 2022-08-17 12:27:15 +02:00
Alexander Tokmakov
ae000e9125
Merge branch 'master' into zookeeper_client_fault_injection 2022-08-17 12:48:54 +03:00
alesapin
1ec6627a70 Fix tables creation 2022-08-16 18:28:17 +02:00
kssenii
eb26b219b9 Merge master 2022-08-16 00:56:27 +02:00
Alexander Tokmakov
589c3408d2
Merge pull request #40234 from ClickHouse/better_message_on_restore_covered
Better error message when restoring covered parts
2022-08-15 22:01:48 +03:00
alesapin
243bd492fa Trying to fix it 2022-08-15 20:55:11 +02:00
alesapin
96722a13bb Merge branch 'master' into stress_s3 2022-08-15 20:20:31 +02:00
Alexander Tokmakov
edaff70010 better error message when restoring covered parts 2022-08-15 13:53:14 +02:00
Alexander Tokmakov
467ef7bbc2
Update run.sh 2022-08-12 14:30:18 +03:00
Alexander Tokmakov
b9d18182f2 fix 2022-08-11 15:27:26 +02:00
kssenii
5c3227ba56 Merge master 2022-08-10 12:00:34 +02:00
kssenii
0dda03c94b Fix checks 2022-08-10 00:06:58 +02:00
Azat Khuzhin
3772415588 tests/stress: add dmesg output (to see OOM details)
max_server_memory_usage already set to 75%, so OOM should not happens,
the reason is that because RSS does not match with memory tracker
statistics:

    2022.08.05 12:36:57.869896 [ 82524 ] {} <Trace> AsynchronousMetrics: MemoryTracking: was 64.69 GiB, peak 65.26 GiB, will set to 62.80 GiB (RSS), difference: -1.89 GiB
    ...
    2022.08.05 12:37:00.213440 [ 82334 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 64.68 GiB (attempt to allocate chunk of 1298794 bytes), maximum: 51.44 GiB. OvercommitTracker decision: Memory overcommit isn't used. Waiting time or orvercommit denominator are set to zero.. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-08-06 12:34:38 +03:00
kssenii
7a9b0bc47f Merge master 2022-08-05 01:48:52 +02:00
Kruglov Pavel
235649cb98
Merge pull request #39458 from Avogar/fix-cancel-insert-into-function
Fix WriteBuffer finalize when cancel insert into function
2022-08-04 13:02:08 +02:00
kssenii
d462782d1a Fix checks 2022-08-02 14:27:45 +02:00
Alexander Tokmakov
e5c47cb26f
Update run.sh 2022-08-02 12:10:53 +03:00
Alexander Tokmakov
ecf7ce1f74 Merge branch 'master' into zookeeper_client_fault_injection 2022-08-01 20:49:01 +02:00
Alexander Tokmakov
3cc20f05ba
Update run.sh 2022-08-01 20:47:14 +03:00
kssenii
e5f4a619ed Merge master 2022-07-31 20:24:40 +03:00