Previously it was possible to have original pgid from the spawned
threads, that could lead to killing the caller script and in case of CI
it could be init process [1].
[1]: https://s3.amazonaws.com/clickhouse-test-reports/67737/e68c9c8d16f37f6c25739076c9b071ed97952269/stress_test__asan_/stress_test_run_21.txt
Repro:
$ echo "SELECT '1" > tests/queries/0_stateless/00001_select_1.sql # break the test
$ cat /tmp/test.sh
./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
Before this change:
$ /tmp/test.sh
Using queries from '/src/ch/worktrees/clickhouse-upstream/tests/queries' directory
Connecting to ClickHouse server... OK
Connected to server 24.8.1.1 @ bef896ce143ea4e0464c9829de6277ba06cc1a53 mt/rename-without-lock-v2
Running 3 stateless tests (MainProcess).
00001_select_1: [ FAIL ]
Reason: return code: 62
Code: 62. DB::Exception: Syntax error: failed at position 8 (''1;
'): '1;
. Single quoted string is not closed: ''1;
'. (SYNTAX_ERROR)
, result:
stdout:
Database: test_hz2zwymr
Child processes of 13041:
13042 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
Killing process group 13040
Processes in process group 13040:
13040 -bash
13042 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
[2]+ Stopped /tmp/test.sh
[1]$ Process group 13040 should be killed
Max failures chain
[2]+ Killed /tmp/test.sh
After:
$ /tmp/test.sh
Using queries from '/src/ch/worktrees/clickhouse-upstream/tests/queries' directory
Connecting to ClickHouse server... OK
Connected to server 24.8.1.1 @ bef896ce143ea4e0464c9829de6277ba06cc1a53 mt/rename-without-lock-v2
Running 3 stateless tests (MainProcess).
00001_select_1: [ FAIL ]
Reason: return code: 62
Code: 62. DB::Exception: Syntax error: failed at position 8 (''1;
'): '1;
. Single quoted string is not closed: ''1;
'. (SYNTAX_ERROR)
, result:
stdout:
Database: test_urz6rk5z
Child processes of 9782:
9785 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
Max failures chain
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
It was simply wrong before, but now, with capturing stacktrace that can
take sometime it is a must.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Previously processes cleanup on i.e. SIGINT simply did not work, because
the launcher kills only processes in process group, while tests are
launched with start_new_session=True for Popen(), which creates own
process group.
This is needed for killing process group in case of test timeout.
So instead, look at the parent pid, and kill the child process groups.
Also add some logging to make it more explicit which processes will be
killed.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
python error:
/src/ch/clickhouse/.cmake/../tests/clickhouse-test:2570: SyntaxWarning: invalid escape sequence '\/'
And also remove unnecessary escaping of `/`.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Before it throws internal python error:
$ ../tests/clickhouse-test 03033
Using queries from '/src/ch/clickhouse/tests/queries' directory
Connecting to ClickHouse server... OK
Connected to server 24.3.1.1 @ 3fa6d23730 master
Running 1 stateless tests (MainProcess).
03033_dist: [ UNKNOWN ] - Test internal error:
TypeError
expected str, bytes or os.PathLike object, not NoneType
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 1644, in run
if not is_valid_utf_8(self.case_file) or not is_valid_utf_8(
^^^^^^^^^^^^^^^
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 237, in is_valid_utf_8
with open(fname, "rb") as f:
^^^^^^^^^^^^^^^^^
0 tests passed. 0 tests skipped. 0.01 s elapsed (MainProcess).
Won't run stateful tests because test data wasn't loaded.
All tests have finished.
Now:
$ ../tests/clickhouse-test 03033
Using queries from '/src/ch/clickhouse/tests/queries' directory
Connecting to ClickHouse server... OK
Connected to server 24.3.1.1 @ 3fa6d23730 master
Running 1 stateless tests (MainProcess).
03033_dist: [ UNKNOWN ] - no reference file
0 tests passed. 0 tests skipped. 0.11 s elapsed (MainProcess).
Won't run stateful tests because test data wasn't loaded.
All tests have finished.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>