Commit Graph

751 Commits

Author SHA1 Message Date
Vladimir Cherkasov
6e3068fbf0
Merge pull request #70474 from azat/clickhouse-tests-fixes-24.10
Fixes for killing leftovers in clikhouse-test
2024-10-17 13:06:32 +00:00
Raúl Marín
ddf3259d27 Fix style 2024-10-14 14:02:15 +02:00
Raúl Marín
d9bcc6639e Increase max_rows_to_read in test reading from text_log 2024-10-14 13:09:12 +02:00
Azat Khuzhin
5e33f2d714 tests/clickhouse-test: fix pylint warning about unused vars in signal handler
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-10-08 12:57:29 +02:00
Azat Khuzhin
9fd9ed40bb Ignore ESRCH while obtaining process group
Should fix the following [1]:

    2024-10-07 21:13:38 02784_connection_string:                                                [ OK ] 9.69 sec.
    2024-10-07 21:13:38 Process Process-5:
    2024-10-07 21:13:38 Traceback (most recent call last):
    2024-10-07 21:13:38   File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    2024-10-07 21:13:38     self.run()
    2024-10-07 21:13:38   File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    2024-10-07 21:13:38     self._target(*self._args, **self._kwargs)
    2024-10-07 21:13:38   File "/usr/bin/clickhouse-test", line 2609, in run_tests_process
    2024-10-07 21:13:38     return run_tests_array(*args, **kwargs)
    2024-10-07 21:13:38   File "/usr/bin/clickhouse-test", line 2327, in run_tests_array
    2024-10-07 21:13:38     stop_tests()
    2024-10-07 21:13:38   File "/usr/bin/clickhouse-test", line 445, in stop_tests
    2024-10-07 21:13:38     cleanup_child_processes(os.getpid())
    2024-10-07 21:13:38   File "/usr/bin/clickhouse-test", line 433, in cleanup_child_processes
    2024-10-07 21:13:38     child_pgid = os.getpgid(child)
    2024-10-07 21:13:38 ProcessLookupError: [Errno 3] No such process

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/70448/cd826389e90065466ddfef140fc344b30e8c6de0/stateless_tests__aarch64_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-10-08 12:40:50 +02:00
Azat Khuzhin
95f3d2e4e9 Kill tests leftovers in case of timeout
Though now there are oddities with multiprocessing_manager.list():

    Having 1 errors! 0 tests passed. 0 tests skipped. 2.20 s elapsed (MainProcess).
    Won't run stateful tests because test data wasn't loaded.
    Traceback (most recent call last):
      File "/usr/lib/python3.12/multiprocessing/managers.py", line 813, in _callmethod
        conn = self._tls.connection
               ^^^^^^^^^^^^^^^^^^^^
    AttributeError: 'ForkAwareLocal' object has no attribute 'connection'

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 3707, in <module>
        main(args)
      File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 3126, in main
        if len(restarted_tests) > 0:
           ^^^^^^^^^^^^^^^^^^^^
      File "<string>", line 2, in __len__
      File "/usr/lib/python3.12/multiprocessing/managers.py", line 817, in _callmethod
        self._connect()
      File "/usr/lib/python3.12/multiprocessing/managers.py", line 804, in _connect
        conn = self._Client(self._token.address, authkey=self._authkey)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/multiprocessing/connection.py", line 519, in Client
        c = SocketClient(address)
            ^^^^^^^^^^^^^^^^^^^^^
      File "/usr/lib/python3.12/multiprocessing/connection.py", line 647, in SocketClient
        s.connect(address)
    ConnectionRefusedError: [Errno 111] Connection refused

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-10-08 12:35:07 +02:00
Azat Khuzhin
94df71fdc2 Add exception message into "Hung check failed" message
Here [1] the hung query failed:

    2024.10.07 21:13:29.044675 [ 16750 ] {484a1200-d576-4c03-a82b-2d389b8e773f} <Debug> executeQuery: (from [::1]:43374) SELECT 1 /*hung check*/
      (stage: Complete)
    2024.10.07 21:13:29.047252 [ 16750 ] {484a1200-d576-4c03-a82b-2d389b8e773f} <Error> executeQuery: Code: 210. DB::Exception: I/O error: Broken pipe, while writing to socket ([::1]:8123 -> [::1]:43374): While executing TabSeparatedRowOutputFormat. (NETWORK_ERROR) (version 24.10.1.1368) (from [::1]:43374) (in query: SELECT 1 /*hung check*/

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/70448/cd826389e90065466ddfef140fc344b30e8c6de0/stateless_tests__aarch64_.html

But I don't see any possible reasons for this, only if the client closes
the connection, but I bet that the query had been sent long time ago,
but due to VM stall (#70473) it was not accepted.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-10-08 12:22:08 +02:00
avogar
7f32eb6b17 Fix tests, add docs and ramdomize new setting 2024-10-03 20:12:46 +00:00
Mikhail f. Shiryaev
42097b0abc
Remove if-return-elif-return-else statements 2024-09-26 16:09:52 +02:00
robot-clickhouse
5ce8604869 Automatic style fix 2024-09-17 12:37:31 +00:00
kssenii
3a05282bce Update assert 2024-09-17 14:26:31 +02:00
Igor Nikonov
3898f52868 Merge remote-tracking branch 'origin/master' into pr-local-plan 2024-09-04 11:15:54 +00:00
robot-clickhouse
8967e6f9b8 Automatic style fix 2024-09-03 16:11:28 +00:00
vdimir
6f1511c9a2
Collect sanitizer report from client to client_log 2024-09-03 14:17:41 +00:00
Igor Nikonov
5b4b08b711 Merge remote-tracking branch 'origin/master' into pr-local-plan 2024-08-30 20:06:03 +00:00
alesapin
3d40f700cf
Merge branch 'master' into tests/capture-kill-output 2024-08-27 18:40:22 +02:00
Igor Nikonov
17c1e82bc0 Merge remote-tracking branch 'origin/master' into pr-local-plan 2024-08-23 18:27:19 +00:00
Alexander Tokmakov
d3f3bc3565
Merge pull request #68629 from ClickHouse/revert-68515-fix-01079_bad_alters_zookeeper_long
Fix test `01079_bad_alters_zookeeper_long`
2024-08-23 18:05:03 +00:00
Max Kainov
4200b3d5cb CI: Stress test fix 2024-08-22 21:19:56 +02:00
Kruglov Pavel
f1b1f8afcf
Merge pull request #67875 from Avogar/limits-for-random-settings
Allow to specify min and max for random settings in the test
2024-08-21 15:57:22 +00:00
Alexander Tokmakov
fe637452ec
Revert "Fix test 01079_bad_alters_zookeeper_long" 2024-08-20 19:54:12 +02:00
Nikita Fomichev
ecd60eab5f Stateless tests: increase hung check timeout 2024-08-19 17:03:53 +02:00
Alexey Milovidov
8f2c20806a Fix test 01079_bad_alters_zookeeper_long 2024-08-18 22:45:13 +02:00
Azat Khuzhin
a66db7abc2 Fix output of clickhouse-test in case of tests timeouts
After https://github.com/ClickHouse/ClickHouse/pull/67737 the output
will be broken, since in case of timeout it will print to stdout.

Let's just capture it and add it to stderr.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-17 00:20:08 +02:00
Kruglov Pavel
28b0aad3f9
Fix python style 2024-08-14 15:16:34 +02:00
avogar
6dfed409f4 Fix seraching for query params 2024-08-13 16:09:45 +00:00
Kruglov Pavel
0414cdbbbf
Fix unpack error 2024-08-13 15:58:49 +02:00
Igor Nikonov
d04db7e26d Merge remote-tracking branch 'origin/master' into pr-local-plan 2024-08-11 20:11:32 +00:00
Azat Khuzhin
bc2740aa70 tests/clickhouse-test: s/RELEASE_BUILD/RELEASE_NON_SANITIZED/g
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-08 13:00:37 +02:00
Azat Khuzhin
979f93df12 tests/clickhouse-test: better english in comment
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-08 13:00:07 +02:00
Azat Khuzhin
420f97c850 tests/clickhouse-test: update return type hint in run_single_test()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-08 12:58:40 +02:00
Azat Khuzhin
e90487fd54 tests/clickhouse-test: remove superior global
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-08 12:57:50 +02:00
Kruglov Pavel
cfeb20681d
Fix style check 2024-08-07 14:42:42 +02:00
avogar
d124de847b Fix style 2024-08-06 16:06:59 +00:00
Azat Khuzhin
72bd43a309 tests: do not capture client stacktraces in stress tests
They are too uncontrollable, and likely will leave some clients [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/67737/9658be5eea8351655dd3ea77b8c1d4717bac7999/stress_test__ubsan_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:42:19 +02:00
Azat Khuzhin
8ce23ff113 tests: increase delay co capture client stacktraces for sanitizers build
5 seconds is too small and not enough to print even few frames.

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/67737/9658be5eea8351655dd3ea77b8c1d4717bac7999/stress_test__ubsan_.html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
ef7d12db66 tests: change the process group earlier to avoid killing self
Previously it was possible to have original pgid from the spawned
threads, that could lead to killing the caller script and in case of CI
it could be init process [1].

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/67737/e68c9c8d16f37f6c25739076c9b071ed97952269/stress_test__asan_/stress_test_run_21.txt

Repro:

    $ echo "SELECT '1" > tests/queries/0_stateless/00001_select_1.sql # break the test
    $ cat /tmp/test.sh
    ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings

Before this change:

    $ /tmp/test.sh
    Using queries from '/src/ch/worktrees/clickhouse-upstream/tests/queries' directory
    Connecting to ClickHouse server... OK
    Connected to server 24.8.1.1 @ bef896ce143ea4e0464c9829de6277ba06cc1a53 mt/rename-without-lock-v2
    Running 3 stateless tests (MainProcess).
    00001_select_1:                                                         [ FAIL ]
    Reason: return code:  62
    Code: 62. DB::Exception: Syntax error: failed at position 8 (''1;
    '): '1;
    . Single quoted string is not closed: ''1;
    '. (SYNTAX_ERROR)

    , result:

    stdout:

    Database: test_hz2zwymr
    Child processes of 13041:
    13042 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
    Killing process group 13040
    Processes in process group 13040:
    13040 -bash
    13042 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings

    [2]+  Stopped                 /tmp/test.sh
    [1]$ Process group 13040 should be killed
    Max failures chain

    [2]+  Killed                  /tmp/test.sh

After:

    $ /tmp/test.sh
    Using queries from '/src/ch/worktrees/clickhouse-upstream/tests/queries' directory
    Connecting to ClickHouse server... OK
    Connected to server 24.8.1.1 @ bef896ce143ea4e0464c9829de6277ba06cc1a53 mt/rename-without-lock-v2
    Running 3 stateless tests (MainProcess).
    00001_select_1:                                                         [ FAIL ]
    Reason: return code:  62
    Code: 62. DB::Exception: Syntax error: failed at position 8 (''1;
    '): '1;
    . Single quoted string is not closed: ''1;
    '. (SYNTAX_ERROR)

    , result:

    stdout:

    Database: test_urz6rk5z
    Child processes of 9782:
    9785 python3 ./tests/clickhouse-test 0001_select --test-runs 3 --max-failures-chain 1 --no-random-settings --no-random-merge-tree-settings
    Max failures chain

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
b76fb165d1 tests: fix pylint issue in clickhouse_execute_http()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
a6ccf19869 tests: capture stderr/stdout/debuglog after terminating test
It was simply wrong before, but now, with capturing stacktrace that can
take sometime it is a must.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
a478ad24a9 tests: try to catch stacktraces from client in case of test timeouts
This is to catch issues like [1].

  [1]: https://github.com/ClickHouse/ClickHouse/issues/67736

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
f9dcce6da3 tests: omit python stacktace in case of signals/server died
It is simply useless and only create output that only distracts.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Azat Khuzhin
ea1575f60a tests: avoid leaving processes leftovers
Previously processes cleanup on i.e. SIGINT simply did not work, because
the launcher kills only processes in process group, while tests are
launched with start_new_session=True for Popen(), which creates own
process group.

This is needed for killing process group in case of test timeout.

So instead, look at the parent pid, and kill the child process groups.

Also add some logging to make it more explicit which processes will be
killed.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2024-08-06 16:39:16 +02:00
Kruglov Pavel
56415028d6
Fix pylint 2024-08-06 15:01:10 +02:00
avogar
71c06b40cb Avoid regexp 2024-08-06 09:07:21 +00:00
avogar
bb33dca384 Fix unrelated changes 2024-08-06 08:49:08 +00:00
avogar
5226792b1d Fix bad merge with master 2024-08-06 08:48:06 +00:00
avogar
74a2976810 Fix pylint 2024-08-06 08:13:03 +00:00
avogar
18a7a82458 Better formatting 2024-08-05 21:18:37 +00:00
avogar
d3dc174533 Remove log 2024-08-05 21:15:11 +00:00
avogar
695cbe9f85 Merge branch 'master' of github.com:ClickHouse/ClickHouse into limits-for-random-settings 2024-08-05 21:12:33 +00:00