Set --server-check-retries to 90 (and this is ~45 seconds), since right
now sometimes it is not enough [1].
[1]: https://clickhouse-test-reports.s3.yandex.net/30191/0e34a9d550cfe6924fe575871f36c44dd44acdaa/functional_stateless_tests_(memory).html#fail1
And the reason I guess is clickhouse-test had been rewritten to
http.client in #30065, and since now it does not need to execute
clickhouse-client binary, which in debug/sanitizers builds can take also
sometime.
That said that with clickhouse-client for hung check it was not 15
seconds, but more (each clickhouse-client requires 0.6sec with
sanitizers for simple SELECT 1, while w/o 0.1second, also too much
should be optimized)
Cons of clickhouse-driver:
- it is one more extra dependency
- it does not have correct timeouts (only for socket operations, and
this is not the same, so we need to set timeout by ourself)
- it is one more thing which can break (@alesapin)
Pros:
- Using native protocol over executing binaries is always better
- `clickhouse-client` in debug build takes almost a second to execute simple `SELECT 1`
and `clickhouse-test` requires ~5 queries at start (determine some
flags, zk, alive, create database)
Notes:
- `FORMAT Vertical` had been replaced with printing of `pandas.DataFrame`
And after this patch tiny tests work with the speed of the test, and
does not requires +-5 seconds of bootstrapping.
Right now it is possible to get the following error:
Having 20 errors! 0 tests passed. 0 tests skipped. 57.37 s elapsed (MainProcess).
Won't run stateful tests because test data wasn't loaded.
Traceback (most recent call last):
File "/usr/lib/python3.9/multiprocessing/managers.py", line 802, in _callmethod
conn = self._tls.connection
AttributeError: 'ForkAwareLocal' object has no attribute 'connection'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 1462, in <module>
main(args)
File "/src/ch/clickhouse/.cmake/../tests/clickhouse-test", line 1261, in main
if len(restarted_tests) > 0:
File "<string>", line 2, in __len__
File "/usr/lib/python3.9/multiprocessing/managers.py", line 806, in _callmethod
self._connect()
File "/usr/lib/python3.9/multiprocessing/managers.py", line 793, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python3.9/multiprocessing/connection.py", line 507, in Client
c = SocketClient(address)
File "/usr/lib/python3.9/multiprocessing/connection.py", line 635, in SocketClient
s.connect(address)
ConnectionRefusedError: [Errno 111] Connection refused
The reason behind this is that manager's thread got terminated:
ipdb> p restarted_tests._manager._process
<ForkProcess name='SyncManager-1' pid=25125 parent=24939 stopped exitcode=-SIGTERM>
Refs: #29259 (cc: @vdimir)
Follow-up for: #29197 (cc: @tavplubix)