From 9c7db6711be82853ea4971043677c17401513ce8 Mon Sep 17 00:00:00 2001 From: Azat Khuzhin Date: Mon, 1 Jul 2024 17:05:34 +0200 Subject: [PATCH] Fix possible issues with MySQL client protocol TLS connections Occasionally, 02479_mysql_connect_to_self fails on CI [1]. [1]: https://github.com/ClickHouse/ClickHouse/issues/50911 The problem was indeed query profiler and EINTR, but not in a way you may think. For such failures you may see the following trace in trace_log: contrib/openssl/crypto/bio/bss_sock.c:127::sock_read contrib/openssl/crypto/bio/bio_meth.c:121::bread_conv contrib/openssl/crypto/bio/bio_lib.c:285::bio_read_intern contrib/openssl/crypto/bio/bio_lib.c:311::BIO_read contrib/openssl/ssl/record/methods/tls_common.c:398::tls_default_read_n contrib/openssl/ssl/record/methods/tls_common.c:575::tls_get_more_records contrib/openssl/ssl/record/methods/tls_common.c:1122::tls_read_record contrib/openssl/ssl/record/rec_layer_s3.c:645::ssl3_read_bytes contrib/openssl/ssl/s3_lib.c:4527::ssl3_read_internal contrib/openssl/ssl/s3_lib.c:4551::ssl3_read contrib/openssl/ssl/ssl_lib.c:2343::ssl_read_internal contrib/openssl/ssl/ssl_lib.c:2357::SSL_read contrib/mariadb-connector-c/libmariadb/secure/openssl.c:729::ma_tls_read contrib/mariadb-connector-c/libmariadb/ma_tls.c:90::ma_pvio_tls_read contrib/mariadb-connector-c/libmariadb/ma_pvio.c:250::ma_pvio_read contrib/mariadb-connector-c/libmariadb/ma_pvio.c:297::ma_pvio_cache_read contrib/mariadb-connector-c/libmariadb/ma_net.c:373::ma_real_read contrib/mariadb-connector-c/libmariadb/ma_net.c:427::ma_net_read contrib/mariadb-connector-c/libmariadb/mariadb_lib.c:192::ma_net_safe_read contrib/mariadb-connector-c/libmariadb/mariadb_lib.c:2138::mthd_my_read_query_result contrib/mariadb-connector-c/libmariadb/mariadb_lib.c:2212::mysql_real_query src/Common/mysqlxx/Query.cpp:56::mysqlxx::Query::executeImpl() src/Common/mysqlxx/Query.cpp:73::mysqlxx::Query::use() src/Processors/Sources/MySQLSource.cpp:50::DB::MySQLSource::Connection::Connection() After which the connection will fail with: Code: 1000. DB::Exception: Received from localhost:9000. DB::Exception: mysqlxx::ConnectionLost: Lost connection to MySQL server during query (127.0.0.1:9004). (POCO_EXCEPTION) But, if you will take a look at ma_tls_read() you will see that it has proper retries for SSL_ERROR_WANT_READ (and EINTR is just a special case of it), but still, for some reason it fails. And the reason is the units of the read/write timeout, ma_tls_read() calls poll(read_timeout) in case of SSL_ERROR_WANT_READ, but, it incorrectly assume that the timeout is in milliseconds, but that timeout was in seconds, this bug had been fixed in [2], and now it works like a charm! [2]: https://github.com/ClickHouse/mariadb-connector-c/pull/17 I've verified it with patching openssl library: diff --git a/crypto/bio/bss_sock.c b/crypto/bio/bss_sock.c index 82f7be85ae..3d2f3926a0 100644 --- a/crypto/bio/bss_sock.c +++ b/crypto/bio/bss_sock.c @@ -124,7 +124,24 @@ static int sock_read(BIO *b, char *out, int outl) ret = ktls_read_record(b->num, out, outl); else # endif - ret = readsocket(b->num, out, outl); + { + static int i = 0; + static int j = 0; + if (!(++j % 5)) + { + fprintf(stderr, "sock_read: inject EAGAIN with ret=0\n"); + ret = 0; + errno = EAGAIN; + } + else if (!(++i % 3)) + { + fprintf(stderr, "sock_read: inject EAGAIN with ret=-1\n"); + ret = -1; + errno = EAGAIN; + } + else + ret = readsocket(b->num, out, outl); + } BIO_clear_retry_flags(b); if (ret <= 0) { if (BIO_sock_should_retry(ret)) And before this patch (well, not the patch itself, but the referenced patch in mariadb-connector-c) if you will pass read_write_timeout=1 it will fail: SELECT * FROM mysql('127.0.0.1:9004', system, one, 'default', '', SETTINGS connect_timeout = 100, connection_wait_timeout = 100, read_write_timeout=1) Code: 1000. DB::Exception: Received from localhost:9000. DB::Exception: mysqlxx::ConnectionLost: Lost connection to MySQL server during query (127.0.0.1:9004). (POCO_EXCEPTION) But after, it always works: $ ch benchmark -c10 -q "SELECT * FROM mysql('127.0.0.1:9004', system, one, 'default', '', SETTINGS connection_pool_size=1, connect_timeout = 100, connection_wait_timeout = 100, read_write_timeout=1)" ^CStopping launch of queries. SIGINT received. Queries executed: 478. localhost:9000, queries: 478, QPS: 120.171, RPS: 120.171, MiB/s: 0.001, result RPS: 120.171, result MiB/s: 0.001. 0.000% 0.014 sec. 10.000% 0.058 sec. 20.000% 0.065 sec. 30.000% 0.073 sec. 40.000% 0.079 sec. 50.000% 0.087 sec. 60.000% 0.089 sec. 70.000% 0.091 sec. 80.000% 0.095 sec. 90.000% 0.100 sec. 95.000% 0.102 sec. 99.000% 0.105 sec. 99.900% 0.140 sec. 99.990% 0.140 sec. Signed-off-by: Azat Khuzhin --- contrib/mariadb-connector-c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/mariadb-connector-c b/contrib/mariadb-connector-c index e39608998f5..d0a788c5b9f 160000 --- a/contrib/mariadb-connector-c +++ b/contrib/mariadb-connector-c @@ -1 +1 @@ -Subproject commit e39608998f5f6944ece9ec61f48e9172ec1de660 +Subproject commit d0a788c5b9fcaca2368d9233770d3ca91ea79f88