optimize select query from cluster table function

Use local node as first priority to get Structure Of Remote Table.
we have many distributed queries( like   select xx from cluster('xx',view  (xxxx)   ) on a clickhouse cluster.  we found that the first node (shard_num=1)  have 2 times of query number compared to other shards.
The reason is that  the getStructureOfRemoteTableInShard func always take the first shard to execute  "DESC TABLE xx" query.
The better way is to  use local node as first priority which save the network rpc and reduce the pressure of first shard .
This commit is contained in:
Mingliang Pan 2022-07-21 11:56:35 +08:00 committed by GitHub
parent fd691000b7
commit 3f76c8d7fd
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -122,6 +122,18 @@ ColumnsDescription getStructureOfRemoteTable(
const auto & shards_info = cluster.getShardsInfo();
std::string fail_messages;
// use local shard as first priority, as it needs no network communication
for (const auto & shard_info : shards_info)
{
if(shard_info.isLocal()){
const auto & res = getStructureOfRemoteTableInShard(cluster, shard_info, table_id, context, table_func_ptr);
if (res.empty())
continue;
return res;
}
}
for (const auto & shard_info : shards_info)
{