add comment to check_nested_column_not_in_using

This commit is contained in:
vdimir 2024-01-10 13:29:28 +00:00
parent 25268fb863
commit 2c68bdecd8
No known key found for this signature in database
GPG Key ID: 6EE4CE2BEDC51862

View File

@ -3141,6 +3141,25 @@ QueryTreeNodePtr QueryAnalyzer::tryResolveIdentifierFromJoin(const IdentifierLoo
auto check_nested_column_not_in_using = [&join_using_column_name_to_column_node, &identifier_lookup](const QueryTreeNodePtr & node)
{
/** tldr: When an identifier is resolved into the function `nested` or `getSubcolumn`, and
* some column in its argument is in the USING list and its type has to be updated, we throw an error to avoid overcomplication.
*
* Identifiers can be resolved into functions in case of nested or subcolumns.
* For example `t.t.t` can be resolved into `getSubcolumn(t, 't.t')` function in case of `t` is `Tuple`.
* So, `t` in USING list is resolved from JOIN itself and has supertype of columns from left and right table.
* But `t` in `getSubcolumn` argument is still resolved from table and we need to update its type.
*
* It can be more complicated in case of Nested subcolumns, in that case in query:
* SELECT t FROM ... JOIN ... USING (t.t)
* Here, `t` is resolved into function `nested(['t', 's'], t.t, t.s) so, `t.t` should be from JOIN and `t.s` should be from table.
*
* Updating type accordingly is pretty complicated, so just forbid such cases.
*
* While it still may work for storages that support selecting subcolumns directly without `getSubcolumn` function:
* SELECT t, t.t, toTypeName(t), toTypeName(t.t) FROM t1 AS a FULL JOIN t2 AS b USING t.t;
* We just support it as a best-effort: `t` will have original type from table, but `t.t` will have super-type from JOIN.
* Probably it's good to prohibit such cases as well, but it's not clear how to check it in general case.
*/
if (node->getNodeType() != QueryTreeNodeType::FUNCTION)
throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected node type {}, expected function node", node->getNodeType());
@ -3152,7 +3171,7 @@ QueryTreeNodePtr QueryAnalyzer::tryResolveIdentifierFromJoin(const IdentifierLoo
const auto & column_name = argument_node->as<ColumnNode &>().getColumnName();
if (join_using_column_name_to_column_node.contains(column_name))
throw Exception(ErrorCodes::AMBIGUOUS_IDENTIFIER,
"Cannot select identifier '{}' while joining using nested column '{}'",
"Cannot select subcolumn for identifier '{}' while joining using column '{}'",
identifier_lookup.identifier, column_name);
}
else if (argument_node->getNodeType() == QueryTreeNodeType::CONSTANT)