#pragma once #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include namespace DB { namespace ErrorCodes { extern const int ATTEMPT_TO_READ_AFTER_EOF; extern const int CANNOT_PARSE_NUMBER; extern const int CANNOT_READ_ARRAY_FROM_TEXT; extern const int CANNOT_PARSE_INPUT_ASSERTION_FAILED; extern const int CANNOT_PARSE_QUOTED_STRING; extern const int CANNOT_PARSE_ESCAPE_SEQUENCE; extern const int CANNOT_PARSE_DATE; extern const int CANNOT_PARSE_DATETIME; extern const int CANNOT_PARSE_TEXT; extern const int CANNOT_PARSE_UUID; extern const int CANNOT_PARSE_IPV4; extern const int CANNOT_PARSE_IPV6; extern const int TOO_FEW_ARGUMENTS_FOR_FUNCTION; extern const int LOGICAL_ERROR; extern const int TYPE_MISMATCH; extern const int CANNOT_CONVERT_TYPE; extern const int ILLEGAL_COLUMN; extern const int NUMBER_OF_ARGUMENTS_DOESNT_MATCH; extern const int ILLEGAL_TYPE_OF_ARGUMENT; extern const int NOT_IMPLEMENTED; extern const int CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN; } /** Type conversion functions. * toType - conversion in "natural way"; */ inline UInt32 extractToDecimalScale(const ColumnWithTypeAndName & named_column) { const auto * arg_type = named_column.type.get(); bool ok = checkAndGetDataType(arg_type) || checkAndGetDataType(arg_type) || checkAndGetDataType(arg_type) || checkAndGetDataType(arg_type); if (!ok) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type of toDecimal() scale {}", named_column.type->getName()); Field field; named_column.column->get(0, field); return static_cast(field.get()); } /// Function toUnixTimestamp has exactly the same implementation as toDateTime of String type. struct NameToUnixTimestamp { static constexpr auto name = "toUnixTimestamp"; }; struct AccurateConvertStrategyAdditions { UInt32 scale { 0 }; }; struct AccurateOrNullConvertStrategyAdditions { UInt32 scale { 0 }; }; struct ConvertDefaultBehaviorTag {}; struct ConvertReturnNullOnErrorTag {}; struct ConvertReturnZeroOnErrorTag {}; /** Conversion of number types to each other, enums to numbers, dates and datetimes to numbers and back: done by straight assignment. * (Date is represented internally as number of days from some day; DateTime - as unix timestamp) */ template struct ConvertImpl { using FromFieldType = typename FromDataType::FieldType; using ToFieldType = typename ToDataType::FieldType; template static ColumnPtr NO_SANITIZE_UNDEFINED execute( const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type [[maybe_unused]], size_t input_rows_count, Additions additions [[maybe_unused]] = Additions()) { const ColumnWithTypeAndName & named_from = arguments[0]; using ColVecFrom = typename FromDataType::ColumnType; using ColVecTo = typename ToDataType::ColumnType; if constexpr ((IsDataTypeDecimal || IsDataTypeDecimal) && !(std::is_same_v || std::is_same_v)) { if constexpr (!IsDataTypeDecimalOrNumber || !IsDataTypeDecimalOrNumber) { throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", named_from.column->getName(), Name::name); } } if (const ColVecFrom * col_from = checkAndGetColumn(named_from.column.get())) { typename ColVecTo::MutablePtr col_to = nullptr; if constexpr (IsDataTypeDecimal) { UInt32 scale; if constexpr (std::is_same_v || std::is_same_v) { scale = additions.scale; } else { scale = additions; } col_to = ColVecTo::create(0, scale); } else col_to = ColVecTo::create(); const auto & vec_from = col_from->getData(); auto & vec_to = col_to->getData(); vec_to.resize(input_rows_count); ColumnUInt8::MutablePtr col_null_map_to; ColumnUInt8::Container * vec_null_map_to [[maybe_unused]] = nullptr; if constexpr (std::is_same_v) { col_null_map_to = ColumnUInt8::create(input_rows_count, false); vec_null_map_to = &col_null_map_to->getData(); } bool result_is_bool = isBool(result_type); for (size_t i = 0; i < input_rows_count; ++i) { if constexpr (std::is_same_v) { if (result_is_bool) { vec_to[i] = vec_from[i] != FromFieldType(0); continue; } } if constexpr (std::is_same_v && std::is_same_v) { static_assert(std::is_same_v, "UInt128 and UUID types must be same"); if constexpr (std::endian::native == std::endian::little) { vec_to[i].items[1] = vec_from[i].toUnderType().items[0]; vec_to[i].items[0] = vec_from[i].toUnderType().items[1]; } else { vec_to[i] = vec_from[i].toUnderType(); } continue; } if constexpr (std::is_same_v != std::is_same_v) { throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Conversion between numeric types and UUID is not supported. " "Probably the passed UUID is unquoted"); } else if constexpr ( (std::is_same_v != std::is_same_v) && !(is_any_of || is_any_of) ) { throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Conversion from {} to {} is not supported", TypeName, TypeName); } else if constexpr (std::is_same_v != std::is_same_v && !(std::is_same_v || std::is_same_v)) { throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Conversion between numeric types and IPv6 is not supported. " "Probably the passed IPv6 is unquoted"); } else { if constexpr (IsDataTypeDecimal || IsDataTypeDecimal) { if constexpr (std::is_same_v) { ToFieldType result; bool convert_result = false; if constexpr (IsDataTypeDecimal && IsDataTypeDecimal) convert_result = tryConvertDecimals(vec_from[i], col_from->getScale(), col_to->getScale(), result); else if constexpr (IsDataTypeDecimal && IsDataTypeNumber) convert_result = tryConvertFromDecimal(vec_from[i], col_from->getScale(), result); else if constexpr (IsDataTypeNumber && IsDataTypeDecimal) convert_result = tryConvertToDecimal(vec_from[i], col_to->getScale(), result); if (convert_result) vec_to[i] = result; else { vec_to[i] = static_cast(0); (*vec_null_map_to)[i] = true; } } else { if constexpr (IsDataTypeDecimal && IsDataTypeDecimal) vec_to[i] = convertDecimals(vec_from[i], col_from->getScale(), col_to->getScale()); else if constexpr (IsDataTypeDecimal && IsDataTypeNumber) vec_to[i] = convertFromDecimal(vec_from[i], col_from->getScale()); else if constexpr (IsDataTypeNumber && IsDataTypeDecimal) vec_to[i] = convertToDecimal(vec_from[i], col_to->getScale()); else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Unsupported data type in conversion function"); } } else { /// If From Data is Nan or Inf and we convert to integer type, throw exception if constexpr (std::is_floating_point_v && !std::is_floating_point_v) { if (!isFinite(vec_from[i])) { if constexpr (std::is_same_v) { vec_to[i] = 0; (*vec_null_map_to)[i] = true; continue; } else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Unexpected inf or nan to integer conversion"); } } if constexpr (std::is_same_v || std::is_same_v) { bool convert_result = accurate::convertNumeric(vec_from[i], vec_to[i]); if (!convert_result) { if (std::is_same_v) { vec_to[i] = 0; (*vec_null_map_to)[i] = true; } else { throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Value in column {} cannot be safely converted into type {}", named_from.column->getName(), result_type->getName()); } } } else { if constexpr (std::is_same_v && std::is_same_v) { const uint8_t ip4_cidr[] {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x00}; const uint8_t * src = reinterpret_cast(&vec_from[i].toUnderType()); if (!matchIPv6Subnet(src, ip4_cidr, 96)) { char addr[IPV6_MAX_TEXT_LENGTH + 1] {}; char * paddr = addr; formatIPv6(src, paddr); throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "IPv6 {} in column {} is not in IPv4 mapping block", addr, named_from.column->getName()); } uint8_t * dst = reinterpret_cast(&vec_to[i].toUnderType()); if constexpr (std::endian::native == std::endian::little) { dst[0] = src[15]; dst[1] = src[14]; dst[2] = src[13]; dst[3] = src[12]; } else { dst[0] = src[12]; dst[1] = src[13]; dst[2] = src[14]; dst[3] = src[15]; } } else if constexpr (std::is_same_v && std::is_same_v) { const uint8_t * src = reinterpret_cast(&vec_from[i].toUnderType()); uint8_t * dst = reinterpret_cast(&vec_to[i].toUnderType()); std::memset(dst, '\0', IPV6_BINARY_LENGTH); dst[10] = dst[11] = 0xff; if constexpr (std::endian::native == std::endian::little) { dst[12] = src[3]; dst[13] = src[2]; dst[14] = src[1]; dst[15] = src[0]; } else { dst[12] = src[0]; dst[13] = src[1]; dst[14] = src[2]; dst[15] = src[3]; } } else if constexpr (std::is_same_v && std::is_same_v) vec_to[i] = static_cast(static_cast(vec_from[i])); else if constexpr (std::is_same_v && (std::is_same_v || std::is_same_v)) vec_to[i] = static_cast(vec_from[i] * DATE_SECONDS_PER_DAY); else vec_to[i] = static_cast(vec_from[i]); } } } } if constexpr (std::is_same_v) return ColumnNullable::create(std::move(col_to), std::move(col_null_map_to)); else return col_to; } else throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", named_from.column->getName(), Name::name); } }; /** Conversion of DateTime to Date: throw off time component. */ template struct ConvertImpl : DateTimeTransformImpl {}; /** Conversion of DateTime to Date32: throw off time component. */ template struct ConvertImpl : DateTimeTransformImpl {}; /** Conversion of Date to DateTime: adding 00:00:00 time component. */ struct ToDateTimeImpl { static constexpr auto name = "toDateTime"; static UInt32 execute(UInt16 d, const DateLUTImpl & time_zone) { return static_cast(time_zone.fromDayNum(DayNum(d))); } static Int64 execute(Int32 d, const DateLUTImpl & time_zone) { return time_zone.fromDayNum(ExtendedDayNum(d)); } static UInt32 execute(UInt32 dt, const DateLUTImpl & /*time_zone*/) { return dt; } // TODO: return UInt32 ??? static Int64 execute(Int64 dt64, const DateLUTImpl & /*time_zone*/) { return dt64; } }; template struct ConvertImpl : DateTimeTransformImpl {}; template struct ConvertImpl : DateTimeTransformImpl {}; /// Implementation of toDate function. template struct ToDateTransform32Or64 { static constexpr auto name = "toDate"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl & time_zone) { // since converting to Date, no need in values outside of default LUT range. return (from <= DATE_LUT_MAX_DAY_NUM) ? from : time_zone.toDayNum(std::min(time_t(from), time_t(0xFFFFFFFF))); } }; template struct ToDateTransform32Or64Signed { static constexpr auto name = "toDate"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl & time_zone) { // TODO: decide narrow or extended range based on FromType /// The function should be monotonic (better for query optimizations), so we saturate instead of overflow. if (from < 0) return 0; return (from <= DATE_LUT_MAX_DAY_NUM) ? static_cast(from) : time_zone.toDayNum(std::min(time_t(from), time_t(0xFFFFFFFF))); } }; template struct ToDateTransform8Or16Signed { static constexpr auto name = "toDate"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl &) { if (from < 0) return 0; return from; } }; template struct ConvertImpl : DateTimeTransformImpl> {}; /// Implementation of toDate32 function. template struct ToDate32Transform32Or64 { static constexpr auto name = "toDate32"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl & time_zone) { return (from < DATE_LUT_MAX_EXTEND_DAY_NUM) ? static_cast(from) : time_zone.toDayNum(std::min(time_t(from), time_t(0xFFFFFFFF))); } }; template struct ToDate32Transform32Or64Signed { static constexpr auto name = "toDate32"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl & time_zone) { static const Int32 daynum_min_offset = -static_cast(time_zone.getDayNumOffsetEpoch()); if (from < daynum_min_offset) return daynum_min_offset; return (from < DATE_LUT_MAX_EXTEND_DAY_NUM) ? static_cast(from) : time_zone.toDayNum(std::min(time_t(Int64(from)), time_t(0xFFFFFFFF))); } }; template struct ToDate32Transform8Or16Signed { static constexpr auto name = "toDate32"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl &) { return from; } }; /** Special case of converting Int8, Int16, (U)Int32 or (U)Int64 (and also, for convenience, * Float32, Float64) to Date. If the number is negative, saturate it to unix epoch time. If the * number is less than 65536, then it is treated as DayNum, and if it's greater or equals to 65536, * then treated as unix timestamp. If the number exceeds UInt32, saturate to MAX_UINT32 then as DayNum. * It's a bit illogical, as we actually have two functions in one. * But allows to support frequent case, * when user write toDate(UInt32), expecting conversion of unix timestamp to Date. * (otherwise such usage would be frequent mistake). */ template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ToDateTimeTransform64 { static constexpr auto name = "toDateTime"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl &) { return static_cast(std::min(time_t(from), time_t(0xFFFFFFFF))); } }; template struct ToDateTimeTransformSigned { static constexpr auto name = "toDateTime"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl &) { if (from < 0) return 0; return from; } }; template struct ToDateTimeTransform64Signed { static constexpr auto name = "toDateTime"; static NO_SANITIZE_UNDEFINED ToType execute(const FromType & from, const DateLUTImpl &) { if (from < 0) return 0; return static_cast(std::min(time_t(from), time_t(0xFFFFFFFF))); } }; /** Special case of converting Int8, Int16, Int32 or (U)Int64 (and also, for convenience, Float32, * Float64) to DateTime. If the number is negative, saturate it to unix epoch time. If the number * exceeds UInt32, saturate to MAX_UINT32. */ template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; constexpr time_t LUT_MIN_TIME = -2208988800l; // 1900-01-01 UTC constexpr time_t LUT_MAX_TIME = 10413791999l; // 2299-12-31 UTC /** Conversion of numeric to DateTime64 */ template struct ToDateTime64TransformUnsigned { static constexpr auto name = "toDateTime64"; const DateTime64::NativeType scale_multiplier = 1; ToDateTime64TransformUnsigned(UInt32 scale = 0) /// NOLINT : scale_multiplier(DecimalUtils::scaleMultiplier(scale)) {} NO_SANITIZE_UNDEFINED DateTime64::NativeType execute(FromType from, const DateLUTImpl &) const { from = std::min(from, LUT_MAX_TIME); return DecimalUtils::decimalFromComponentsWithMultiplier(from, 0, scale_multiplier); } }; template struct ToDateTime64TransformSigned { static constexpr auto name = "toDateTime64"; const DateTime64::NativeType scale_multiplier = 1; ToDateTime64TransformSigned(UInt32 scale = 0) /// NOLINT : scale_multiplier(DecimalUtils::scaleMultiplier(scale)) {} NO_SANITIZE_UNDEFINED DateTime64::NativeType execute(FromType from, const DateLUTImpl &) const { from = static_cast(std::max(from, LUT_MIN_TIME)); from = static_cast(std::min(from, LUT_MAX_TIME)); return DecimalUtils::decimalFromComponentsWithMultiplier(from, 0, scale_multiplier); } }; template struct ToDateTime64TransformFloat { static constexpr auto name = "toDateTime64"; const UInt32 scale = 1; ToDateTime64TransformFloat(UInt32 scale_ = 0) /// NOLINT : scale(scale_) {} NO_SANITIZE_UNDEFINED DateTime64::NativeType execute(FromType from, const DateLUTImpl &) const { from = std::max(from, static_cast(LUT_MIN_TIME)); from = std::min(from, static_cast(LUT_MAX_TIME)); return convertToDecimal(from, scale); } }; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; /** Conversion of DateTime64 to Date or DateTime: discards fractional part. */ template struct FromDateTime64Transform { static constexpr auto name = Transform::name; const DateTime64::NativeType scale_multiplier = 1; FromDateTime64Transform(UInt32 scale) /// NOLINT : scale_multiplier(DecimalUtils::scaleMultiplier(scale)) {} auto execute(DateTime64::NativeType dt, const DateLUTImpl & time_zone) const { const auto c = DecimalUtils::splitWithScaleMultiplier(DateTime64(dt), scale_multiplier); return Transform::execute(static_cast(c.whole), time_zone); } }; /** Conversion of DateTime64 to Date or DateTime: discards fractional part. */ template struct ConvertImpl : DateTimeTransformImpl> {}; template struct ConvertImpl : DateTimeTransformImpl> {}; struct ToDateTime64Transform { static constexpr auto name = "toDateTime64"; const DateTime64::NativeType scale_multiplier = 1; ToDateTime64Transform(UInt32 scale = 0) /// NOLINT : scale_multiplier(DecimalUtils::scaleMultiplier(scale)) {} DateTime64::NativeType execute(UInt16 d, const DateLUTImpl & time_zone) const { const auto dt = ToDateTimeImpl::execute(d, time_zone); return execute(dt, time_zone); } DateTime64::NativeType execute(Int32 d, const DateLUTImpl & time_zone) const { const auto dt = ToDateTimeImpl::execute(d, time_zone); return DecimalUtils::decimalFromComponentsWithMultiplier(dt, 0, scale_multiplier); } DateTime64::NativeType execute(UInt32 dt, const DateLUTImpl & /*time_zone*/) const { return DecimalUtils::decimalFromComponentsWithMultiplier(dt, 0, scale_multiplier); } }; /** Conversion of Date or DateTime to DateTime64: add zero sub-second part. */ template struct ConvertImpl : DateTimeTransformImpl {}; template struct ConvertImpl : DateTimeTransformImpl {}; template struct ConvertImpl : DateTimeTransformImpl {}; /** Transformation of numbers, dates, datetimes to strings: through formatting. */ template struct FormatImpl { template static ReturnType execute(const typename DataType::FieldType x, WriteBuffer & wb, const DataType *, const DateLUTImpl *) { writeText(x, wb); return ReturnType(true); } }; template <> struct FormatImpl { template static ReturnType execute(const DataTypeDate::FieldType x, WriteBuffer & wb, const DataTypeDate *, const DateLUTImpl * time_zone) { writeDateText(DayNum(x), wb, *time_zone); return ReturnType(true); } }; template <> struct FormatImpl { template static ReturnType execute(const DataTypeDate32::FieldType x, WriteBuffer & wb, const DataTypeDate32 *, const DateLUTImpl * time_zone) { writeDateText(ExtendedDayNum(x), wb, *time_zone); return ReturnType(true); } }; template <> struct FormatImpl { template static ReturnType execute(const DataTypeDateTime::FieldType x, WriteBuffer & wb, const DataTypeDateTime *, const DateLUTImpl * time_zone) { writeDateTimeText(x, wb, *time_zone); return ReturnType(true); } }; template <> struct FormatImpl { template static ReturnType execute(const DataTypeDateTime64::FieldType x, WriteBuffer & wb, const DataTypeDateTime64 * type, const DateLUTImpl * time_zone) { writeDateTimeText(DateTime64(x), type->getScale(), wb, *time_zone); return ReturnType(true); } }; template struct FormatImpl> { template static ReturnType execute(const FieldType x, WriteBuffer & wb, const DataTypeEnum * type, const DateLUTImpl *) { static constexpr bool throw_exception = std::is_same_v; if constexpr (throw_exception) { writeString(type->getNameForValue(x), wb); } else { StringRef res; bool is_ok = type->getNameForValue(x, res); if (is_ok) writeString(res, wb); return ReturnType(is_ok); } } }; template struct FormatImpl> { template static ReturnType execute(const FieldType x, WriteBuffer & wb, const DataTypeDecimal * type, const DateLUTImpl *) { writeText(x, type->getScale(), wb, false); return ReturnType(true); } }; /// DataTypeEnum to DataType free conversion template struct ConvertImpl, DataTypeNumber, Name, ConvertDefaultBehaviorTag> { static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) { return arguments[0].column; } }; static inline ColumnUInt8::MutablePtr copyNullMap(ColumnPtr col) { ColumnUInt8::MutablePtr null_map = nullptr; if (const auto * col_null = checkAndGetColumn(col.get())) { null_map = ColumnUInt8::create(); null_map->insertRangeFrom(col_null->getNullMapColumn(), 0, col_null->size()); } return null_map; } template requires (!std::is_same_v) struct ConvertImpl { using FromFieldType = typename FromDataType::FieldType; using ColVecType = ColumnVectorOrDecimal; static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) { auto datetime_arg = arguments[0]; datetime_arg.column = datetime_arg.column->convertToFullColumnIfConst(); const auto & col_with_type_and_name = columnGetNested(datetime_arg); ColumnUInt8::MutablePtr null_map = copyNullMap(datetime_arg.column); const auto & type = static_cast(*col_with_type_and_name.type); const DateLUTImpl * time_zone = nullptr; const ColumnConst * time_zone_column = nullptr; if (arguments.size() > 1) time_zone_column = checkAndGetColumnConst(arguments[1].column.get()); else if (arguments.size() == 1) time_zone = &DateLUT::instance(); if constexpr (std::is_same_v || std::is_same_v) time_zone = &DateLUT::instance(); /// For argument of Date or DateTime type, second argument with time zone could be specified. if constexpr (std::is_same_v || std::is_same_v) { if (time_zone_column) { auto non_null_args = createBlockWithNestedColumns(arguments); time_zone = &extractTimeZoneFromFunctionArguments(non_null_args, 1, 0); } } if (const auto col_from = checkAndGetColumn(col_with_type_and_name.column.get())) { auto col_to = ColumnString::create(); const typename ColVecType::Container & vec_from = col_from->getData(); ColumnString::Chars & data_to = col_to->getChars(); ColumnString::Offsets & offsets_to = col_to->getOffsets(); size_t size = vec_from.size(); if constexpr (std::is_same_v) data_to.resize(size * (strlen("YYYY-MM-DD") + 1)); else if constexpr (std::is_same_v) data_to.resize(size * (strlen("YYYY-MM-DD") + 1)); else if constexpr (std::is_same_v) data_to.resize(size * (strlen("YYYY-MM-DD hh:mm:ss") + 1)); else if constexpr (std::is_same_v) data_to.resize(size * (strlen("YYYY-MM-DD hh:mm:ss.") + col_from->getScale() + 1)); else data_to.resize(size * 3); /// Arbitrary offsets_to.resize(size); WriteBufferFromVector write_buffer(data_to); if (null_map) { for (size_t i = 0; i < size; ++i) { if (!time_zone_column && arguments.size() > 1) { if (!arguments[1].column.get()->getDataAt(i).toString().empty()) time_zone = &DateLUT::instance(arguments[1].column.get()->getDataAt(i).toString()); else throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Provided time zone must be non-empty and be a valid time zone"); } bool is_ok = FormatImpl::template execute(vec_from[i], write_buffer, &type, time_zone); null_map->getData()[i] |= !is_ok; writeChar(0, write_buffer); offsets_to[i] = write_buffer.count(); } } else { for (size_t i = 0; i < size; ++i) { if (!time_zone_column && arguments.size() > 1) { if (!arguments[1].column.get()->getDataAt(i).toString().empty()) time_zone = &DateLUT::instance(arguments[1].column.get()->getDataAt(i).toString()); else throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Provided time zone must be non-empty and be a valid time zone"); } FormatImpl::template execute(vec_from[i], write_buffer, &type, time_zone); writeChar(0, write_buffer); offsets_to[i] = write_buffer.count(); } } write_buffer.finalize(); if (null_map) return ColumnNullable::create(std::move(col_to), std::move(null_map)); return col_to; } else throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", arguments[0].column->getName(), Name::name); } }; /// Generic conversion of any type to String or FixedString via serialization to text. template struct ConvertImplGenericToString { static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t /*input_rows_count*/) { static_assert(std::is_same_v || std::is_same_v, "Can be used only to serialize to ColumnString or ColumnFixedString"); ColumnUInt8::MutablePtr null_map = copyNullMap(arguments[0].column); const auto & col_with_type_and_name = columnGetNested(arguments[0]); const IDataType & type = *col_with_type_and_name.type; const IColumn & col_from = *col_with_type_and_name.column; size_t size = col_from.size(); auto col_to = removeNullable(result_type)->createColumn(); { ColumnStringHelpers::WriteHelper write_helper( assert_cast(*col_to), size); auto & write_buffer = write_helper.getWriteBuffer(); FormatSettings format_settings; auto serialization = type.getDefaultSerialization(); for (size_t i = 0; i < size; ++i) { serialization->serializeText(col_from, i, write_buffer, format_settings); write_helper.rowWritten(); } write_helper.finalize(); } if (result_type->isNullable() && null_map) return ColumnNullable::create(std::move(col_to), std::move(null_map)); return col_to; } }; /** Conversion of time_t to UInt16, Int32, UInt32 */ template void convertFromTime(typename DataType::FieldType & x, time_t & time) { x = time; } template <> inline void convertFromTime(DataTypeDate::FieldType & x, time_t & time) { if (unlikely(time < 0)) x = 0; else if (unlikely(time > 0xFFFF)) x = 0xFFFF; else x = time; } template <> inline void convertFromTime(DataTypeDate32::FieldType & x, time_t & time) { x = static_cast(time); } template <> inline void convertFromTime(DataTypeDateTime::FieldType & x, time_t & time) { if (unlikely(time < 0)) x = 0; else if (unlikely(time > 0xFFFFFFFF)) x = 0xFFFFFFFF; else x = static_cast(time); } /** Conversion of strings to numbers, dates, datetimes: through parsing. */ template void parseImpl(typename DataType::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool precise_float_parsing) { if constexpr (std::is_floating_point_v) { if (precise_float_parsing) readFloatTextPrecise(x, rb); else readFloatTextFast(x, rb); } else readText(x, rb); } template <> inline void parseImpl(DataTypeDate::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { DayNum tmp(0); readDateText(tmp, rb, *time_zone); x = tmp; } template <> inline void parseImpl(DataTypeDate32::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { ExtendedDayNum tmp(0); readDateText(tmp, rb, *time_zone); x = tmp; } // NOTE: no need of extra overload of DateTime64, since readDateTimeText64 has different signature and that case is explicitly handled in the calling code. template <> inline void parseImpl(DataTypeDateTime::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { time_t time = 0; readDateTimeText(time, rb, *time_zone); convertFromTime(x, time); } template <> inline void parseImpl(DataTypeUUID::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { UUID tmp; readUUIDText(tmp, rb); x = tmp.toUnderType(); } template <> inline void parseImpl(DataTypeIPv4::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { IPv4 tmp; readIPv4Text(tmp, rb); x = tmp.toUnderType(); } template <> inline void parseImpl(DataTypeIPv6::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { IPv6 tmp; readIPv6Text(tmp, rb); x = tmp; } template bool tryParseImpl(typename DataType::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool precise_float_parsing) { if constexpr (std::is_floating_point_v) { if (precise_float_parsing) return tryReadFloatTextPrecise(x, rb); else return tryReadFloatTextFast(x, rb); } else /*if constexpr (is_integer_v)*/ return tryReadIntText(x, rb); } template <> inline bool tryParseImpl(DataTypeDate::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { DayNum tmp(0); if (!tryReadDateText(tmp, rb, *time_zone)) return false; x = tmp; return true; } template <> inline bool tryParseImpl(DataTypeDate32::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { ExtendedDayNum tmp(0); if (!tryReadDateText(tmp, rb, *time_zone)) return false; x = tmp; return true; } template <> inline bool tryParseImpl(DataTypeDateTime::FieldType & x, ReadBuffer & rb, const DateLUTImpl * time_zone, bool) { time_t tmp = 0; if (!tryReadDateTimeText(tmp, rb, *time_zone)) return false; x = static_cast(tmp); return true; } template <> inline bool tryParseImpl(DataTypeUUID::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { UUID tmp; if (!tryReadUUIDText(tmp, rb)) return false; x = tmp.toUnderType(); return true; } template <> inline bool tryParseImpl(DataTypeIPv4::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { IPv4 tmp; if (!tryReadIPv4Text(tmp, rb)) return false; x = tmp.toUnderType(); return true; } template <> inline bool tryParseImpl(DataTypeIPv6::FieldType & x, ReadBuffer & rb, const DateLUTImpl *, bool) { IPv6 tmp; if (!tryReadIPv6Text(tmp, rb)) return false; x = tmp; return true; } /** Throw exception with verbose message when string value is not parsed completely. */ [[noreturn]] inline void throwExceptionForIncompletelyParsedValue(ReadBuffer & read_buffer, const IDataType & result_type) { WriteBufferFromOwnString message_buf; message_buf << "Cannot parse string " << quote << String(read_buffer.buffer().begin(), read_buffer.buffer().size()) << " as " << result_type.getName() << ": syntax error"; if (read_buffer.offset()) message_buf << " at position " << read_buffer.offset() << " (parsed just " << quote << String(read_buffer.buffer().begin(), read_buffer.offset()) << ")"; else message_buf << " at begin of string"; // Currently there are no functions toIPv{4,6}Or{Null,Zero} if (isNativeNumber(result_type) && !(result_type.getName() == "IPv4" || result_type.getName() == "IPv6")) message_buf << ". Note: there are to" << result_type.getName() << "OrZero and to" << result_type.getName() << "OrNull functions, which returns zero/NULL instead of throwing exception."; throw Exception(PreformattedMessage{message_buf.str(), "Cannot parse string {} as {}: syntax error {}"}, ErrorCodes::CANNOT_PARSE_TEXT); } enum class ConvertFromStringExceptionMode { Throw, /// Throw exception if value cannot be parsed. Zero, /// Fill with zero or default if value cannot be parsed. Null /// Return ColumnNullable with NULLs when value cannot be parsed. }; enum class ConvertFromStringParsingMode { Normal, BestEffort, /// Only applicable for DateTime. Will use sophisticated method, that is slower. BestEffortUS }; template struct ConvertThroughParsing { static_assert(std::is_same_v || std::is_same_v, "ConvertThroughParsing is only applicable for String or FixedString data types"); static constexpr bool to_datetime64 = std::is_same_v; static bool isAllRead(ReadBuffer & in) { /// In case of FixedString, skip zero bytes at end. if constexpr (std::is_same_v) while (!in.eof() && *in.position() == 0) ++in.position(); if (in.eof()) return true; /// Special case, that allows to parse string with DateTime or DateTime64 as Date or Date32. if constexpr (std::is_same_v || std::is_same_v) { if (!in.eof() && (*in.position() == ' ' || *in.position() == 'T')) { if (in.buffer().size() == strlen("YYYY-MM-DD hh:mm:ss")) return true; if (in.buffer().size() >= strlen("YYYY-MM-DD hh:mm:ss.x") && in.buffer().begin()[19] == '.') { in.position() = in.buffer().begin() + 20; while (!in.eof() && isNumericASCII(*in.position())) ++in.position(); if (in.eof()) return true; } } } return false; } template static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr & res_type, size_t input_rows_count, Additions additions [[maybe_unused]] = Additions()) { using ColVecTo = typename ToDataType::ColumnType; const DateLUTImpl * local_time_zone [[maybe_unused]] = nullptr; const DateLUTImpl * utc_time_zone [[maybe_unused]] = nullptr; /// For conversion to Date or DateTime type, second argument with time zone could be specified. if constexpr (std::is_same_v || to_datetime64) { const auto result_type = removeNullable(res_type); // Time zone is already figured out during result type resolution, no need to do it here. if (const auto dt_col = checkAndGetDataType(result_type.get())) local_time_zone = &dt_col->getTimeZone(); else local_time_zone = &extractTimeZoneFromFunctionArguments(arguments, 1, 0); if constexpr (parsing_mode == ConvertFromStringParsingMode::BestEffort || parsing_mode == ConvertFromStringParsingMode::BestEffortUS) utc_time_zone = &DateLUT::instance("UTC"); } else if constexpr (std::is_same_v || std::is_same_v) { // Timezone is more or less dummy when parsing Date/Date32 from string. local_time_zone = &DateLUT::instance(); utc_time_zone = &DateLUT::instance("UTC"); } const IColumn * col_from = arguments[0].column.get(); const ColumnString * col_from_string = checkAndGetColumn(col_from); const ColumnFixedString * col_from_fixed_string = checkAndGetColumn(col_from); if (std::is_same_v && !col_from_string) throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", col_from->getName(), Name::name); if (std::is_same_v && !col_from_fixed_string) throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", col_from->getName(), Name::name); size_t size = input_rows_count; typename ColVecTo::MutablePtr col_to = nullptr; if constexpr (IsDataTypeDecimal) { UInt32 scale = additions; if constexpr (to_datetime64) { ToDataType check_bounds_in_ctor(scale, local_time_zone ? local_time_zone->getTimeZone() : String{}); } else { ToDataType check_bounds_in_ctor(ToDataType::maxPrecision(), scale); } col_to = ColVecTo::create(size, scale); } else col_to = ColVecTo::create(size); typename ColVecTo::Container & vec_to = col_to->getData(); ColumnUInt8::MutablePtr col_null_map_to; ColumnUInt8::Container * vec_null_map_to [[maybe_unused]] = nullptr; if constexpr (exception_mode == ConvertFromStringExceptionMode::Null) { col_null_map_to = ColumnUInt8::create(size); vec_null_map_to = &col_null_map_to->getData(); } const ColumnString::Chars * chars = nullptr; const IColumn::Offsets * offsets = nullptr; size_t fixed_string_size = 0; if constexpr (std::is_same_v) { chars = &col_from_string->getChars(); offsets = &col_from_string->getOffsets(); } else { chars = &col_from_fixed_string->getChars(); fixed_string_size = col_from_fixed_string->getN(); } size_t current_offset = 0; bool precise_float_parsing = false; if (DB::CurrentThread::isInitialized()) { const DB::ContextPtr query_context = DB::CurrentThread::get().getQueryContext(); if (query_context) precise_float_parsing = query_context->getSettingsRef().precise_float_parsing; } for (size_t i = 0; i < size; ++i) { size_t next_offset = std::is_same_v ? (*offsets)[i] : (current_offset + fixed_string_size); size_t string_size = std::is_same_v ? next_offset - current_offset - 1 : fixed_string_size; ReadBufferFromMemory read_buffer(&(*chars)[current_offset], string_size); if constexpr (exception_mode == ConvertFromStringExceptionMode::Throw) { if constexpr (parsing_mode == ConvertFromStringParsingMode::BestEffort) { if constexpr (to_datetime64) { DateTime64 res = 0; parseDateTime64BestEffort(res, col_to->getScale(), read_buffer, *local_time_zone, *utc_time_zone); vec_to[i] = res; } else { time_t res; parseDateTimeBestEffort(res, read_buffer, *local_time_zone, *utc_time_zone); convertFromTime(vec_to[i], res); } } else if constexpr (parsing_mode == ConvertFromStringParsingMode::BestEffortUS) { if constexpr (to_datetime64) { DateTime64 res = 0; parseDateTime64BestEffortUS(res, col_to->getScale(), read_buffer, *local_time_zone, *utc_time_zone); vec_to[i] = res; } else { time_t res; parseDateTimeBestEffortUS(res, read_buffer, *local_time_zone, *utc_time_zone); convertFromTime(vec_to[i], res); } } else { if constexpr (to_datetime64) { DateTime64 value = 0; readDateTime64Text(value, col_to->getScale(), read_buffer, *local_time_zone); vec_to[i] = value; } else if constexpr (IsDataTypeDecimal) { SerializationDecimal::readText( vec_to[i], read_buffer, ToDataType::maxPrecision(), col_to->getScale()); } else { /// we want to utilize constexpr condition here, which is not mixable with value comparison do { if constexpr (std::is_same_v && std::is_same_v) { if (fixed_string_size == IPV6_BINARY_LENGTH) { readBinary(vec_to[i], read_buffer); break; } } parseImpl(vec_to[i], read_buffer, local_time_zone, precise_float_parsing); } while (false); } } if (!isAllRead(read_buffer)) throwExceptionForIncompletelyParsedValue(read_buffer, *res_type); } else { bool parsed; if constexpr (parsing_mode == ConvertFromStringParsingMode::BestEffort) { if constexpr (to_datetime64) { DateTime64 res = 0; parsed = tryParseDateTime64BestEffort(res, col_to->getScale(), read_buffer, *local_time_zone, *utc_time_zone); vec_to[i] = res; } else { time_t res; parsed = tryParseDateTimeBestEffort(res, read_buffer, *local_time_zone, *utc_time_zone); convertFromTime(vec_to[i],res); } } else if constexpr (parsing_mode == ConvertFromStringParsingMode::BestEffortUS) { if constexpr (to_datetime64) { DateTime64 res = 0; parsed = tryParseDateTime64BestEffortUS(res, col_to->getScale(), read_buffer, *local_time_zone, *utc_time_zone); vec_to[i] = res; } else { time_t res; parsed = tryParseDateTimeBestEffortUS(res, read_buffer, *local_time_zone, *utc_time_zone); convertFromTime(vec_to[i],res); } } else { if constexpr (to_datetime64) { DateTime64 value = 0; parsed = tryReadDateTime64Text(value, col_to->getScale(), read_buffer, *local_time_zone); vec_to[i] = value; } else if constexpr (IsDataTypeDecimal) { parsed = SerializationDecimal::tryReadText( vec_to[i], read_buffer, ToDataType::maxPrecision(), col_to->getScale()); } else { /// we want to utilize constexpr condition here, which is not mixable with value comparison do { if constexpr (std::is_same_v && std::is_same_v) { if (fixed_string_size == IPV6_BINARY_LENGTH) { readBinary(vec_to[i], read_buffer); parsed = true; break; } } parsed = tryParseImpl(vec_to[i], read_buffer, local_time_zone, precise_float_parsing); } while (false); } } if (!isAllRead(read_buffer)) parsed = false; if (!parsed) { if constexpr (std::is_same_v) { vec_to[i] = -static_cast(DateLUT::instance().getDayNumOffsetEpoch()); } else { vec_to[i] = static_cast(0); } } if constexpr (exception_mode == ConvertFromStringExceptionMode::Null) (*vec_null_map_to)[i] = !parsed; } current_offset = next_offset; } if constexpr (exception_mode == ConvertFromStringExceptionMode::Null) return ColumnNullable::create(std::move(col_to), std::move(col_null_map_to)); else return col_to; } }; template requires (!std::is_same_v) struct ConvertImpl : ConvertThroughParsing {}; template requires (!std::is_same_v) struct ConvertImpl : ConvertThroughParsing {}; template requires (!std::is_same_v) struct ConvertImpl : ConvertThroughParsing {}; template requires (!std::is_same_v) struct ConvertImpl : ConvertThroughParsing {}; template requires (is_any_of && is_any_of) struct ConvertImpl : ConvertThroughParsing {}; /// Generic conversion of any type from String. Used for complex types: Array and Tuple or types with custom serialization. template struct ConvertImplGenericFromString { static ColumnPtr execute(ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * column_nullable, size_t input_rows_count) { static_assert(std::is_same_v || std::is_same_v, "Can be used only to parse from ColumnString or ColumnFixedString"); const IColumn & column_from = *arguments[0].column; const IDataType & data_type_to = *result_type; auto res = data_type_to.createColumn(); auto serialization = data_type_to.getDefaultSerialization(); const auto * null_map = column_nullable ? &column_nullable->getNullMapData() : nullptr; executeImpl(column_from, *res, *serialization, input_rows_count, null_map, result_type.get()); return res; } static void executeImpl( const IColumn & column_from, IColumn & column_to, const ISerialization & serialization_from, size_t input_rows_count, const PaddedPODArray * null_map = nullptr, const IDataType * result_type = nullptr) { static_assert(std::is_same_v || std::is_same_v, "Can be used only to parse from ColumnString or ColumnFixedString"); if (const StringColumnType * col_from_string = checkAndGetColumn(&column_from)) { column_to.reserve(input_rows_count); FormatSettings format_settings; for (size_t i = 0; i < input_rows_count; ++i) { if (null_map && (*null_map)[i]) { column_to.insertDefault(); continue; } const auto & val = col_from_string->getDataAt(i); ReadBufferFromMemory read_buffer(val.data, val.size); serialization_from.deserializeWholeText(column_to, read_buffer, format_settings); if (!read_buffer.eof()) { if (result_type) throwExceptionForIncompletelyParsedValue(read_buffer, *result_type); else throw Exception(ErrorCodes::CANNOT_PARSE_TEXT, "Cannot parse string to column {}. Expected eof", column_to.getName()); } } } else throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of conversion function from string", column_from.getName()); } }; template <> struct ConvertImpl : ConvertImpl {}; template <> struct ConvertImpl : ConvertImpl {}; /** If types are identical, just take reference to column. */ template requires (!T::is_parametric) struct ConvertImpl { template static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/, Additions additions [[maybe_unused]] = Additions()) { return arguments[0].column; } }; template struct ConvertImpl { template static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/, Additions additions [[maybe_unused]] = Additions()) { return arguments[0].column; } }; /** Conversion from FixedString to String. * Cutting sequences of zero bytes from end of strings. */ template struct ConvertImpl { static ColumnPtr execute(const ColumnsWithTypeAndName & arguments, const DataTypePtr & return_type, size_t /*input_rows_count*/) { ColumnUInt8::MutablePtr null_map = copyNullMap(arguments[0].column); const auto & nested = columnGetNested(arguments[0]); if (const ColumnFixedString * col_from = checkAndGetColumn(nested.column.get())) { auto col_to = ColumnString::create(); const ColumnFixedString::Chars & data_from = col_from->getChars(); ColumnString::Chars & data_to = col_to->getChars(); ColumnString::Offsets & offsets_to = col_to->getOffsets(); size_t size = col_from->size(); size_t n = col_from->getN(); data_to.resize(size * (n + 1)); /// + 1 - zero terminator offsets_to.resize(size); size_t offset_from = 0; size_t offset_to = 0; for (size_t i = 0; i < size; ++i) { if (!null_map || !null_map->getData()[i]) { size_t bytes_to_copy = n; while (bytes_to_copy > 0 && data_from[offset_from + bytes_to_copy - 1] == 0) --bytes_to_copy; memcpy(&data_to[offset_to], &data_from[offset_from], bytes_to_copy); offset_to += bytes_to_copy; } data_to[offset_to] = 0; ++offset_to; offsets_to[i] = offset_to; offset_from += n; } data_to.resize(offset_to); if (return_type->isNullable() && null_map) return ColumnNullable::create(std::move(col_to), std::move(null_map)); return col_to; } else throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Illegal column {} of first argument of function {}", arguments[0].column->getName(), Name::name); } }; /// Declared early because used below. struct NameToDate { static constexpr auto name = "toDate"; }; struct NameToDate32 { static constexpr auto name = "toDate32"; }; struct NameToDateTime { static constexpr auto name = "toDateTime"; }; struct NameToDateTime32 { static constexpr auto name = "toDateTime32"; }; struct NameToDateTime64 { static constexpr auto name = "toDateTime64"; }; struct NameToString { static constexpr auto name = "toString"; }; struct NameToDecimal32 { static constexpr auto name = "toDecimal32"; }; struct NameToDecimal64 { static constexpr auto name = "toDecimal64"; }; struct NameToDecimal128 { static constexpr auto name = "toDecimal128"; }; struct NameToDecimal256 { static constexpr auto name = "toDecimal256"; }; #define DEFINE_NAME_TO_INTERVAL(INTERVAL_KIND) \ struct NameToInterval ## INTERVAL_KIND \ { \ static constexpr auto name = "toInterval" #INTERVAL_KIND; \ static constexpr auto kind = IntervalKind::INTERVAL_KIND; \ }; DEFINE_NAME_TO_INTERVAL(Nanosecond) DEFINE_NAME_TO_INTERVAL(Microsecond) DEFINE_NAME_TO_INTERVAL(Millisecond) DEFINE_NAME_TO_INTERVAL(Second) DEFINE_NAME_TO_INTERVAL(Minute) DEFINE_NAME_TO_INTERVAL(Hour) DEFINE_NAME_TO_INTERVAL(Day) DEFINE_NAME_TO_INTERVAL(Week) DEFINE_NAME_TO_INTERVAL(Month) DEFINE_NAME_TO_INTERVAL(Quarter) DEFINE_NAME_TO_INTERVAL(Year) #undef DEFINE_NAME_TO_INTERVAL struct NameParseDateTimeBestEffort; struct NameParseDateTimeBestEffortOrZero; struct NameParseDateTimeBestEffortOrNull; template static inline bool isDateTime64(const ColumnsWithTypeAndName & arguments) { if constexpr (std::is_same_v) return true; else if constexpr (std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v) { return (arguments.size() == 2 && isUnsignedInteger(arguments[1].type)) || arguments.size() == 3; } return false; } template class FunctionConvert : public IFunction { public: using Monotonic = MonotonicityImpl; static constexpr auto name = Name::name; static constexpr bool to_decimal = std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v; static constexpr bool to_datetime64 = std::is_same_v; static constexpr bool to_string_or_fixed_string = std::is_same_v || std::is_same_v; static constexpr bool to_date_or_datetime = std::is_same_v || std::is_same_v || std::is_same_v; static FunctionPtr create(ContextPtr context) { return std::make_shared(context); } static FunctionPtr create() { return std::make_shared(); } FunctionConvert() = default; explicit FunctionConvert(ContextPtr context_) : context(context_) {} String getName() const override { return name; } bool isVariadic() const override { return true; } size_t getNumberOfArguments() const override { return 0; } bool isInjective(const ColumnsWithTypeAndName &) const override { return std::is_same_v; } bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & arguments) const override { /// TODO: We can make more optimizations here. return !(to_date_or_datetime && isNumber(*arguments[0].type)); } using DefaultReturnTypeGetter = std::function; static DataTypePtr getReturnTypeDefaultImplementationForNulls(const ColumnsWithTypeAndName & arguments, const DefaultReturnTypeGetter & getter) { NullPresence null_presence = getNullPresense(arguments); if (null_presence.has_null_constant) { return makeNullable(std::make_shared()); } if (null_presence.has_nullable) { auto nested_columns = Block(createBlockWithNestedColumns(arguments)); auto return_type = getter(ColumnsWithTypeAndName(nested_columns.begin(), nested_columns.end())); return makeNullable(return_type); } return getter(arguments); } DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override { auto getter = [&] (const auto & args) { return getReturnTypeImplRemovedNullable(args); }; auto res = getReturnTypeDefaultImplementationForNulls(arguments, getter); to_nullable = res->isNullable(); checked_return_type = true; return res; } DataTypePtr getReturnTypeImplRemovedNullable(const ColumnsWithTypeAndName & arguments) const { FunctionArgumentDescriptors mandatory_args = {{"Value", nullptr, nullptr, nullptr}}; FunctionArgumentDescriptors optional_args; if constexpr (to_decimal) { mandatory_args.push_back({"scale", &isNativeInteger, &isColumnConst, "const Integer"}); } if (!to_decimal && isDateTime64(arguments)) { mandatory_args.push_back({"scale", &isNativeInteger, &isColumnConst, "const Integer"}); } // toString(DateTime or DateTime64, [timezone: String]) if ((std::is_same_v && !arguments.empty() && (isDateTime64(arguments[0].type) || isDateTime(arguments[0].type))) // toUnixTimestamp(value[, timezone : String]) || std::is_same_v // toDate(value[, timezone : String]) || std::is_same_v // TODO: shall we allow timestamp argument for toDate? DateTime knows nothing about timezones and this argument is ignored below. // toDate32(value[, timezone : String]) || std::is_same_v // toDateTime(value[, timezone: String]) || std::is_same_v // toDateTime64(value, scale : Integer[, timezone: String]) || std::is_same_v) { optional_args.push_back({"timezone", &isString, nullptr, "String"}); } validateFunctionArgumentTypes(*this, arguments, mandatory_args, optional_args); if constexpr (std::is_same_v) { return std::make_shared(Name::kind); } else if constexpr (to_decimal) { UInt64 scale = extractToDecimalScale(arguments[1]); if constexpr (std::is_same_v) return createDecimalMaxPrecision(scale); else if constexpr (std::is_same_v) return createDecimalMaxPrecision(scale); else if constexpr (std::is_same_v) return createDecimalMaxPrecision(scale); else if constexpr (std::is_same_v) return createDecimalMaxPrecision(scale); throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected branch in code of conversion function: it is a bug."); } else { // Optional second argument with time zone for DateTime. UInt8 timezone_arg_position = 1; UInt32 scale [[maybe_unused]] = DataTypeDateTime64::default_scale; // DateTime64 requires more arguments: scale and timezone. Since timezone is optional, scale should be first. if (isDateTime64(arguments)) { timezone_arg_position += 1; scale = static_cast(arguments[1].column->get64(0)); if (to_datetime64 || scale != 0) /// toDateTime('xxxx-xx-xx xx:xx:xx', 0) return DateTime return std::make_shared(scale, extractTimeZoneNameFromFunctionArguments(arguments, timezone_arg_position, 0, false)); return std::make_shared(extractTimeZoneNameFromFunctionArguments(arguments, timezone_arg_position, 0, false)); } if constexpr (std::is_same_v) return std::make_shared(extractTimeZoneNameFromFunctionArguments(arguments, timezone_arg_position, 0, false)); else if constexpr (std::is_same_v) throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected branch in code of conversion function: it is a bug."); else return std::make_shared(); } } /// Function actually uses default implementation for nulls, /// but we need to know if return type is Nullable or not, /// so we use checked_return_type only to intercept the first call to getReturnTypeImpl(...). bool useDefaultImplementationForNulls() const override { bool to_nullable_string = to_nullable && std::is_same_v; return checked_return_type && !to_nullable_string; } bool useDefaultImplementationForConstants() const override { return true; } ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { if constexpr (std::is_same_v) return {}; else if constexpr (std::is_same_v) return {2}; return {1}; } bool canBeExecutedOnDefaultArguments() const override { return false; } ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override { try { return executeInternal(arguments, result_type, input_rows_count); } catch (Exception & e) { /// More convenient error message. if (e.code() == ErrorCodes::ATTEMPT_TO_READ_AFTER_EOF) { e.addMessage("Cannot parse " + result_type->getName() + " from " + arguments[0].type->getName() + ", because value is too short"); } else if (e.code() == ErrorCodes::CANNOT_PARSE_NUMBER || e.code() == ErrorCodes::CANNOT_READ_ARRAY_FROM_TEXT || e.code() == ErrorCodes::CANNOT_PARSE_INPUT_ASSERTION_FAILED || e.code() == ErrorCodes::CANNOT_PARSE_QUOTED_STRING || e.code() == ErrorCodes::CANNOT_PARSE_ESCAPE_SEQUENCE || e.code() == ErrorCodes::CANNOT_PARSE_DATE || e.code() == ErrorCodes::CANNOT_PARSE_DATETIME || e.code() == ErrorCodes::CANNOT_PARSE_UUID || e.code() == ErrorCodes::CANNOT_PARSE_IPV4 || e.code() == ErrorCodes::CANNOT_PARSE_IPV6) { e.addMessage("Cannot parse " + result_type->getName() + " from " + arguments[0].type->getName()); } throw; } } bool hasInformationAboutMonotonicity() const override { return Monotonic::has(); } Monotonicity getMonotonicityForRange(const IDataType & type, const Field & left, const Field & right) const override { return Monotonic::get(type, left, right); } private: ContextPtr context; mutable bool checked_return_type = false; mutable bool to_nullable = false; ColumnPtr executeInternal(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const { if (arguments.empty()) throw Exception(ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION, "Function {} expects at least 1 argument", getName()); if (result_type->onlyNull()) return result_type->createColumnConstWithDefaultValue(input_rows_count); const DataTypePtr from_type = removeNullable(arguments[0].type); ColumnPtr result_column; auto call = [&](const auto & types, const auto & tag) -> bool { using Types = std::decay_t; using LeftDataType = typename Types::LeftType; using RightDataType = typename Types::RightType; using SpecialTag = std::decay_t; if constexpr (IsDataTypeDecimal) { if constexpr (std::is_same_v) { /// Account for optional timezone argument. if (arguments.size() != 2 && arguments.size() != 3) throw Exception(ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION, "Function {} expects 2 or 3 arguments for DataTypeDateTime64.", getName()); } else if (arguments.size() != 2) { throw Exception(ErrorCodes::TOO_FEW_ARGUMENTS_FOR_FUNCTION, "Function {} expects 2 arguments for Decimal.", getName()); } const ColumnWithTypeAndName & scale_column = arguments[1]; UInt32 scale = extractToDecimalScale(scale_column); result_column = ConvertImpl::execute(arguments, result_type, input_rows_count, scale); } else if constexpr (IsDataTypeDateOrDateTime && std::is_same_v) { const auto * dt64 = assert_cast(arguments[0].type.get()); result_column = ConvertImpl::execute(arguments, result_type, input_rows_count, dt64->getScale()); } else if constexpr (IsDataTypeDecimalOrNumber && IsDataTypeDecimalOrNumber) { using LeftT = typename LeftDataType::FieldType; using RightT = typename RightDataType::FieldType; static constexpr bool bad_left = is_decimal || std::is_floating_point_v || is_big_int_v || is_signed_v; static constexpr bool bad_right = is_decimal || std::is_floating_point_v || is_big_int_v || is_signed_v; /// Disallow int vs UUID conversion (but support int vs UInt128 conversion) if constexpr ((bad_left && std::is_same_v) || (bad_right && std::is_same_v)) { throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Wrong UUID conversion"); } else { result_column = ConvertImpl::execute(arguments, result_type, input_rows_count); } } else { result_column = ConvertImpl::execute(arguments, result_type, input_rows_count); } return true; }; if (isDateTime64(arguments)) { /// For toDateTime('xxxx-xx-xx xx:xx:xx.00', 2[, 'timezone']) we need to it convert to DateTime64 const ColumnWithTypeAndName & scale_column = arguments[1]; UInt32 scale = extractToDecimalScale(scale_column); if (to_datetime64 || scale != 0) /// When scale = 0, the data type is DateTime otherwise the data type is DateTime64 { if (!callOnIndexAndDataType(from_type->getTypeId(), call, ConvertDefaultBehaviorTag{})) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[0].type->getName(), getName()); return result_column; } } if constexpr (std::is_same_v) { if (from_type->getCustomSerialization()) return ConvertImplGenericToString::execute(arguments, result_type, input_rows_count); } bool done = false; if constexpr (to_string_or_fixed_string) { done = callOnIndexAndDataType(from_type->getTypeId(), call, ConvertDefaultBehaviorTag{}); } else { bool cast_ipv4_ipv6_default_on_conversion_error = false; if constexpr (is_any_of) if (context && (cast_ipv4_ipv6_default_on_conversion_error = context->getSettingsRef().cast_ipv4_ipv6_default_on_conversion_error)) done = callOnIndexAndDataType(from_type->getTypeId(), call, ConvertReturnZeroOnErrorTag{}); if (!cast_ipv4_ipv6_default_on_conversion_error) { /// We should use ConvertFromStringExceptionMode::Null mode when converting from String (or FixedString) /// to Nullable type, to avoid 'value is too short' error on attempt to parse empty string from NULL values. if (to_nullable && WhichDataType(from_type).isStringOrFixedString()) done = callOnIndexAndDataType(from_type->getTypeId(), call, ConvertReturnNullOnErrorTag{}); else done = callOnIndexAndDataType(from_type->getTypeId(), call, ConvertDefaultBehaviorTag{}); } } if (!done) { /// Generic conversion of any type to String. if (std::is_same_v) { return ConvertImplGenericToString::execute(arguments, result_type, input_rows_count); } else throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}", arguments[0].type->getName(), getName()); } return result_column; } }; /** Function toTOrZero (where T is number of date or datetime type): * try to convert from String to type T through parsing, * if cannot parse, return default value instead of throwing exception. * Function toTOrNull will return Nullable type with NULL when cannot parse. * NOTE Also need to implement tryToUnixTimestamp with timezone. */ template class FunctionConvertFromString : public IFunction { public: static constexpr auto name = Name::name; static constexpr bool to_decimal = std::is_same_v> || std::is_same_v> || std::is_same_v> || std::is_same_v>; static constexpr bool to_datetime64 = std::is_same_v; static FunctionPtr create(ContextPtr) { return std::make_shared(); } static FunctionPtr create() { return std::make_shared(); } String getName() const override { return name; } bool isVariadic() const override { return true; } bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; } size_t getNumberOfArguments() const override { return 0; } bool useDefaultImplementationForConstants() const override { return true; } bool canBeExecutedOnDefaultArguments() const override { return false; } ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; } DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override { DataTypePtr res; if (isDateTime64(arguments)) { validateFunctionArgumentTypes(*this, arguments, FunctionArgumentDescriptors{{"string", &isStringOrFixedString, nullptr, "String or FixedString"}}, // optional FunctionArgumentDescriptors{ {"precision", &isUInt8, isColumnConst, "const UInt8"}, {"timezone", &isStringOrFixedString, isColumnConst, "const String or FixedString"}, }); UInt64 scale = to_datetime64 ? DataTypeDateTime64::default_scale : 0; if (arguments.size() > 1) scale = extractToDecimalScale(arguments[1]); const auto timezone = extractTimeZoneNameFromFunctionArguments(arguments, 2, 0, false); res = scale == 0 ? res = std::make_shared(timezone) : std::make_shared(scale, timezone); } else { if ((arguments.size() != 1 && arguments.size() != 2) || (to_decimal && arguments.size() != 2)) throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH, "Number of arguments for function {} doesn't match: passed {}, should be 1 or 2. " "Second argument only make sense for DateTime (time zone, optional) and Decimal (scale).", getName(), arguments.size()); if (!isStringOrFixedString(arguments[0].type)) { if (this->getName().find("OrZero") != std::string::npos || this->getName().find("OrNull") != std::string::npos) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of first argument of function {}. " "Conversion functions with postfix 'OrZero' or 'OrNull' should take String argument", arguments[0].type->getName(), getName()); else throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of first argument of function {}", arguments[0].type->getName(), getName()); } if (arguments.size() == 2) { if constexpr (std::is_same_v) { if (!isString(arguments[1].type)) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of 2nd argument of function {}", arguments[1].type->getName(), getName()); } else if constexpr (to_decimal) { if (!isInteger(arguments[1].type)) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of 2nd argument of function {}", arguments[1].type->getName(), getName()); if (!arguments[1].column) throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Second argument for function {} must be constant", getName()); } else { throw Exception(ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH, "Number of arguments for function {} doesn't match: passed {}, should be 1. " "Second argument makes sense only for DateTime and Decimal.", getName(), arguments.size()); } } if constexpr (std::is_same_v) res = std::make_shared(extractTimeZoneNameFromFunctionArguments(arguments, 1, 0, false)); else if constexpr (std::is_same_v) throw Exception(ErrorCodes::LOGICAL_ERROR, "LOGICAL ERROR: It is a bug."); else if constexpr (to_decimal) { UInt64 scale = extractToDecimalScale(arguments[1]); res = createDecimalMaxPrecision(scale); if (!res) throw Exception(ErrorCodes::LOGICAL_ERROR, "Something wrong with toDecimalNNOrZero() or toDecimalNNOrNull()"); } else res = std::make_shared(); } if constexpr (exception_mode == ConvertFromStringExceptionMode::Null) res = std::make_shared(res); return res; } template ColumnPtr executeInternal(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count, UInt32 scale = 0) const { const IDataType * from_type = arguments[0].type.get(); if (checkAndGetDataType(from_type)) { return ConvertThroughParsing::execute( arguments, result_type, input_rows_count, scale); } else if (checkAndGetDataType(from_type)) { return ConvertThroughParsing::execute( arguments, result_type, input_rows_count, scale); } return nullptr; } ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override { ColumnPtr result_column; if constexpr (to_decimal) result_column = executeInternal(arguments, result_type, input_rows_count, assert_cast(*removeNullable(result_type)).getScale()); else { if (isDateTime64(arguments)) { UInt64 scale = to_datetime64 ? DataTypeDateTime64::default_scale : 0; if (arguments.size() > 1) scale = extractToDecimalScale(arguments[1]); if (scale == 0) result_column = executeInternal(arguments, result_type, input_rows_count); else { result_column = executeInternal(arguments, result_type, input_rows_count, static_cast(scale)); } } else { result_column = executeInternal(arguments, result_type, input_rows_count); } } if (!result_column) throw Exception(ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT, "Illegal type {} of argument of function {}. " "Only String or FixedString argument is accepted for try-conversion function. For other arguments, " "use function without 'orZero' or 'orNull'.", arguments[0].type->getName(), getName()); return result_column; } }; /// Monotonicity. struct PositiveMonotonicity { static bool has() { return true; } static IFunction::Monotonicity get(const IDataType &, const Field &, const Field &) { return { .is_monotonic = true }; } }; struct UnknownMonotonicity { static bool has() { return false; } static IFunction::Monotonicity get(const IDataType &, const Field &, const Field &) { return { }; } }; template struct ToNumberMonotonicity { static bool has() { return true; } static UInt64 divideByRangeOfType(UInt64 x) { if constexpr (sizeof(T) < sizeof(UInt64)) return x >> (sizeof(T) * 8); else return 0; } static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right) { if (!type.isValueRepresentedByNumber()) return {}; /// If type is same, the conversion is always monotonic. /// (Enum has separate case, because it is different data type) if (checkAndGetDataType>(&type) || checkAndGetDataType>(&type)) return { .is_monotonic = true, .is_always_monotonic = true }; /// Float cases. /// When converting to Float, the conversion is always monotonic. if constexpr (std::is_floating_point_v) return { .is_monotonic = true, .is_always_monotonic = true }; const auto * low_cardinality = typeid_cast(&type); const IDataType * low_cardinality_dictionary_type = nullptr; if (low_cardinality) low_cardinality_dictionary_type = low_cardinality->getDictionaryType().get(); WhichDataType which_type(type); WhichDataType which_inner_type = low_cardinality ? WhichDataType(low_cardinality_dictionary_type) : WhichDataType(type); /// If converting from Float, for monotonicity, arguments must fit in range of result type. if (which_inner_type.isFloat()) { if (left.isNull() || right.isNull()) return {}; Float64 left_float = left.get(); Float64 right_float = right.get(); if (left_float >= static_cast(std::numeric_limits::min()) && left_float <= static_cast(std::numeric_limits::max()) && right_float >= static_cast(std::numeric_limits::min()) && right_float <= static_cast(std::numeric_limits::max())) return { .is_monotonic = true }; return {}; } /// Integer cases. /// Only support types represented by native integers. /// It can be extended to big integers, decimals and DateTime64 later. /// By the way, NULLs are representing unbounded ranges. if (!((left.isNull() || left.getType() == Field::Types::UInt64 || left.getType() == Field::Types::Int64) && (right.isNull() || right.getType() == Field::Types::UInt64 || right.getType() == Field::Types::Int64))) return {}; const bool from_is_unsigned = type.isValueRepresentedByUnsignedInteger(); const bool to_is_unsigned = is_unsigned_v; const size_t size_of_from = type.getSizeOfValueInMemory(); const size_t size_of_to = sizeof(T); const bool left_in_first_half = left.isNull() ? from_is_unsigned : (left.get() >= 0); const bool right_in_first_half = right.isNull() ? !from_is_unsigned : (right.get() >= 0); /// Size of type is the same. if (size_of_from == size_of_to) { if (from_is_unsigned == to_is_unsigned) return { .is_monotonic = true, .is_always_monotonic = true }; if (left_in_first_half == right_in_first_half) return { .is_monotonic = true }; return {}; } /// Size of type is expanded. if (size_of_from < size_of_to) { if (from_is_unsigned == to_is_unsigned) return { .is_monotonic = true, .is_always_monotonic = true }; if (!to_is_unsigned) return { .is_monotonic = true, .is_always_monotonic = true }; /// signed -> unsigned. If arguments from the same half, then function is monotonic. if (left_in_first_half == right_in_first_half) return { .is_monotonic = true }; return {}; } /// Size of type is shrunk. if (size_of_from > size_of_to) { /// Function cannot be monotonic on unbounded ranges. if (left.isNull() || right.isNull()) return {}; /// Function cannot be monotonic when left and right are not on the same ranges. if (divideByRangeOfType(left.get()) != divideByRangeOfType(right.get())) return {}; if (to_is_unsigned) return { .is_monotonic = true }; else { // If To is signed, it's possible that the signedness is different after conversion. So we check it explicitly. const bool is_monotonic = (T(left.get()) >= 0) == (T(right.get()) >= 0); return { .is_monotonic = is_monotonic }; } } UNREACHABLE(); } }; struct ToDateMonotonicity { static bool has() { return true; } static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right) { auto which = WhichDataType(type); if (which.isDateOrDate32() || which.isDateTime() || which.isDateTime64() || which.isInt8() || which.isInt16() || which.isUInt8() || which.isUInt16()) { return { .is_monotonic = true, .is_always_monotonic = true }; } else if ( ((left.getType() == Field::Types::UInt64 || left.isNull()) && (right.getType() == Field::Types::UInt64 || right.isNull()) && ((left.isNull() || left.get() < 0xFFFF) && (right.isNull() || right.get() >= 0xFFFF))) || ((left.getType() == Field::Types::Int64 || left.isNull()) && (right.getType() == Field::Types::Int64 || right.isNull()) && ((left.isNull() || left.get() < 0xFFFF) && (right.isNull() || right.get() >= 0xFFFF))) || (((left.getType() == Field::Types::Float64 || left.isNull()) && (right.getType() == Field::Types::Float64 || right.isNull()) && ((left.isNull() || left.get() < 0xFFFF) && (right.isNull() || right.get() >= 0xFFFF)))) || !isNativeNumber(type)) { return {}; } else { return { .is_monotonic = true, .is_always_monotonic = true }; } } }; struct ToDateTimeMonotonicity { static bool has() { return true; } static IFunction::Monotonicity get(const IDataType & type, const Field &, const Field &) { if (type.isValueRepresentedByNumber()) return { .is_monotonic = true, .is_always_monotonic = true }; else return {}; } }; /** The monotonicity for the `toString` function is mainly determined for test purposes. * It is doubtful that anyone is looking to optimize queries with conditions `toString(CounterID) = 34`. */ struct ToStringMonotonicity { static bool has() { return true; } static IFunction::Monotonicity get(const IDataType & type, const Field & left, const Field & right) { IFunction::Monotonicity positive{ .is_monotonic = true }; IFunction::Monotonicity not_monotonic; const auto * type_ptr = &type; if (const auto * low_cardinality_type = checkAndGetDataType(type_ptr)) type_ptr = low_cardinality_type->getDictionaryType().get(); /// Order on enum values (which is the order on integers) is completely arbitrary in respect to the order on strings. if (WhichDataType(type).isEnum()) return not_monotonic; /// `toString` function is monotonous if the argument is Date or Date32 or DateTime or String, or non-negative numbers with the same number of symbols. if (checkDataTypes(type_ptr)) return positive; if (left.isNull() || right.isNull()) return {}; if (left.getType() == Field::Types::UInt64 && right.getType() == Field::Types::UInt64) { return (left.get() == 0 && right.get() == 0) || (floor(log10(left.get())) == floor(log10(right.get()))) ? positive : not_monotonic; } if (left.getType() == Field::Types::Int64 && right.getType() == Field::Types::Int64) { return (left.get() == 0 && right.get() == 0) || (left.get() > 0 && right.get() > 0 && floor(log10(left.get())) == floor(log10(right.get()))) ? positive : not_monotonic; } return not_monotonic; } }; struct NameToUInt8 { static constexpr auto name = "toUInt8"; }; struct NameToUInt16 { static constexpr auto name = "toUInt16"; }; struct NameToUInt32 { static constexpr auto name = "toUInt32"; }; struct NameToUInt64 { static constexpr auto name = "toUInt64"; }; struct NameToUInt128 { static constexpr auto name = "toUInt128"; }; struct NameToUInt256 { static constexpr auto name = "toUInt256"; }; struct NameToInt8 { static constexpr auto name = "toInt8"; }; struct NameToInt16 { static constexpr auto name = "toInt16"; }; struct NameToInt32 { static constexpr auto name = "toInt32"; }; struct NameToInt64 { static constexpr auto name = "toInt64"; }; struct NameToInt128 { static constexpr auto name = "toInt128"; }; struct NameToInt256 { static constexpr auto name = "toInt256"; }; struct NameToFloat32 { static constexpr auto name = "toFloat32"; }; struct NameToFloat64 { static constexpr auto name = "toFloat64"; }; struct NameToUUID { static constexpr auto name = "toUUID"; }; struct NameToIPv4 { static constexpr auto name = "toIPv4"; }; struct NameToIPv6 { static constexpr auto name = "toIPv6"; }; using FunctionToUInt8 = FunctionConvert>; using FunctionToUInt16 = FunctionConvert>; using FunctionToUInt32 = FunctionConvert>; using FunctionToUInt64 = FunctionConvert>; using FunctionToUInt128 = FunctionConvert>; using FunctionToUInt256 = FunctionConvert>; using FunctionToInt8 = FunctionConvert>; using FunctionToInt16 = FunctionConvert>; using FunctionToInt32 = FunctionConvert>; using FunctionToInt64 = FunctionConvert>; using FunctionToInt128 = FunctionConvert>; using FunctionToInt256 = FunctionConvert>; using FunctionToFloat32 = FunctionConvert>; using FunctionToFloat64 = FunctionConvert>; using FunctionToDate = FunctionConvert; using FunctionToDate32 = FunctionConvert; using FunctionToDateTime = FunctionConvert; using FunctionToDateTime32 = FunctionConvert; using FunctionToDateTime64 = FunctionConvert; using FunctionToUUID = FunctionConvert>; using FunctionToIPv4 = FunctionConvert>; using FunctionToIPv6 = FunctionConvert>; using FunctionToString = FunctionConvert; using FunctionToUnixTimestamp = FunctionConvert>; using FunctionToDecimal32 = FunctionConvert, NameToDecimal32, UnknownMonotonicity>; using FunctionToDecimal64 = FunctionConvert, NameToDecimal64, UnknownMonotonicity>; using FunctionToDecimal128 = FunctionConvert, NameToDecimal128, UnknownMonotonicity>; using FunctionToDecimal256 = FunctionConvert, NameToDecimal256, UnknownMonotonicity>; template struct FunctionTo; template <> struct FunctionTo { using Type = FunctionToUInt8; }; template <> struct FunctionTo { using Type = FunctionToUInt16; }; template <> struct FunctionTo { using Type = FunctionToUInt32; }; template <> struct FunctionTo { using Type = FunctionToUInt64; }; template <> struct FunctionTo { using Type = FunctionToUInt128; }; template <> struct FunctionTo { using Type = FunctionToUInt256; }; template <> struct FunctionTo { using Type = FunctionToInt8; }; template <> struct FunctionTo { using Type = FunctionToInt16; }; template <> struct FunctionTo { using Type = FunctionToInt32; }; template <> struct FunctionTo { using Type = FunctionToInt64; }; template <> struct FunctionTo { using Type = FunctionToInt128; }; template <> struct FunctionTo { using Type = FunctionToInt256; }; template <> struct FunctionTo { using Type = FunctionToFloat32; }; template <> struct FunctionTo { using Type = FunctionToFloat64; }; template <> struct FunctionTo { using Type = FunctionToDate; }; template <> struct FunctionTo { using Type = FunctionToDate32; }; template <> struct FunctionTo { using Type = FunctionToDateTime; }; template <> struct FunctionTo { using Type = FunctionToDateTime64; }; template <> struct FunctionTo { using Type = FunctionToUUID; }; template <> struct FunctionTo { using Type = FunctionToIPv4; }; template <> struct FunctionTo { using Type = FunctionToIPv6; }; template <> struct FunctionTo { using Type = FunctionToString; }; template <> struct FunctionTo { using Type = FunctionToFixedString; }; template <> struct FunctionTo> { using Type = FunctionToDecimal32; }; template <> struct FunctionTo> { using Type = FunctionToDecimal64; }; template <> struct FunctionTo> { using Type = FunctionToDecimal128; }; template <> struct FunctionTo> { using Type = FunctionToDecimal256; }; template struct FunctionTo> : FunctionTo> { }; struct NameToUInt8OrZero { static constexpr auto name = "toUInt8OrZero"; }; struct NameToUInt16OrZero { static constexpr auto name = "toUInt16OrZero"; }; struct NameToUInt32OrZero { static constexpr auto name = "toUInt32OrZero"; }; struct NameToUInt64OrZero { static constexpr auto name = "toUInt64OrZero"; }; struct NameToUInt128OrZero { static constexpr auto name = "toUInt128OrZero"; }; struct NameToUInt256OrZero { static constexpr auto name = "toUInt256OrZero"; }; struct NameToInt8OrZero { static constexpr auto name = "toInt8OrZero"; }; struct NameToInt16OrZero { static constexpr auto name = "toInt16OrZero"; }; struct NameToInt32OrZero { static constexpr auto name = "toInt32OrZero"; }; struct NameToInt64OrZero { static constexpr auto name = "toInt64OrZero"; }; struct NameToInt128OrZero { static constexpr auto name = "toInt128OrZero"; }; struct NameToInt256OrZero { static constexpr auto name = "toInt256OrZero"; }; struct NameToFloat32OrZero { static constexpr auto name = "toFloat32OrZero"; }; struct NameToFloat64OrZero { static constexpr auto name = "toFloat64OrZero"; }; struct NameToDateOrZero { static constexpr auto name = "toDateOrZero"; }; struct NameToDate32OrZero { static constexpr auto name = "toDate32OrZero"; }; struct NameToDateTimeOrZero { static constexpr auto name = "toDateTimeOrZero"; }; struct NameToDateTime64OrZero { static constexpr auto name = "toDateTime64OrZero"; }; struct NameToDecimal32OrZero { static constexpr auto name = "toDecimal32OrZero"; }; struct NameToDecimal64OrZero { static constexpr auto name = "toDecimal64OrZero"; }; struct NameToDecimal128OrZero { static constexpr auto name = "toDecimal128OrZero"; }; struct NameToDecimal256OrZero { static constexpr auto name = "toDecimal256OrZero"; }; struct NameToUUIDOrZero { static constexpr auto name = "toUUIDOrZero"; }; struct NameToIPv4OrZero { static constexpr auto name = "toIPv4OrZero"; }; struct NameToIPv6OrZero { static constexpr auto name = "toIPv6OrZero"; }; using FunctionToUInt8OrZero = FunctionConvertFromString; using FunctionToUInt16OrZero = FunctionConvertFromString; using FunctionToUInt32OrZero = FunctionConvertFromString; using FunctionToUInt64OrZero = FunctionConvertFromString; using FunctionToUInt128OrZero = FunctionConvertFromString; using FunctionToUInt256OrZero = FunctionConvertFromString; using FunctionToInt8OrZero = FunctionConvertFromString; using FunctionToInt16OrZero = FunctionConvertFromString; using FunctionToInt32OrZero = FunctionConvertFromString; using FunctionToInt64OrZero = FunctionConvertFromString; using FunctionToInt128OrZero = FunctionConvertFromString; using FunctionToInt256OrZero = FunctionConvertFromString; using FunctionToFloat32OrZero = FunctionConvertFromString; using FunctionToFloat64OrZero = FunctionConvertFromString; using FunctionToDateOrZero = FunctionConvertFromString; using FunctionToDate32OrZero = FunctionConvertFromString; using FunctionToDateTimeOrZero = FunctionConvertFromString; using FunctionToDateTime64OrZero = FunctionConvertFromString; using FunctionToDecimal32OrZero = FunctionConvertFromString, NameToDecimal32OrZero, ConvertFromStringExceptionMode::Zero>; using FunctionToDecimal64OrZero = FunctionConvertFromString, NameToDecimal64OrZero, ConvertFromStringExceptionMode::Zero>; using FunctionToDecimal128OrZero = FunctionConvertFromString, NameToDecimal128OrZero, ConvertFromStringExceptionMode::Zero>; using FunctionToDecimal256OrZero = FunctionConvertFromString, NameToDecimal256OrZero, ConvertFromStringExceptionMode::Zero>; using FunctionToUUIDOrZero = FunctionConvertFromString; using FunctionToIPv4OrZero = FunctionConvertFromString; using FunctionToIPv6OrZero = FunctionConvertFromString; struct NameToUInt8OrNull { static constexpr auto name = "toUInt8OrNull"; }; struct NameToUInt16OrNull { static constexpr auto name = "toUInt16OrNull"; }; struct NameToUInt32OrNull { static constexpr auto name = "toUInt32OrNull"; }; struct NameToUInt64OrNull { static constexpr auto name = "toUInt64OrNull"; }; struct NameToUInt128OrNull { static constexpr auto name = "toUInt128OrNull"; }; struct NameToUInt256OrNull { static constexpr auto name = "toUInt256OrNull"; }; struct NameToInt8OrNull { static constexpr auto name = "toInt8OrNull"; }; struct NameToInt16OrNull { static constexpr auto name = "toInt16OrNull"; }; struct NameToInt32OrNull { static constexpr auto name = "toInt32OrNull"; }; struct NameToInt64OrNull { static constexpr auto name = "toInt64OrNull"; }; struct NameToInt128OrNull { static constexpr auto name = "toInt128OrNull"; }; struct NameToInt256OrNull { static constexpr auto name = "toInt256OrNull"; }; struct NameToFloat32OrNull { static constexpr auto name = "toFloat32OrNull"; }; struct NameToFloat64OrNull { static constexpr auto name = "toFloat64OrNull"; }; struct NameToDateOrNull { static constexpr auto name = "toDateOrNull"; }; struct NameToDate32OrNull { static constexpr auto name = "toDate32OrNull"; }; struct NameToDateTimeOrNull { static constexpr auto name = "toDateTimeOrNull"; }; struct NameToDateTime64OrNull { static constexpr auto name = "toDateTime64OrNull"; }; struct NameToDecimal32OrNull { static constexpr auto name = "toDecimal32OrNull"; }; struct NameToDecimal64OrNull { static constexpr auto name = "toDecimal64OrNull"; }; struct NameToDecimal128OrNull { static constexpr auto name = "toDecimal128OrNull"; }; struct NameToDecimal256OrNull { static constexpr auto name = "toDecimal256OrNull"; }; struct NameToUUIDOrNull { static constexpr auto name = "toUUIDOrNull"; }; struct NameToIPv4OrNull { static constexpr auto name = "toIPv4OrNull"; }; struct NameToIPv6OrNull { static constexpr auto name = "toIPv6OrNull"; }; using FunctionToUInt8OrNull = FunctionConvertFromString; using FunctionToUInt16OrNull = FunctionConvertFromString; using FunctionToUInt32OrNull = FunctionConvertFromString; using FunctionToUInt64OrNull = FunctionConvertFromString; using FunctionToUInt128OrNull = FunctionConvertFromString; using FunctionToUInt256OrNull = FunctionConvertFromString; using FunctionToInt8OrNull = FunctionConvertFromString; using FunctionToInt16OrNull = FunctionConvertFromString; using FunctionToInt32OrNull = FunctionConvertFromString; using FunctionToInt64OrNull = FunctionConvertFromString; using FunctionToInt128OrNull = FunctionConvertFromString; using FunctionToInt256OrNull = FunctionConvertFromString; using FunctionToFloat32OrNull = FunctionConvertFromString; using FunctionToFloat64OrNull = FunctionConvertFromString; using FunctionToDateOrNull = FunctionConvertFromString; using FunctionToDate32OrNull = FunctionConvertFromString; using FunctionToDateTimeOrNull = FunctionConvertFromString; using FunctionToDateTime64OrNull = FunctionConvertFromString; using FunctionToDecimal32OrNull = FunctionConvertFromString, NameToDecimal32OrNull, ConvertFromStringExceptionMode::Null>; using FunctionToDecimal64OrNull = FunctionConvertFromString, NameToDecimal64OrNull, ConvertFromStringExceptionMode::Null>; using FunctionToDecimal128OrNull = FunctionConvertFromString, NameToDecimal128OrNull, ConvertFromStringExceptionMode::Null>; using FunctionToDecimal256OrNull = FunctionConvertFromString, NameToDecimal256OrNull, ConvertFromStringExceptionMode::Null>; using FunctionToUUIDOrNull = FunctionConvertFromString; using FunctionToIPv4OrNull = FunctionConvertFromString; using FunctionToIPv6OrNull = FunctionConvertFromString; struct NameParseDateTimeBestEffort { static constexpr auto name = "parseDateTimeBestEffort"; }; struct NameParseDateTimeBestEffortOrZero { static constexpr auto name = "parseDateTimeBestEffortOrZero"; }; struct NameParseDateTimeBestEffortOrNull { static constexpr auto name = "parseDateTimeBestEffortOrNull"; }; struct NameParseDateTimeBestEffortUS { static constexpr auto name = "parseDateTimeBestEffortUS"; }; struct NameParseDateTimeBestEffortUSOrZero { static constexpr auto name = "parseDateTimeBestEffortUSOrZero"; }; struct NameParseDateTimeBestEffortUSOrNull { static constexpr auto name = "parseDateTimeBestEffortUSOrNull"; }; struct NameParseDateTime32BestEffort { static constexpr auto name = "parseDateTime32BestEffort"; }; struct NameParseDateTime32BestEffortOrZero { static constexpr auto name = "parseDateTime32BestEffortOrZero"; }; struct NameParseDateTime32BestEffortOrNull { static constexpr auto name = "parseDateTime32BestEffortOrNull"; }; struct NameParseDateTime64BestEffort { static constexpr auto name = "parseDateTime64BestEffort"; }; struct NameParseDateTime64BestEffortOrZero { static constexpr auto name = "parseDateTime64BestEffortOrZero"; }; struct NameParseDateTime64BestEffortOrNull { static constexpr auto name = "parseDateTime64BestEffortOrNull"; }; struct NameParseDateTime64BestEffortUS { static constexpr auto name = "parseDateTime64BestEffortUS"; }; struct NameParseDateTime64BestEffortUSOrZero { static constexpr auto name = "parseDateTime64BestEffortUSOrZero"; }; struct NameParseDateTime64BestEffortUSOrNull { static constexpr auto name = "parseDateTime64BestEffortUSOrNull"; }; using FunctionParseDateTimeBestEffort = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffort, ConvertFromStringExceptionMode::Throw, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTimeBestEffortOrZero = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffortOrZero, ConvertFromStringExceptionMode::Zero, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTimeBestEffortOrNull = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffortOrNull, ConvertFromStringExceptionMode::Null, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTimeBestEffortUS = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffortUS, ConvertFromStringExceptionMode::Throw, ConvertFromStringParsingMode::BestEffortUS>; using FunctionParseDateTimeBestEffortUSOrZero = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffortUSOrZero, ConvertFromStringExceptionMode::Zero, ConvertFromStringParsingMode::BestEffortUS>; using FunctionParseDateTimeBestEffortUSOrNull = FunctionConvertFromString< DataTypeDateTime, NameParseDateTimeBestEffortUSOrNull, ConvertFromStringExceptionMode::Null, ConvertFromStringParsingMode::BestEffortUS>; using FunctionParseDateTime32BestEffort = FunctionConvertFromString< DataTypeDateTime, NameParseDateTime32BestEffort, ConvertFromStringExceptionMode::Throw, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime32BestEffortOrZero = FunctionConvertFromString< DataTypeDateTime, NameParseDateTime32BestEffortOrZero, ConvertFromStringExceptionMode::Zero, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime32BestEffortOrNull = FunctionConvertFromString< DataTypeDateTime, NameParseDateTime32BestEffortOrNull, ConvertFromStringExceptionMode::Null, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime64BestEffort = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffort, ConvertFromStringExceptionMode::Throw, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime64BestEffortOrZero = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffortOrZero, ConvertFromStringExceptionMode::Zero, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime64BestEffortOrNull = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffortOrNull, ConvertFromStringExceptionMode::Null, ConvertFromStringParsingMode::BestEffort>; using FunctionParseDateTime64BestEffortUS = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffortUS, ConvertFromStringExceptionMode::Throw, ConvertFromStringParsingMode::BestEffortUS>; using FunctionParseDateTime64BestEffortUSOrZero = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffortUSOrZero, ConvertFromStringExceptionMode::Zero, ConvertFromStringParsingMode::BestEffortUS>; using FunctionParseDateTime64BestEffortUSOrNull = FunctionConvertFromString< DataTypeDateTime64, NameParseDateTime64BestEffortUSOrNull, ConvertFromStringExceptionMode::Null, ConvertFromStringParsingMode::BestEffortUS>; class ExecutableFunctionCast : public IExecutableFunction { public: using WrapperType = std::function; struct Diagnostic { std::string column_from; std::string column_to; }; explicit ExecutableFunctionCast( WrapperType && wrapper_function_, const char * name_, std::optional diagnostic_) : wrapper_function(std::move(wrapper_function_)), name(name_), diagnostic(std::move(diagnostic_)) {} String getName() const override { return name; } protected: ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override { /// drop second argument, pass others ColumnsWithTypeAndName new_arguments{arguments.front()}; if (arguments.size() > 2) new_arguments.insert(std::end(new_arguments), std::next(std::begin(arguments), 2), std::end(arguments)); try { return wrapper_function(new_arguments, result_type, nullptr, input_rows_count); } catch (Exception & e) { if (diagnostic) e.addMessage("while converting source column " + backQuoteIfNeed(diagnostic->column_from) + " to destination column " + backQuoteIfNeed(diagnostic->column_to)); throw; } } bool useDefaultImplementationForNulls() const override { return false; } /// CAST(Nothing, T) -> T bool useDefaultImplementationForNothing() const override { return false; } bool useDefaultImplementationForConstants() const override { return true; } bool useDefaultImplementationForLowCardinalityColumns() const override { return false; } ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; } private: WrapperType wrapper_function; const char * name; std::optional diagnostic; }; struct CastName { static constexpr auto name = "CAST"; }; struct CastInternalName { static constexpr auto name = "_CAST"; }; enum class CastType { nonAccurate, accurate, accurateOrNull }; class FunctionCastBase : public IFunctionBase { public: using MonotonicityForRange = std::function; using Diagnostic = ExecutableFunctionCast::Diagnostic; }; template class FunctionCast final : public FunctionCastBase { public: using WrapperType = std::function; FunctionCast(ContextPtr context_ , const char * cast_name_ , MonotonicityForRange && monotonicity_for_range_ , const DataTypes & argument_types_ , const DataTypePtr & return_type_ , std::optional diagnostic_ , CastType cast_type_) : cast_name(cast_name_), monotonicity_for_range(std::move(monotonicity_for_range_)) , argument_types(argument_types_), return_type(return_type_), diagnostic(std::move(diagnostic_)) , cast_type(cast_type_) , context(context_) { } const DataTypes & getArgumentTypes() const override { return argument_types; } const DataTypePtr & getResultType() const override { return return_type; } ExecutableFunctionPtr prepare(const ColumnsWithTypeAndName & /*sample_columns*/) const override { try { return std::make_unique( prepareUnpackDictionaries(getArgumentTypes()[0], getResultType()), cast_name, diagnostic); } catch (Exception & e) { if (diagnostic) e.addMessage("while converting source column " + backQuoteIfNeed(diagnostic->column_from) + " to destination column " + backQuoteIfNeed(diagnostic->column_to)); throw; } } String getName() const override { return cast_name; } bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; } bool hasInformationAboutMonotonicity() const override { return static_cast(monotonicity_for_range); } Monotonicity getMonotonicityForRange(const IDataType & type, const Field & left, const Field & right) const override { return monotonicity_for_range(type, left, right); } private: const char * cast_name; MonotonicityForRange monotonicity_for_range; DataTypes argument_types; DataTypePtr return_type; std::optional diagnostic; CastType cast_type; ContextPtr context; static WrapperType createFunctionAdaptor(FunctionPtr function, const DataTypePtr & from_type) { auto function_adaptor = std::make_unique(function)->build({ColumnWithTypeAndName{nullptr, from_type, ""}}); return [function_adaptor] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) { return function_adaptor->execute(arguments, result_type, input_rows_count); }; } static WrapperType createToNullableColumnWrapper() { return [] (ColumnsWithTypeAndName &, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) { ColumnPtr res = result_type->createColumn(); ColumnUInt8::Ptr col_null_map_to = ColumnUInt8::create(input_rows_count, true); return ColumnNullable::create(res->cloneResized(input_rows_count), std::move(col_null_map_to)); }; } template WrapperType createWrapper(const DataTypePtr & from_type, const ToDataType * const to_type, bool requested_result_is_nullable) const { TypeIndex from_type_index = from_type->getTypeId(); WhichDataType which(from_type_index); bool can_apply_accurate_cast = (cast_type == CastType::accurate || cast_type == CastType::accurateOrNull) && (which.isInt() || which.isUInt() || which.isFloat()); if (requested_result_is_nullable && checkAndGetDataType(from_type.get())) { /// In case when converting to Nullable type, we apply different parsing rule, /// that will not throw an exception but return NULL in case of malformed input. FunctionPtr function = FunctionConvertFromString::create(); return createFunctionAdaptor(function, from_type); } else if (!can_apply_accurate_cast) { FunctionPtr function = FunctionTo::Type::create(context); return createFunctionAdaptor(function, from_type); } auto wrapper_cast_type = cast_type; return [wrapper_cast_type, from_type_index, to_type] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *column_nullable, size_t input_rows_count) { ColumnPtr result_column; auto res = callOnIndexAndDataType(from_type_index, [&](const auto & types) -> bool { using Types = std::decay_t; using LeftDataType = typename Types::LeftType; using RightDataType = typename Types::RightType; if constexpr (IsDataTypeNumber) { if constexpr (IsDataTypeNumber) { if (wrapper_cast_type == CastType::accurate) { result_column = ConvertImpl::execute( arguments, result_type, input_rows_count, AccurateConvertStrategyAdditions()); } else { result_column = ConvertImpl::execute( arguments, result_type, input_rows_count, AccurateOrNullConvertStrategyAdditions()); } return true; } if constexpr (std::is_same_v || std::is_same_v) { if (wrapper_cast_type == CastType::accurate) { result_column = ConvertImpl::template execute( arguments, result_type, input_rows_count); } else { result_column = ConvertImpl::template execute( arguments, result_type, input_rows_count); } return true; } } return false; }); /// Additionally check if callOnIndexAndDataType wasn't called at all. if (!res) { if (wrapper_cast_type == CastType::accurateOrNull) { auto nullable_column_wrapper = FunctionCast::createToNullableColumnWrapper(); return nullable_column_wrapper(arguments, result_type, column_nullable, input_rows_count); } else { throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", from_type_index, to_type->getName()); } } return result_column; }; } template WrapperType createBoolWrapper(const DataTypePtr & from_type, const ToDataType * const to_type, bool requested_result_is_nullable) const { if (checkAndGetDataType(from_type.get())) { return &ConvertImplGenericFromString::execute; } return createWrapper(from_type, to_type, requested_result_is_nullable); } WrapperType createUInt8ToBoolWrapper(const DataTypePtr from_type, const DataTypePtr to_type) const { return [from_type, to_type] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable *, size_t /*input_rows_count*/) -> ColumnPtr { /// Special case when we convert UInt8 column to Bool column. /// both columns have type UInt8, but we shouldn't use identity wrapper, /// because Bool column can contain only 0 and 1. auto res_column = to_type->createColumn(); const auto & data_from = checkAndGetColumn(arguments[0].column.get())->getData(); auto & data_to = assert_cast(res_column.get())->getData(); data_to.resize(data_from.size()); for (size_t i = 0; i != data_from.size(); ++i) data_to[i] = static_cast(data_from[i]); return res_column; }; } static WrapperType createStringWrapper(const DataTypePtr & from_type) { FunctionPtr function = FunctionToString::create(); return createFunctionAdaptor(function, from_type); } WrapperType createFixedStringWrapper(const DataTypePtr & from_type, const size_t N) const { if (!isStringOrFixedString(from_type)) throw Exception(ErrorCodes::NOT_IMPLEMENTED, "CAST AS FixedString is only implemented for types String and FixedString"); bool exception_mode_null = cast_type == CastType::accurateOrNull; return [exception_mode_null, N] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable *, size_t /*input_rows_count*/) { if (exception_mode_null) return FunctionToFixedString::executeForN(arguments, N); else return FunctionToFixedString::executeForN(arguments, N); }; } #define GENERATE_INTERVAL_CASE(INTERVAL_KIND) \ case IntervalKind::INTERVAL_KIND: \ return createFunctionAdaptor(FunctionConvert::create(), from_type); static WrapperType createIntervalWrapper(const DataTypePtr & from_type, IntervalKind kind) { switch (kind) { GENERATE_INTERVAL_CASE(Nanosecond) GENERATE_INTERVAL_CASE(Microsecond) GENERATE_INTERVAL_CASE(Millisecond) GENERATE_INTERVAL_CASE(Second) GENERATE_INTERVAL_CASE(Minute) GENERATE_INTERVAL_CASE(Hour) GENERATE_INTERVAL_CASE(Day) GENERATE_INTERVAL_CASE(Week) GENERATE_INTERVAL_CASE(Month) GENERATE_INTERVAL_CASE(Quarter) GENERATE_INTERVAL_CASE(Year) } throw Exception{ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion to unexpected IntervalKind: {}", kind.toString()}; } #undef GENERATE_INTERVAL_CASE template requires IsDataTypeDecimal WrapperType createDecimalWrapper(const DataTypePtr & from_type, const ToDataType * to_type, bool requested_result_is_nullable) const { TypeIndex type_index = from_type->getTypeId(); UInt32 scale = to_type->getScale(); WhichDataType which(type_index); bool ok = which.isNativeInt() || which.isNativeUInt() || which.isDecimal() || which.isFloat() || which.isDateOrDate32() || which.isDateTime() || which.isDateTime64() || which.isStringOrFixedString(); if (!ok) { if (cast_type == CastType::accurateOrNull) return createToNullableColumnWrapper(); else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", from_type->getName(), to_type->getName()); } auto wrapper_cast_type = cast_type; return [wrapper_cast_type, type_index, scale, to_type, requested_result_is_nullable] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *column_nullable, size_t input_rows_count) { ColumnPtr result_column; auto res = callOnIndexAndDataType(type_index, [&](const auto & types) -> bool { using Types = std::decay_t; using LeftDataType = typename Types::LeftType; using RightDataType = typename Types::RightType; if constexpr (IsDataTypeDecimalOrNumber && IsDataTypeDecimalOrNumber && !std::is_same_v) { if (wrapper_cast_type == CastType::accurate) { AccurateConvertStrategyAdditions additions; additions.scale = scale; result_column = ConvertImpl::execute( arguments, result_type, input_rows_count, additions); return true; } else if (wrapper_cast_type == CastType::accurateOrNull) { AccurateOrNullConvertStrategyAdditions additions; additions.scale = scale; result_column = ConvertImpl::execute( arguments, result_type, input_rows_count, additions); return true; } } else if constexpr (std::is_same_v) { if (requested_result_is_nullable) { /// Consistent with CAST(Nullable(String) AS Nullable(Numbers)) /// In case when converting to Nullable type, we apply different parsing rule, /// that will not throw an exception but return NULL in case of malformed input. result_column = ConvertImpl::execute( arguments, result_type, input_rows_count, scale); return true; } } result_column = ConvertImpl::execute(arguments, result_type, input_rows_count, scale); return true; }); /// Additionally check if callOnIndexAndDataType wasn't called at all. if (!res) { if (wrapper_cast_type == CastType::accurateOrNull) { auto nullable_column_wrapper = FunctionCast::createToNullableColumnWrapper(); return nullable_column_wrapper(arguments, result_type, column_nullable, input_rows_count); } else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", type_index, to_type->getName()); } return result_column; }; } WrapperType createAggregateFunctionWrapper(const DataTypePtr & from_type_untyped, const DataTypeAggregateFunction * to_type) const { /// Conversion from String through parsing. if (checkAndGetDataType(from_type_untyped.get())) { return &ConvertImplGenericFromString::execute; } else { if (cast_type == CastType::accurateOrNull) return createToNullableColumnWrapper(); else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", from_type_untyped->getName(), to_type->getName()); } } WrapperType createArrayWrapper(const DataTypePtr & from_type_untyped, const DataTypeArray & to_type) const { /// Conversion from String through parsing. if (checkAndGetDataType(from_type_untyped.get())) { return &ConvertImplGenericFromString::execute; } DataTypePtr from_type_holder; const auto * from_type = checkAndGetDataType(from_type_untyped.get()); const auto * from_type_map = checkAndGetDataType(from_type_untyped.get()); /// Convert from Map if (from_type_map) { /// Recreate array of unnamed tuples because otherwise it may work /// unexpectedly while converting to array of named tuples. from_type_holder = from_type_map->getNestedTypeWithUnnamedTuple(); from_type = assert_cast(from_type_holder.get()); } if (!from_type) { throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Array can only be performed between same-dimensional Array, Map or String types"); } DataTypePtr from_nested_type = from_type->getNestedType(); /// In query SELECT CAST([] AS Array(Array(String))) from type is Array(Nothing) bool from_empty_array = isNothing(from_nested_type); if (from_type->getNumberOfDimensions() != to_type.getNumberOfDimensions() && !from_empty_array) throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Array can only be performed between same-dimensional array types"); const DataTypePtr & to_nested_type = to_type.getNestedType(); /// Prepare nested type conversion const auto nested_function = prepareUnpackDictionaries(from_nested_type, to_nested_type); return [nested_function, from_nested_type, to_nested_type]( ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t /*input_rows_count*/) -> ColumnPtr { const auto & argument_column = arguments.front(); const ColumnArray * col_array = nullptr; if (const ColumnMap * col_map = checkAndGetColumn(argument_column.column.get())) col_array = &col_map->getNestedColumn(); else col_array = checkAndGetColumn(argument_column.column.get()); if (col_array) { /// create columns for converting nested column containing original and result columns ColumnsWithTypeAndName nested_columns{{ col_array->getDataPtr(), from_nested_type, "" }}; /// convert nested column auto result_column = nested_function(nested_columns, to_nested_type, nullable_source, nested_columns.front().column->size()); /// set converted nested column to result return ColumnArray::create(result_column, col_array->getOffsetsPtr()); } else { throw Exception(ErrorCodes::LOGICAL_ERROR, "Illegal column {} for function CAST AS Array", argument_column.column->getName()); } }; } using ElementWrappers = std::vector; ElementWrappers getElementWrappers(const DataTypes & from_element_types, const DataTypes & to_element_types) const { ElementWrappers element_wrappers; element_wrappers.reserve(from_element_types.size()); /// Create conversion wrapper for each element in tuple for (size_t i = 0; i < from_element_types.size(); ++i) { const DataTypePtr & from_element_type = from_element_types[i]; const DataTypePtr & to_element_type = to_element_types[i]; element_wrappers.push_back(prepareUnpackDictionaries(from_element_type, to_element_type)); } return element_wrappers; } WrapperType createTupleWrapper(const DataTypePtr & from_type_untyped, const DataTypeTuple * to_type) const { /// Conversion from String through parsing. if (checkAndGetDataType(from_type_untyped.get())) { return &ConvertImplGenericFromString::execute; } const auto * from_type = checkAndGetDataType(from_type_untyped.get()); if (!from_type) throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Tuple can only be performed between tuple types or from String.\n" "Left type: {}, right type: {}", from_type_untyped->getName(), to_type->getName()); const auto & from_element_types = from_type->getElements(); const auto & to_element_types = to_type->getElements(); std::vector element_wrappers; std::vector> to_reverse_index; /// For named tuples allow conversions for tuples with /// different sets of elements. If element exists in @to_type /// and doesn't exist in @to_type it will be filled by default values. if (from_type->haveExplicitNames() && to_type->haveExplicitNames()) { const auto & from_names = from_type->getElementNames(); std::unordered_map from_positions; from_positions.reserve(from_names.size()); for (size_t i = 0; i < from_names.size(); ++i) from_positions[from_names[i]] = i; const auto & to_names = to_type->getElementNames(); element_wrappers.reserve(to_names.size()); to_reverse_index.reserve(from_names.size()); for (size_t i = 0; i < to_names.size(); ++i) { auto it = from_positions.find(to_names[i]); if (it != from_positions.end()) { element_wrappers.emplace_back(prepareUnpackDictionaries(from_element_types[it->second], to_element_types[i])); to_reverse_index.emplace_back(it->second); } else { element_wrappers.emplace_back(); to_reverse_index.emplace_back(); } } } else { if (from_element_types.size() != to_element_types.size()) throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Tuple can only be performed between tuple types " "with the same number of elements or from String.\nLeft type: {}, right type: {}", from_type->getName(), to_type->getName()); element_wrappers = getElementWrappers(from_element_types, to_element_types); to_reverse_index.reserve(to_element_types.size()); for (size_t i = 0; i < to_element_types.size(); ++i) to_reverse_index.emplace_back(i); } return [element_wrappers, from_element_types, to_element_types, to_reverse_index] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t input_rows_count) -> ColumnPtr { const auto * col = arguments.front().column.get(); size_t tuple_size = to_element_types.size(); const ColumnTuple & column_tuple = typeid_cast(*col); Columns converted_columns(tuple_size); /// invoke conversion for each element for (size_t i = 0; i < tuple_size; ++i) { if (to_reverse_index[i]) { size_t from_idx = *to_reverse_index[i]; ColumnsWithTypeAndName element = {{column_tuple.getColumns()[from_idx], from_element_types[from_idx], "" }}; converted_columns[i] = element_wrappers[i](element, to_element_types[i], nullable_source, input_rows_count); } else { converted_columns[i] = to_element_types[i]->createColumn()->cloneResized(input_rows_count); } } return ColumnTuple::create(converted_columns); }; } /// The case of: tuple([key1, key2, ..., key_n], [value1, value2, ..., value_n]) WrapperType createTupleToMapWrapper(const DataTypes & from_kv_types, const DataTypes & to_kv_types) const { return [element_wrappers = getElementWrappers(from_kv_types, to_kv_types), from_kv_types, to_kv_types] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t /*input_rows_count*/) -> ColumnPtr { const auto * col = arguments.front().column.get(); const auto & column_tuple = assert_cast(*col); Columns offsets(2); Columns converted_columns(2); for (size_t i = 0; i < 2; ++i) { const auto & column_array = assert_cast(column_tuple.getColumn(i)); ColumnsWithTypeAndName element = {{column_array.getDataPtr(), from_kv_types[i], ""}}; converted_columns[i] = element_wrappers[i](element, to_kv_types[i], nullable_source, (element[0].column)->size()); offsets[i] = column_array.getOffsetsPtr(); } const auto & keys_offsets = assert_cast(*offsets[0]).getData(); const auto & values_offsets = assert_cast(*offsets[1]).getData(); if (keys_offsets != values_offsets) throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Map can only be performed from tuple of arrays with equal sizes."); return ColumnMap::create(converted_columns[0], converted_columns[1], offsets[0]); }; } WrapperType createMapToMapWrapper(const DataTypes & from_kv_types, const DataTypes & to_kv_types) const { return [element_wrappers = getElementWrappers(from_kv_types, to_kv_types), from_kv_types, to_kv_types] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t /*input_rows_count*/) -> ColumnPtr { const auto * col = arguments.front().column.get(); const auto & column_map = typeid_cast(*col); const auto & nested_data = column_map.getNestedData(); Columns converted_columns(2); for (size_t i = 0; i < 2; ++i) { ColumnsWithTypeAndName element = {{nested_data.getColumnPtr(i), from_kv_types[i], ""}}; converted_columns[i] = element_wrappers[i](element, to_kv_types[i], nullable_source, (element[0].column)->size()); } return ColumnMap::create(converted_columns[0], converted_columns[1], column_map.getNestedColumn().getOffsetsPtr()); }; } /// The case of: [(key1, value1), (key2, value2), ...] WrapperType createArrayToMapWrapper(const DataTypes & from_kv_types, const DataTypes & to_kv_types) const { return [element_wrappers = getElementWrappers(from_kv_types, to_kv_types), from_kv_types, to_kv_types] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t /*input_rows_count*/) -> ColumnPtr { const auto * col = arguments.front().column.get(); const auto & column_array = typeid_cast(*col); const auto & nested_data = typeid_cast(column_array.getData()); Columns converted_columns(2); for (size_t i = 0; i < 2; ++i) { ColumnsWithTypeAndName element = {{nested_data.getColumnPtr(i), from_kv_types[i], ""}}; converted_columns[i] = element_wrappers[i](element, to_kv_types[i], nullable_source, (element[0].column)->size()); } return ColumnMap::create(converted_columns[0], converted_columns[1], column_array.getOffsetsPtr()); }; } WrapperType createMapWrapper(const DataTypePtr & from_type_untyped, const DataTypeMap * to_type) const { if (const auto * from_tuple = checkAndGetDataType(from_type_untyped.get())) { if (from_tuple->getElements().size() != 2) throw Exception( ErrorCodes::TYPE_MISMATCH, "CAST AS Map from tuple requires 2 elements. " "Left type: {}, right type: {}", from_tuple->getName(), to_type->getName()); DataTypes from_kv_types; const auto & to_kv_types = to_type->getKeyValueTypes(); for (const auto & elem : from_tuple->getElements()) { const auto * type_array = checkAndGetDataType(elem.get()); if (!type_array) throw Exception(ErrorCodes::TYPE_MISMATCH, "CAST AS Map can only be performed from tuples of array. Got: {}", from_tuple->getName()); from_kv_types.push_back(type_array->getNestedType()); } return createTupleToMapWrapper(from_kv_types, to_kv_types); } else if (const auto * from_array = typeid_cast(from_type_untyped.get())) { const auto * nested_tuple = typeid_cast(from_array->getNestedType().get()); if (!nested_tuple || nested_tuple->getElements().size() != 2) throw Exception( ErrorCodes::TYPE_MISMATCH, "CAST AS Map from array requires nested tuple of 2 elements. " "Left type: {}, right type: {}", from_array->getName(), to_type->getName()); return createArrayToMapWrapper(nested_tuple->getElements(), to_type->getKeyValueTypes()); } else if (const auto * from_type = checkAndGetDataType(from_type_untyped.get())) { return createMapToMapWrapper(from_type->getKeyValueTypes(), to_type->getKeyValueTypes()); } else { throw Exception(ErrorCodes::TYPE_MISMATCH, "Unsupported types to CAST AS Map. " "Left type: {}, right type: {}", from_type_untyped->getName(), to_type->getName()); } } WrapperType createTupleToObjectWrapper(const DataTypeTuple & from_tuple, bool has_nullable_subcolumns) const { if (!from_tuple.haveExplicitNames()) throw Exception(ErrorCodes::TYPE_MISMATCH, "Cast to Object can be performed only from flatten Named Tuple. Got: {}", from_tuple.getName()); PathsInData paths; DataTypes from_types; std::tie(paths, from_types) = flattenTuple(from_tuple.getPtr()); auto to_types = from_types; for (auto & type : to_types) { if (isTuple(type) || isNested(type)) throw Exception(ErrorCodes::TYPE_MISMATCH, "Cast to Object can be performed only from flatten Named Tuple. Got: {}", from_tuple.getName()); type = recursiveRemoveLowCardinality(type); } return [element_wrappers = getElementWrappers(from_types, to_types), has_nullable_subcolumns, from_types, to_types, paths] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t input_rows_count) { size_t tuple_size = to_types.size(); auto flattened_column = flattenTuple(arguments.front().column); const auto & column_tuple = assert_cast(*flattened_column); if (tuple_size != column_tuple.getColumns().size()) throw Exception(ErrorCodes::TYPE_MISMATCH, "Expected tuple with {} subcolumn, but got {} subcolumns", tuple_size, column_tuple.getColumns().size()); auto res = ColumnObject::create(has_nullable_subcolumns); for (size_t i = 0; i < tuple_size; ++i) { ColumnsWithTypeAndName element = {{column_tuple.getColumns()[i], from_types[i], "" }}; auto converted_column = element_wrappers[i](element, to_types[i], nullable_source, input_rows_count); res->addSubcolumn(paths[i], converted_column->assumeMutable()); } return res; }; } WrapperType createMapToObjectWrapper(const DataTypeMap & from_map, bool has_nullable_subcolumns) const { auto key_value_types = from_map.getKeyValueTypes(); if (!isStringOrFixedString(key_value_types[0])) throw Exception(ErrorCodes::TYPE_MISMATCH, "Cast to Object from Map can be performed only from Map " "with String or FixedString key. Got: {}", from_map.getName()); const auto & value_type = key_value_types[1]; auto to_value_type = value_type; if (!has_nullable_subcolumns && value_type->isNullable()) to_value_type = removeNullable(value_type); if (has_nullable_subcolumns && !value_type->isNullable()) to_value_type = makeNullable(value_type); DataTypes to_key_value_types{std::make_shared(), std::move(to_value_type)}; auto element_wrappers = getElementWrappers(key_value_types, to_key_value_types); return [has_nullable_subcolumns, element_wrappers, key_value_types, to_key_value_types] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable * nullable_source, size_t) -> ColumnPtr { const auto & column_map = assert_cast(*arguments.front().column); const auto & offsets = column_map.getNestedColumn().getOffsets(); auto key_value_columns = column_map.getNestedData().getColumnsCopy(); for (size_t i = 0; i < 2; ++i) { ColumnsWithTypeAndName element{{key_value_columns[i], key_value_types[i], ""}}; key_value_columns[i] = element_wrappers[i](element, to_key_value_types[i], nullable_source, key_value_columns[i]->size()); } const auto & key_column_str = assert_cast(*key_value_columns[0]); const auto & value_column = *key_value_columns[1]; using SubcolumnsMap = HashMap; SubcolumnsMap subcolumns; for (size_t row = 0; row < offsets.size(); ++row) { for (size_t i = offsets[static_cast(row) - 1]; i < offsets[row]; ++i) { auto ref = key_column_str.getDataAt(i); bool inserted; SubcolumnsMap::LookupResult it; subcolumns.emplace(ref, it, inserted); auto & subcolumn = it->getMapped(); if (inserted) subcolumn = value_column.cloneEmpty()->cloneResized(row); /// Map can have duplicated keys. We insert only first one. if (subcolumn->size() == row) subcolumn->insertFrom(value_column, i); } /// Insert default values for keys missed in current row. for (const auto & [_, subcolumn] : subcolumns) if (subcolumn->size() == row) subcolumn->insertDefault(); } auto column_object = ColumnObject::create(has_nullable_subcolumns); for (auto && [key, subcolumn] : subcolumns) { PathInData path(key.toView()); column_object->addSubcolumn(path, std::move(subcolumn)); } return column_object; }; } WrapperType createObjectWrapper(const DataTypePtr & from_type, const DataTypeObject * to_type) const { if (const auto * from_tuple = checkAndGetDataType(from_type.get())) { return createTupleToObjectWrapper(*from_tuple, to_type->hasNullableSubcolumns()); } else if (const auto * from_map = checkAndGetDataType(from_type.get())) { return createMapToObjectWrapper(*from_map, to_type->hasNullableSubcolumns()); } else if (checkAndGetDataType(from_type.get())) { return [] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * nullable_source, size_t input_rows_count) { auto res = ConvertImplGenericFromString::execute(arguments, result_type, nullable_source, input_rows_count)->assumeMutable(); res->finalize(); return res; }; } else if (checkAndGetDataType(from_type.get())) { return [is_nullable = to_type->hasNullableSubcolumns()] (ColumnsWithTypeAndName & arguments, const DataTypePtr & , const ColumnNullable * , size_t) -> ColumnPtr { auto & column_object = assert_cast(*arguments.front().column); auto res = ColumnObject::create(is_nullable); for (size_t i = 0; i < column_object.size(); i++) res->insert(column_object[i]); res->finalize(); return res; }; } throw Exception(ErrorCodes::TYPE_MISMATCH, "Cast to Object can be performed only from flatten named Tuple, Map or String. Got: {}", from_type->getName()); } template WrapperType createEnumWrapper(const DataTypePtr & from_type, const DataTypeEnum * to_type) const { using EnumType = DataTypeEnum; using Function = typename FunctionTo::Type; if (const auto * from_enum8 = checkAndGetDataType(from_type.get())) checkEnumToEnumConversion(from_enum8, to_type); else if (const auto * from_enum16 = checkAndGetDataType(from_type.get())) checkEnumToEnumConversion(from_enum16, to_type); if (checkAndGetDataType(from_type.get())) return createStringToEnumWrapper(); else if (checkAndGetDataType(from_type.get())) return createStringToEnumWrapper(); else if (isNativeNumber(from_type) || isEnum(from_type)) { auto function = Function::create(); return createFunctionAdaptor(function, from_type); } else { if (cast_type == CastType::accurateOrNull) return createToNullableColumnWrapper(); else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", from_type->getName(), to_type->getName()); } } template void checkEnumToEnumConversion(const EnumTypeFrom * from_type, const EnumTypeTo * to_type) const { const auto & from_values = from_type->getValues(); const auto & to_values = to_type->getValues(); using ValueType = std::common_type_t; using NameValuePair = std::pair; using EnumValues = std::vector; EnumValues name_intersection; std::set_intersection(std::begin(from_values), std::end(from_values), std::begin(to_values), std::end(to_values), std::back_inserter(name_intersection), [] (auto && from, auto && to) { return from.first < to.first; }); for (const auto & name_value : name_intersection) { const auto & old_value = name_value.second; const auto & new_value = to_type->getValue(name_value.first); if (old_value != new_value) throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Enum conversion changes value for element '{}' from {} to {}", name_value.first, toString(old_value), toString(new_value)); } } template WrapperType createStringToEnumWrapper() const { const char * function_name = cast_name; return [function_name] ( ColumnsWithTypeAndName & arguments, const DataTypePtr & res_type, const ColumnNullable * nullable_col, size_t /*input_rows_count*/) { const auto & first_col = arguments.front().column.get(); const auto & result_type = typeid_cast(*res_type); const ColumnStringType * col = typeid_cast(first_col); if (col && nullable_col && nullable_col->size() != col->size()) throw Exception(ErrorCodes::LOGICAL_ERROR, "ColumnNullable is not compatible with original"); if (col) { const auto size = col->size(); auto res = result_type.createColumn(); auto & out_data = static_cast(*res).getData(); out_data.resize(size); auto default_enum_value = result_type.getValues().front().second; if (nullable_col) { for (size_t i = 0; i < size; ++i) { if (!nullable_col->isNullAt(i)) out_data[i] = result_type.getValue(col->getDataAt(i)); else out_data[i] = default_enum_value; } } else { for (size_t i = 0; i < size; ++i) out_data[i] = result_type.getValue(col->getDataAt(i)); } return res; } else throw Exception(ErrorCodes::LOGICAL_ERROR, "Unexpected column {} as first argument of function {}", first_col->getName(), function_name); }; } static WrapperType createIdentityWrapper(const DataTypePtr &) { return [] (ColumnsWithTypeAndName & arguments, const DataTypePtr &, const ColumnNullable *, size_t /*input_rows_count*/) { return arguments.front().column; }; } static WrapperType createNothingWrapper(const IDataType * to_type) { ColumnPtr res = to_type->createColumnConstWithDefaultValue(1); return [res] (ColumnsWithTypeAndName &, const DataTypePtr &, const ColumnNullable *, size_t input_rows_count) { /// Column of Nothing type is trivially convertible to any other column return res->cloneResized(input_rows_count)->convertToFullColumnIfConst(); }; } WrapperType prepareUnpackDictionaries(const DataTypePtr & from_type, const DataTypePtr & to_type) const { const auto * from_low_cardinality = typeid_cast(from_type.get()); const auto * to_low_cardinality = typeid_cast(to_type.get()); const auto & from_nested = from_low_cardinality ? from_low_cardinality->getDictionaryType() : from_type; const auto & to_nested = to_low_cardinality ? to_low_cardinality->getDictionaryType() : to_type; if (from_type->onlyNull()) { if (!to_nested->isNullable()) { if (cast_type == CastType::accurateOrNull) { return createToNullableColumnWrapper(); } else { throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Cannot convert NULL to a non-nullable type"); } } return [](ColumnsWithTypeAndName &, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) { return result_type->createColumnConstWithDefaultValue(input_rows_count)->convertToFullColumnIfConst(); }; } bool skip_not_null_check = false; if (from_low_cardinality && from_nested->isNullable() && !to_nested->isNullable()) /// Disable check for dictionary. Will check that column doesn't contain NULL in wrapper below. skip_not_null_check = true; auto wrapper = prepareRemoveNullable(from_nested, to_nested, skip_not_null_check); if (!from_low_cardinality && !to_low_cardinality) return wrapper; return [wrapper, from_low_cardinality, to_low_cardinality, skip_not_null_check] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * nullable_source, size_t input_rows_count) -> ColumnPtr { ColumnsWithTypeAndName args = {arguments[0]}; auto & arg = args.front(); auto res_type = result_type; ColumnPtr converted_column; ColumnPtr res_indexes; /// For some types default can't be casted (for example, String to Int). In that case convert column to full. bool src_converted_to_full_column = false; { auto tmp_rows_count = input_rows_count; if (to_low_cardinality) res_type = to_low_cardinality->getDictionaryType(); if (from_low_cardinality) { const auto * col_low_cardinality = typeid_cast(arguments[0].column.get()); if (skip_not_null_check && col_low_cardinality->containsNull()) throw Exception(ErrorCodes::CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN, "Cannot convert NULL value to non-Nullable type"); arg.column = col_low_cardinality->getDictionary().getNestedColumn(); arg.type = from_low_cardinality->getDictionaryType(); /// TODO: Make map with defaults conversion. src_converted_to_full_column = !removeNullable(arg.type)->equals(*removeNullable(res_type)); if (src_converted_to_full_column) arg.column = arg.column->index(col_low_cardinality->getIndexes(), 0); else res_indexes = col_low_cardinality->getIndexesPtr(); tmp_rows_count = arg.column->size(); } /// Perform the requested conversion. converted_column = wrapper(args, res_type, nullable_source, tmp_rows_count); } if (to_low_cardinality) { auto res_column = to_low_cardinality->createColumn(); auto * col_low_cardinality = typeid_cast(res_column.get()); if (from_low_cardinality && !src_converted_to_full_column) { col_low_cardinality->insertRangeFromDictionaryEncodedColumn(*converted_column, *res_indexes); } else col_low_cardinality->insertRangeFromFullColumn(*converted_column, 0, converted_column->size()); return res_column; } else if (!src_converted_to_full_column) return converted_column->index(*res_indexes, 0); else return converted_column; }; } WrapperType prepareRemoveNullable(const DataTypePtr & from_type, const DataTypePtr & to_type, bool skip_not_null_check) const { /// Determine whether pre-processing and/or post-processing must take place during conversion. bool source_is_nullable = from_type->isNullable(); bool result_is_nullable = to_type->isNullable(); auto wrapper = prepareImpl(removeNullable(from_type), removeNullable(to_type), result_is_nullable); if (result_is_nullable) { return [wrapper, source_is_nullable] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) -> ColumnPtr { /// Create a temporary columns on which to perform the operation. const auto & nullable_type = static_cast(*result_type); const auto & nested_type = nullable_type.getNestedType(); ColumnsWithTypeAndName tmp_args; if (source_is_nullable) tmp_args = createBlockWithNestedColumns(arguments); else tmp_args = arguments; const ColumnNullable * nullable_source = nullptr; /// Add original ColumnNullable for createStringToEnumWrapper() if (source_is_nullable) { if (arguments.size() != 1) throw Exception(ErrorCodes::LOGICAL_ERROR, "Invalid number of arguments"); nullable_source = typeid_cast(arguments.front().column.get()); } /// Perform the requested conversion. auto tmp_res = wrapper(tmp_args, nested_type, nullable_source, input_rows_count); /// May happen in fuzzy tests. For debug purpose. if (!tmp_res) throw Exception(ErrorCodes::LOGICAL_ERROR, "Couldn't convert {} to {} in prepareRemoveNullable wrapper.", arguments[0].type->getName(), nested_type->getName()); return wrapInNullable(tmp_res, arguments, nested_type, input_rows_count); }; } else if (source_is_nullable) { /// Conversion from Nullable to non-Nullable. return [wrapper, skip_not_null_check] (ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) -> ColumnPtr { auto tmp_args = createBlockWithNestedColumns(arguments); auto nested_type = removeNullable(result_type); /// Check that all values are not-NULL. /// Check can be skipped in case if LowCardinality dictionary is transformed. /// In that case, correctness will be checked beforehand. if (!skip_not_null_check) { const auto & col = arguments[0].column; const auto & nullable_col = assert_cast(*col); const auto & null_map = nullable_col.getNullMapData(); if (!memoryIsZero(null_map.data(), 0, null_map.size())) throw Exception(ErrorCodes::CANNOT_INSERT_NULL_IN_ORDINARY_COLUMN, "Cannot convert NULL value to non-Nullable type"); } const ColumnNullable * nullable_source = typeid_cast(arguments.front().column.get()); return wrapper(tmp_args, nested_type, nullable_source, input_rows_count); }; } else return wrapper; } /// 'from_type' and 'to_type' are nested types in case of Nullable. /// 'requested_result_is_nullable' is true if CAST to Nullable type is requested. WrapperType prepareImpl(const DataTypePtr & from_type, const DataTypePtr & to_type, bool requested_result_is_nullable) const { if (isUInt8(from_type) && isBool(to_type)) return createUInt8ToBoolWrapper(from_type, to_type); /// We can cast IPv6 into IPv6, IPv4 into IPv4, but we should not allow to cast FixedString(16) into IPv6 as part of identity cast bool safe_convert_custom_types = true; if (const auto * to_type_custom_name = to_type->getCustomName()) safe_convert_custom_types = from_type->getCustomName() && from_type->getCustomName()->getName() == to_type_custom_name->getName(); else if (const auto * from_type_custom_name = from_type->getCustomName()) safe_convert_custom_types = to_type->getCustomName() && from_type_custom_name->getName() == to_type->getCustomName()->getName(); if (from_type->equals(*to_type) && safe_convert_custom_types) return createIdentityWrapper(from_type); else if (WhichDataType(from_type).isNothing()) return createNothingWrapper(to_type.get()); WrapperType ret; auto make_default_wrapper = [&](const auto & types) -> bool { using Types = std::decay_t; using ToDataType = typename Types::LeftType; if constexpr ( std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v || std::is_same_v) { ret = createWrapper(from_type, checkAndGetDataType(to_type.get()), requested_result_is_nullable); return true; } if constexpr (std::is_same_v) { if (isBool(to_type)) ret = createBoolWrapper(from_type, checkAndGetDataType(to_type.get()), requested_result_is_nullable); else ret = createWrapper(from_type, checkAndGetDataType(to_type.get()), requested_result_is_nullable); return true; } if constexpr ( std::is_same_v || std::is_same_v) { ret = createEnumWrapper(from_type, checkAndGetDataType(to_type.get())); return true; } if constexpr ( std::is_same_v> || std::is_same_v> || std::is_same_v> || std::is_same_v> || std::is_same_v) { ret = createDecimalWrapper(from_type, checkAndGetDataType(to_type.get()), requested_result_is_nullable); return true; } return false; }; bool cast_ipv4_ipv6_default_on_conversion_error_value = context && context->getSettingsRef().cast_ipv4_ipv6_default_on_conversion_error; bool input_format_ipv4_default_on_conversion_error_value = context && context->getSettingsRef().input_format_ipv4_default_on_conversion_error; bool input_format_ipv6_default_on_conversion_error_value = context && context->getSettingsRef().input_format_ipv6_default_on_conversion_error; auto make_custom_serialization_wrapper = [&, cast_ipv4_ipv6_default_on_conversion_error_value, input_format_ipv4_default_on_conversion_error_value, input_format_ipv6_default_on_conversion_error_value](const auto & types) -> bool { using Types = std::decay_t; using ToDataType = typename Types::RightType; using FromDataType = typename Types::LeftType; if constexpr (WhichDataType(FromDataType::type_id).isStringOrFixedString()) { if constexpr (std::is_same_v) { ret = [cast_ipv4_ipv6_default_on_conversion_error_value, input_format_ipv4_default_on_conversion_error_value, requested_result_is_nullable]( ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * column_nullable, size_t) -> ColumnPtr { if (!WhichDataType(result_type).isIPv4()) throw Exception(ErrorCodes::TYPE_MISMATCH, "Wrong result type {}. Expected IPv4", result_type->getName()); const auto * null_map = column_nullable ? &column_nullable->getNullMapData() : nullptr; if (cast_ipv4_ipv6_default_on_conversion_error_value || input_format_ipv4_default_on_conversion_error_value || requested_result_is_nullable) return convertToIPv4(arguments[0].column, null_map); else return convertToIPv4(arguments[0].column, null_map); }; return true; } if constexpr (std::is_same_v) { ret = [cast_ipv4_ipv6_default_on_conversion_error_value, input_format_ipv6_default_on_conversion_error_value, requested_result_is_nullable]( ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * column_nullable, size_t) -> ColumnPtr { if (!WhichDataType(result_type).isIPv6()) throw Exception( ErrorCodes::TYPE_MISMATCH, "Wrong result type {}. Expected IPv6", result_type->getName()); const auto * null_map = column_nullable ? &column_nullable->getNullMapData() : nullptr; if (cast_ipv4_ipv6_default_on_conversion_error_value || input_format_ipv6_default_on_conversion_error_value || requested_result_is_nullable) return convertToIPv6(arguments[0].column, null_map); else return convertToIPv6(arguments[0].column, null_map); }; return true; } if (to_type->getCustomSerialization() && to_type->getCustomName()) { ret = &ConvertImplGenericFromString::execute; return true; } } else if constexpr (WhichDataType(FromDataType::type_id).isIPv6() && WhichDataType(ToDataType::type_id).isIPv4()) { ret = [cast_ipv4_ipv6_default_on_conversion_error_value, requested_result_is_nullable]( ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable * column_nullable, size_t) -> ColumnPtr { if (!WhichDataType(result_type).isIPv4()) throw Exception( ErrorCodes::TYPE_MISMATCH, "Wrong result type {}. Expected IPv4", result_type->getName()); const auto * null_map = column_nullable ? &column_nullable->getNullMapData() : nullptr; if (cast_ipv4_ipv6_default_on_conversion_error_value || requested_result_is_nullable) return convertIPv6ToIPv4(arguments[0].column, null_map); else return convertIPv6ToIPv4(arguments[0].column, null_map); }; return true; } if constexpr (WhichDataType(ToDataType::type_id).isStringOrFixedString()) { if (from_type->getCustomSerialization()) { ret = [](ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, const ColumnNullable *, size_t input_rows_count) -> ColumnPtr { return ConvertImplGenericToString::execute(arguments, result_type, input_rows_count); }; return true; } } return false; }; if (callOnTwoTypeIndexes(from_type->getTypeId(), to_type->getTypeId(), make_custom_serialization_wrapper)) return ret; if (callOnIndexAndDataType(to_type->getTypeId(), make_default_wrapper)) return ret; switch (to_type->getTypeId()) { case TypeIndex::String: return createStringWrapper(from_type); case TypeIndex::FixedString: return createFixedStringWrapper(from_type, checkAndGetDataType(to_type.get())->getN()); case TypeIndex::Array: return createArrayWrapper(from_type, static_cast(*to_type)); case TypeIndex::Tuple: return createTupleWrapper(from_type, checkAndGetDataType(to_type.get())); case TypeIndex::Map: return createMapWrapper(from_type, checkAndGetDataType(to_type.get())); case TypeIndex::Object: return createObjectWrapper(from_type, checkAndGetDataType(to_type.get())); case TypeIndex::AggregateFunction: return createAggregateFunctionWrapper(from_type, checkAndGetDataType(to_type.get())); case TypeIndex::Interval: return createIntervalWrapper(from_type, checkAndGetDataType(to_type.get())->getKind()); default: break; } if (cast_type == CastType::accurateOrNull) return createToNullableColumnWrapper(); else throw Exception(ErrorCodes::CANNOT_CONVERT_TYPE, "Conversion from {} to {} is not supported", from_type->getName(), to_type->getName()); } }; class MonotonicityHelper { public: using MonotonicityForRange = FunctionCastBase::MonotonicityForRange; template static auto monotonicityForType(const DataType * const) { return FunctionTo::Type::Monotonic::get; } static MonotonicityForRange getMonotonicityInformation(const DataTypePtr & from_type, const IDataType * to_type) { if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (isEnum(from_type)) { if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); if (const auto * type = checkAndGetDataType(to_type)) return monotonicityForType(type); } /// other types like Null, FixedString, Array and Tuple have no monotonicity defined return {}; } }; }