The Year 1925 is a starting point because most of the timezones
switched to saner (mostly 15-minutes based) offsets somewhere
during 1924 or before. And that significantly simplifies implementation.
2238 is to simplify arithmetics for sanitizing LUT index access;
there are less than 0x1ffff days from 1925.
* Extended DateLUTImpl internal LUT to 0x1ffff items, some of which
represent negative (pre-1970) time values.
As a collateral benefit, Date now correctly supports dates up to 2149
(instead of 2106).
* Added a new strong typedef ExtendedDayNum, which represents dates
pre-1970 and post 2149.
* Functions that used to return DayNum now return ExtendedDayNum.
* Refactored DateLUTImpl to untie DayNum from the dual role of being
a value and an index (due to negative time). Index is now a different
type LUTIndex with explicit conversion functions from DatNum, time_t,
and ExtendedDayNum.
* Updated DateLUTImpl to properly support values close to epoch start
(1970-01-01 00:00), including negative ones.
* Reduced resolution of DateLUTImpl::Values::time_at_offset_change
to multiple of 15-minutes to allow storing 64-bits of time_t in
DateLUTImpl::Value while keeping same size.
* Minor performance updates to DateLUTImpl when building month LUT
by skipping non-start-of-month days.
* Fixed extractTimeZoneFromFunctionArguments to work correctly
with DateTime64.
* New unit-tests and stateless integration tests for both DateTime
and DateTime64.
There was no specializations for toDateTime64(<numeric>), and because of
this default decimal conversion was used, however this is not enough for
DateTime/DateTime64 types, since the date may overflow and the proper
check is required (like DateTime has), and this what UBsan found [1]:
../src/IO/WriteHelpers.h:812:33: runtime error: index 508 out of bounds for type 'const char [201]' Received signal -3 Received signal Unknown signal (-3)
Backtrace:
(gdb) bt
0 LocalDateTime::LocalDateTime (this=0x7fffffff8418, year_=1970, month_=1 '\001', day_=1 '\001', hour_=2 '\002', minute_=0 '\000', second_=254 '\376') at LocalDateTime.h:83
1 0x00000000138a5edb in DB::writeDateTimeText<(char)45, (char)58, (char)32, (char)46> (datetime64=..., scale=7, buf=..., date_lut=...) at WriteHelpers.h:852
2 0x0000000019c379b4 in DB::DataTypeDateTime64::serializeText (this=0x7ffff5c4b0d8, column=..., row_num=0, ostr=..., settings=...) at DataTypeDateTime64.cpp:66
3 0x0000000019d297e4 in DB::IDataType::serializeAsText (this=0x7ffff5c4b0d8, column=..., row_num=0, ostr=..., settings=...) at IDataType.cpp:387
[1]: https://clickhouse-test-reports.s3.yandex.net/19527/cea8ae162ffbf92e5ed29304ab010704c5d611c8/fuzzer_ubsan/report.html#fail1
Also fix CAST for DateTime64
It should be marked with always const, otherwise it will bail:
Code: 44, e.displayText() = DB::Exception: Illegal column String of time zone argument of function, must be constant string: While processing toDateTime(-1, 1, 'GMT'), Stack trace (when copying this message, always include the lines below):
Was:
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Illegal column UInt16 of first argument of function toUnixTimestamp: While processing toUnixTimestamp(today()).
Now:
Code: 44. DB::Exception: Received from localhost:9000. DB::Exception: Illegal type Date of first argument of function toUnixTimestamp: While processing toUnixTimestamp(today()).
Making it implicitly cast to Date() does not looks correct, since before
it returns somewhat unexpected result:
SELECT toUnixTimestamp(today())
┌─toUnixTimestamp(today())─┐
│ 18591 │
└──────────────────────────┘
The wrapper that is returned from createStringToEnumWrapper() does not
have access to the ColumnNullable (i.e. original column), because it is
converted to nested type in the prepareRemoveNullable().
So add original ColumnNullable into the block in prepareRemoveNullable()
if source type is Nullable and pass this flag to the
createStringToEnumWrapper() to make it know about that fact that last
column in the block is the original ColumnNullable in this case.
And this one looks most sane.