This commit is contained in:
Robert Schulze 2023-09-18 20:08:37 +00:00
parent b583b80733
commit 774c4b52da
No known key found for this signature in database
GPG Key ID: 26703B55FB13728A
13 changed files with 446 additions and 659 deletions

View File

@ -4067,17 +4067,16 @@ Result:
└─────┴─────┴───────┘
```
## splitby_max_substring_behavior {#splitby-max-substring-behavior}
## splitby_max_substrings_includes_remaining_string {#splitby_max_substrings_includes_remaining_string}
Controls how functions [splitBy*()](../../sql-reference/functions/splitting-merging-functions.md) with given `max_substring` argument behave.
Controls whether function [splitBy*()](../../sql-reference/functions/splitting-merging-functions.md) with argument `max_substrings` > 0 will include the remaining string in the last element of the result array.
Possible values:
- `''` - If `max_substring` >=1, return the first `max_substring`-many splits.
- `'python'` - If `max_substring` >= 0, split `max_substring`-many times, and return `max_substring + 1` elements where the last element contains the remaining string.
- `'spark'` - If `max_substring` >= 1, split `max_substring`-many times, and return `max_substring + 1` elements where the last element contains the remaining string.
- `0` - The remaining string will not be included in the last element of the result array.
- `1` - The remaining string will be included in the last element of the result array. This is the behavior of Spark's [`split()`](https://spark.apache.org/docs/3.1.2/api/python/reference/api/pyspark.sql.functions.split.html) function and Python's ['string.split()'](https://docs.python.org/3/library/stdtypes.html#str.split) method.
Default value: ``.
Default value: `0`
## enable_extended_results_for_datetime_functions {#enable-extended-results-for-datetime-functions}

View File

@ -21,7 +21,7 @@ splitByChar(separator, s[, max_substrings]))
- `separator` — The separator which should contain exactly one character. [String](../../sql-reference/data-types/string.md).
- `s` — The string to split. [String](../../sql-reference/data-types/string.md).
- `max_substrings` — An optional `Int64` defaulting to 0. When `max_substrings` > 0, the returned substrings will be no more than `max_substrings`, otherwise the function will return as many substrings as possible.
- `max_substrings` — An optional `Int64` defaulting to 0. If `max_substrings` > 0, the returned array will contain at most `max_substrings` substrings, otherwise the function will return as many substrings as possible.
**Returned value(s)**
@ -39,7 +39,9 @@ For example,
- in v22.10: `SELECT splitByChar('=', 'a=b=c=d', 2); -- ['a','b','c=d']`
- in v22.11: `SELECT splitByChar('=', 'a=b=c=d', 2); -- ['a','b']`
The previous behavior can be restored by setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) = 'python'.
A behavior similar to ClickHouse pre-v22.11 can be achieved by setting
[splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string)
`SELECT splitByChar('=', 'a=b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1 -- ['a', 'b=c=d']`
:::
**Example**
@ -82,7 +84,7 @@ Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-refere
- There are multiple consecutive non-empty separators;
- The original string `s` is empty while the separator is not empty.
Setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) (default: '') controls the behavior with `max_substrings` > 0.
Setting [splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string) (default: 0) controls if the remaining string is included in the last element of the result array when argument `max_substrings` > 0.
**Example**
@ -137,7 +139,7 @@ Returns an array of selected substrings. Empty substrings may be selected when:
Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md)).
Setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) (default: '') controls the behavior with `max_substrings` > 0.
Setting [splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string) (default: 0) controls if the remaining string is included in the last element of the result array when argument `max_substrings` > 0.
**Example**
@ -188,7 +190,7 @@ Returns an array of selected substrings.
Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md)).
Setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) (default: '') controls the behavior with `max_substrings` > 0.
Setting [splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string) (default: 0) controls if the remaining string is included in the last element of the result array when argument `max_substrings` > 0.
**Example**
@ -227,7 +229,7 @@ Returns an array of selected substrings.
Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md)).
Setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) (default: '') controls the behavior with `max_substrings` > 0.
Setting [splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string) (default: 0) controls if the remaining string is included in the last element of the result array when argument `max_substrings` > 0.
**Example**
@ -289,7 +291,7 @@ Returns an array of selected substrings.
Type: [Array](../../sql-reference/data-types/array.md)([String](../../sql-reference/data-types/string.md)).
Setting [splitby_max_substring_behavior](../../operations/settings/settings.md#splitby-max-substring-behavior) (default: '') controls the behavior with `max_substrings` > 0.
Setting [splitby_max_substrings_includes_remaining_string](../../operations/settings/settings.md#splitby_max_substrings_includes_remaining_string) (default: 0) controls if the remaining string is included in the last element of the result array when argument `max_substrings` > 0.
**Example**

View File

@ -502,7 +502,7 @@ class IColumn;
M(Bool, reject_expensive_hyperscan_regexps, true, "Reject patterns which will likely be expensive to evaluate with hyperscan (due to NFA state explosion)", 0) \
M(Bool, allow_simdjson, true, "Allow using simdjson library in 'JSON*' functions if AVX2 instructions are available. If disabled rapidjson will be used.", 0) \
M(Bool, allow_introspection_functions, false, "Allow functions for introspection of ELF and DWARF for query profiling. These functions are slow and may impose security considerations.", 0) \
M(String, splitby_max_substring_behavior, "", "Control the behavior of the 'max_substring' argument in functions splitBy*(): '' (default), 'python' or 'spark'", 0) \
M(Bool, splitby_max_substrings_includes_remaining_string, false, "Functions 'splitBy*()' with 'max_substrings' argument > 0 include the remaining string as last element in the result", 0) \
\
M(Bool, allow_execute_multiif_columnar, true, "Allow execute multiIf function columnar", 0) \
M(Bool, formatdatetime_f_prints_single_zero, false, "Formatter '%f' in function 'formatDateTime()' produces a single zero instead of six zeros if the formatted value has no fractional seconds.", 0) \

View File

@ -19,7 +19,7 @@ std::optional<Int64> extractMaxSplitsImpl(const ColumnWithTypeAndName & argument
return static_cast<Int64>(value);
}
std::optional<size_t> extractMaxSplits(const ColumnsWithTypeAndName & arguments, size_t max_substrings_argument_position, MaxSubstringBehavior max_substring_behavior)
std::optional<size_t> extractMaxSplits(const ColumnsWithTypeAndName & arguments, size_t max_substrings_argument_position)
{
if (max_substrings_argument_position >= arguments.size())
return std::nullopt;
@ -35,24 +35,8 @@ std::optional<size_t> extractMaxSplits(const ColumnsWithTypeAndName & arguments,
arguments[max_substrings_argument_position].column->getName(),
max_substrings_argument_position + 1);
if (max_splits)
switch (max_substring_behavior)
{
case MaxSubstringBehavior::LikeClickHouse:
case MaxSubstringBehavior::LikeSpark:
{
if (*max_splits <= 0)
return std::nullopt;
break;
}
case MaxSubstringBehavior::LikePython:
{
if (*max_splits < 0)
return std::nullopt;
break;
}
}
if (*max_splits <= 0)
return std::nullopt;
return max_splits;
}

View File

@ -54,14 +54,7 @@ namespace ErrorCodes
using Pos = const char *;
enum class MaxSubstringBehavior
{
LikeClickHouse,
LikeSpark,
LikePython
};
std::optional<size_t> extractMaxSplits(const ColumnsWithTypeAndName & arguments, size_t max_substrings_argument_position, MaxSubstringBehavior max_substring_behavior);
std::optional<size_t> extractMaxSplits(const ColumnsWithTypeAndName & arguments, size_t max_substrings_argument_position);
/// Substring generators. All of them have a common interface.
@ -72,7 +65,7 @@ private:
Pos end;
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = "alphaTokens";
@ -97,10 +90,10 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 1, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 1);
}
/// Called for each next string.
@ -125,35 +118,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = end;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = end;
return true;
}
break;
token_end = end;
pos = end;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
while (pos < end && isAlphaASCII(*pos))
@ -173,7 +149,7 @@ private:
Pos end;
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
/// Get the name of the function.
@ -190,10 +166,10 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 1, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 1);
}
/// Called for each next string.
@ -218,35 +194,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = end;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = end;
return true;
}
break;
token_end = end;
pos = end;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
while (pos < end && !(isWhitespaceASCII(*pos) || isPunctuationASCII(*pos)))
@ -266,7 +225,7 @@ private:
Pos end;
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = "splitByWhitespace";
@ -282,10 +241,10 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 1, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 1);
}
/// Called for each next string.
@ -310,35 +269,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = end;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = end;
return true;
}
break;
token_end = end;
pos = end;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
while (pos < end && !isWhitespaceASCII(*pos))
@ -359,7 +301,7 @@ private:
char separator;
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = "splitByChar";
@ -383,7 +325,7 @@ public:
static constexpr auto strings_argument_position = 1uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
const ColumnConst * col = checkAndGetColumnConstStringOrFixedString(arguments[0].column.get());
@ -398,8 +340,8 @@ public:
separator = sep_str[0];
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 2, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 2);
}
void set(Pos pos_, Pos end_)
@ -418,35 +360,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = nullptr;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = nullptr;
return true;
}
break;
token_end = end;
pos = nullptr;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
pos = reinterpret_cast<Pos>(memchr(pos, separator, end - pos));
@ -472,7 +397,7 @@ private:
String separator;
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = "splitByString";
@ -487,7 +412,7 @@ public:
static constexpr auto strings_argument_position = 1uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
const ColumnConst * col = checkAndGetColumnConstStringOrFixedString(arguments[0].column.get());
@ -497,8 +422,8 @@ public:
separator = col->getValue<String>();
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 2, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 2);
}
/// Called for each next string.
@ -521,35 +446,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = end;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = end;
return true;
}
break;
token_end = end;
pos = end;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
pos += 1;
@ -565,35 +473,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = nullptr;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = nullptr;
return true;
}
break;
token_end = end;
pos = nullptr;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
pos = reinterpret_cast<Pos>(memmem(pos, end - pos, separator.data(), separator.size()));
@ -622,7 +513,7 @@ private:
std::optional<size_t> max_splits;
size_t splits;
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = "splitByRegexp";
@ -638,7 +529,7 @@ public:
static constexpr auto strings_argument_position = 1uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior max_substring_behavior_)
void init(const ColumnsWithTypeAndName & arguments, bool max_substrings_includes_remaining_string_)
{
const ColumnConst * col = checkAndGetColumnConstStringOrFixedString(arguments[0].column.get());
@ -649,8 +540,8 @@ public:
if (!col->getValue<String>().empty())
re = std::make_shared<OptimizedRegularExpression>(Regexps::createRegexp<false, false, false>(col->getValue<String>()));
max_substring_behavior = max_substring_behavior_;
max_splits = extractMaxSplits(arguments, 2, max_substring_behavior);
max_substrings_includes_remaining_string = max_substrings_includes_remaining_string_;
max_splits = extractMaxSplits(arguments, 2);
}
/// Called for each next string.
@ -673,35 +564,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = end;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = end;
return true;
}
break;
token_end = end;
pos = end;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
pos += 1;
@ -717,35 +591,18 @@ public:
if (max_splits)
{
switch (max_substring_behavior)
if (max_substrings_includes_remaining_string)
{
case MaxSubstringBehavior::LikeClickHouse:
if (splits == *max_splits - 1)
{
if (splits == *max_splits)
return false;
break;
}
case MaxSubstringBehavior::LikeSpark:
{
if (splits == *max_splits - 1)
{
token_end = end;
pos = nullptr;
return true;
}
break;
}
case MaxSubstringBehavior::LikePython:
{
if (splits == *max_splits)
{
token_end = end;
pos = nullptr;
return true;
}
break;
token_end = end;
pos = nullptr;
return true;
}
}
else
if (splits == *max_splits)
return false;
}
if (!re->match(pos, end - pos, matches) || !matches[0].length)
@ -792,7 +649,7 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & arguments, MaxSubstringBehavior /*max_substring_behavior*/)
void init(const ColumnsWithTypeAndName & arguments, bool /*max_substrings_includes_remaining_string*/)
{
const ColumnConst * col = checkAndGetColumnConstStringOrFixedString(arguments[1].column.get());
@ -845,7 +702,7 @@ template <typename Generator>
class FunctionTokens : public IFunction
{
private:
MaxSubstringBehavior max_substring_behavior;
bool max_substrings_includes_remaining_string;
public:
static constexpr auto name = Generator::name;
@ -854,17 +711,7 @@ public:
explicit FunctionTokens<Generator>(ContextPtr context)
{
const Settings & settings = context->getSettingsRef();
if (settings.splitby_max_substring_behavior.value == "")
max_substring_behavior = MaxSubstringBehavior::LikeClickHouse;
else if (settings.splitby_max_substring_behavior.value == "python")
max_substring_behavior = MaxSubstringBehavior::LikePython;
else if (settings.splitby_max_substring_behavior.value == "spark")
max_substring_behavior = MaxSubstringBehavior::LikeSpark;
else
throw Exception(
ErrorCodes::ILLEGAL_COLUMN,
"Illegal value {} for setting splitby_max_substring_behavior in function {}, must be '', 'python' or 'spark'",
settings.splitby_max_substring_behavior.value, getName());
max_substrings_includes_remaining_string = settings.splitby_max_substrings_includes_remaining_string;
}
String getName() const override { return name; }
@ -885,7 +732,7 @@ public:
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t /*input_rows_count*/) const override
{
Generator generator;
generator.init(arguments, max_substring_behavior);
generator.init(arguments, max_substrings_includes_remaining_string);
const auto & array_argument = arguments[generator.strings_argument_position];

View File

@ -30,7 +30,7 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & /*arguments*/, MaxSubstringBehavior /*max_substring_behavior*/) {}
void init(const ColumnsWithTypeAndName & /*arguments*/, bool /*max_substrings_includes_remaining_string*/) {}
/// Called for each next string.
void set(Pos pos_, Pos end_)

View File

@ -29,7 +29,7 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & /*arguments*/, MaxSubstringBehavior /*max_substring_behavior*/) {}
void init(const ColumnsWithTypeAndName & /*arguments*/, bool /*max_substring_behavior*/) {}
/// Called for each next string.
void set(Pos pos_, Pos end_)

View File

@ -29,7 +29,7 @@ public:
static constexpr auto strings_argument_position = 0uz;
void init(const ColumnsWithTypeAndName & /*arguments*/, MaxSubstringBehavior /*max_substring_behavior*/) {}
void init(const ColumnsWithTypeAndName & /*arguments*/, bool /*max_substrings_includes_remaining_string*/) {}
/// Called for each next string.
void set(Pos pos_, Pos end_)

View File

@ -27,7 +27,7 @@ public:
validateFunctionArgumentTypes(func, arguments, mandatory_args);
}
void init(const ColumnsWithTypeAndName & /*arguments*/, MaxSubstringBehavior /*max_substring_behavior*/) {}
void init(const ColumnsWithTypeAndName & /*arguments*/, bool /*max_substrings_includes_remaining_string*/) {}
static constexpr auto strings_argument_position = 0uz;

View File

@ -1,44 +1,160 @@
['1','2','3']
['1','2','3']
['1','2','3']
['1']
['1','2']
['1','2','3']
['1','2','3']
['one','two','three','']
['one','two','three','']
['one','two','three','']
['one']
['one','two']
['one','two','three']
['one','two','three','']
['one','two','three','']
['abca','abc']
['abca','abc']
['abca','abc']
['abca']
['abca','abc']
['abca','abc']
['abca','abc']
['1','a','b']
['1','a','b']
['1','a','b']
['1']
['1','a']
['1','a','b']
['1','a','b']
['1!','a,','b.']
['1!','a,','b.']
['1!','a,','b.']
['1!']
['1!','a,']
['1!','a,','b.']
['1!','a,','b.']
['1','2 3','4,5','abcde']
['1','2 3','4,5','abcde']
['1','2 3','4,5','abcde']
['1']
['1','2 3']
['1','2 3','4,5']
['1','2 3','4,5','abcde']
['1','2 3','4,5','abcde']
-- negative tests
-- splitByChar
-- (default)
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a']
['a','']
['a','','b']
['a','','b','c']
['a','','b','c','d']
['a','','b','c','d']
-- (include remainder)
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a==b=c=d']
['a','=b=c=d']
['a','','b=c=d']
['a','','b','c=d']
['a','','b','c','d']
['a','','b','c','d']
-- splitByString
-- (default)
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a']
['a','=']
['a','=','=']
['a','=','=','b']
['a','=','=','b','=']
['a','=','=','b','=','c']
['a','=','=','b','=','c','=']
['a','=','=','b','=','c','=']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a']
['a','']
['a','','b']
['a','','b','c']
['a','','b','c','d']
['a','','b','c','d']
-- (include remainder)
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a==b=c=d']
['a','==b=c=d']
['a','=','=b=c=d']
['a','=','=','b=c=d']
['a','=','=','b','=c=d']
['a','=','=','b','=','c=d']
['a','=','=','b','=','c','=d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a==b=c=d']
['a','=b=c=d']
['a','','b=c=d']
['a','','b','c=d']
['a','','b','c','d']
['a','','b','c','d']
-- splitByRegexp
-- (default)
['a','bc','de','f']
['a','bc','de','f']
['a','bc','de','f']
['a']
['a','bc']
['a','bc','de']
['a','bc','de','f']
['a','bc','de','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a']
['a','1']
['a','1','2']
['a','1','2','b']
['a','1','2','b','c']
-- (include remainder)
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a12bc23de345f']
['a','12bc23de345f']
['a','1','2bc23de345f']
['a','1','2','bc23de345f']
['a','1','2','b','c23de345f']
['a','bc','de','f']
['a','bc','de','f']
['a','bc','de','f']
['a12bc23de345f']
['a','bc23de345f']
['a','bc','de345f']
['a','bc','de','f']
['a','bc','de','f']
-- splitByAlpha
-- (default)
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab']
['ab','cd']
['ab','cd','ef']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
-- (include remainder)
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab.cd.ef.gh']
['ab','cd.ef.gh']
['ab','cd','ef.gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
-- splitByNonAlpha
-- (default)
['128','0','0','1']
['128','0','0','1']
['128','0','0','1']
['128']
['128','0']
['128','0','0']
['128','0','0','1']
['128','0','0','1']
-- (include remainder)
['128','0','0','1']
['128','0','0','1']
['128','0','0','1']
['128.0.0.1']
['128','0.0.1']
['128','0','0.1']
['128','0','0','1']
['128','0','0','1']
-- splitByWhitespace
-- (default)
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,']
['Nein,','nein,']
['Nein,','nein,','nein!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
-- (include remainder)
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein, nein, nein! Doch!']
['Nein,','nein, nein! Doch!']
['Nein,','nein,','nein! Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']

View File

@ -1,59 +1,175 @@
select splitByChar(',', '1,2,3');
select splitByChar(',', '1,2,3', -1);
select splitByChar(',', '1,2,3', 0);
select splitByChar(',', '1,2,3', 1);
select splitByChar(',', '1,2,3', 2);
select splitByChar(',', '1,2,3', 3);
select splitByChar(',', '1,2,3', 4);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC');
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', -1);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 0);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 1);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 2);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 3);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 4);
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', 5);
SELECT alphaTokens('abca1abc');
SELECT alphaTokens('abca1abc', -1);
SELECT alphaTokens('abca1abc', 0);
SELECT alphaTokens('abca1abc', 1);
SELECT alphaTokens('abca1abc', 2);
SELECT alphaTokens('abca1abc', 3);
SELECT splitByAlpha('abca1abc');
SELECT splitByNonAlpha(' 1! a, b. ');
SELECT splitByNonAlpha(' 1! a, b. ', -1);
SELECT splitByNonAlpha(' 1! a, b. ', 0);
SELECT splitByNonAlpha(' 1! a, b. ', 1);
SELECT splitByNonAlpha(' 1! a, b. ', 2);
SELECT splitByNonAlpha(' 1! a, b. ', 3);
SELECT splitByNonAlpha(' 1! a, b. ', 4);
SELECT splitByWhitespace(' 1! a, b. ');
SELECT splitByWhitespace(' 1! a, b. ', -1);
SELECT splitByWhitespace(' 1! a, b. ', 0);
SELECT splitByWhitespace(' 1! a, b. ', 1);
SELECT splitByWhitespace(' 1! a, b. ', 2);
SELECT splitByWhitespace(' 1! a, b. ', 3);
SELECT splitByWhitespace(' 1! a, b. ', 4);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde');
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', -1);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 0);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 1);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 2);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 3);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 4);
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', 5);
select splitByChar(',', '1,2,3', ''); -- { serverError 43 }
select splitByRegexp('[ABC]', 'oneAtwoBthreeC', ''); -- { serverError 43 }
SELECT '-- negative tests';
SELECT splitByChar(',', '1,2,3', ''); -- { serverError 43 }
SELECT splitByRegexp('[ABC]', 'oneAtwoBthreeC', ''); -- { serverError 43 }
SELECT alphaTokens('abca1abc', ''); -- { serverError 43 }
SELECT splitByAlpha('abca1abc', ''); -- { serverError 43 }
SELECT splitByNonAlpha(' 1! a, b. ', ''); -- { serverError 43 }
SELECT splitByWhitespace(' 1! a, b. ', ''); -- { serverError 43 }
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', ''); -- { serverError 43 }
SELECT splitByString(', ', '1, 2 3, 4,5, abcde', ''); -- { serverError 43 }
SELECT '-- splitByChar';
SELECT '-- (default)';
SELECT splitByChar('=', 'a==b=c=d');
SELECT splitByChar('=', 'a==b=c=d', -1);
SELECT splitByChar('=', 'a==b=c=d', 0);
SELECT splitByChar('=', 'a==b=c=d', 1);
SELECT splitByChar('=', 'a==b=c=d', 2);
SELECT splitByChar('=', 'a==b=c=d', 3);
SELECT splitByChar('=', 'a==b=c=d', 4);
SELECT splitByChar('=', 'a==b=c=d', 5);
SELECT splitByChar('=', 'a==b=c=d', 6);
SELECT '-- (include remainder)';
SELECT splitByChar('=', 'a==b=c=d') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByChar('=', 'a==b=c=d', 6) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT '-- splitByString';
SELECT '-- (default)';
SELECT splitByString('', 'a==b=c=d') SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 6) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 7) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 7) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 8) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('', 'a==b=c=d', 9) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d') SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT splitByString('=', 'a==b=c=d', 6) SETTINGS splitby_max_substrings_includes_remaining_string = 0;
SELECT '-- (include remainder)';
SELECT splitByString('', 'a==b=c=d') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 6) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 7) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 8) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('', 'a==b=c=d', 9) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByString('=', 'a==b=c=d', 6) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT '-- splitByRegexp';
SELECT '-- (default)';
SELECT splitByRegexp('\\d+', 'a12bc23de345f');
SELECT splitByRegexp('\\d+', 'a12bc23de345f', -1);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 0);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 1);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 2);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 3);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 4);
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 5);
SELECT splitByRegexp('', 'a12bc23de345f');
SELECT splitByRegexp('', 'a12bc23de345f', -1);
SELECT splitByRegexp('', 'a12bc23de345f', 0);
SELECT splitByRegexp('', 'a12bc23de345f', 1);
SELECT splitByRegexp('', 'a12bc23de345f', 2);
SELECT splitByRegexp('', 'a12bc23de345f', 3);
SELECT splitByRegexp('', 'a12bc23de345f', 4);
SELECT splitByRegexp('', 'a12bc23de345f', 5);
SELECT '-- (include remainder)';
SELECT splitByRegexp('', 'a12bc23de345f') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('', 'a12bc23de345f', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT '-- splitByAlpha';
SELECT '-- (default)';
SELECT splitByAlpha('ab.cd.ef.gh');
SELECT splitByAlpha('ab.cd.ef.gh', -1);
SELECT splitByAlpha('ab.cd.ef.gh', 0);
SELECT splitByAlpha('ab.cd.ef.gh', 1);
SELECT splitByAlpha('ab.cd.ef.gh', 2);
SELECT splitByAlpha('ab.cd.ef.gh', 3);
SELECT splitByAlpha('ab.cd.ef.gh', 4);
SELECT splitByAlpha('ab.cd.ef.gh', 5);
SELECT '-- (include remainder)';
SELECT splitByAlpha('ab.cd.ef.gh') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByAlpha('ab.cd.ef.gh', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT '-- splitByNonAlpha';
SELECT '-- (default)';
SELECT splitByNonAlpha('128.0.0.1');
SELECT splitByNonAlpha('128.0.0.1', -1);
SELECT splitByNonAlpha('128.0.0.1', 0);
SELECT splitByNonAlpha('128.0.0.1', 1);
SELECT splitByNonAlpha('128.0.0.1', 2);
SELECT splitByNonAlpha('128.0.0.1', 3);
SELECT splitByNonAlpha('128.0.0.1', 4);
SELECT splitByNonAlpha('128.0.0.1', 5);
SELECT '-- (include remainder)';
SELECT splitByNonAlpha('128.0.0.1') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByNonAlpha('128.0.0.1', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
--
--
SELECT '-- splitByWhitespace';
SELECT '-- (default)';
SELECT splitByWhitespace('Nein, nein, nein! Doch!');
SELECT splitByWhitespace('Nein, nein, nein! Doch!', -1);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 0);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 1);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 2);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 3);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 4);
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 5);
SELECT '-- (include remainder)';
SELECT splitByWhitespace('Nein, nein, nein! Doch!') SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', -1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 0) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 1) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 2) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 3) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 4) SETTINGS splitby_max_substrings_includes_remaining_string = 1;
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 5) SETTINGS splitby_max_substrings_includes_remaining_string = 1;

View File

@ -1,126 +0,0 @@
-- splitByAlpha
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab']
['ab','cd']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab.cd.ef.gh']
['ab','cd.ef.gh']
['ab','cd','ef.gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab','cd','ef','gh']
['ab.cd.ef.gh']
['ab','cd.ef.gh']
-- splitByNonAlpha
['128','0','0','1']
['128','0','0','1']
['128','0','0','1']
['128']
['128','0']
['128','0','0','1']
['128','0','0','1']
['128.0.0.1']
['128','0.0.1']
['128','0','0.1']
['128','0','0','1']
['128','0','0','1']
['128','0','0','1']
['128.0.0.1']
['128','0.0.1']
-- splitByWhitespace
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,']
['Nein,','nein,']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein, nein, nein! Doch!']
['Nein,','nein, nein! Doch!']
['Nein,','nein,','nein! Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein,','nein,','nein!','Doch!']
['Nein, nein, nein! Doch!']
['Nein,','nein, nein! Doch!']
-- splitByChar
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a']
['a','']
['a','','b','c','d']
['a','','b','c','d']
['a==b=c=d']
['a','=b=c=d']
['a','','b=c=d']
['a','','b','c','d']
['a','','b','c','d']
['a','','b','c','d']
['a==b=c=d']
['a','=b=c=d']
-- splitByString
['a','b=c=d']
['a','b=c=d']
['a','b=c=d']
['a']
['a','b=c=d']
['a','b=c=d']
['a','b=c=d']
['a==b=c=d']
['a','b=c=d']
['a','b=c=d']
['a','b=c=d']
['a','b=c=d']
['a','b=c=d']
['a==b=c=d']
['a','b=c=d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a']
['a','=']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a==b=c=d']
['a','==b=c=d']
['a','=','=b=c=d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a','=','=','b','=','c','=','d']
['a==b=c=d']
['a','==b=c=d']
-- splitByRegexp
['a','bc','de','f']
['a','bc','de','f']
['a','bc','de','f']
['a']
['a','bc']
['a','bc','de','f']
['a','bc','de','f']
['a12bc23de345f']
['a','bc23de345f']
['a','bc','de345f']
['a','bc','de','f']
['a','bc','de','f']
['a','bc','de','f']
['a12bc23de345f']
['a','bc23de345f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a']
['a','1']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a12bc23de345f']
['a','12bc23de345f']
['a','1','2bc23de345f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a','1','2','b','c','2','3','d','e','3','4','5','f']
['a12bc23de345f']
['a','12bc23de345f']

View File

@ -1,151 +0,0 @@
SELECT '-- splitByAlpha';
SELECT splitByAlpha('ab.cd.ef.gh') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByAlpha('ab.cd.ef.gh', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByAlpha('ab.cd.ef.gh', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByAlpha('ab.cd.ef.gh', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByAlpha('ab.cd.ef.gh', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByAlpha('ab.cd.ef.gh') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByAlpha('ab.cd.ef.gh', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByAlpha('ab.cd.ef.gh', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByAlpha('ab.cd.ef.gh', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByAlpha('ab.cd.ef.gh', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByAlpha('ab.cd.ef.gh') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByAlpha('ab.cd.ef.gh', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByAlpha('ab.cd.ef.gh', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByAlpha('ab.cd.ef.gh', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByAlpha('ab.cd.ef.gh', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT '-- splitByNonAlpha';
SELECT splitByNonAlpha('128.0.0.1') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByNonAlpha('128.0.0.1', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByNonAlpha('128.0.0.1', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByNonAlpha('128.0.0.1', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByNonAlpha('128.0.0.1', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByNonAlpha('128.0.0.1') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByNonAlpha('128.0.0.1', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByNonAlpha('128.0.0.1', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByNonAlpha('128.0.0.1', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByNonAlpha('128.0.0.1', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByNonAlpha('128.0.0.1') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByNonAlpha('128.0.0.1', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByNonAlpha('128.0.0.1', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByNonAlpha('128.0.0.1', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByNonAlpha('128.0.0.1', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT '-- splitByWhitespace';
SELECT splitByWhitespace('Nein, nein, nein! Doch!') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByWhitespace('Nein, nein, nein! Doch!') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByWhitespace('Nein, nein, nein! Doch!') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByWhitespace('Nein, nein, nein! Doch!', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT '-- splitByChar';
SELECT splitByChar('=', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByChar('=', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByChar('=', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByChar('=', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByChar('=', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByChar('=', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByChar('=', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByChar('=', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByChar('=', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByChar('=', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByChar('=', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByChar('=', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByChar('=', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByChar('=', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByChar('=', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT '-- splitByString';
SELECT splitByString('==', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('==', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('==', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('==', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('==', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('==', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('==', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('==', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('==', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('==', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('==', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('==', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('==', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('==', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('==', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByString('', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByString('', 'a==b=c=d') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('', 'a==b=c=d', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('', 'a==b=c=d', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('', 'a==b=c=d', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByString('', 'a==b=c=d', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT '-- splitByRegexp';
SELECT splitByRegexp('\\d+', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('\\d+', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('\\d+', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('\\d+', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = '';
SELECT splitByRegexp('', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = 'python';
SELECT splitByRegexp('', 'a12bc23de345f') SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('', 'a12bc23de345f', -1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('', 'a12bc23de345f', 0) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('', 'a12bc23de345f', 1) SETTINGS splitby_max_substring_behavior = 'spark';
SELECT splitByRegexp('', 'a12bc23de345f', 2) SETTINGS splitby_max_substring_behavior = 'spark';