Commit Graph

16 Commits

Author SHA1 Message Date
Robert Schulze
74937cf27b
Reject DoS-prone hyperscan regexes 2023-02-09 17:17:35 +00:00
Robert Schulze
6ec4f3cf3d
Implement non-const needle arguments in multiSearchAllPositions 2022-07-14 06:24:28 +00:00
Robert Schulze
49348b833a
Simplify 2022-07-07 20:25:26 +00:00
Robert Schulze
1de5e9a7da
Avoid copy-ing array elements 2022-07-07 12:33:34 +00:00
Robert Schulze
d0b2f13f9d
Fix style check 2022-07-05 13:41:52 +02:00
Robert Schulze
1eed72b525
Make more multi-search methods work with non-const needles
After making function multi[Fuzzy]Match(Any|AnyIndex|AllIndices)() work
with non-const needles, 12 more functions started to fail in test
"00233_position_function_family":

multiSearchAny()
multiSearchAnyCaseInsensitive()
multiSearchAnyUTF8
multiSearchAnyCaseInsensitiveUTF8()

multiSearchFirstPosition()
multiSearchFirstPositionCaseInsensitive()
multiSearchFirstPositionUTF8()
multiSearchFirstPositionCaseInsensitiveUTF8()

multiSearchFirstIndex()
multiSearchFirstIndexCaseInsensitive()
multiSearchFirstIndexUTF8()
multiSearchFirstIndexCaseInsensitiveUTF8()

Failing queries take the form
  select 0 = multiSearchAny('\0', CAST([], 'Array(String)'));
2022-07-04 14:00:21 +00:00
Robert Schulze
d547aa7849
Allow non-const pattern array argument in multi[Fuzzy]Match*()
Resolves #38046
2022-07-04 10:43:16 +00:00
Robert Schulze
959cbaab02
Move loop over patterns into implementations
- This is preparation for non-const regexp arguments, where this loop
  will run for each row.
2022-06-26 16:26:13 +00:00
Robert Schulze
e2b11899a1
Move check if cfg allows hyperscan into implementations
- This is not needed for non-const regexp array arguments but cleans up
  the code and runs the check only in functions which actually use
  hyperscan.
2022-06-26 16:25:49 +00:00
Robert Schulze
3478db9fb6
Move check for regexp array size into implementations
- This is not needed for non-const regexp array arguments (the
  cardinality of arrays is fixed per column) but it cleans up the code
  and runs the check only in functions which have restrictions on the
  number of patterns.

- For functions using hyperscans, it was checked that the number of
  regexes is < 2^32. Removed the check because I don't think anyone will
  every specify 4 billion patterns.
2022-06-26 15:38:12 +00:00
Robert Schulze
7913edc172
Move check for hyperscan regexp constraints into implementations
- This is preparation for non-const regexp arguments, where this check
  will run for each row.
2022-06-26 15:38:05 +00:00
Robert Schulze
bb7c627964
Cosmetics: Pass patterns around as std::string_view instead of StringRef
- The patterns are not used in hashing, there should not be a performance
  impact when we use stuff from the standard library instead.

- added forgotten .reserve() in FunctionsMultiStringPosition.h
2022-06-26 15:32:19 +00:00
Vasily Nemkov
2c6b9aa174 Better exception messages for some String-related functions 2021-09-26 08:18:37 +03:00
vdimir
fbcefaee5d
Fill result vector only for empty input in multiSearch functions 2021-08-06 11:49:51 +03:00
vdimir
b8558a1716
Fix uninitialized memory in functions multiSearch* with empty array 2021-08-04 16:44:39 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00