ClickHouse/docs/en
johanngan bcb058f999 Add case insensitive and dot-all modes to RegExpTree dictionary
The new per-dictionary settings control regex match semantics around
case sensitivity and the '.' wildcard with newlines. They must be set at
the dictionary level since they're applied to regex engines at
pattern-compile-time.

- regexp_dict_flag_case_insensitive: case insensitive matching
- regexp_dict_flag_dotall: '.' matches all characters including newlines

They correspond to HS_FLAG_CASELESS and HS_FLAG_DOTALL in Vectorscan
and case_sensitive and dot_nl in RE2. These are the most useful options
compatible with the internal behavior of RegExpTreeDictionary around
splitting up simple and complex patterns between Vectorscan and RE2.

The alternative is to use (?i) and/or (?s) for all patterns. However,
(?s) isn't handled properly by OptimizedRegularExpression::analyze().
And while (?i) is, it still causes the dictionary to treat the pattern
as "complex" for sequential scanning with RE2 rather than multi-matching
with Vectorscan, even though Vectorscan supports case insensitive
literal matching. Setting dictionary-wide flags is both more convenient,
and circumvents these problems.
2023-09-06 11:28:53 -05:00
..
development Merge pull request #53701 from ClibMouse/feature/qemu-s390x-docs 2023-08-23 18:29:45 +03:00
engines Update generate.md 2023-09-05 15:55:58 -04:00
getting-started Various fixups 2023-08-31 19:18:42 +00:00
interfaces Clarify that the cloud MySQL interface is under private preview 2023-08-30 11:22:47 -07:00
operations Merge pull request #53638 from arenadata/ADQM-987 2023-09-05 17:03:41 +02:00
sql-reference Add case insensitive and dot-all modes to RegExpTree dictionary 2023-09-06 11:28:53 -05:00