Commit Graph

108 Commits

Author SHA1 Message Date
Alexander Gololobov
a02a631d51 Cleanups based on code review 2022-12-29 15:00:42 +01:00
Alexander Gololobov
7df137e460 Replaced asserts with logical errors 2022-12-29 14:33:11 +01:00
Alexander Gololobov
059ec6f747 Cleanups 2022-12-29 01:22:47 +01:00
Alexander Gololobov
10a058d138 More cleanups in the logic of applying current step filter and final filter 2022-12-28 18:07:36 +01:00
Alexander Gololobov
fd5d328fae Test accumulating filters ignoring prewhere_info->need_filter flag 2022-12-28 18:07:36 +01:00
Alexander Gololobov
a7adc0a91b Cleanups 2022-12-28 18:07:36 +01:00
Alexander Gololobov
b22711baa3 Reset need_filter flag when filter is applied 2022-12-28 18:07:36 +01:00
Alexander Gololobov
ada6422985 Restored old logic for filling _part_offset 2022-12-28 18:07:36 +01:00
Alexander Gololobov
4cebc6f3a4 Cleanups 2022-12-28 18:07:36 +01:00
Alexander Gololobov
13e457c754 Cleanups 2022-12-28 18:07:36 +01:00
Alexander Gololobov
ac1549f6b3 Skip filtering if there are no rows after optimize() 2022-12-28 18:07:35 +01:00
Alexander Gololobov
f273f8712d Avoid filtering same column in block_before_prewhere if it is present in the result 2022-12-28 18:07:35 +01:00
Alexander Gololobov
f3646248c5 Avoid unneeded work if all rows were filtered 2022-12-28 18:07:35 +01:00
Alexander Gololobov
75152ddabb Apply filter only if needed 2022-12-28 18:07:35 +01:00
Alexander Gololobov
a18850458c Test applying current filter at each step 2022-12-28 18:07:35 +01:00
Alexander Gololobov
29b5c4af07 Test dirty intermediate changes 2022-12-28 18:07:35 +01:00
Alexander Gololobov
c561acb774 Properly handle low cardinality column as prewhere filter 2022-12-28 18:07:35 +01:00
Alexander Gololobov
aa276b230b Don't need to save filter and rows_per_granule from previous step 2022-12-28 18:07:35 +01:00
Alexander Gololobov
c4a01cbd5b Fix for propely cleaning rows_per_granule_original between prewhere steps 2022-12-28 18:07:35 +01:00
Alexander Gololobov
abbb58107c Fix for "out of bound" in ColumnVector::insertRangeFrom called from shrink() 2022-12-28 18:07:35 +01:00
Alexander Gololobov
79874e8733 Fix for "Invalid number of rows in Chunk" 2022-12-28 18:07:35 +01:00
Alexander Gololobov
bdf51545f7 Added FilterWithCachedCount class instead of caching counts in filter_bytes_map 2022-12-28 18:07:35 +01:00
Azat Khuzhin
31a88d4eae Fix PREWHERE with row-level filters (when row filter is always true/false)
In case of row-level filters optimized out, i.e. converted to
always true/false, it is possible for MergeTreeRangeReader to reuse
incorrect statistics for the filter (countBytesInResultFilter()), and
because of this it simply does not apply other filters, since it assume
that this filter does not need to filter anything.

Fixes: #40956
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-12-28 18:07:35 +01:00
Alexander Gololobov
d44392b366 Checking the fix for "Invalid number of rows in Chunk" 2022-12-28 18:07:35 +01:00
kssenii
83514fa2ef Refactor 2022-09-05 20:08:22 +02:00
Robert Schulze
a7734672b9
Use std::popcount, ::countl_zero, ::countr_zero functions
- Introduced with the C++20 <bit> header

- The problem with __builtin_c(l|t)z() is that 0 as input has an
  undefined result (*) and the code did not always check. The std::
  versions do not have this issue.

- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
  src/Common/BitHelpers.h) because the std:: versions only accept
  unsigned inputs (and they also check that) and the casting would be
  ugly.

(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
2022-07-31 15:16:51 +00:00
Alexander Gololobov
48de02a7b8 Capitalized const name 2022-07-25 16:32:16 +02:00
Alexander Gololobov
594195451e Cleanups 2022-07-24 12:21:18 +02:00
Alexander Gololobov
1ea9f143ff Leave only _row_exists-based implementation of lightweight delete 2022-07-21 11:26:13 +02:00
Alexander Gololobov
ae0d00083c Renamed __row_exists to _row_exists 2022-07-18 20:07:36 +02:00
Alexander Gololobov
f324ca9921 Cleanups 2022-07-18 20:07:22 +02:00
Alexander Gololobov
9de72d995a POC lightweight delete using __row_exists virtual column and prewhere-like filtering 2022-07-18 20:06:42 +02:00
jianmei zhang
ca42f649da Rewrite logic for loading deleted mask related to getDeletedMask() 2022-07-15 15:31:10 +08:00
jianmei zhang
d37152a5d6 Remove loadDeletedMask() and get deleted mask when needed 2022-07-15 12:32:42 +08:00
jianmei zhang
7e433859ea Change deleted rows mask from String to Native UInt8 format 2022-07-15 12:32:41 +08:00
jianmei zhang
8ad2bb7c33 Code changes due to master new fixes, and update reference for mutations table 2022-07-15 12:32:41 +08:00
jianmei zhang
11fdea6e4b Add missing code for deleted_mask_filter_holder 2022-07-15 12:32:41 +08:00
jianmei zhang
9d27af7ee2 For some columns mutations, skip to apply deleted mask when read some columns. Also add unit test case 2022-07-15 12:32:41 +08:00
jianmei zhang
b4a37e1e22 Disable optimizations for count() when lightweight delete exists, add hasLightweightDelete() function in IMergeTreeDataPart 2022-07-15 12:32:41 +08:00
jianmei zhang
8696319d62 Support lightweight delete execution using string as deleted rows mask,also part of select can handle LWD 2022-07-15 12:32:41 +08:00
Alexander Gololobov
0ee47363d4 Fixed includes 2022-06-22 19:08:18 +02:00
Alexander Gololobov
e5b55b965b Removed incorrect check 2022-06-22 17:23:09 +02:00
Alexander Gololobov
dbc6d1a159 Cleanups 2022-06-22 17:23:09 +02:00
Alexander Gololobov
b3922461b3 Properly handle empty actions 2022-06-22 17:23:09 +02:00
Alexander Gololobov
ba89c3954c Do not add the same vitrual if it has been added by prev_reader 2022-06-22 17:23:09 +02:00
Alexander Gololobov
a9e3b8d29e Don't read the same columns again 2022-06-22 17:23:09 +02:00
Alexander Gololobov
4e426c63cc Debuging test failures 2022-06-22 17:23:09 +02:00
Alexander Gololobov
6a26325fab Test dirty hacks for multiple PREWHERE steps 2022-06-22 17:23:05 +02:00
Alexander Gololobov
87b669f439 Intermediate changes 2022-06-22 17:17:42 +02:00
Alexander Gololobov
64a2f3734b Protect ReadResult internals from MergeTreeRangeReader clients 2022-06-22 17:17:42 +02:00