Commit Graph

156 Commits

Author SHA1 Message Date
Pavel Kruglov
0662df8b76 Fix performance with JIT, add arguments to function isSuitableForShortCircuitArgumentsExecution 2021-08-09 17:54:14 +03:00
Pavel Kruglov
e792fa588f Mark all Functions as sutable or not for executing as short circuit arguments 2021-08-09 17:50:09 +03:00
kssenii
9ca422f0c5 Introduce CAST for internal usage 2021-08-07 09:03:10 +00:00
Nikolai Kochetov
7a24e72e76 Merge branch 'master' into fix-header-for-scalar-query-with-empty-result 2021-07-19 15:48:44 +03:00
Nikolai Kochetov
96e20e2641 Fix some tests. 2021-07-19 15:35:55 +03:00
alexey-milovidov
b52411a715
Merge pull request #12455 from amosbird/npc
Nullable primary key with correct KeyCondition
2021-07-18 17:52:20 +03:00
Nikolai Kochetov
c22f856d36 Fix indexHint 2021-06-23 15:19:22 +03:00
Nikolai Kochetov
a45290bfb3 Fix some tests. 2021-06-22 18:52:14 +03:00
Nikolai Kochetov
f5f57781b7 Fix build after merge. 2021-06-22 17:45:22 +03:00
Nikolai Kochetov
3dc0b9c096 Merge branch 'master' into use-dag-in-key-condition 2021-06-22 17:02:01 +03:00
Nikolai Kochetov
47f130d39c Try use expression for KeyCondition from query plan. 2021-06-22 16:54:00 +03:00
Nikolai Kochetov
21e39e10ea Update KeyCondition constructor 2021-06-22 13:28:56 +03:00
Nikolai Kochetov
d15d16fee0 Fix build. 2021-06-22 10:26:45 +03:00
Nikolai Kochetov
68176f064b Fix some tests. 2021-06-21 20:28:15 +03:00
Nikolai Kochetov
eef6c73030 Use DAG in KeyCondition 2021-06-21 19:17:05 +03:00
Anton Popov
ffa56bde24 fix usage of index with array columns and ARRAY JOIN 2021-06-21 15:34:05 +03:00
Nikolai Kochetov
47387d4faa Merge branch 'master' into use-dag-in-key-condition 2021-06-21 14:10:09 +03:00
Amos Bird
f2ed5ef42b
Nullable primary key with correct KeyCondition 2021-06-18 23:04:24 +08:00
Kseniia Sumarokova
e08c05cdf5
Merge pull request #25295 from ClickHouse/remove-some-code-from-key-condition
Remove some code from KeyCondition.
2021-06-16 10:12:12 +03:00
Nikolai Kochetov
96d98ff020 Add comment 2021-06-15 21:42:26 +03:00
mergify[bot]
7959d92029
Merge branch 'master' into minor-changes-3 2021-06-15 18:07:24 +00:00
Nikolai Kochetov
80a13c489b Revert back moduloLegacy check for canConstantBeWrappedByFunctions. 2021-06-15 18:21:31 +03:00
Nikolai Kochetov
481b87b37a Remove some code from keyCondition. 2021-06-15 16:47:37 +03:00
Nikita Mikhaylov
a52bba91b7
Merge pull request #16401 from abyss7/ast-table-identifier-2
ASTTableIdentifier part #2: Introduce ASTTableIdentifier
2021-06-15 13:51:30 +03:00
Alexey Milovidov
447d7bb8cd Minor changes 2021-06-14 07:13:35 +03:00
Nikolai Kochetov
6197d20c18
Update KeyCondition.cpp 2021-06-08 15:48:14 +03:00
Nikolai Kochetov
c4832fd3c0 Added test. 2021-06-07 21:24:32 +03:00
Nikolai Kochetov
0d2a839ca4 Fix tests. 2021-06-07 16:41:40 +03:00
Nikolai Kochetov
397f6133e0 Refactor canConstantBeWrappedByMonotonicFunctions function. 2021-06-04 20:56:56 +03:00
Nikolai Kochetov
3e5a1cda60 Revert 2021-06-03 17:44:59 +03:00
Nikolai Kochetov
726e22ea1d Always return false for canConstantBeWrappedByMonotonicFunctions. 2021-06-03 16:26:04 +03:00
Nikolai Kochetov
dee032c899 Part 2. 2021-06-03 15:27:38 +03:00
Nikolai Kochetov
38966e3e6b Part 2. 2021-06-03 15:26:02 +03:00
Nikolai Kochetov
c855cf7057 Part 1. 2021-06-02 19:56:24 +03:00
Ivan Lezhankin
365e52817b More fixes due to "in" function arguments being incorrectly checked as ASTIdentifier 2021-06-01 14:20:03 +03:00
kssenii
70469429c1 Fixes 2021-05-24 23:39:56 +00:00
kssenii
1ffaf1d793 Better check 2021-05-23 21:53:32 +00:00
kssenii
fcfec83875 Fix comparisons with modulo key (version 2) 2021-05-21 16:40:47 +00:00
kssenii
30845a383f Fix comparisons with modulo key 2021-05-21 15:01:41 +00:00
Maksim Kita
947f28d430 IFunction refactoring 2021-05-15 20:33:15 +03:00
Alexey Milovidov
a5a4e64ba7 Fix a few PVS-Studio warnings 2021-04-27 07:22:32 +03:00
alexey-milovidov
186b1128d0
Merge pull request #23310 from amosbird/fixbugindex
Don't relax NOT conditions during partition pruning.
2021-04-27 07:13:18 +03:00
alexey-milovidov
cae5260817
Update KeyCondition.cpp 2021-04-24 05:34:35 +03:00
Amos Bird
32c84f77c3
Resurrect indexHint function. 2021-04-20 19:27:23 +08:00
Amos Bird
d5f606c544
Another fix 2021-04-20 14:15:28 +08:00
Amos Bird
aeff06d67d
Don't relax NOT conditions during partition pruning. 2021-04-19 22:15:53 +08:00
Nikolai Kochetov
4d86f51eff Merge branch 'master' into add-read-from-mt-step 2021-04-19 10:17:21 +03:00
Nikolai Kochetov
8d8e57615c A little bit better index description. 2021-04-16 12:42:23 +03:00
Nikolai Kochetov
be52b2889a Better description for key condition. 2021-04-15 20:30:04 +03:00
Alexey Milovidov
97611faad0 Whitespace 2021-04-13 22:06:24 +03:00
alexey-milovidov
46e1da03fb
Update KeyCondition.cpp 2021-04-13 18:47:11 +03:00
Ivan
495c6e03aa
Replace all Context references with std::weak_ptr (#22297)
* Replace all Context references with std::weak_ptr

* Fix shared context captured by value

* Fix build

* Fix Context with named sessions

* Fix copy context

* Fix gcc build

* Merge with master and fix build

* Fix gcc-9 build
2021-04-11 02:33:54 +03:00
Amos Bird
f00e108410
Fix scalar subquery index analysis 2021-03-16 14:07:30 +08:00
Alexey Milovidov
4bae04d500 Merge branch 'master' into amosbird/fix-18364 2021-01-15 14:37:35 +03:00
alexey-milovidov
4a71971b43
Update KeyCondition.cpp 2021-01-15 14:36:07 +03:00
Amos Bird
44758935df
correct index analysis of WITH aliases 2021-01-10 17:40:47 +08:00
Amos Bird
f93e30bed6
Fix warning 2020-12-31 11:06:15 +08:00
alexey-milovidov
9bc571eacc
Update KeyCondition.cpp 2020-12-30 17:58:43 +03:00
Amos Bird
8b5714b2ac
Fix 2-arg functions with constant in PK analysis 2020-12-23 12:29:29 +08:00
Maksim Kita
18dc118298 Fixed compile issues 2020-12-14 22:12:15 +03:00
Maksim Kita
0464859cfe Updated usage of different types during IN query
1. Added accurateCast function.
2. Use accurateCast in Set during execute.
3. Added accurateCast tests.
4. Updated select_in_different_types tests.
2020-12-14 22:12:15 +03:00
Maksim Kita
f4b8e8ef99 Allow different types inside IN subquery 2020-12-14 22:12:15 +03:00
Ivan
315ff4f0d9
ANTLR4 Grammar for ClickHouse and new parser (#11298) 2020-12-04 05:15:44 +03:00
alexey-milovidov
fabceebbce
Merge pull request #17145 from amosbird/cddt
Fix unmatched type comparison in KeyCondition
2020-11-29 14:29:35 +03:00
Amos Bird
022ba2b0a9
Fix unmatched type comparison in KeyCondition 2020-11-26 16:15:50 +08:00
Azat Khuzhin
0b47f4a9e9 Fix optimize_trivial_count_query with partition predicate
Consider the following example:

    CREATE TABLE test(p DateTime, k int) ENGINE MergeTree PARTITION BY toDate(p) ORDER BY k;
    INSERT INTO test VALUES ('2020-09-01 00:01:02', 1), ('2020-09-01 20:01:03', 2), ('2020-09-02 00:01:03', 3);

- SELECT count() FROM test WHERE toDate(p) >= '2020-09-01' AND p <= '2020-09-01 00:00:00'
  In this case rpn will be (FUNCTION_IN_RANGE, FUNCTION_UNKNOWN (due to strict), FUNCTION_AND)
  and for optimize_trivial_count_query we cannot use index if there is at least one FUNCTION_UNKNOWN.
  since there is no post processing and return count() based on only the first predicate is wrong.

  Before this patch FUNCTION_UNKNOWN was allowed for optimize_trivial_count_query, and the result was wrong.

And two examples above just to show the difference, the behaviour hadn't been changed with this patch:

- SELECT * FROM test WHERE toDate(p) >= '2020-09-01' AND p <= '2020-09-01 00:00:00'
  In this case will be (FUNCTION_IN_RANGE, FUNCTION_IN_RANGE (due to non-strict), FUNCTION_AND)
  so it will prune everything out and nothing will be read.

- SELECT * FROM test WHERE toDate(p) >= '2020-09-01' AND toUnixTimestamp(p)%5==0
  In this case will be (FUNCTION_IN_RANGE, FUNCTION_UNKNOWN, FUNCTION_AND)
  and all, two, partitions will be scanned, but due to filtering later none of rows will be matched.
2020-11-25 23:09:17 +03:00
Amos Bird
172b7e9ed1
global in set index. 2020-11-23 22:05:08 +08:00
Nikolai Kochetov
46f70dd0de Merge branch 'master' into actions-dag-f14 2020-11-12 11:54:44 +03:00
Nikolai Kochetov
1db8e77371 Add comments. Update ActionsDAG::Index 2020-11-10 17:54:59 +03:00
Alexander Tokmakov
5cdfcfb307 remove other stringstreams 2020-11-09 22:12:44 +03:00
Nikolai Kochetov
e41b1ae52b Empty commit. 2020-11-09 19:35:43 +03:00
Nikolai Kochetov
8c4db34f9d Update after merge. 2020-11-09 14:58:11 +03:00
Nikolai Kochetov
6717c7a0af Merge branch 'master' into actions-dag-f14 2020-11-09 14:57:48 +03:00
alexey-milovidov
0e6ae4aff7
Merge pull request #16253 from amosbird/pf
Prune partition in verbatim way.
2020-11-08 18:58:02 +03:00
Alexey Milovidov
fd84d16387 Fix "server failed to start" error 2020-11-07 03:14:53 +03:00
Amos Bird
2b0085c106
Pruning is different from counting 2020-11-06 19:58:03 +08:00
Amos Bird
aa436a3cb1
Transform single point 2020-11-06 14:59:55 +08:00
Nikolai Kochetov
07a7c46b89 Refactor ExpressionActions [Part 3] 2020-11-03 14:28:28 +03:00
alexey-milovidov
adeba6bdd8
Merge pull request #15074 from amosbird/btc
Extend trivial count optimization.
2020-10-22 02:50:57 +03:00
Nikolai Kochetov
bc58637ec2 Fixing build. 2020-10-19 21:37:44 +03:00
Nikolai Kochetov
a7fb2e38a5 Use ColumnWithTypeAndName as function argument instead of Block. 2020-10-09 10:41:28 +03:00
Amos Bird
867216103f
Extend trivial count optimization. 2020-10-08 18:08:17 +08:00
Amos Bird
5cc8fd395c
Fix empty key segfault 2020-09-13 21:55:16 +08:00
Amos Bird
34b9547ce1
Binary operator monotonicity 2020-09-13 21:55:12 +08:00
Alexey Milovidov
4ed0bf3af1 Better code 2020-08-03 00:01:39 +03:00
Alexey Milovidov
3c489ce159 Fix assertion in KeyCondition 2020-08-02 23:55:20 +03:00
Alexey Milovidov
5f808aa503 Fix bad code 2020-08-02 23:41:52 +03:00
Anton Popov
4c266d1e5d fix wrong index analysis with functions 2020-07-29 19:09:38 +03:00
Nikolai Kochetov
dad9d369a1 Merge branch 'master' into bobrik-parallel-randes 2020-07-23 16:21:32 +03:00
Artem Zuikov
2afd123eda
Refactoring: extract TreeOptimizer from SyntaxAnalyzer (#12645) 2020-07-22 20:13:05 +03:00
Nikolai Kochetov
12c5e376c6 Remove mutable from RPNElement. 2020-07-21 14:02:58 +03:00
Ivan Babrou
8784994d65 Allow conditions outside of PK with exact range
Conditions that are outside of PK are marked as `unknown` in `KeyCondition`,
so it's safe to allow them, as long as they are always combined by `AND`.
2020-07-11 18:59:26 -07:00
Ivan Babrou
d9d8d0242e Optimize PK lookup for queries that match exact PK range
Existing code that looks up marks that match the query has a pathological
case, when most of the part does in fact match the query.

The code works by recursively splitting a part into ranges and then discarding
the ranges that definitely do not match the query, based on primary key.

The problem is that it requires visiting every mark that matches the query,
making the complexity of this sort of look up O(n).

For queries that match exact range on the primary key, we can find
both left and right parts of the range with O(log 2) complexity.

This change implements exactly that.

To engage this optimization, the query must:

* Have a prefix list of the primary key.
* Have only range or single set element constraints for columns.
* Have only AND as a boolean operator.

Consider a table with `(service, timestamp)` as the primary key.

The following conditions will be optimized:

* `service = 'foo'`
* `service = 'foo' and timestamp >= now() - 3600`
* `service in ('foo')`
* `service in ('foo') and timestamp >= now() - 3600 and timestamp <= now`

The following will fall back to previous lookup algorithm:

* `timestamp >= now() - 3600`
* `service in ('foo', 'bar') and timestamp >= now() - 3600`
* `service = 'foo'`

Note that the optimization won't engage when PK has a range expression
followed by a point expression, since in that case the range is not continuous.

Trace query logging provides the following messages types of messages,
each representing a different kind of PK usage for a part:

```
Used optimized inclusion search over index for part 20200711_5710108_5710108_0 with 9 steps
Used generic exclusion search over index for part 20200711_5710118_5710228_5 with 1495 steps
Not using index on part 20200710_5710473_5710473_0
```

Number of steps translates to computational complexity.

Here's a comparison for before and after for a query over 24h of data:

```
Read 4562944 rows, 148.05 MiB in 45.19249672 sec.,   100966 rows/sec.,   3.28 MiB/sec.
Read 4183040 rows, 135.78 MiB in 0.196279627 sec., 21311636 rows/sec., 691.75 MiB/sec.
```

This is especially useful for queries that read data in order
and terminate early to return "last X things" matching a query.

See #11564 for more thoughts on this.
2020-07-11 12:26:54 -07:00
Alexey Milovidov
276b3a0215 Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables #11905 2020-07-10 09:30:49 +03:00
myrrc
8c3417fbf7
ILIKE operator (#12125)
* Integrated CachingAllocator into MarkCache

* fixed build errors

* reset func hotfix

* upd: Fixing build

* updated submodules links

* fix 2

* updating grabber allocator proto

* updating lost work

* updating CMake to use concepts

* some other changes to get it building (integration into MarkCache)

* further integration into caches

* updated Async metrics, fixed some build errors

* and some other errors revealing

* added perfect forwarding to some functions

* fix: forward template

* fix: constexpr modifier

* fix: FakePODAllocator missing member func

* updated PODArray constructor taking alloc params

* fix: PODArray overload with n restored

* fix: FakePODAlloc duplicating alloc() func

* added constexpr variable for alloc_tag_t

* split cache values by allocators, provided updates

* fix: memcpy

* fix: constexpr modifier

* fix: noexcept modifier

* fix: alloc_tag_t for PODArray constructor

* fix: PODArray copy ctor with different alloc

* fix: resize() signature

* updating to lastest working master

* syncing with 273267

* first draft version

* fix: update Searcher to case-insensitive

* added ILIKE test

* fixed style errors, updated test, split like and ilike,  added notILike

* replaced inconsistent comments

* fixed show tables ilike

* updated missing test cases

* regenerated ya.make

* Update 01355_ilike.sql

Co-authored-by: myrrc <me-clickhouse@myrrec.space>
Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>
2020-07-05 18:57:59 +03:00
Nicolae Vartolomei
3854ce6d84 Rewrite Set lookup to make it more readable 2020-07-01 15:05:54 +01:00
Nicolae Vartolomei
8f1845185e Try fix pk in tuple performance
Possible approach for fixing #10574

The problem is that prepared sets are built correctly, it is a hash map of key -> set
where key is a hash of AST and list of data types (when we a list of
tuples of literals).

However, when the key is built from the index to try and find if there
exists a prepared set that would match it looks for data types of the
primary key (see how data_types is populated) because the primary key
has only one field (v in my example) it can not find the prepared set.

The patch looks for any prepared indexes where data types match for the
subset of fields found in primary key, we are not interested in other
fields anyway for the purpose of primary key pruning.
2020-06-30 16:33:38 +01:00
Alexey Milovidov
8dac30ae95 Split file for better build times 2020-06-14 21:42:10 +03:00
Alexey Milovidov
f6c52fe1c2 Allow comparison with String in index analysis; simplify code #11630 2020-06-14 21:31:42 +03:00
Alexander Kuzmenkov
1ab3201454 Merge remote-tracking branch 'origin/master' into HEAD 2020-06-03 16:36:22 +03:00