ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-16 11:22:12 +00:00

Author	SHA1	Message	Date
Amos Bird	aeff06d67d	Don't relax NOT conditions during partition pruning.	2021-04-19 22:15:53 +08:00
Alexey Milovidov	97611faad0	Whitespace	2021-04-13 22:06:24 +03:00
alexey-milovidov	46e1da03fb	Update KeyCondition.cpp	2021-04-13 18:47:11 +03:00
Ivan	495c6e03aa	Replace all Context references with std::weak_ptr (#22297 ) * Replace all Context references with std::weak_ptr * Fix shared context captured by value * Fix build * Fix Context with named sessions * Fix copy context * Fix gcc build * Merge with master and fix build * Fix gcc-9 build	2021-04-11 02:33:54 +03:00
Amos Bird	f00e108410	Fix scalar subquery index analysis	2021-03-16 14:07:30 +08:00
Alexey Milovidov	4bae04d500	Merge branch 'master' into amosbird/fix-18364	2021-01-15 14:37:35 +03:00
alexey-milovidov	4a71971b43	Update KeyCondition.cpp	2021-01-15 14:36:07 +03:00
Amos Bird	44758935df	correct index analysis of WITH aliases	2021-01-10 17:40:47 +08:00
Amos Bird	f93e30bed6	Fix warning	2020-12-31 11:06:15 +08:00
alexey-milovidov	9bc571eacc	Update KeyCondition.cpp	2020-12-30 17:58:43 +03:00
Amos Bird	8b5714b2ac	Fix 2-arg functions with constant in PK analysis	2020-12-23 12:29:29 +08:00
Maksim Kita	18dc118298	Fixed compile issues	2020-12-14 22:12:15 +03:00
Maksim Kita	0464859cfe	Updated usage of different types during IN query 1. Added accurateCast function. 2. Use accurateCast in Set during execute. 3. Added accurateCast tests. 4. Updated select_in_different_types tests.	2020-12-14 22:12:15 +03:00
Maksim Kita	f4b8e8ef99	Allow different types inside IN subquery	2020-12-14 22:12:15 +03:00
Ivan	315ff4f0d9	ANTLR4 Grammar for ClickHouse and new parser (#11298 )	2020-12-04 05:15:44 +03:00
alexey-milovidov	fabceebbce	Merge pull request #17145 from amosbird/cddt Fix unmatched type comparison in KeyCondition	2020-11-29 14:29:35 +03:00
Amos Bird	022ba2b0a9	Fix unmatched type comparison in KeyCondition	2020-11-26 16:15:50 +08:00
Azat Khuzhin	0b47f4a9e9	Fix optimize_trivial_count_query with partition predicate Consider the following example: CREATE TABLE test(p DateTime, k int) ENGINE MergeTree PARTITION BY toDate(p) ORDER BY k; INSERT INTO test VALUES ('2020-09-01 00:01:02', 1), ('2020-09-01 20:01:03', 2), ('2020-09-02 00:01:03', 3); - SELECT count() FROM test WHERE toDate(p) >= '2020-09-01' AND p <= '2020-09-01 00:00:00' In this case rpn will be (FUNCTION_IN_RANGE, FUNCTION_UNKNOWN (due to strict), FUNCTION_AND) and for optimize_trivial_count_query we cannot use index if there is at least one FUNCTION_UNKNOWN. since there is no post processing and return count() based on only the first predicate is wrong. Before this patch FUNCTION_UNKNOWN was allowed for optimize_trivial_count_query, and the result was wrong. And two examples above just to show the difference, the behaviour hadn't been changed with this patch: - SELECT * FROM test WHERE toDate(p) >= '2020-09-01' AND p <= '2020-09-01 00:00:00' In this case will be (FUNCTION_IN_RANGE, FUNCTION_IN_RANGE (due to non-strict), FUNCTION_AND) so it will prune everything out and nothing will be read. - SELECT * FROM test WHERE toDate(p) >= '2020-09-01' AND toUnixTimestamp(p)%5==0 In this case will be (FUNCTION_IN_RANGE, FUNCTION_UNKNOWN, FUNCTION_AND) and all, two, partitions will be scanned, but due to filtering later none of rows will be matched.	2020-11-25 23:09:17 +03:00
Amos Bird	172b7e9ed1	global in set index.	2020-11-23 22:05:08 +08:00
Nikolai Kochetov	46f70dd0de	Merge branch 'master' into actions-dag-f14	2020-11-12 11:54:44 +03:00
Nikolai Kochetov	1db8e77371	Add comments. Update ActionsDAG::Index	2020-11-10 17:54:59 +03:00
Alexander Tokmakov	5cdfcfb307	remove other stringstreams	2020-11-09 22:12:44 +03:00
Nikolai Kochetov	e41b1ae52b	Empty commit.	2020-11-09 19:35:43 +03:00
Nikolai Kochetov	8c4db34f9d	Update after merge.	2020-11-09 14:58:11 +03:00
Nikolai Kochetov	6717c7a0af	Merge branch 'master' into actions-dag-f14	2020-11-09 14:57:48 +03:00
alexey-milovidov	0e6ae4aff7	Merge pull request #16253 from amosbird/pf Prune partition in verbatim way.	2020-11-08 18:58:02 +03:00
Alexey Milovidov	fd84d16387	Fix "server failed to start" error	2020-11-07 03:14:53 +03:00
Amos Bird	2b0085c106	Pruning is different from counting	2020-11-06 19:58:03 +08:00
Amos Bird	aa436a3cb1	Transform single point	2020-11-06 14:59:55 +08:00
Nikolai Kochetov	07a7c46b89	Refactor ExpressionActions [Part 3]	2020-11-03 14:28:28 +03:00
alexey-milovidov	adeba6bdd8	Merge pull request #15074 from amosbird/btc Extend trivial count optimization.	2020-10-22 02:50:57 +03:00
Nikolai Kochetov	bc58637ec2	Fixing build.	2020-10-19 21:37:44 +03:00
Nikolai Kochetov	a7fb2e38a5	Use ColumnWithTypeAndName as function argument instead of Block.	2020-10-09 10:41:28 +03:00
Amos Bird	867216103f	Extend trivial count optimization.	2020-10-08 18:08:17 +08:00
Amos Bird	5cc8fd395c	Fix empty key segfault	2020-09-13 21:55:16 +08:00
Amos Bird	34b9547ce1	Binary operator monotonicity	2020-09-13 21:55:12 +08:00
Alexey Milovidov	4ed0bf3af1	Better code	2020-08-03 00:01:39 +03:00
Alexey Milovidov	3c489ce159	Fix assertion in KeyCondition	2020-08-02 23:55:20 +03:00
Alexey Milovidov	5f808aa503	Fix bad code	2020-08-02 23:41:52 +03:00
Anton Popov	4c266d1e5d	fix wrong index analysis with functions	2020-07-29 19:09:38 +03:00
Nikolai Kochetov	dad9d369a1	Merge branch 'master' into bobrik-parallel-randes	2020-07-23 16:21:32 +03:00
Artem Zuikov	2afd123eda	Refactoring: extract TreeOptimizer from SyntaxAnalyzer (#12645 )	2020-07-22 20:13:05 +03:00
Nikolai Kochetov	12c5e376c6	Remove mutable from RPNElement.	2020-07-21 14:02:58 +03:00
Ivan Babrou	8784994d65	Allow conditions outside of PK with exact range Conditions that are outside of PK are marked as `unknown` in `KeyCondition`, so it's safe to allow them, as long as they are always combined by `AND`.	2020-07-11 18:59:26 -07:00
Ivan Babrou	d9d8d0242e	Optimize PK lookup for queries that match exact PK range Existing code that looks up marks that match the query has a pathological case, when most of the part does in fact match the query. The code works by recursively splitting a part into ranges and then discarding the ranges that definitely do not match the query, based on primary key. The problem is that it requires visiting every mark that matches the query, making the complexity of this sort of look up O(n). For queries that match exact range on the primary key, we can find both left and right parts of the range with O(log 2) complexity. This change implements exactly that. To engage this optimization, the query must: * Have a prefix list of the primary key. * Have only range or single set element constraints for columns. * Have only AND as a boolean operator. Consider a table with `(service, timestamp)` as the primary key. The following conditions will be optimized: * `service = 'foo'` * `service = 'foo' and timestamp >= now() - 3600` * `service in ('foo')` * `service in ('foo') and timestamp >= now() - 3600 and timestamp <= now` The following will fall back to previous lookup algorithm: * `timestamp >= now() - 3600` * `service in ('foo', 'bar') and timestamp >= now() - 3600` * `service = 'foo'` Note that the optimization won't engage when PK has a range expression followed by a point expression, since in that case the range is not continuous. Trace query logging provides the following messages types of messages, each representing a different kind of PK usage for a part: ``` Used optimized inclusion search over index for part 20200711_5710108_5710108_0 with 9 steps Used generic exclusion search over index for part 20200711_5710118_5710228_5 with 1495 steps Not using index on part 20200710_5710473_5710473_0 ``` Number of steps translates to computational complexity. Here's a comparison for before and after for a query over 24h of data: ``` Read 4562944 rows, 148.05 MiB in 45.19249672 sec., 100966 rows/sec., 3.28 MiB/sec. Read 4183040 rows, 135.78 MiB in 0.196279627 sec., 21311636 rows/sec., 691.75 MiB/sec. ``` This is especially useful for queries that read data in order and terminate early to return "last X things" matching a query. See #11564 for more thoughts on this.	2020-07-11 12:26:54 -07:00
Alexey Milovidov	276b3a0215	Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables #11905	2020-07-10 09:30:49 +03:00
myrrc	8c3417fbf7	ILIKE operator (#12125 ) * Integrated CachingAllocator into MarkCache * fixed build errors * reset func hotfix * upd: Fixing build * updated submodules links * fix 2 * updating grabber allocator proto * updating lost work * updating CMake to use concepts * some other changes to get it building (integration into MarkCache) * further integration into caches * updated Async metrics, fixed some build errors * and some other errors revealing * added perfect forwarding to some functions * fix: forward template * fix: constexpr modifier * fix: FakePODAllocator missing member func * updated PODArray constructor taking alloc params * fix: PODArray overload with n restored * fix: FakePODAlloc duplicating alloc() func * added constexpr variable for alloc_tag_t * split cache values by allocators, provided updates * fix: memcpy * fix: constexpr modifier * fix: noexcept modifier * fix: alloc_tag_t for PODArray constructor * fix: PODArray copy ctor with different alloc * fix: resize() signature * updating to lastest working master * syncing with 273267 * first draft version * fix: update Searcher to case-insensitive * added ILIKE test * fixed style errors, updated test, split like and ilike, added notILike * replaced inconsistent comments * fixed show tables ilike * updated missing test cases * regenerated ya.make * Update 01355_ilike.sql Co-authored-by: myrrc <me-clickhouse@myrrec.space> Co-authored-by: alexey-milovidov <milovidov@yandex-team.ru>	2020-07-05 18:57:59 +03:00
Nicolae Vartolomei	3854ce6d84	Rewrite Set lookup to make it more readable	2020-07-01 15:05:54 +01:00
Nicolae Vartolomei	8f1845185e	Try fix pk in tuple performance Possible approach for fixing #10574 The problem is that prepared sets are built correctly, it is a hash map of key -> set where key is a hash of AST and list of data types (when we a list of tuples of literals). However, when the key is built from the index to try and find if there exists a prepared set that would match it looks for data types of the primary key (see how data_types is populated) because the primary key has only one field (v in my example) it can not find the prepared set. The patch looks for any prepared indexes where data types match for the subset of fields found in primary key, we are not interested in other fields anyway for the purpose of primary key pruning.	2020-06-30 16:33:38 +01:00
Alexey Milovidov	8dac30ae95	Split file for better build times	2020-06-14 21:42:10 +03:00

1 2

58 Commits