ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-15 02:41:59 +00:00

Author	SHA1	Message	Date
Alexander Tokmakov	b8d9066004	Revert "Resurrect parallel distributed insert select with s3Cluster (#41535 )" This reverts commit `860e34e760`.	2022-10-07 15:53:30 +02:00
Nikita Mikhaylov	860e34e760	Resurrect parallel distributed insert select with s3Cluster (#41535 )	2022-10-06 13:47:32 +02:00
Anton Popov	6e61cf92f5	Merge remote-tracking branch 'upstream/master' into HEAD	2022-10-03 13:16:57 +00:00
Alexey Milovidov	ab4db2d0c4	Fix 5/6 of trash	2022-09-19 08:50:53 +02:00
Anton Popov	f0a404e2c8	Merge remote-tracking branch 'upstream/master' into HEAD	2022-09-06 15:51:16 +00:00
Alexander Tokmakov	f9f85a0e8b	Revert "Parallel distributed insert select from *Cluster table functions (#39107 )" This reverts commit `d3cc234986`.	2022-08-24 15:17:15 +03:00
Nikita Mikhaylov	d3cc234986	Parallel distributed insert select from *Cluster table functions (#39107 )	2022-08-15 12:41:17 +02:00
Alexander Gololobov	ae0d00083c	Renamed __row_exists to _row_exists	2022-07-18 20:07:36 +02:00
Alexander Gololobov	9de72d995a	POC lightweight delete using __row_exists virtual column and prewhere-like filtering	2022-07-18 20:06:42 +02:00
avogar	59c1c472cb	Better exception messages on wrong table engines/functions argument types	2022-06-23 20:04:06 +00:00
Nikolai Kochetov	8991f39412	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-06-02 17:00:08 +00:00
Nikita Mikhaylov	d34e051c69	Support for simultaneous read from local and remote parallel replica (#37204 )	2022-06-02 11:46:33 +02:00
Nikolai Kochetov	c71256ea38	Remove some commented code.	2022-05-30 13:18:20 +00:00
Nikolai Kochetov	1b85f2c1d6	Merge branch 'master' into refactor-read-metrics-and-callbacks	2022-05-25 16:27:40 +02:00
Nikolai Kochetov	fd97a9d885	Move some resources	2022-05-23 19:47:32 +00:00
Nikolai Kochetov	56feef01e7	Move some resources	2022-05-20 19:49:31 +00:00
Anton Popov	e911900054	remove last mentions of data streams	2022-05-09 19:15:24 +00:00
Anton Popov	515f68eead	Merge remote-tracking branch 'upstream/master' into dynamic-columns-14	2022-05-06 16:10:51 +00:00
Anton Popov	566c08086a	support Object type inside other types	2022-05-06 14:44:00 +00:00
mergify[bot]	64084b5e32	Merge branch 'master' into shared_ptr_helper3	2022-05-03 20:46:16 +00:00
Robert Schulze	330212e0f4	Remove inherited create() method + disallow copying The original motivation for this commit was that shared_ptr_helper used std::shared_ptr<>() which does two heap allocations instead of make_shared<>() which does a single allocation. Turned out that 1. the affected code (--> Storages/) is not on a hot path (rendering the performance argument moot ...) 2. yet copying Storage objects is potentially dangerous and was previously allowed. Hence, this change - removes shared_ptr_helper and as a result all inherited create() methods, - instead, Storage objects are now created using make_shared<>() by the caller (for that to work, many constructors had to be made public), and - all Storage classes were marked as noncopyable using boost::noncopyable. In sum, we are (likely) not making things faster but the code becomes cleaner and harder to misuse.	2022-05-02 08:46:52 +02:00
mergify[bot]	265398d1b6	Merge branch 'master' into feat/add_part_offset	2022-04-25 15:58:16 +00:00
Robert Schulze	b24ca8de52	Fix various clang-tidy warnings When I tried to add cool new clang-tidy 14 warnings, I noticed that the current clang-tidy settings already produce a ton of warnings. This commit addresses many of these. Almost all of them were non-critical, i.e. C vs. C++ style casts.	2022-04-20 10:29:05 +02:00
Alexey Milovidov	242919eddd	Remove abbreviation	2022-04-18 01:02:49 +02:00
Alexander Tokmakov	07d952b728	use snapshots for semistructured data, durability fixes	2022-03-17 18:26:18 +01:00
roverxu	29a842bf22	feat(...): [LWD] support getting _part_offset of a row	2022-03-15 15:40:10 +08:00
Anton Popov	36ec379aeb	Merge remote-tracking branch 'upstream/master' into HEAD	2022-03-14 16:28:35 +00:00
Anton Popov	37efe2ddb5	Apply suggestions from code review Co-authored-by: Kseniia Sumarokova <54203879+kssenii@users.noreply.github.com>	2022-03-10 22:24:19 +01:00
Azat Khuzhin	4843e210c3	Support view() for parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-08 22:05:57 +03:00
Azat Khuzhin	c4b6342853	Improvements for `parallel_distributed_insert_select` (and related) (#34728 ) * Add a warning if parallel_distributed_insert_select was ignored Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Respect max_distributed_depth for parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Print warning for non applied parallel_distributed_insert_select only for initial query Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Remove Cluster::getHashOfAddresses() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses Before it uses empty cluster name (getClusterName()) which is not correct, compare all addresses instead. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix max_distributed_depth check max_distributed_depth=1 must mean not more then one distributed query, not two, since max_distributed_depth=0 means no limit, and distribute_depth is 0 for the first query. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Add a test for parallel_distributed_insert_select with cluster()/remote() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Return <remote> instead of empty cluster name in Distributed engine Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Make user with sharding_key and w/o in remote()/cluster() identical Before with sharding_key the user was "default", while w/o it it was empty. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-08 15:24:39 +01:00
Anton Popov	04a3a10148	minor fixes	2022-03-01 20:20:53 +03:00
Anton Popov	2758db5341	add more comments	2022-03-01 19:32:55 +03:00
Anton Popov	a661eaf39f	better performance of getting storage snapshot	2022-02-16 02:17:22 +03:00
Anton Popov	dcd7312d75	cache common type on objects in MergeTree	2022-02-09 23:47:53 +03:00
Anton Popov	18940b8637	Merge remote-tracking branch 'upstream/master' into HEAD	2022-02-09 23:38:38 +03:00
feng lv	6325d4d9b0	continue of #34317 fix fix	2022-02-06 08:59:17 +00:00
Anton Popov	78b9f15abb	Merge remote-tracking branch 'upstream/master' into HEAD	2022-01-30 03:24:37 +03:00
Anton Popov	e8ce091e68	Merge remote-tracking branch 'upstream/master' into HEAD	2022-01-21 20:11:18 +03:00
Kruglov Pavel	2295a07066	Merge pull request #33534 from azat/fwd-decl RFC: Split headers, move SystemLog into module, more forward declarations	2022-01-18 17:22:49 +03:00
Azat Khuzhin	c341b3b237	Add current database to table names in JOIN section for distributed queries This should fix JOIN w/o explicit database. v2: rewrite only JOIN section, since there is old behavior that relies on default_database for IN section, see [1]: - 01487_distributed_in_not_default_db - 01152_cross_replication [1]: https://s3.amazonaws.com/clickhouse-test-reports/33611/d0ea3c76fa51131171b1825939680867eb1c04da/fast_test__actions_.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-01-14 11:23:38 +03:00
Azat Khuzhin	0a9b1ee803	Remove RestoreQualifiedNamesMatcher::Data::rename (always true) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-01-14 11:18:52 +03:00
Azat Khuzhin	aee034a597	Use explicit template instantiation for SystemLog - Move some code into module part to avoid dependency from IStorage in SystemLog - Remove extra headers from SystemLog.h - Rewrite some code that was relying on headers that was included by SystemLog.h v2: rebase v3: squash move into module part with explicit template instantiation (to make each commit self compilable after rebase)	2022-01-10 22:01:41 +03:00
Azat Khuzhin	1637c41d42	Remove leftovers of old _shard_num via identifier implementation	2022-01-10 21:21:24 +03:00
avogar	8112a71233	Implement schema inference for most input formats	2021-12-29 12:18:56 +03:00
Anton Popov	99ebabd822	Merge remote-tracking branch 'upstream/master' into HEAD	2021-12-17 19:02:29 +03:00
Alexey Milovidov	5c90ed2ed9	Unambiguous formatting of distributed queries	2021-12-10 00:55:14 +03:00
Nikita Mikhaylov	dbf5091016	Parallel reading from replicas (#29279 )	2021-12-09 13:39:28 +03:00
Anton Popov	6f4d9a53b2	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-12-01 15:54:33 +03:00
Raúl Marín	7781fc12ed	Reduce dependencies on ASTSelectWithUnionQuery.h 521 -> 77 files requiring changes	2021-11-26 19:27:16 +01:00
Raúl Marín	b2cfa70541	Reduce dependencies on ASTFunction.h 481 -> 230	2021-11-26 18:21:54 +01:00
Anton Popov	a20922b2d3	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-11-09 15:36:25 +03:00
feng lv	6f12348282	enable modify table comment of some table	2021-10-29 12:31:18 +00:00
Alexander Tokmakov	2e7e195e77	change alter_lock to std::timed_mutex	2021-10-26 13:37:00 +03:00
Nikolai Kochetov	fd14faeae2	Remove DataStreams folder.	2021-10-15 23:18:20 +03:00
Nikolai Kochetov	2957971ee3	Remove some last streams.	2021-10-13 21:22:02 +03:00
Vitaly Baranov	1636ee24bb	Fix using materialized column as sharding key.	2021-10-04 10:56:42 +03:00
Nikolai Kochetov	341553febd	Fix build.	2021-09-16 20:40:42 +03:00
Nikolai Kochetov	b997214620	Rename QueryPipeline to QueryPipelineBuilder.	2021-09-14 20:48:18 +03:00
Nikolai Kochetov	0e267c50b4	Merge branch 'master' into rewrite-pushing-to-views	2021-09-14 16:13:54 +03:00
alexey-milovidov	ea13a8b562	Merge pull request #28659 from myrrc/improvement/tostring_to_magic_enum Improving CH type system with concepts	2021-09-12 15:26:29 +03:00
Nikolai Kochetov	f569a3e3f7	Merge branch 'master' into rewrite-pushing-to-views	2021-09-09 20:30:23 +03:00
Anton Popov	4c388e3d84	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-09-09 14:10:16 +03:00
Nikolai Kochetov	999a4fe831	Fix other tests.	2021-09-08 21:29:38 +03:00
ZhiYong Wang	978dd19fa2	Fix coredump in creating distributed table	2021-09-07 19:05:26 +08:00
Mike Kot	8e9aacadd1	Initial: replacing hardcoded toString for enums with magic_enum	2021-09-06 16:24:03 +02:00
Alexander Tokmakov	42378b5913	fix	2021-08-20 17:05:53 +03:00
Anton Popov	61239343e3	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-08-20 16:33:30 +03:00
Alexander Tokmakov	8c6dd18917	check cluster name before creating Distributed	2021-08-20 14:55:04 +03:00
Azat Khuzhin	702d9955c0	Fix distributed queries with zero shards and aggregation	2021-08-08 19:22:49 +03:00
Azat Khuzhin	3be3c503aa	Fix some comments	2021-08-08 09:58:07 +03:00
alexey-milovidov	c5207fc237	Merge pull request #26466 from azat/optimize-dist-select Rework SELECT from Distributed optimizations	2021-08-08 03:59:32 +03:00
mergify[bot]	dc57254982	Merge branch 'master' into improve_create_or_replace	2021-08-03 11:39:07 +00:00
Azat Khuzhin	97851bde08	Fix Distributed over Distributed for WithMergeableStateAfterAggregation* stages In case if one Distributed has multiple shards, and underlying Distributed has only one, there can be the case when the query will be tried to process from Complete to WithMergeableStateAfterAggregation, which is obviously wrong.	2021-08-03 10:10:08 +03:00
Anton Popov	e36736b50c	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-08-02 22:52:02 +03:00
Azat Khuzhin	ff12f5102a	Avoid running LIMIT BY/DISTINCT step on the initiator for optimize_distributed_group_by_sharding_key Before the following queries was running LimitBy/Distinct step on the initator: select distinct sharding_key from dist order by k While this can be omitted.	2021-08-02 21:04:30 +03:00
Azat Khuzhin	2fb95d9ee0	Rework SELECT from Distributed query stages optimization Before this patch it wasn't possible to optimize simple SELECT * FROM dist ORDER BY (w/o GROUP BY and DISTINCT) to more optimal stage (QueryProcessingStage::WithMergeableStateAfterAggregationAndLimit), since that code was under allow_nondeterministic_optimize_skip_unused_shards, rework it and make it possible. Also now distributed_push_down_limit is respected for optimize_distributed_group_by_sharding_key. Next step will be to enable distributed_push_down_limit by default. v2: fix detection of aggregates	2021-08-02 21:04:29 +03:00
Azat Khuzhin	bb6d030fb8	Optimize distributed SELECT w/o GROUP BY	2021-08-02 21:04:29 +03:00
Nikolai Kochetov	61d8f880cd	Rename some files.	2021-07-26 19:48:25 +03:00
mergify[bot]	044be267d6	Merge branch 'master' into improve_create_or_replace	2021-07-26 08:38:48 +00:00
Anton Popov	90b6a591e5	fix reading from distributed	2021-07-24 03:55:50 +03:00
Anton Popov	f867995b94	remove excessive creation of storage snapshot	2021-07-23 19:47:43 +03:00
Anton Popov	2b58f39c10	dynamic columns: support missed columns in distributed	2021-07-23 19:34:23 +03:00
Nikolai Kochetov	2dc5c89b66	Update Storage::write	2021-07-23 17:25:35 +03:00
Anton Popov	f99374cca6	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-07-20 18:20:21 +03:00
Amos Bird	dbfb699690	Asynchronously drain connections.	2021-07-19 21:53:29 +08:00
alexey-milovidov	bc907bd27c	Merge pull request #26336 from azat/dist-per-table-monitor-settings Add ability to set Distributed directory monitor settings via CREATE TABLE	2021-07-17 01:49:40 +03:00
alexey-milovidov	1701cc429d	Merge pull request #26353 from azat/optimize_distributed_group_by_sharding_key-fix Fix optimize_distributed_group_by_sharding_key for multiple columns	2021-07-17 01:45:10 +03:00
Azat Khuzhin	f3d3ec44a6	Add ability to set Distributed directory monitor settings via CREATE TABLE	2021-07-16 04:10:47 +03:00
Nikolai Kochetov	f36d14f68f	Add separate step to read from remote.	2021-07-15 19:15:16 +03:00
Azat Khuzhin	7b209694d5	Fix optimize_distributed_group_by_sharding_key for multiple columns Before we incorrectly check that columns from GROUP BY was a subset of columns from sharding key, while this is not right, consider the following example: select k1, any(k2), sum(v) from remote('127.{1,2}', view(select 1 k1, 2 k2, 3 v), cityHash64(k1, k2)) group by k1 Here the columns from GROUP BY is a subset of columns from sharding key, but the optimization cannot be applied, since there is no guarantee that particular shard contains distinct values of k1. So instead we should check that GROUP BY contains all columns that is required for calculating sharding key expression, i.e.: select k1, k2, sum(v) from remote('127.{1,2}', view(select 1 k1, 2 k2, 3 v), cityHash64(k1, k2)) group by k1, k2	2021-07-15 09:09:58 +03:00
Anton Popov	5d175bf557	dynamic columns: support distributed tables	2021-07-12 17:54:02 +03:00
Anton Popov	3ed7f5a6cc	dynamic subcolumns: add snapshot for storage	2021-07-09 06:15:41 +03:00
Azat Khuzhin	533df9507f	Fix log message for optimize_skip_unused_shards_limit	2021-07-07 00:17:39 +03:00
Alexander Tokmakov	1b2416007e	fix	2021-07-01 19:43:59 +03:00
Alexander Tokmakov	d9a77e3a1a	improve CREATE OR REPLACE query	2021-07-01 16:21:38 +03:00
Raúl Marín	bfc122df64	Fix some typos in Storage classes	2021-06-28 19:03:56 +02:00
alexey-milovidov	1b644b9a31	Merge pull request #25663 from azat/dist-startup Improve startup time of Distributed engine.	2021-06-27 18:22:45 +03:00
alexey-milovidov	f6e67d3dc1	Update StorageDistributed.cpp	2021-06-27 18:22:34 +03:00
Alexander Tokmakov	3a25b05765	fix rename Distributed table	2021-06-24 13:00:33 +03:00
Azat Khuzhin	a616ae8861	Improve startup time of Distributed engine. - create directory monitors in parallel (this also includes rmdir in case of directory is empty, since even if the directory is empty it may take some time to remove it, due to waiting for journal or if the directory is large, i.e. it had lots of files before, since remember ext4 does not truncate the directory size on each unlink [1]) - initialize increment in parallel too (since it does readdir()) [1]: https://lore.kernel.org/linux-ext4/930A5754-5CE6-4567-8CF0-62447C97825C@dilger.ca/	2021-06-24 10:27:51 +03:00
Anton Popov	d8b6f15ef4	Merge pull request #23027 from azat/distributed-push-down-limit Add ability to push down LIMIT for distributed queries	2021-06-20 23:08:50 +03:00
Maksim Kita	67e9b85951	Merge ext into common	2021-06-16 23:28:41 +03:00
alexey-milovidov	34d12063f8	Merge pull request #23349 from azat/dist-respect-insert_allow_materialized_columns Respect insert_allow_materialized_columns for INSERT into Distributed()	2021-06-14 07:23:00 +03:00
Nikita Mikhaylov	82b8d45cd7	Merge pull request #23518 from nikitamikhaylov/copier-stuck Bugfixes and improvements of `clickhouse-copier`	2021-06-09 11:36:42 +03:00
Azat Khuzhin	18e8f0eb5e	Add ability to push down LIMIT for distributed queries This way the remote nodes will not need to send all the rows, so this will decrease network io and also this will make queries w/ optimize_aggregation_in_order=1/LIMIT X and w/o ORDER BY faster since it initiator will not need to read all the rows, only first X (but note that for this you need to your data to be sharded correctly or you may get inaccurate results). Note, that having lots of processing stages will increase the complexity of interpreter (it is already not that clean and simple right now). Although using separate QueryProcessingStage looks pretty natural. Another option is to make WithMergeableStateAfterAggregation always, but in this case you will not be able to disable only this optimization, i.e. if there will be some issue with it. v2: fix OFFSET v3: convert 01814_distributed_push_down_limit test to .sh and add retries v4: add test with OFFSET v5: add new query stage into the bash completion v6/tests: use LIMIT O,L syntax over LIMIT L OFFSET O since it is broken in ANTLR parser https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_(antlr_debug).html#fail1 v7/tests: set use_hedged_requests to 0, to avoid excessive log entries on retries https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_flaky_check_(address).html#fail1	2021-06-09 02:29:50 +03:00
Amos Bird	78fca8f8fa	Fix possible race condition when getting cluster	2021-06-04 21:09:59 +08:00
Nikita Mikhaylov	312bb96eeb	Merge branch 'master' of github.com:ClickHouse/ClickHouse into copier-stuck	2021-06-02 01:04:47 +03:00
Nikita Mikhaylov	6d19dea761	better	2021-05-31 17:38:20 +03:00
Nikita Mikhaylov	90ab394769	better	2021-05-31 17:37:10 +03:00
kssenii	3dee003f9b	Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs	2021-05-20 19:20:09 +03:00
Azat Khuzhin	4d737a5481	Respect insert_allow_materialized_columns for INSERT into Distributed()	2021-05-20 07:40:46 +03:00
Alexander Kuzmenkov	e9b69bbd70	Merge pull request #23906 from azat/fix-distributed_group_by_no_merge distributed_group_by_no_merge fixes	2021-05-19 16:16:08 +03:00
Alexander Kuzmenkov	09cb467812	Update StorageDistributed.cpp	2021-05-19 16:14:33 +03:00
kssenii	9b8df78fdd	Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs	2021-05-17 17:42:05 +03:00
feng lv	c6f8ab9826	fix	2021-05-13 02:05:53 +00:00
kssenii	0527f0ea33	Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs	2021-05-12 16:54:18 +03:00
Amos Bird	cd6414639e	add metadata_snapshot to getQueryProcessingStage	2021-05-11 18:12:26 +08:00
Azat Khuzhin	eefd67fce5	Disable optimize_distributed_group_by_sharding_key with window functions	2021-05-06 00:44:22 +03:00
feng lv	39f68bf5ff	fix conflict	2021-05-02 16:33:45 +00:00
kssenii	ee06936596	Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs	2021-05-01 17:24:31 +03:00
feng lv	aed2f337e9	Fix CLEAR COLUMN does not work after #21303	2021-04-30 05:02:32 +00:00
kssenii	deb4903af8	Merge branch 'master' of github.com:ClickHouse/ClickHouse into poco-file-to-std-fs	2021-04-28 20:57:13 +03:00
kssenii	eeb71672a0	Change in Storages/*	2021-04-27 16:49:37 +03:00
feng lv	4ffe199d39	Implement table comments	2021-04-23 12:18:23 +00:00
Amos Bird	096d76627e	Skip unavaiable shards when writing to distributed tables	2021-04-21 10:30:40 +08:00
Maksim Kita	e361f5943f	Merge pull request #22999 from azat/no-optimize_skip_unused_shards-single-node Do not perform optimize_skip_unused_shards for cluster with one node	2021-04-15 14:36:56 +03:00
Nikita Mikhaylov	7a68820342	style	2021-04-13 22:39:42 +03:00
Nikita Mikhaylov	081ea84a41	save	2021-04-13 22:39:41 +03:00
tavplubix	1525e38a3c	Merge pull request #22990 from ClickHouse/tavplubix-patch-1 Fix excessive warning in StorageDistributed with cross-replication	2021-04-13 18:58:12 +03:00
Azat Khuzhin	a497d4d462	Do not perform optimize_skip_unused_shards for cluster with one node	2021-04-12 22:18:31 +03:00
tavplubix	a995962e6a	Update StorageDistributed.cpp	2021-04-12 14:58:24 +03:00
Azat Khuzhin	79bd8d4d3f	Respect optimize_skip_unused_shards_rewrite_in with optimize_skip_unused_shards_limit	2021-04-12 10:37:28 +03:00
Azat Khuzhin	e439914d38	Fix optimized cluster logic for optimize_skip_unused_shards	2021-04-12 10:37:28 +03:00
Azat Khuzhin	fbb386dca5	Rewrite IN in query for remote shards to exclude values that does not belongs to shard v2: fix optimize_skip_unused_shards_rewrite_in for sharding_key wrapped into function v3: fix column name for optimize_skip_unused_shards_rewrite_in v4: fix optimize_skip_unused_shards_rewrite_in with Null v5: - squash with Remove query argument for IStreamFactory::createForShard() - use proper column after function execution (using sharding_key_column_name) - update the test reference since (X) now is tuple(X)	2021-04-12 10:37:28 +03:00
Ivan	495c6e03aa	Replace all Context references with std::weak_ptr (#22297 ) * Replace all Context references with std::weak_ptr * Fix shared context captured by value * Fix build * Fix Context with named sessions * Fix copy context * Fix gcc build * Merge with master and fix build * Fix gcc-9 build	2021-04-11 02:33:54 +03:00
Nikolai Kochetov	6102652c99	Merge branch 'master' into better-filter-push-down	2021-04-06 13:38:03 +03:00
Maxim Akhmedov	725fa17961	Introduce IStorage::distributedWrite method for distributed INSERT SELECT.	2021-04-05 02:14:27 +03:00
Nikolai Kochetov	c3c393a7aa	Merge branch 'master' into refactor-actions-dag	2021-03-18 14:33:07 +03:00
Nikolai Kochetov	e8d7349c79	Merge branch 'master' into dist-query-zero-shards-fix	2021-03-16 12:00:08 +03:00
Azat Khuzhin	61d40c3600	Fix optimize_skip_unused_shards for zero shards case v2: move check to the beginning of the StorageDistributed::read()	2021-03-10 09:05:14 +03:00
Azat Khuzhin	3474ea044e	Avoid processing optimize_skip_unused_shards twice	2021-03-09 10:05:56 +03:00
Azat Khuzhin	ed09897eb1	Pass optimize_skip_unused_shards_limit to the bottom layer And now optimize_skip_unused_shards_limit=0 is not a special case anymore.	2021-03-08 10:05:56 +03:00
Azat Khuzhin	16f4c02d42	Add optimize_skip_unused_shards_limit Limit for number of sharding key values, turns off optimize_skip_unused_shards if the limit is reached	2021-03-26 06:09:00 +03:00
Nikolai Kochetov	a669f7d641	Merge branch 'master' into refactor-actions-dag	2021-03-05 18:21:14 +03:00
Nikolai Kochetov	9a39459888	Refactor ActionsDAG	2021-03-04 20:38:12 +03:00
Azat Khuzhin	6965ac26c3	Distributed: Add ability to delay/throttle INSERT until pending data will be reduced Add two new settings for the Distributed engine: - bytes_to_delay_insert - max_delay_to_insert If at the beginning of INSERT there will be too much pending data, more then bytes_to_delay_insert, then the INSERT will wait until it will be shrinked, and not more then max_delay_to_insert seconds. If after this there will be still too much pending, it will throw an exception. Also new profile events were added (by analogy to the MergeTree): - DistributedDelayedInserts (although you can use system.errors instead of this, but still) - DistributedRejectedInserts - DistributedDelayedInsertsMilliseconds	2021-03-03 23:30:23 +03:00
Azat Khuzhin	b43046ba06	Distributed: More accurate distribution_queue counters So now system.distribution_queue will show accurate statistics, so tests does not requires sleep anymore. But note that with too much distributed pending this will iterate over all directories.	2021-03-03 23:30:03 +03:00
Azat Khuzhin	b5a5778589	Distributed: Add ability to limit amount of pending bytes for async INSERT Right now with distributed_directory_monitor_batch_inserts=1 and insert_distributed_sync=0 INSERT into Distributed table will store blocks that should be sent to remote (and in case of prefer_localhost_replica=0 to the localhost too) on the local filesystem, and sent it in background. However there is no limit for this storage, and if the remote is unavailable (or some other error), these pending blocks may take significant space, and this is not always desired behaviour. Add new Distributed setting - bytes_to_throw_insert, that will set the limit for how much pending bytes is allowed, if the limit will be reached an exception will be throw. By default was set to 0, to avoid surprises.	2021-03-03 23:30:00 +03:00
Azat Khuzhin	ce09b7ff89	Distributed: Implement totalBytes() (system.tables.total_bytes)	2021-03-03 23:29:11 +03:00
Anton Popov	a4c00ab5dc	Merge pull request #21303 from ucasFL/forbid Forbid to drop a column if it's referenced by materialized view	2021-03-03 02:55:06 +03:00
feng lv	a26c9e64a9	fix fix	2021-03-02 03:20:03 +00:00
feng lv	51021c1164	forbid to drop a column if it's referenced by materialized view	2021-02-28 05:24:39 +00:00
Nikolai Kochetov	d328bfa41f	Review fixes. Add setting max_optimizations_to_apply.	2021-02-26 19:29:56 +03:00
Azat Khuzhin	809fa7e4cc	Sync SYSTEM FLUSH DISTRIBUTED with TRUNCATE	2021-02-10 23:10:37 +03:00
Azat Khuzhin	ce91c257b2	Lockless SYSTEM FLUSH DISTRIBUTED Right now SYSTEM FLUSH DISTRIBUTED will block: - INSERT into this Distributed table (requireDirectoryMonitor()) - SELECT * FROM system.distribution_queue	2021-02-08 22:07:30 +03:00
Kruglov Pavel	d94e8624d7	Merge branch 'master' into shard-id	2021-02-06 16:48:17 +03:00
Aleksei Semiglazov	921518db0a	CLICKHOUSE-606: query deduplication based on parts' UUID * add the query data deduplication excluding duplicated parts in MergeTree family engines. query deduplication is based on parts' UUID which should be enabled first with merge_tree setting assign_part_uuids=1 allow_experimental_query_deduplication setting is to enable part deduplication, default ot false. data part UUID is a mechanism of giving a data part a unique identifier. Having UUID and deduplication mechanism provides a potential of moving parts between shards preserving data consistency on a read path: duplicated UUIDs will cause root executor to retry query against on of the replica explicitly asking to exclude encountered duplicated fingerprints during a distributed query execution. NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will update part's UUID. * add _part_uuid virtual column, allowing to use UUIDs in predicates. Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com> address comments	2021-02-02 16:53:39 +00:00
feng lv	4279c7da41	add setting insert_shard_id add test fix style fix	2021-02-02 04:26:59 +00:00
kreuzerkrieg	29a2ef3089	Add IStoragePolicy interface	2021-01-26 10:55:28 +02:00
Azat Khuzhin	2e55bd2285	Accept IDisk in DirectoryMonitor (for further fsync)	2021-01-09 16:31:42 +03:00
Azat Khuzhin	b5ace27014	Add fsync support for Distributed engine. Two new settings (by analogy with MergeTree family) has been added: - `fsync_after_insert` - Do fsync for every inserted. Will decreases performance of inserts. - `fsync_tmp_directory` - Do fsync for temporary directory (that is used for async INSERT only) after all part operations (writes, renames, etc.). Refs: #17380 (p1)	2021-01-09 11:31:32 +03:00
Azat Khuzhin	714d5a067a	Expose supports_parallel_insert via system.table_engines	2021-01-08 14:57:24 +03:00
Alexey Milovidov	190402b7d5	Do not insert empty blocks on sync Distributed INSERT	2021-01-06 02:54:22 +03:00
Amos Bird	6fc225e676	Distributed insertion to one random shard (#18294 ) * Distributed insertion to one random shard * add some tests * add some documentation * Respect shards' weights * fine locking Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>	2020-12-23 19:04:05 +03:00
Azat Khuzhin	5365718f01	Fix optimize_distributed_group_by_sharding_key for query with OFFSET only (#16996 ) * Fix optimize_distributed_group_by_sharding_key for query with OFFSET only * Fix 01244_optimize_distributed_group_by_sharding_key flakiness	2020-12-02 20:11:39 +03:00
tavplubix	085359c110	Merge pull request #17274 from ClickHouse/fix_ast_formatting_in_logs Fix AST formatting in log messages	2020-11-24 19:00:56 +03:00
Alexander Tokmakov	60a5782c75	fix AST formatting in log messages	2020-11-22 20:23:12 +03:00
Amos Bird	1d9d586e20	Make global_context consistent.	2020-11-20 18:23:14 +08:00
Nikolai Kochetov	46f70dd0de	Merge branch 'master' into actions-dag-f14	2020-11-12 11:54:44 +03:00
tavplubix	058aa8f85e	Merge pull request #16824 from ClickHouse/replace_stringstreams_with_buffers Replace std::stringstreams with DB::Buffers	2020-11-12 01:11:44 +03:00
Nikolai Kochetov	1846bb3cac	Merge branch 'master' into actions-dag-f14	2020-11-11 13:08:57 +03:00
Nikolai Kochetov	1db8e77371	Add comments. Update ActionsDAG::Index	2020-11-10 17:54:59 +03:00
Nikolai Kochetov	195c941c4e	Merge branch 'master' into storage-read-query-plan	2020-11-10 15:02:22 +03:00
Alexander Tokmakov	5cdfcfb307	remove other stringstreams	2020-11-09 22:12:44 +03:00
Nikolai Kochetov	6717c7a0af	Merge branch 'master' into actions-dag-f14	2020-11-09 14:57:48 +03:00
alexey-milovidov	f4ba5f1f9a	Merge pull request #16772 from ClickHouse/fix-stringstream Fix "server failed to start" error	2020-11-08 14:27:08 +03:00
Alexey Milovidov	ba4ae00121	Whitespace	2020-11-08 00:30:40 +03:00
Alexey Milovidov	1ea3afadbc	Merge with master	2020-11-08 00:28:39 +03:00
Alexey Milovidov	5314185e25	Merge branch 'master' into azat-optimize_skip_unused_shards-optimization	2020-11-08 00:17:59 +03:00
Alexey Milovidov	fd84d16387	Fix "server failed to start" error	2020-11-07 03:14:53 +03:00
Nikolai Kochetov	c10f733587	Merge branch 'master' into storage-read-query-plan	2020-11-06 15:43:46 +03:00
Nikolai Kochetov	9aeb757da4	Merge branch 'master' into actions-dag-f14	2020-11-06 15:04:20 +03:00
Azat Khuzhin	f23995d290	Remove empty directories for async INSERT at start of Distributed engine Will be created by DistributedBlockOutputStream on demand.	2020-11-05 23:50:30 +03:00
Nikolai Kochetov	6767a226fc	Merge branch 'master' into actions-dag-f14	2020-11-03 15:21:06 +03:00
Nikolai Kochetov	07a7c46b89	Refactor ExpressionActions [Part 3]	2020-11-03 14:28:28 +03:00
Azat Khuzhin	fc14fde24a	Fix DROP TABLE for Distributed (racy with INSERT) <details> ``` drop() on T1275: 0 DB::StorageDistributed::drop (this=0x7f9ed34f0000) at ../contrib/libcxx/include/__hash_table:966 1 0x000000000d557242 in DB::DatabaseOnDisk::dropTable (this=0x7f9fc22706d8, context=..., table_name=...) at ../contrib/libcxx/include/new:340 2 0x000000000d6fcf7c in DB::InterpreterDropQuery::executeToTable (this=this@entry=0x7f9e42560dc0, query=...) at ../contrib/libcxx/include/memory:3826 3 0x000000000d6ff5ee in DB::InterpreterDropQuery::execute (this=0x7f9e42560dc0) at ../src/Interpreters/InterpreterDropQuery.cpp:50 4 0x000000000daa40c0 in DB::executeQueryImpl (begin=<optimized out>, end=<optimized out>, context=..., internal=<optimized out>, stage=DB::QueryProcessingStage::Complete, has_query_tail=false, istr=0x0) at ../src/Interpreters/executeQuery.cpp:420 5 0x000000000daa59df in DB::executeQuery (query=..., context=..., internal=internal@entry=false, stage=<optimized out>, may_have_embedded_data=<optimized out>) at ../contrib/libcxx/include/string:1487 6 0x000000000e1369e6 in DB::TCPHandler::runImpl (this=this@entry=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:254 7 0x000000000e1379c9 in DB::TCPHandler::run (this=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:1326 8 0x000000001086fac7 in Poco::Net::TCPServerConnection::start (this=this@entry=0x7f9ddf3a9000) at ../contrib/poco/Net/src/TCPServerConnection.cpp:43 9 0x000000001086ff2b in Poco::Net::TCPServerDispatcher::run (this=0x7f9e4eba5c00) at ../contrib/poco/Net/src/TCPServerDispatcher.cpp:114 10 0x00000000109dbe8e in Poco::PooledThread::run (this=0x7f9e4a2d2f80) at ../contrib/poco/Foundation/src/ThreadPool.cpp:199 11 0x00000000109d78f9 in Poco::ThreadImpl::runnableEntry (pThread=<optimized out>) at ../contrib/poco/Foundation/include/Poco/SharedPtr.h:401 12 0x00007f9fc3cccea7 in start_thread (arg=<optimized out>) at pthread_create.c:477 13 0x00007f9fc3bebeaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 StorageDistributedDirectoryMonitor on T166: 0 DB::StorageDistributedDirectoryMonitor::StorageDistributedDirectoryMonitor (this=0x7f9ea7ab1400, storage_=..., path_=..., pool_=..., monitor_blocker_=..., bg_pool_=...) at ../src/Storages/Distributed/DirectoryMonitor.cpp:81 1 0x000000000dbf684e in std::__1::make_unique<> () at ../contrib/libcxx/include/memory:3474 2 DB::StorageDistributed::requireDirectoryMonitor (this=0x7f9ed34f0000, disk=..., name=...) at ../src/Storages/StorageDistributed.cpp:682 3 0x000000000de3d5fa in DB::DistributedBlockOutputStream::writeToShard (this=this@entry=0x7f9ed39c7418, block=..., dir_names=...) at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:634 4 0x000000000de3e214 in DB::DistributedBlockOutputStream::writeAsyncImpl (this=this@entry=0x7f9ed39c7418, block=..., shard_id=shard_id@entry=79) at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:539 5 0x000000000de3e47b in DB::DistributedBlockOutputStream::writeSplitAsync (this=this@entry=0x7f9ed39c7418, block=...) at ../contrib/libcxx/include/vector:1546 6 0x000000000de3eab0 in DB::DistributedBlockOutputStream::writeAsync (block=..., this=0x7f9ed39c7418) at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:141 7 DB::DistributedBlockOutputStream::write (this=0x7f9ed39c7418, block=...) at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:135 8 0x000000000d73b376 in DB::PushingToViewsBlockOutputStream::write (this=this@entry=0x7f9ea7a8cf58, block=...) at ../src/DataStreams/PushingToViewsBlockOutputStream.cpp:157 9 0x000000000d7853eb in DB::AddingDefaultBlockOutputStream::write (this=0x7f9ed383d118, block=...) at ../contrib/libcxx/include/memory:3826 10 0x000000000d740790 in DB::SquashingBlockOutputStream::write (this=0x7f9ed383de18, block=...) at ../contrib/libcxx/include/memory:3826 11 0x000000000d68c308 in DB::CountingBlockOutputStream::write (this=0x7f9ea7ac6d60, block=...) at ../contrib/libcxx/include/memory:3826 12 0x000000000ddab449 in DB::StorageBuffer::writeBlockToDestination (this=this@entry=0x7f9fbd56a000, block=..., table=...) at ../src/Storages/StorageBuffer.cpp:747 13 0x000000000ddabfa6 in DB::StorageBuffer::flushBuffer (this=this@entry=0x7f9fbd56a000, buffer=..., check_thresholds=check_thresholds@entry=true, locked=locked@entry=false, reset_block_structure=reset_block_structure@entry=false) at ../src/Storages/StorageBuffer.cpp:661 14 0x000000000ddac415 in DB::StorageBuffer::flushAllBuffers (reset_blocks_structure=false, check_thresholds=true, this=0x7f9fbd56a000) at ../src/Storages/StorageBuffer.cpp:605 shutdown() on T1275: 0 DB::StorageDistributed::shutdown (this=0x7f9ed34f0000) at ../contrib/libcxx/include/atomic:1612 1 0x000000000d6fd938 in DB::InterpreterDropQuery::executeToTable (this=this@entry=0x7f98530c79a0, query=...) at ../src/Storages/TableLockHolder.h:12 2 0x000000000d6ff5ee in DB::InterpreterDropQuery::execute (this=0x7f98530c79a0) at ../src/Interpreters/InterpreterDropQuery.cpp:50 3 0x000000000daa40c0 in DB::executeQueryImpl (begin=<optimized out>, end=<optimized out>, context=..., internal=<optimized out>, stage=DB::QueryProcessingStage::Complete, has_query_tail=false, istr=0x0) at ../src/Interpreters/executeQuery.cpp:420 4 0x000000000daa59df in DB::executeQuery (query=..., context=..., internal=internal@entry=false, stage=<optimized out>, may_have_embedded_data=<optimized out>) at ../contrib/libcxx/include/string:1487 5 0x000000000e1369e6 in DB::TCPHandler::runImpl (this=this@entry=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:254 6 0x000000000e1379c9 in DB::TCPHandler::run (this=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:1326 7 0x000000001086fac7 in Poco::Net::TCPServerConnection::start (this=this@entry=0x7f9ddf3a9000) at ../contrib/poco/Net/src/TCPServerConnection.cpp:43 8 0x000000001086ff2b in Poco::Net::TCPServerDispatcher::run (this=0x7f9e4eba5c00) at ../contrib/poco/Net/src/TCPServerDispatcher.cpp:114 9 0x00000000109dbe8e in Poco::PooledThread::run (this=0x7f9e4a2d2f80) at ../contrib/poco/Foundation/src/ThreadPool.cpp:199 10 0x00000000109d78f9 in Poco::ThreadImpl::runnableEntry (pThread=<optimized out>) at ../contrib/poco/Foundation/include/Poco/SharedPtr.h:401 11 0x00007f9fc3cccea7 in start_thread (arg=<optimized out>) at pthread_create.c:477 12 0x00007f9fc3bebeaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 ``` </details>	2020-10-27 21:19:36 +03:00
Ivan	1d170f5745	ASTTableIdentifier Part #1 : improve internal representation of ASTIdentifier name (#16149 ) * Use only \|name_parts\| as primary name source * Restore legacy logic for table restoration * Fix build * Fix tests * Add pytest server config * Fix tests * Fixes due to review	2020-10-24 21:46:10 +03:00
Nikolai Kochetov	7fa045cff8	Merge branch 'master' into storage-read-query-plan	2020-10-22 13:31:10 +03:00
Azat Khuzhin	9b8abd44ab	Add allow_nondeterministic_optimize_skip_unused_shards	2020-10-17 01:07:02 +03:00
Alexander Tokmakov	72b1339656	Revert "Revert "Write structure of table functions to metadata"" This reverts commit `c65d1e5c70`.	2020-10-14 15:19:29 +03:00
tavplubix	c65d1e5c70	Revert "Write structure of table functions to metadata"	2020-10-14 13:59:29 +03:00
alexey-milovidov	f60ccb4edf	Merge pull request #14295 from ClickHouse/write_structure_of_table_functions Write structure of table functions to metadata	2020-10-13 23:56:09 +03:00
Nikolai Kochetov	7e58f99f64	Merge branch 'master' into storage-read-query-plan	2020-10-12 13:12:39 +03:00
Alexey Milovidov	5b482f4191	Cleanups	2020-10-10 19:31:10 +03:00
Nikolai Kochetov	c5cb05f5f3	Try fix tests.	2020-10-07 14:26:29 +03:00
Azat Khuzhin	b838214a35	Pass non-const SelectQueryInfo (and drop mutable qualifiers)	2020-10-02 22:42:35 +03:00
Azat Khuzhin	587cde853e	Avoid skipping unused shards twice (for query processing stage and read itself)	2020-10-02 22:42:09 +03:00
Nikolai Kochetov	576ffadb17	Fix explain for ISourceStep.	2020-09-30 15:22:30 +03:00
Alexander Tokmakov	b0d99217fb	Merge branch 'master' into write_structure_of_table_functions	2020-09-27 14:26:47 +03:00
Nikolai Kochetov	dea90009e3	Fix build	2020-09-25 16:03:12 +03:00

... 2 3 4 5 6 ...

456 Commits