ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-15 10:52:30 +00:00

Author	SHA1	Message	Date
Alexander Tokmakov	07d952b728	use snapshots for semistructured data, durability fixes	2022-03-17 18:26:18 +01:00
Anton Popov	063917786e	minor fixes	2022-03-14 17:29:18 +00:00
Anton Popov	36ec379aeb	Merge remote-tracking branch 'upstream/master' into HEAD	2022-03-14 16:28:35 +00:00
Rajkumar	3d3b6d1956	clang-tidy report issues with Medium priority	2022-03-10 07:23:49 -08:00
Azat Khuzhin	c4b6342853	Improvements for `parallel_distributed_insert_select` (and related) (#34728 ) * Add a warning if parallel_distributed_insert_select was ignored Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Respect max_distributed_depth for parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Print warning for non applied parallel_distributed_insert_select only for initial query Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Remove Cluster::getHashOfAddresses() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses Before it uses empty cluster name (getClusterName()) which is not correct, compare all addresses instead. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix max_distributed_depth check max_distributed_depth=1 must mean not more then one distributed query, not two, since max_distributed_depth=0 means no limit, and distribute_depth is 0 for the first query. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Add a test for parallel_distributed_insert_select with cluster()/remote() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Return <remote> instead of empty cluster name in Distributed engine Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Make user with sharding_key and w/o in remote()/cluster() identical Before with sharding_key the user was "default", while w/o it it was empty. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-08 15:24:39 +01:00
Anton Popov	2758db5341	add more comments	2022-03-01 19:32:55 +03:00
Anton Popov	a661eaf39f	better performance of getting storage snapshot	2022-02-16 02:17:22 +03:00
Anton Popov	e8ce091e68	Merge remote-tracking branch 'upstream/master' into HEAD	2022-01-21 20:11:18 +03:00
Anton Popov	7c6f7f6732	support 'optimize_move_to_prewhere' with storage 'Merge'	2021-12-29 20:49:10 +03:00
Anton Popov	6f4d9a53b2	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-12-01 15:54:33 +03:00
Raúl Marín	b2cfa70541	Reduce dependencies on ASTFunction.h 481 -> 230	2021-11-26 18:21:54 +01:00
Anton Popov	a20922b2d3	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-11-09 15:36:25 +03:00
Alexander Tokmakov	2e7e195e77	change alter_lock to std::timed_mutex	2021-10-26 13:37:00 +03:00
Alexey Milovidov	fe6b7c77c7	Rename "common" to "base"	2021-10-02 10:13:14 +03:00
Nikolai Kochetov	b997214620	Rename QueryPipeline to QueryPipelineBuilder.	2021-09-14 20:48:18 +03:00
Anton Popov	4c388e3d84	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-09-09 14:10:16 +03:00
Alexander Tokmakov	13466a7cc3	minor fix	2021-09-03 20:06:38 +03:00
Alexander Tokmakov	42378b5913	fix	2021-08-20 17:05:53 +03:00
Anton Popov	61239343e3	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-08-20 16:33:30 +03:00
Alexey Milovidov	24cef99065	Merge branch 'master' into fix-bad-cast	2021-08-08 04:00:29 +03:00
alexey-milovidov	c5207fc237	Merge pull request #26466 from azat/optimize-dist-select Rework SELECT from Distributed optimizations	2021-08-08 03:59:32 +03:00
mergify[bot]	dc57254982	Merge branch 'master' into improve_create_or_replace	2021-08-03 11:39:07 +00:00
Anton Popov	e36736b50c	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-08-02 22:52:02 +03:00
Azat Khuzhin	2fb95d9ee0	Rework SELECT from Distributed query stages optimization Before this patch it wasn't possible to optimize simple SELECT * FROM dist ORDER BY (w/o GROUP BY and DISTINCT) to more optimal stage (QueryProcessingStage::WithMergeableStateAfterAggregationAndLimit), since that code was under allow_nondeterministic_optimize_skip_unused_shards, rework it and make it possible. Also now distributed_push_down_limit is respected for optimize_distributed_group_by_sharding_key. Next step will be to enable distributed_push_down_limit by default. v2: fix detection of aggregates	2021-08-02 21:04:29 +03:00
Alexey Milovidov	edfeb0957f	Fix strange code	2021-07-24 04:52:18 +03:00
Anton Popov	f867995b94	remove excessive creation of storage snapshot	2021-07-23 19:47:43 +03:00
Anton Popov	2b58f39c10	dynamic columns: support missed columns in distributed	2021-07-23 19:34:23 +03:00
Nikolai Kochetov	2dc5c89b66	Update Storage::write	2021-07-23 17:25:35 +03:00
Anton Popov	5d175bf557	dynamic columns: support distributed tables	2021-07-12 17:54:02 +03:00
Anton Popov	3ed7f5a6cc	dynamic subcolumns: add snapshot for storage	2021-07-09 06:15:41 +03:00
Alexander Tokmakov	d9a77e3a1a	improve CREATE OR REPLACE query	2021-07-01 16:21:38 +03:00
Azat Khuzhin	a616ae8861	Improve startup time of Distributed engine. - create directory monitors in parallel (this also includes rmdir in case of directory is empty, since even if the directory is empty it may take some time to remove it, due to waiting for journal or if the directory is large, i.e. it had lots of files before, since remember ext4 does not truncate the directory size on each unlink [1]) - initialize increment in parallel too (since it does readdir()) [1]: https://lore.kernel.org/linux-ext4/930A5754-5CE6-4567-8CF0-62447C97825C@dilger.ca/	2021-06-24 10:27:51 +03:00
Maksim Kita	67e9b85951	Merge ext into common	2021-06-16 23:28:41 +03:00
Anton Popov	3acbd12c54	enable reading of subcolumn for distributed tables	2021-05-25 03:49:24 +03:00
feng lv	c6f8ab9826	fix	2021-05-13 02:05:53 +00:00
Amos Bird	cd6414639e	add metadata_snapshot to getQueryProcessingStage	2021-05-11 18:12:26 +08:00
feng lv	4ffe199d39	Implement table comments	2021-04-23 12:18:23 +00:00
Ivan	495c6e03aa	Replace all Context references with std::weak_ptr (#22297 ) * Replace all Context references with std::weak_ptr * Fix shared context captured by value * Fix build * Fix Context with named sessions * Fix copy context * Fix gcc build * Merge with master and fix build * Fix gcc-9 build	2021-04-11 02:33:54 +03:00
Maxim Akhmedov	725fa17961	Introduce IStorage::distributedWrite method for distributed INSERT SELECT.	2021-04-05 02:14:27 +03:00
Azat Khuzhin	6965ac26c3	Distributed: Add ability to delay/throttle INSERT until pending data will be reduced Add two new settings for the Distributed engine: - bytes_to_delay_insert - max_delay_to_insert If at the beginning of INSERT there will be too much pending data, more then bytes_to_delay_insert, then the INSERT will wait until it will be shrinked, and not more then max_delay_to_insert seconds. If after this there will be still too much pending, it will throw an exception. Also new profile events were added (by analogy to the MergeTree): - DistributedDelayedInserts (although you can use system.errors instead of this, but still) - DistributedRejectedInserts - DistributedDelayedInsertsMilliseconds	2021-03-03 23:30:23 +03:00
Azat Khuzhin	b5a5778589	Distributed: Add ability to limit amount of pending bytes for async INSERT Right now with distributed_directory_monitor_batch_inserts=1 and insert_distributed_sync=0 INSERT into Distributed table will store blocks that should be sent to remote (and in case of prefer_localhost_replica=0 to the localhost too) on the local filesystem, and sent it in background. However there is no limit for this storage, and if the remote is unavailable (or some other error), these pending blocks may take significant space, and this is not always desired behaviour. Add new Distributed setting - bytes_to_throw_insert, that will set the limit for how much pending bytes is allowed, if the limit will be reached an exception will be throw. By default was set to 0, to avoid surprises.	2021-03-03 23:30:00 +03:00
Azat Khuzhin	ce09b7ff89	Distributed: Implement totalBytes() (system.tables.total_bytes)	2021-03-03 23:29:11 +03:00
Azat Khuzhin	456cbaf747	Distributed: Hide private part of the interface	2021-03-03 23:29:11 +03:00
feng lv	51021c1164	forbid to drop a column if it's referenced by materialized view	2021-02-28 05:24:39 +00:00
Azat Khuzhin	809fa7e4cc	Sync SYSTEM FLUSH DISTRIBUTED with TRUNCATE	2021-02-10 23:10:37 +03:00
Azat Khuzhin	ce91c257b2	Lockless SYSTEM FLUSH DISTRIBUTED Right now SYSTEM FLUSH DISTRIBUTED will block: - INSERT into this Distributed table (requireDirectoryMonitor()) - SELECT * FROM system.distribution_queue	2021-02-08 22:07:30 +03:00
Azat Khuzhin	2e55bd2285	Accept IDisk in DirectoryMonitor (for further fsync)	2021-01-09 16:31:42 +03:00
Azat Khuzhin	b5ace27014	Add fsync support for Distributed engine. Two new settings (by analogy with MergeTree family) has been added: - `fsync_after_insert` - Do fsync for every inserted. Will decreases performance of inserts. - `fsync_tmp_directory` - Do fsync for temporary directory (that is used for async INSERT only) after all part operations (writes, renames, etc.). Refs: #17380 (p1)	2021-01-09 11:31:32 +03:00
Amos Bird	6fc225e676	Distributed insertion to one random shard (#18294 ) * Distributed insertion to one random shard * add some tests * add some documentation * Respect shards' weights * fine locking Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>	2020-12-23 19:04:05 +03:00
Amos Bird	1d9d586e20	Make global_context consistent.	2020-11-20 18:23:14 +08:00
Nikolai Kochetov	195c941c4e	Merge branch 'master' into storage-read-query-plan	2020-11-10 15:02:22 +03:00
Alexey Milovidov	5314185e25	Merge branch 'master' into azat-optimize_skip_unused_shards-optimization	2020-11-08 00:17:59 +03:00
Nikolai Kochetov	c10f733587	Merge branch 'master' into storage-read-query-plan	2020-11-06 15:43:46 +03:00
Alexander Tokmakov	ac32809b6a	fix #16482	2020-11-02 19:40:39 +03:00
Nikolai Kochetov	7fa045cff8	Merge branch 'master' into storage-read-query-plan	2020-10-22 13:31:10 +03:00
Alexander Tokmakov	72b1339656	Revert "Revert "Write structure of table functions to metadata"" This reverts commit `c65d1e5c70`.	2020-10-14 15:19:29 +03:00
tavplubix	c65d1e5c70	Revert "Write structure of table functions to metadata"	2020-10-14 13:59:29 +03:00
Azat Khuzhin	b838214a35	Pass non-const SelectQueryInfo (and drop mutable qualifiers)	2020-10-02 22:42:35 +03:00
Azat Khuzhin	587cde853e	Avoid skipping unused shards twice (for query processing stage and read itself)	2020-10-02 22:42:09 +03:00
Nikolai Kochetov	576ffadb17	Fix explain for ISourceStep.	2020-09-30 15:22:30 +03:00
Nikolai Kochetov	dea90009e3	Fix build	2020-09-25 16:03:12 +03:00
Alexander Tokmakov	1ca9a92b21	Merge branch 'master' into write_structure_of_table_functions	2020-09-18 21:09:23 +03:00
Nikolai Kochetov	b26f11c00c	Support StorageDistributed::read for QueryPlan.	2020-09-18 17:16:53 +03:00
Pavel Kovalenko	01ab28a182	Don't throw exception if Distributed storage has multi-volume storage policy configuration.	2020-09-15 12:26:56 +03:00
Alexander Tokmakov	b840d741d0	Merge branch 'master' into write_structure_of_table_functions	2020-09-04 13:00:07 +03:00
Azat Khuzhin	10b4f3b41f	Optimize queries with LIMIT/LIMIT BY/ORDER BY for distributed with GROUP BY sharding_key Previous set of QueryProcessingStage does not allow to do this. But after WithMergeableStateAfterAggregation had been introduced the following queries can be optimized too under optimize_distributed_group_by_sharding_key: - GROUP BY sharding_key LIMIT - GROUP BY sharding_key LIMIT BY - GROUP BY sharding_key ORDER BY And right now it is still not supports: - WITH TOTALS (looks like it can be supported) - WITH ROLLUP (looks like it can be supported) - WITH CUBE - SETTINGS extremes=1 (looks like it can be supported) But will be implemented separatelly. vX: fixes v2: fix WITH * v3: fix extremes v4: fix LIMIT OFFSET (and make a little bit cleaner) v5: fix HAVING v6: fix ORDER BY v7: rebase against 20.7 v8: move out WithMergeableStateAfterAggregation v9: add optimize_distributed_group_by_sharding_key into test names	2020-09-03 00:52:51 +03:00
Alexander Tokmakov	56695727b2	Merge branch 'master' into write_structure_of_table_functions	2020-09-01 20:15:13 +03:00
Alexander Tokmakov	969940b4c9	write table tructure for table function remote(...)	2020-08-26 23:55:40 +03:00
Alexey Milovidov	2a09aa53cc	Support parallel INSERT for more table engines	2020-08-26 19:41:30 +03:00
Nikolai Kochetov	2cca4d5fcf	Refactor Pipe [part 2].	2020-08-03 16:54:14 +03:00
Vladimir Chebotarev	1b3f5c99f5	Real fix of test.	2020-07-26 21:27:36 +03:00
Vladimir Chebotarev	8039d45910	Minor fix in `StorageDistributed`.	2020-07-26 21:27:36 +03:00
Gleb Novikov	7f5b6fba78	Generic volume is coming... 1. SingleDiskVolume for temporary volumes 2. Generic VolumePtr in StoragePolicies 3. Removed max_data_part_size in system.storage_policies, added volume_type	2020-07-26 21:27:36 +03:00
Azat Khuzhin	6ea1b19476	Remove data for Distributed tables (blocks from async INSERTs) on DROP TABLE	2020-07-17 08:59:57 +03:00
alexey-milovidov	18eb141ea1	Merge pull request #11715 from azat/dist-optimize_skip_unused_shards-fixes Control nesting level for shards skipping and disallow non-deterministic functions	2020-06-24 12:54:58 +03:00
Azat Khuzhin	041533eae2	Disable optimize_skip_unused_shards if sharding_key has non-deterministic func Example of such functions is rand() And this patch disables only optimize_skip_unused_shards, i.e. INSERT code path does not changed, so it will work as before.	2020-06-18 21:49:29 +03:00
alesapin	d79982f497	Better locks in Storages	2020-06-18 19:10:47 +03:00
alesapin	aab4ce6394	Truncate with metadata	2020-06-18 13:29:13 +03:00
alesapin	ebb36bec8a	Merge branch 'master' into atomic_metadata5	2020-06-18 11:57:16 +03:00
alesapin	dffdece350	getColumns in StorageInMemoryMetadta (only compilable)	2020-06-17 19:39:58 +03:00
Nikita Mikhaylov	ff0262626a	Merge pull request #11645 from azat/load-balancing-round-robin Add round_robin load_balancing	2020-06-17 14:34:59 +04:00
alesapin	36ba0192df	Metadata in read and write methods of IStorage	2020-06-15 22:08:58 +03:00
Azat Khuzhin	c139a05370	Forward declaration in StorageDistributed	2020-06-14 01:09:21 +03:00
alesapin	8be957ecb5	Better checks around metadata	2020-06-10 14:16:31 +03:00
Azat Khuzhin	86c5465bf8	Rewrite StorageSystemDistributionQueue interfaces	2020-06-04 03:04:32 +03:00
Azat Khuzhin	389f78ceee	Add system.distribution_queue system.distribution_queue contains the following columns: - database - table - data_path - is_blocked - error_count - data_files - data_compressed_bytes	2020-06-04 02:36:16 +03:00
Azat Khuzhin	60d10f1bac	Fix typo in StorageDistributed	2020-06-04 02:36:16 +03:00
Alexey Milovidov	25f941020b	Remove namespace pollution	2020-05-31 00:57:37 +03:00
Azat Khuzhin	bc4b75dead	Add table name into logs for StorageDistributed	2020-05-23 11:57:14 +03:00
Azat Khuzhin	d93b9a57f6	Forward declaration for Context as much as possible. Now after changing Context.h 488 modules will be recompiled instead of 582.	2020-05-21 01:53:18 +03:00
Gleb Novikov	c637d99e07	Volumes and storages refactoring: 1. Moved Volume to separate file 2. Created IVolume interface and implemented current behaviour in implementation of new interface — VolumeJBOD 3. Replaced all old volume usages with new VolumeJBOD. Where it is unnecessary to have JBOD — left just IVolume. 4. Removed old Volume completely 5. Moved StoragePolicy to separated files 6. Moved DiskSelector to separated files 7. Removed DiskSpaceMonitor file	2020-05-04 23:15:38 +03:00
Azat Khuzhin	63d8ab8f03	Make createSelector() static (in storage) and const (in stream)	2020-05-01 11:31:05 +03:00
Azat Khuzhin	f22ba15b4a	Reduce copy-paste of DistributedBlockOutputStream::createSelector This will make it less error prone.	2020-05-01 02:59:40 +03:00
alesapin	f981649213	Fix pushing to views stream and refactor virtuals	2020-04-28 13:38:57 +03:00
alesapin	18c550df15	Better virtuals logic	2020-04-27 16:55:30 +03:00
alesapin	2829774105	Merge branch 'master' into refactor_istorage	2020-04-27 15:34:21 +03:00
alesapin	dc2dd77d2e	Remove redundant overrides from IStorage	2020-04-24 12:20:09 +03:00
Alexander Tokmakov	04d6b59ac0	Merge branch 'master' into database_atomic	2020-04-23 17:31:37 +03:00
Alexey Milovidov	1e325a9fd9	Checkpoint	2020-04-22 09:22:14 +03:00
Alexander Tokmakov	b29bddac12	Merge branch 'master' into database_atomic	2020-04-20 14:09:09 +03:00

1 2 3 4

159 Commits