ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-15 02:41:59 +00:00

Author	SHA1	Message	Date
Azat Khuzhin	c25d6cd624	Rename directory monitor concept into background INSERT (#55978 ) * Limit log frequence for "Skipping send data over distributed table" message After SYSTEM STOP DISTRIBUTED SENDS it will constantly print this message. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Rename directory monitor concept into async INSERT Rename the following query settings (with preserving backward compatiblity, by keeping old name as an alias): - distributed_directory_monitor_sleep_time_ms -> distributed_async_insert_sleep_time_ms - distributed_directory_monitor_max_sleep_time_ms -> distributed_async_insert_max_sleep_time_ms - distributed_directory_monitor_batch -> distributed_async_insert_batch_inserts - distributed_directory_monitor_split_batch_on_failure -> distributed_async_insert_split_batch_on_failure Rename the following table settings (with preserving backward compatiblity, by keeping old name as an alias): - monitor_batch_inserts -> async_insert_batch - monitor_split_batch_on_failure -> async_insert_split_batch_on_failure - directory_monitor_sleep_time_ms -> async_insert_sleep_time_ms - directory_monitor_max_sleep_time_ms -> async_insert_max_sleep_time_ms And also update all the references: $ gg -e directory_monitor_ -e monitor_ tests docs \| cut -d: -f1 \| sort -u \| xargs sed -e 's/distributed_directory_monitor_sleep_time_ms/distributed_async_insert_sleep_time_ms/g' -e 's/distributed_directory_monitor_max_sleep_time_ms/distributed_async_insert_max_sleep_time_ms/g' -e 's/distributed_directory_monitor_batch_inserts/distributed_async_insert_batch/g' -e 's/distributed_directory_monitor_split_batch_on_failure/distributed_async_insert_split_batch_on_failure/g' -e 's/monitor_batch_inserts/async_insert_batch/g' -e 's/monitor_split_batch_on_failure/async_insert_split_batch_on_failure/g' -e 's/monitor_sleep_time_ms/async_insert_sleep_time_ms/g' -e 's/monitor_max_sleep_time_ms/async_insert_max_sleep_time_ms/g' -i Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Rename async_insert for Distributed into background_insert This will avoid amigibuity between general async INSERT's and INSERT into Distributed, which are indeed background, so new term express it even better. Mostly done with: $ git di HEAD^ --name-only \| xargs sed -i -e 's/distributed_async_insert/distributed_background_insert/g' -e 's/async_insert_batch/background_insert_batch/g' -e 's/async_insert_split_batch_on_failure/background_insert_split_batch_on_failure/g' -e 's/async_insert_sleep_time_ms/background_insert_sleep_time_ms/g' -e 's/async_insert_max_sleep_time_ms/background_insert_max_sleep_time_ms/g' Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Mark 02417_opentelemetry_insert_on_distributed_table as long CI: https://s3.amazonaws.com/clickhouse-test-reports/55978/7a6abb03a0b507e29e999cb7e04f246a119c6f28/stateless_tests_flaky_check__asan_.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> --------- Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-11-01 15:09:39 +01:00
Nikolai Kochetov	c5100cd9b1	Fixing build	2023-10-04 09:01:29 +00:00
Nikolai Kochetov	903c966cc8	Merge branch 'master' into planner-prepare-filters-for-analysis-2	2023-09-11 16:14:03 +02:00
Igor Nikonov	a12a375a1d	Style check	2023-09-09 12:51:34 +00:00
Igor Nikonov	fff2a2d81c	Fix style	2023-09-09 08:29:33 +00:00
Igor Nikonov	7a396139df	Cleanup: unnecessary SelectQueryInfo usage around distributed	2023-09-08 21:53:38 +00:00
Nikolai Kochetov	cb851fcee0	Cleanup.	2023-08-24 11:07:17 +00:00
Nikolai Kochetov	33b8b93d1b	Re-implement getOptimizedQueryProcessingStage with analyzer.	2023-08-24 11:07:17 +00:00
Nikolai Kochetov	fc90a1a0bd	Fix some skip_unused_shards tests.	2023-08-24 11:07:17 +00:00
Nikolai Kochetov	26e0ad8d72	Re-impl evaluateExpressionOverConstantCondition [part 1]	2023-08-24 11:07:17 +00:00
Alexander Tokmakov	faca49a905	Merge branch 'master' into add_delay_for_replicated	2023-07-24 16:07:38 +02:00
Azat Khuzhin	20625d75ab	Fix optimize_skip_unused_shards with JOINs In case of JOIN query may contains conditions for other tables, while optimize_skip_unused_shards was pretty dumb and failed to skip such columns. Fix this by removing JOIN before applying this optimization. v2: restriction for analyzer v3: ignore 01940_custom_tld_sharding_key under analyzer Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> Co-Authored-By: Alexey Milovidov <milovidov@clickhouse.com>	2023-07-22 07:45:33 +02:00
alesapin	baee73fd96	Make shutdown of replicated tables softer	2023-07-05 18:11:25 +02:00
Antonio Andelic	b11f744252	Correctly disable async insert with deduplication when it's not needed (#50663 ) * Correctly disable async insert when it's not used * Better * Add comment * Better * Fix tests --------- Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>	2023-06-07 20:33:08 +02:00
Azat Khuzhin	79b83c4fd2	Remove superfluous includes of logger_userful.h from headers Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-04-10 17:59:30 +02:00
Azat Khuzhin	e10fb142fd	Fix race for distributed sends from disk Before it was initialized from disk only on startup, but if some INSERT can create the object before, then, it will lead to the situation when it will not be initialized. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-02-28 22:33:36 +01:00
Azat Khuzhin	b5434eac3b	Rename StorageDistributedDirectoryMonitor to DistributedAsyncInsertDirectoryQueue Since #44922 it is not a directory monitor anymore. v2: Remove unused error codes v3: Contains some header fixes due to conflicts with master Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-02-28 22:33:36 +01:00
Azat Khuzhin	1c4659b8e7	Separate out Batch as DistributedAsyncInsertBatch (and also some helpers) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-02-28 22:33:36 +01:00
Azat Khuzhin	3f892e52ab	Revert "Revert "Merge pull request #44922 from azat/dist/async-INSERT-metrics"" This is the revert of revert since there will be follow up patches to address the issues. This reverts commit `a55798626a`.	2023-02-28 22:33:36 +01:00
Azat Khuzhin	51019bc9f3	Fix a race between Distributed table creation and INSERT into it Initializing queues for pending on-disk files for async INSERT cannot be done after table had been attached and visible to user, since it initializes the per-table counter, that is used during INSERT. Now there is a window, when this counter is not initialized and it will start from the beginning, and this could lead to CANNOT_LINK error: Destination file /data/clickhouse/data/urls_v1/urls_in/shard6_replica1/13129817.bin is already exist and have different inode Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-23 09:55:43 +01:00
Azat Khuzhin	a55798626a	Revert "Merge pull request #44922 from azat/dist/async-INSERT-metrics" There are the following problems with this patch: - Looses files on exception - Existing current_batch.txt on startup leads to ENOENT error and hung of distributed sends without ATTACH/DETACH - Race between creating the queue for sending at table startup and INSERT, if it had been created from INSERT, then it will not be initialized from disk They were addressed in #45491, but it makes code more cmoplex and plus since, likely, the release is comming, it is better to revert the change. This reverts commit `94604f71b7`, reversing changes made to `80f6a45376`.	2023-01-21 22:42:00 +01:00
Nikita Mikhaylov	857799fbca	Parallel distributed insert select with s3Cluster [3] (#44955 ) * Revert "Revert "Resurrect parallel distributed insert select with s3Cluster (#41535)"" This reverts commit `b8d9066004`. * Fix build * Better * Fix test * Automatic style fix Co-authored-by: robot-clickhouse <robot-clickhouse@users.noreply.github.com>	2023-01-09 13:30:32 +01:00
Azat Khuzhin	4e76629aaf	Fixes for -Wshorten-64-to-32 - lots of static_cast - add safe_cast - types adjustments - config - IStorage::read/watch - ... - some TODO's (to convert types in future) P.S. That was quite a journey... v2: fixes after rebase v3: fix conflicts after #42308 merged Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-10-21 13:25:19 +02:00
Alexander Tokmakov	b8d9066004	Revert "Resurrect parallel distributed insert select with s3Cluster (#41535 )" This reverts commit `860e34e760`.	2022-10-07 15:53:30 +02:00
Nikita Mikhaylov	860e34e760	Resurrect parallel distributed insert select with s3Cluster (#41535 )	2022-10-06 13:47:32 +02:00
Alexander Tokmakov	f9f85a0e8b	Revert "Parallel distributed insert select from *Cluster table functions (#39107 )" This reverts commit `d3cc234986`.	2022-08-24 15:17:15 +03:00
Nikita Mikhaylov	d3cc234986	Parallel distributed insert select from *Cluster table functions (#39107 )	2022-08-15 12:41:17 +02:00
Nikolai Kochetov	c71256ea38	Remove some commented code.	2022-05-30 13:18:20 +00:00
Nikolai Kochetov	fd97a9d885	Move some resources	2022-05-23 19:47:32 +00:00
Nikolai Kochetov	56feef01e7	Move some resources	2022-05-20 19:49:31 +00:00
Robert Schulze	777b5bc15b	Don't let storages inherit from boost::noncopyable ... IStorage has deleted copy ctor / assignment already	2022-05-03 09:07:08 +02:00
Robert Schulze	330212e0f4	Remove inherited create() method + disallow copying The original motivation for this commit was that shared_ptr_helper used std::shared_ptr<>() which does two heap allocations instead of make_shared<>() which does a single allocation. Turned out that 1. the affected code (--> Storages/) is not on a hot path (rendering the performance argument moot ...) 2. yet copying Storage objects is potentially dangerous and was previously allowed. Hence, this change - removes shared_ptr_helper and as a result all inherited create() methods, - instead, Storage objects are now created using make_shared<>() by the caller (for that to work, many constructors had to be made public), and - all Storage classes were marked as noncopyable using boost::noncopyable. In sum, we are (likely) not making things faster but the code becomes cleaner and harder to misuse.	2022-05-02 08:46:52 +02:00
Amos Bird	4a5e4274f0	base should not depend on Common	2022-04-29 10:26:35 +08:00
Alexey Milovidov	242919eddd	Remove abbreviation	2022-04-18 01:02:49 +02:00
Alexander Tokmakov	da00beaf7f	Merge branch 'master' into mvcc_prototype	2022-04-05 11:14:42 +02:00
Alexey Milovidov	4d6c030d23	Revert "clang-tidy report issues with Medium priority"	2022-04-04 23:41:42 +03:00
Alexander Tokmakov	287d858fda	Merge branch 'master' into mvcc_prototype	2022-03-29 16:24:12 +02:00
Maksim Kita	a1a4552740	Merge pull request #35184 from DevTeamBK/clang-tidy-issues clang-tidy report issues with Medium priority	2022-03-29 13:19:54 +02:00
Alexander Tokmakov	07d952b728	use snapshots for semistructured data, durability fixes	2022-03-17 18:26:18 +01:00
Anton Popov	063917786e	minor fixes	2022-03-14 17:29:18 +00:00
Anton Popov	36ec379aeb	Merge remote-tracking branch 'upstream/master' into HEAD	2022-03-14 16:28:35 +00:00
Rajkumar	3d3b6d1956	clang-tidy report issues with Medium priority	2022-03-10 07:23:49 -08:00
Azat Khuzhin	c4b6342853	Improvements for `parallel_distributed_insert_select` (and related) (#34728 ) * Add a warning if parallel_distributed_insert_select was ignored Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Respect max_distributed_depth for parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Print warning for non applied parallel_distributed_insert_select only for initial query Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Remove Cluster::getHashOfAddresses() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Forbid parallel_distributed_insert_select for remote()/cluster() with different addresses Before it uses empty cluster name (getClusterName()) which is not correct, compare all addresses instead. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix max_distributed_depth check max_distributed_depth=1 must mean not more then one distributed query, not two, since max_distributed_depth=0 means no limit, and distribute_depth is 0 for the first query. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Fix INSERT INTO remote()/cluster() with parallel_distributed_insert_select Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Add a test for parallel_distributed_insert_select with cluster()/remote() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Return <remote> instead of empty cluster name in Distributed engine Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com> * Make user with sharding_key and w/o in remote()/cluster() identical Before with sharding_key the user was "default", while w/o it it was empty. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-03-08 15:24:39 +01:00
Anton Popov	2758db5341	add more comments	2022-03-01 19:32:55 +03:00
Anton Popov	a661eaf39f	better performance of getting storage snapshot	2022-02-16 02:17:22 +03:00
Anton Popov	e8ce091e68	Merge remote-tracking branch 'upstream/master' into HEAD	2022-01-21 20:11:18 +03:00
Anton Popov	7c6f7f6732	support 'optimize_move_to_prewhere' with storage 'Merge'	2021-12-29 20:49:10 +03:00
Anton Popov	6f4d9a53b2	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-12-01 15:54:33 +03:00
Raúl Marín	b2cfa70541	Reduce dependencies on ASTFunction.h 481 -> 230	2021-11-26 18:21:54 +01:00
Anton Popov	a20922b2d3	Merge remote-tracking branch 'origin/sparse-serialization' into HEAD	2021-11-09 15:36:25 +03:00

1 2 3

147 Commits