ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-16 19:32:07 +00:00

Author	SHA1	Message	Date
Azat Khuzhin	45ee650e26	Distributed: check for bytes_to_throw/delay_insert only before INSERT Before it was checked for each block.	2021-03-03 23:30:24 +03:00
Azat Khuzhin	6965ac26c3	Distributed: Add ability to delay/throttle INSERT until pending data will be reduced Add two new settings for the Distributed engine: - bytes_to_delay_insert - max_delay_to_insert If at the beginning of INSERT there will be too much pending data, more then bytes_to_delay_insert, then the INSERT will wait until it will be shrinked, and not more then max_delay_to_insert seconds. If after this there will be still too much pending, it will throw an exception. Also new profile events were added (by analogy to the MergeTree): - DistributedDelayedInserts (although you can use system.errors instead of this, but still) - DistributedRejectedInserts - DistributedDelayedInsertsMilliseconds	2021-03-03 23:30:23 +03:00
Azat Khuzhin	fcf49a4914	Distributed: Calculate counters for async INSERT at INSERT time Previous patch fixes the inaccuracy, but it's done using iterating over directory on each request (to system.distribution_queue or to check bytes_to_throw_insert), and like previous patch alredy stated, it may have pretty huge overhead (especially when you have lots of distributed files pending). This patch remove that recalculation (but it will still be done, and if there is different, there will be a log message), and replace it with proper account at INSERT time (and after file has been sent, or marked as broken).	2021-03-03 23:30:03 +03:00
Azat Khuzhin	b5a5778589	Distributed: Add ability to limit amount of pending bytes for async INSERT Right now with distributed_directory_monitor_batch_inserts=1 and insert_distributed_sync=0 INSERT into Distributed table will store blocks that should be sent to remote (and in case of prefer_localhost_replica=0 to the localhost too) on the local filesystem, and sent it in background. However there is no limit for this storage, and if the remote is unavailable (or some other error), these pending blocks may take significant space, and this is not always desired behaviour. Add new Distributed setting - bytes_to_throw_insert, that will set the limit for how much pending bytes is allowed, if the limit will be reached an exception will be throw. By default was set to 0, to avoid surprises.	2021-03-03 23:30:00 +03:00
Kruglov Pavel	d94e8624d7	Merge branch 'master' into shard-id	2021-02-06 16:48:17 +03:00
feng lv	0edf65c094	fix test fix update fix spell	2021-02-02 09:22:30 +00:00
feng lv	4279c7da41	add setting insert_shard_id add test fix style fix	2021-02-02 04:26:59 +00:00
Anton Popov	c7070da85a	better abstractions in disk interface	2021-01-26 17:49:35 +03:00
Azat Khuzhin	a6631287a7	DistributedBlockOutputStream: add more comments	2021-01-17 12:50:37 +03:00
Azat Khuzhin	b725e1d131	DistributedBlockOutputStream: Remove superfluous brackets for string construction	2021-01-17 12:48:51 +03:00
Azat Khuzhin	2955e25e83	Fix inserted blocks accounting for insert_distributed_one_random_shard=1 It is tricky due to block splitting Refs: https://github.com/ClickHouse/ClickHouse/pull/18294	2021-01-17 12:45:42 +03:00
Azat Khuzhin	858f07c796	Update comment for query AST cloning during inesrt into multiple local shards Refs: https://github.com/ClickHouse/ClickHouse/pull/18264#discussion_r558839456	2021-01-17 12:40:41 +03:00
alexey-milovidov	a5a19de878	Update DistributedBlockOutputStream.cpp	2021-01-16 13:22:25 +03:00
feng lv	9829c09720	fix	2021-01-15 15:54:35 +00:00
Azat Khuzhin	ecae6c1c60	Avoid reading the distributed batch just to read the block header Before this patch batched mode of the DirectoryMonitor is 2x slower then non-batched, after it should be more or less the same as non-batched.	2021-01-14 22:38:46 +03:00
Azat Khuzhin	819b9d7d56	Add more metadata into distributed .bin files to avoid doing the same on sending Before this patch StorageDistributedDirectoryMonitor reading .bin files in batch mode, just to calculate number of bytes/rows, this is very ineffective, let's just store them in the header (rows/bytes).	2021-01-10 18:17:15 +03:00
Azat Khuzhin	471deab63a	Rename fsync_tmp_directory to fsync_directories for Distributed engine	2021-01-09 17:51:30 +03:00
Azat Khuzhin	2e55bd2285	Accept IDisk in DirectoryMonitor (for further fsync)	2021-01-09 16:31:42 +03:00
Azat Khuzhin	fbe5df809b	Sync other temporary directories for Distributed fsync_tmp_directories	2021-01-09 11:36:04 +03:00
Azat Khuzhin	b5ace27014	Add fsync support for Distributed engine. Two new settings (by analogy with MergeTree family) has been added: - `fsync_after_insert` - Do fsync for every inserted. Will decreases performance of inserts. - `fsync_tmp_directory` - Do fsync for temporary directory (that is used for async INSERT only) after all part operations (writes, renames, etc.). Refs: #17380 (p1)	2021-01-09 11:31:32 +03:00
alexey-milovidov	417e685830	Merge pull request #18775 from ClickHouse/dont-insert-empty-blocks-distributed-sync Do not insert empty blocks on sync Distributed INSERT	2021-01-06 20:06:35 +03:00
Alexey Milovidov	1572d2122b	Respect network_compression_method in async INSERT into Distributed table	2021-01-06 03:41:34 +03:00
Alexey Milovidov	190402b7d5	Do not insert empty blocks on sync Distributed INSERT	2021-01-06 02:54:22 +03:00
Amos Bird	6fc225e676	Distributed insertion to one random shard (#18294 ) * Distributed insertion to one random shard * add some tests * add some documentation * Respect shards' weights * fine locking Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>	2020-12-23 19:04:05 +03:00
Azat Khuzhin	5b3ab48861	More forward declaration for generic headers The following headers are pretty generic, so use forward declaration as much as possible: - Context.h - Settings.h - ConnectionTimeouts.h (Also this shows that some missing some includes -- this has been fixed) And split ConnectionTimeouts.h into ConnectionTimeoutsContext.h (since module part cannot be added for it, due to recursive build dependencies that will be introduced) Also remove Settings from the RemoteBlockInputStream/RemoteQueryExecutor and just pass the context, since settings was passed only in speicifc places, that can allow making a copy of Context (i.e. Copier). Approx results (How much units will be recompiled after changing file X?): - ConnectionTimeouts.h - mainline: 100 - Context.h: - mainline: ~800 - patched: 415 - Settings.h: - mainline: 900-1K - patched: 440 (most of them because of the Context.h)	2020-12-12 17:43:10 +03:00
Alexander Tokmakov	bfbf150c67	fix segfault when 'not enough space'	2020-12-02 17:49:43 +03:00
Alexander Tokmakov	b94cc5c4e5	remove more stringstreams	2020-11-10 21:22:26 +03:00
alexey-milovidov	f39457bc77	Merge pull request #16788 from azat/fix-use_compact_format_in_distributed_parts_names Apply use_compact_format_in_distributed_parts_names for each INSERT (with internal_replication)	2020-11-08 23:23:10 +03:00
Azat Khuzhin	04db0834bf	Apply use_compact_format_in_distributed_parts_names for each INSERT (with internal_replication) Before this patch use_compact_format_in_distributed_parts_names was applied only from default profile (at server start) for internal_replication=1, and was ignored on INSERT.	2020-11-08 03:05:52 +03:00
Alexey Milovidov	fd84d16387	Fix "server failed to start" error	2020-11-07 03:14:53 +03:00
Alexander Kuzmenkov	fb64cf210a	straighten the protocol version	2020-09-17 17:37:29 +03:00
Vitaly Baranov	56665a15f7	Rework and rename the template class SettingsCollection => BaseSettings.	2020-07-31 20:54:18 +03:00
Vladimir Chebotarev	faedb04722	Minor fixes.	2020-07-28 19:45:46 +03:00
Vladimir Chebotarev	1b3f5c99f5	Real fix of test.	2020-07-26 21:27:36 +03:00
Alexey Milovidov	73a5c38398	Fix potential overflow in integer division #12119	2020-07-05 03:29:03 +03:00
alesapin	1ddeb3d149	Buildable getSampleBlock in StorageInMemoryMetadata	2020-06-16 18:51:29 +03:00
Azat Khuzhin	d2383f0f5d	Fix async INSERT into Distributed for prefer_localhost_replica=0 and w/o internal_replication	2020-06-08 21:58:56 +03:00
Alexey Milovidov	25f941020b	Remove namespace pollution	2020-05-31 00:57:37 +03:00
Alexey Milovidov	7e1813825b	Return old names of macros	2020-05-24 01:24:01 +03:00
Alexey Milovidov	ce0619dabf	Progress on task	2020-05-24 00:26:45 +03:00
Alexey Milovidov	2d7d5a1547	Apply all transformations again	2020-05-24 00:16:27 +03:00
Alexey Milovidov	bab24879e9	Progress on task	2020-05-24 00:16:05 +03:00
Alexey Milovidov	eacff92d0e	Progress on task	2020-05-23 22:35:08 +03:00
Alexey Milovidov	a2ad11897f	Remove duplicate whitespaces (preparation)	2020-05-23 21:53:58 +03:00
Alexey Milovidov	1f13515a65	Make all LOG in single line (preparation)	2020-05-23 21:31:37 +03:00
Alexey Milovidov	8042e5febe	find {base,src,programs} -name '.h' -or -name '.cpp' \| xargs grep -l -P 'LOG_\w+\([^,]+, "[^"]+" << [^<]+ << "[^"]+" << [^<]+\);' \| xargs sed -i -r -e 's/(LOG_\w+)\(([^,]+), "([^"]+)" << ([^<]+) << "([^"]+)" << ([^<]+)\);/\1_FORMATTED(\2, "\3{}\5{}", \4, \6);/'	2020-05-23 19:58:15 +03:00
Azat Khuzhin	d93b9a57f6	Forward declaration for Context as much as possible. Now after changing Context.h 488 modules will be recompiled instead of 582.	2020-05-21 01:53:18 +03:00
alexey-milovidov	7cf3538840	Merge pull request #10270 from ClickHouse/quota-key-in-client Support quota_key for Native client	2020-05-17 14:09:40 +03:00
alexey-milovidov	7ee35f102d	Merge pull request #10867 from azat/dist-INSERT-load-balancing Respect prefer_localhost_replica/load_balancing on INSERT into Distributed	2020-05-17 11:11:35 +03:00
Alexey Milovidov	397859ccb8	Fix error	2020-05-17 08:45:20 +03:00

1 2

68 Commits