ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-17 11:52:27 +00:00

Author	SHA1	Message	Date
Azat Khuzhin	fb6f7631c2	Add ability to pass grower for HashTable during creation Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7b5d156cc5	Optimize SPARSE_HASHED layout (by using PackedHashMap) In case you want dictionary optimized for memory, SPARSE_HASHED is not always gives you what you need. Consider the following example <UInt64, UInt16> as <Key, Value>, but this pair will also have a 6 byte padding (on amd64), so this is almost 40% of space wastage. And because of this padding, even google::sparse_hash_map, does not make picture better, in fact, sparse_hash_map is not very friendly to memory allocators (especially jemalloc). Here are some numbers for dictionary with 1e9 elements and UInt64 as key, and UInt16 as value: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB As you can see PackedHashMap looks way more better then HASHED, and even better then SPARSE_HASHED, but slightly worse then sparse_hash_map with packed allocator (it is done with a custom patch to google sparse_hash_map). v2: rebase on top of bucket_count fix Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	b44497fd4c	Introduce PackedHashMap (HashMap with structure without padding) In case of you have HashMap with <UInt64, UInt16> as <Key, Value> the overhead of 38% can be crutial, especially if you have tons of keys. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	c4f23e87f1	Export grower_type in HashTable Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Michael Kolupaev	e84f0895e7	Support hardlinking parts transactionally	2023-05-18 21:05:56 -07:00
Yakov Olkhovskiy	a2c3de5082	Merge pull request #49933 from ClickHouse/fix-ipv6-proto-serialization Fix IPv6 encoding in protobuf	2023-05-18 23:02:15 -04:00
Dmitry Novik	aea71cf1bb	Merge branch 'master' into group-by-constant-fix	2023-05-19 01:29:56 +02:00
Michael Kolupaev	8dc59c1efe	Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails	2023-05-18 21:40:24 +00:00
Amos Bird	6b4dcbd3ed	Use PROJECT__DIR instead of CMAKE__DIR.	2023-05-18 23:23:39 +08:00
Sergei Trifonov	f98c337d2f	Fix stack-use-after-scope in resource manager test (#49908 ) * Fix stack-use-after-scope in resource manager test * fix	2023-05-18 14:53:46 +02:00
Kseniia Sumarokova	adebac1a92	Merge branch 'master' into fix-assertion-in-do-cleanup	2023-05-18 12:22:02 +02:00
FFFFFFFHHHHHHH	d31371adac	Merge branch 'master' into dot_product	2023-05-18 15:31:25 +08:00
Alexey Milovidov	86e14547d4	Merge pull request #49964 from ClickHouse/kssenii-patch-7 Follow up to #49429	2023-05-18 09:20:00 +03:00
Kseniia Sumarokova	855c95f626	Update src/Interpreters/Cache/Metadata.cpp Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>	2023-05-17 22:46:09 +02:00
Azat Khuzhin	e2e3a03dbe	Revert "`groupArray` returns cannot be nullable"	2023-05-17 22:33:30 +02:00
Timur Solodovnikov	c7ab59302f	Set allow_experimental_query_cache setting as obsolete (#49934 ) * set allow_experimental_query_cache as obsolete * add tsolodov to trusted contributors * CI linter --------- Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>	2023-05-17 20:03:42 +02:00
Kseniia Sumarokova	1c04085e8f	Update MergeTreeWriteAheadLog.h	2023-05-17 18:15:51 +02:00
kssenii	f2dbcb5146	Better fix	2023-05-17 16:27:06 +02:00
Han Fei	ed1d036151	Merge pull request #49884 from azat/dist-fix-async-block-processing Fix processing pending batch for Distributed async INSERT after restart	2023-05-17 15:19:42 +02:00
Alexander Tokmakov	36c31e1d79	Improve concurrent parts removal with zero copy replication (#49630 ) * improve concurrent parts removal * fix * fix	2023-05-17 14:07:34 +03:00
Alexander Tokmakov	1e529263d0	Merge branch 'master' into Follow_up_Backup_Restore_concurrency_check_node_2	2023-05-17 13:57:50 +03:00
Vitaly Baranov	6c8a923c9d	Merge branch 'master' into write-encrypted-to-backup	2023-05-17 12:37:05 +02:00
Kseniia Sumarokova	edceda494d	Merge branch 'master' into add-more-logging-for-cache	2023-05-17 12:24:59 +02:00
Kseniia Sumarokova	3787b7f127	Update Metadata.cpp	2023-05-17 12:16:18 +02:00
Azat Khuzhin	fdfb1eda55	Fix {Local,Remote}ReadThrottlerSleepMicroseconds metric values And also update the test, since now you could have slightly less sleep intervals, if query spend some time in other places. But what is important is that query_duration_ms does not exceeded calculated delay. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-17 12:12:39 +02:00
Azat Khuzhin	7383da0c52	Fix per-query remote throttler remote throttler by some reason had been overwritten by the global one during reloads, likely this is for graceful reload of this option, but it breaks per-query throttling, remove this logic. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-17 12:12:39 +02:00
Azat Khuzhin	3c80e30f02	Fix per-query IO/BACKUPs throttling settings (when default profile has them) When some of this settings was set for default profile (in users.xml/users.yml), then it will be always used regardless of what user passed. Fix this by not inherit per-query throttlers, for this they should be reset before making query context and they should not be initialized as before in Context::makeQueryContext(), since makeQueryContext() called too early, when user settings was not read yet. But there we had also initialization of per-server throttling, move this into the ContextSharedPart::configureServerWideThrottling(), and call it once we have ServerSettings set. Also note, that this patch makes the following settings - server settings: - max_replicated_fetches_network_bandwidth_for_server - max_replicated_sends_network_bandwidth_for_server But this change should not affect anybody, since it is done with compatiblity (i.e. if this setting is set in users profile it will be read from it as well as a fallback). Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-17 12:12:39 +02:00
Igor Nikonov	7d647c50c7	Merge branch 'master' into clearable_hash_set_without_zero_storage	2023-05-17 11:29:01 +02:00
FFFFFFFHHHHHHH	fd1e6557e1	Merge branch 'master' into dot_product	2023-05-17 14:40:06 +08:00
fhbai	c104354894	fix	2023-05-17 14:39:30 +08:00
Vitaly Baranov	f4ac4c3f9d	Corrections after review.	2023-05-17 03:23:16 +02:00
Yakov Olkhovskiy	0a44a69dc8	remove unnecessary header	2023-05-17 00:22:13 +00:00
Yakov Olkhovskiy	282297b677	binary encoding of IPv6 in protobuf	2023-05-16 23:46:01 +00:00
serxa	abacf1f990	add missing `quota_key` in operator== for connections	2023-05-16 19:14:54 +00:00
serxa	b12eefc694	fix timeout units and log message	2023-05-16 18:57:04 +00:00
Alexander Tokmakov	0da82945ac	fix	2023-05-16 18:18:48 +02:00
Alexander Tokmakov	3d26232cc0	Merge pull request #49918 from ClickHouse/remove_unused_code Remove unused code	2023-05-16 18:53:49 +03:00
kssenii	724949927b	Add logging	2023-05-16 17:36:48 +02:00
Antonio Andelic	4bc5a76fa7	Add Compose request for GCS (#49693 ) * Add compose request * Check if outcome is successful --------- Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>	2023-05-16 17:20:06 +02:00
Dmitry Novik	2287dd8633	Merge pull request #49800 from ClickHouse/fix-adding-cast Analyzer: apply _CAST to constants only once	2023-05-16 17:05:02 +02:00
Igor Nikonov	dea5cbcf4e	Slightly update comment	2023-05-16 16:39:00 +02:00
vdimir	1f55c320b4	Fix style	2023-05-16 16:23:53 +02:00
vdimir	ca005ecea1	Update comment about filtering nulls in asof join	2023-05-16 16:23:53 +02:00
vdimir	a7bb8f412f	Allow ASOF JOIN over nullable right column	2023-05-16 16:23:53 +02:00
alesapin	50a536bba8	Remove unused code	2023-05-16 15:26:24 +02:00
Alexander Tokmakov	b6716a8f0f	Merge branch 'master' into fix_some_tests4	2023-05-16 14:46:27 +02:00
Vitaly Baranov	b068f0b619	Fix build.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	2ec94a42b7	Remove default parameters from virtual functions.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	943707963f	Add backup setting "decrypt_files_from_encrypted_disks"	2023-05-16 14:27:27 +02:00
Vitaly Baranov	019493efa3	Fix throttling in backups.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	5198997fd8	Remove ReadSettings from backup entries.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	7cea264230	Fix whitespaces.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	c48c20fac8	Use combined checksums for encrypted immutable files.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	517e119e03	Move checksum calculation to IBackupEntry.	2023-05-16 14:27:27 +02:00
Vitaly Baranov	002fd19cb7	Move the common part of BackupIO_* to BackupIO_Default.	2023-05-16 14:27:23 +02:00
Vitaly Baranov	c92219f01b	BACKUP now writes encrypted data for tables on encrypted disks.	2023-05-16 14:26:33 +02:00
Vitaly Baranov	cc50fcc60a	Remove the 'temporary_file_' argument from BackupEntryFromImmutableFile's constructor.	2023-05-16 14:25:37 +02:00
Vitaly Baranov	bc880db5d9	Add functions to read/write encrypted files from IDisk.	2023-05-16 14:25:37 +02:00
Vitaly Baranov	101aa6eff0	Add function copyS3FileFromDisk().	2023-05-16 14:25:37 +02:00
Vitaly Baranov	69114cb550	Add function getBlobPath() to IDisk interface to allow copying to/from disks which are not built on top of IObjectStorage.	2023-05-16 14:25:36 +02:00
Vitaly Baranov	fd2731845c	Simplify interface of IBackupWriter: Remove supportNativeCopy() function.	2023-05-16 14:25:36 +02:00
Smita Kulkarni	9a2645a729	Fixed clang build	2023-05-16 14:09:38 +02:00
kssenii	d4ea3ea045	Fix	2023-05-16 13:54:13 +02:00
alesapin	93bd09ddd6	Merge branch 'master' into fix_another_zero_copy_bug	2023-05-16 12:24:52 +02:00
Kruglov Pavel	b414760d43	Merge pull request #49673 from Avogar/fiber-local-var Fix assert in SpanHolder::finish() with fibers	2023-05-16 11:59:33 +02:00
alesapin	0b4ab70dd9	Merge pull request #49891 from hanfei1991/hanfei/chassert-1 use chassert in MergeTreeDeduplicationLog to have better log info	2023-05-16 11:50:11 +02:00
Sema Checherinda	03c51208d1	Merge pull request #44869 from CheSema/multi_part_upload rework WriteBufferFromS3, add tests, add abortion	2023-05-16 10:52:01 +02:00
Robert Schulze	59bc3e25be	Merge pull request #49824 from AVMusorin/allow-alias-column-kafka KafkaEngine: Allow usage of Alias column type	2023-05-15 23:40:03 +02:00
FFFFFFFHHHHHHH	11b94a626a	Fix aggregate function kolmogorovSmirnovTest (#49768 )	2023-05-15 23:20:29 +02:00
Sergei Trifonov	cbc15bf35a	Add `DynamicResourceManager` and `FairPolicy` into scheduling subsystem (#49671 ) * Add `DynamicResourceManager` and `FairPolicy` into scheduling subsystem * fix test * fix tidy build	2023-05-15 23:13:17 +02:00
Alexander Tokmakov	c9d6ee3c98	Merge pull request #49874 from azat/build/fix Fix "reference to local binding" after fixes for clang-17	2023-05-15 23:25:18 +03:00
Vitaly Baranov	801cacc13f	Merge pull request #49831 from vitlibar/fix-setting-null-in-profile-def Fix setting NULL in profile definition	2023-05-15 22:24:49 +02:00
Vitaly Baranov	bf3336a84e	Merge pull request #47640 from ilejn/row_policy_template Row policy for database	2023-05-15 20:05:15 +02:00
Alexander Tokmakov	65bc702b0b	fix	2023-05-15 20:02:30 +02:00
Michael Kolupaev	91db148513	Fix AsynchronousReadIndirectBufferFromRemoteFS breaking on short seeks	2023-05-15 11:02:24 -07:00
Han Fei	4137a5e058	use chassert in MergeTreeDeduplicationLog to have better log info	2023-05-15 18:51:16 +02:00
Kruglov Pavel	900aca5f0a	Delete unneded files	2023-05-15 18:33:09 +02:00
Kruglov Pavel	bfcaf95aed	Delete unneded files	2023-05-15 18:32:54 +02:00
Alexander Tokmakov	05ae7b2c2d	fix some tests	2023-05-15 18:28:12 +02:00
avogar	78064d0622	Better comments	2023-05-15 15:52:14 +00:00
avogar	b23afdc533	Fix build for aarch64-darwin	2023-05-15 15:48:00 +00:00
Igor Nikonov	97e1513b22	Merge branch 'master' into clearable_hash_set_without_zero_storage	2023-05-15 17:42:10 +02:00
vdimir	07de815d96	Merge pull request #49836 from arthurpassos/add_extract_kv_max_number_of_pairs_safeguard	2023-05-15 16:31:01 +02:00
Anton Popov	512b27ef27	Merge pull request #49873 from amosbird/fix_49839 Fix a bug with projections and the aggregate_functions_null_for_empty setting (for query_plan_optimize_projection)	2023-05-15 15:58:42 +02:00
Azat Khuzhin	f2a023140e	Fix processing pending batch for Distributed async INSERT after restart After abnormal server restart current_batch.txt (that contains list of files to send to the remote shard), may not have all files, if it was terminated between unlink .bin files and truncation of current_batch.txt But it should be fixed in a more reliable way, though to backport the patch I kept it simple. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-15 15:57:30 +02:00
AVMusorin	418a61a68c	Allow using Alias column type for KafkaEngine ``` create table kafka ( a UInt32, a_str String Alias toString(a) ) engine = Kafka; create table data ( a UInt32; a_str String ) engine = MergeTree order by tuple(); create materialized view data_mv to data ( a UInt32, a_str String ) as select a, a_str from kafka; ``` Alias type works as expected in comparison with MATERIALIZED/EPHEMERAL or column with default expression. Ref: https://github.com/ClickHouse/ClickHouse/pull/47138 Co-authored-by: Azat Khuzhin <a3at.mail@gmail.com>	2023-05-15 15:39:58 +02:00
Sema Checherinda	dccdb3e678	work with comments on PR	2023-05-15 14:41:51 +02:00
Arthur Passos	e8f971aa2b	use LIMIT_EXCEEDED instead of TOO_LARGE_MAP_SIZE	2023-05-15 09:25:10 -03:00
Arthur Passos	b06e34a77f	Accept key value delimiter as part of value	2023-05-15 13:52:47 +02:00
Azat Khuzhin	665545ec45	Fix "reference to local binding" after fixes for clang-17 Follow-up for: #49851 (cc @alexey-milovidov) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-15 12:45:20 +02:00
Alexander Tokmakov	25912a2673	Merge pull request #49876 from JackyWoo/fix_typo fix typo	2023-05-15 13:32:58 +03:00
Kruglov Pavel	558eda4146	Merge pull request #49412 from azat/block-use-dense-hash-map Switch Block::NameMap to google::dense_hash_map over HashMap	2023-05-15 12:22:55 +02:00
JackyWoo	8d1bcb5c2f	fix typo	2023-05-15 16:51:20 +08:00
Amos Bird	4764259f60	Fix a bug with projections and the aggregate_functions_null_for_empty setting (for query_plan_optimize_projection) Fix a bug with projections and the aggregate_functions_null_for_empty setting. This was already fixed in PR #42198 but got forgotten after using query_plan_optimize_projection.	2023-05-15 14:17:16 +08:00
Alexey Milovidov	1db35384d9	Support `bitCount` for big integers	2023-05-15 03:30:03 +02:00
alekar	528e68bfc4	Merge branch 'master' into fix-osx-setsockopt-errors	2023-05-14 15:35:55 -07:00
Sergei Trifonov	8f20085d9a	Merge pull request #48923 from ClickHouse/async-loader Add AsyncLoader with dependency tracking and runtime prioritization	2023-05-14 15:12:39 +02:00
robot-clickhouse	33ca77b4ca	Merge pull request #49843 from azat/joinGet-non-deterministic [RFC] Mark joinGet() as non deterministic (so as dictGet)	2023-05-14 11:12:12 +02:00
alekar	2631d3db20	Merge branch 'master' into fix-osx-setsockopt-errors	2023-05-13 23:03:17 -07:00
Manas Alekar	c87b33a24d	Fix error on OS X regarding resetting timeouts. This happens when remote disconnects due to inactivity. It seems to work on Linux, likely due to difference in SO_LINGER, maybe a different default timeout on Darwin. Verified manually using clickhouse cloud using following process: 1. Connect to instance. 2. Run `show tables`. 3. Wait 6 minutes. 4. Run `show tables`. With this fix, the EINVAL is not reported, and client will simply reconnect.	2023-05-13 22:55:27 -07:00

1 2 3 4 5 ...

41573 Commits