ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-27 10:02:01 +00:00

Author	SHA1	Message	Date
Alexander Tokmakov	55fc4adf05	Update 02441_alter_delete_and_drop_column.sql	2023-05-19 16:42:15 +03:00
Antonio Andelic	7f60af11cb	Merge branch 'master' into write-buffer-from-s3	2023-05-19 15:17:05 +02:00
robot-clickhouse	f947d1cc0a	Automatic style fix	2023-05-19 13:00:26 +00:00
Antonio Andelic	3107070e76	Avoid deadlock when starting table in attach thread	2023-05-19 12:48:19 +00:00
Dmitry Novik	d705e5102b	Merge pull request #49838 from ClickHouse/group-by-constant-fix Analyzer: do not optimize GROUP BY keys with ROLLUP and CUBE	2023-05-19 14:27:34 +02:00
Sergei Trifonov	5db5f6e44b	Merge branch 'master' into fix-throttlers	2023-05-19 14:08:36 +02:00
Alexey Milovidov	ab162756ba	Merge branch 'master' into dot_product	2023-05-19 14:46:53 +03:00
Antonio Andelic	9c3b17fa18	Remove whitespace	2023-05-19 13:00:51 +02:00
alesapin	6676285f02	Merge pull request #49921 from azat/tests/fix-test_distributed_load_balancing Fix flakiness of test_distributed_load_balancing test	2023-05-19 13:00:38 +02:00
alesapin	632ab8a3d1	Merge pull request #49996 from ClickHouse/az Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails	2023-05-19 12:58:47 +02:00
Antonio Andelic	e46476dba2	Update src/Coordination/Changelog.cpp Co-authored-by: alesapin <alesapin@clickhouse.com>	2023-05-19 12:44:20 +02:00
Alexey Milovidov	f5506210d6	Geo types are production ready	2023-05-19 12:43:55 +02:00
alesapin	2398de9d2f	Merge pull request #49473 from ClickHouse/fix_another_zero_copy_bug Fix another zero copy bug	2023-05-19 12:41:36 +02:00
alesapin	e741450b88	Merge branch 'master' into fix_another_zero_copy_bug	2023-05-19 12:40:48 +02:00
alesapin	c46a5c27d0	Merge pull request #49889 from ClickHouse/fix_some_tests4 Fix some tests	2023-05-19 12:40:34 +02:00
alesapin	e5b001abda	Merge branch 'master' into fix_some_tests4	2023-05-19 12:34:03 +02:00
kssenii	d6df009842	Fix	2023-05-19 12:22:46 +02:00
Antonio Andelic	6e468b29e8	Check return value of ftruncate	2023-05-19 10:15:06 +00:00
Alexey Milovidov	c1b96fd2f5	Merge branch 'master' into Fix_flaky_test_ssl_cert_authentication	2023-05-19 12:37:55 +03:00
Alexey Milovidov	70c83f5133	Merge pull request #49991 from amosbird/clickhouse_as_library Use PROJECT__DIR instead of CMAKE__DIR.	2023-05-19 12:37:18 +03:00
Alexey Milovidov	4dbe5b8329	Support them in tests	2023-05-19 11:13:28 +02:00
Alexey Milovidov	193c82a09a	Merge pull request #49993 from den-crane/test/issue_46128 test for #46128	2023-05-19 11:43:13 +03:00
Alexey Milovidov	d234aebfc3	Merge pull request #49992 from azat/build/fix-woboq Fix woboq codebrowser build with -Wno-poison-system-directories	2023-05-19 11:38:36 +03:00
Alexey Milovidov	f47375d16c	Support Tableau	2023-05-19 10:28:13 +02:00
Victor Krasnov	4efb2037a2	Fix the premature amendment of the 00921 test	2023-05-19 06:22:21 +00:00
Azat Khuzhin	dc353faf44	Simplify obtaining query shard in test_distributed_load_balancing Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:58 +02:00
Azat Khuzhin	e37e8f83bb	Fix flakiness of test_distributed_load_balancing I saw the following in the logs for the failed test: 2023.05.16 07:12:12.894051 [ 262 ] {74575ac0-b296-4fdc-bc8e-3476a305e6ea} <Warning> ConnectionPoolWithFailover: Connection failed at try №1, reason: Timeout exceeded while reading from socket (socket (172.16.3.2:9000), receive timeout 2000 ms) And I think that the culprit is the test_distributed_replica_max_ignored_errors for which it is normal, however not for others, and this should not affect other tests. So fix this by calling SYSTEM RELOAD CONFIG, which should reset error count. CI: https://s3.amazonaws.com/clickhouse-test-reports/49380/5abc1a1c68ee204c9024493be1d19835cf5630f7/integration_tests__release__[3_4].html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:58 +02:00
Azat Khuzhin	e1e2a83a9e	Print type of the structure that will be used for HASHED/SPARSE_HASHED Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	2b240d3721	Improve documentation for HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	f8e7d2cb1f	Remove part of the HashTableGrowerWithPrecalculationAndMaxLoadFactor comment Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	c9cde110cd	Add initial degree as parameter for HashTableGrowerWithPrecalculationAndMaxLoadFactor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	01bf041cca	Rewrite HashTableGrower{,WithPrecalculation}::set w/o ternary operators Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	634f168a74	Introduce max_size_degree for HashTableGrower{,WithPrecalculation} Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	42eac6bfbc	Wrap implementation helpers into HashedDictionaryImpl namespace Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	6f351851ad	Rename grower to HashTableGrowerWithPrecalculationAndMaxLoadFactor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	1ab130132c	Add more comments into HashedDictionaryCollectionType.h Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7eba6def94	Add a comment for HashTableGrowerWithPrecalculation about load factor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	422cbe08fe	Do not use PackedHashMap for non-POD for the purposes of layout In clang-16 the behaviour for POD types had been changed in [1], this does not allows us to use PackedHashMap for some types. [1]: `277123376c` Note, that I tried to come up with a more generic solution then enumeratic types, but failed. Though now I think that this is good, since this shows which types are not allowed for PackedHashMap Another option is to use -fclang-abi-compat=13.0 but I doubt it is a good idea. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	fc19e79f50	Change coding style of declaring packed attribute in PackedHashMap Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	65dd87d0da	Fix "reference binding to misaligned address" in PackedHashMap Use separate helpers that accept/return values, instead of reference, anyway PackedHashMap is developed for small structure. v0: fix for keys v2: fix for values v3: fix bitEquals v4: fix for iterating over HashMap Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7c8d8eeb56	Use Cell::setMapped() over separate helper insertSetMapped() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	2996b38606	Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout As it turns out, HashMap/PackedHashMap works great even with max load factor of 0.99. By "great" I mean it least it works faster then google sparsehash, and not to mention it's friendliness to the memory allocator (it has zero fragmentation since it works with a continuious memory region, in comparison to the sparsehash that doing lots of realloc, which jemalloc does not like, due to it's slabs). Here is a table of different setups: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS - \| - \| - \| - \| - \| - HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap 0.5 \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB hashed 0.95 \| 34.903 \| 115.615 \| 8.65 \| 16GiB \| 18.7GiB PackedHashMap 0.95 \| 93.6 \| 19.883 \| 10.68 \| 10GiB \| 12.8GiB PackedHashMap 0.99 \| 26.113 \| 83.6 \| 11.96 \| 10GiB \| 12.3GiB As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less memory then SPARSE_HASHED in upstream, and it also 2x faster for read! v2: fix grower Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	3698302ddb	Accept float values for dictionary layouts configurations Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	8c6d691f52	Use HashTable constructor in HashSet Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	fb6f7631c2	Add ability to pass grower for HashTable during creation Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7b5d156cc5	Optimize SPARSE_HASHED layout (by using PackedHashMap) In case you want dictionary optimized for memory, SPARSE_HASHED is not always gives you what you need. Consider the following example <UInt64, UInt16> as <Key, Value>, but this pair will also have a 6 byte padding (on amd64), so this is almost 40% of space wastage. And because of this padding, even google::sparse_hash_map, does not make picture better, in fact, sparse_hash_map is not very friendly to memory allocators (especially jemalloc). Here are some numbers for dictionary with 1e9 elements and UInt64 as key, and UInt16 as value: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB As you can see PackedHashMap looks way more better then HASHED, and even better then SPARSE_HASHED, but slightly worse then sparse_hash_map with packed allocator (it is done with a custom patch to google sparse_hash_map). v2: rebase on top of bucket_count fix Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	b44497fd4c	Introduce PackedHashMap (HashMap with structure without padding) In case of you have HashMap with <UInt64, UInt16> as <Key, Value> the overhead of 38% can be crutial, especially if you have tons of keys. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	c4f23e87f1	Export grower_type in HashTable Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Michael Kolupaev	e84f0895e7	Support hardlinking parts transactionally	2023-05-18 21:05:56 -07:00
Yakov Olkhovskiy	a2c3de5082	Merge pull request #49933 from ClickHouse/fix-ipv6-proto-serialization Fix IPv6 encoding in protobuf	2023-05-18 23:02:15 -04:00

... 2 3 4 5 6 ...

115451 Commits