ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-11-24 00:22:29 +00:00

Author	SHA1	Message	Date
Alexey Milovidov	193c82a09a	Merge pull request #49993 from den-crane/test/issue_46128 test for #46128	2023-05-19 11:43:13 +03:00
Alexey Milovidov	d234aebfc3	Merge pull request #49992 from azat/build/fix-woboq Fix woboq codebrowser build with -Wno-poison-system-directories	2023-05-19 11:38:36 +03:00
Alexey Milovidov	f47375d16c	Support Tableau	2023-05-19 10:28:13 +02:00
Azat Khuzhin	dc353faf44	Simplify obtaining query shard in test_distributed_load_balancing Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:58 +02:00
Azat Khuzhin	e37e8f83bb	Fix flakiness of test_distributed_load_balancing I saw the following in the logs for the failed test: 2023.05.16 07:12:12.894051 [ 262 ] {74575ac0-b296-4fdc-bc8e-3476a305e6ea} <Warning> ConnectionPoolWithFailover: Connection failed at try №1, reason: Timeout exceeded while reading from socket (socket (172.16.3.2:9000), receive timeout 2000 ms) And I think that the culprit is the test_distributed_replica_max_ignored_errors for which it is normal, however not for others, and this should not affect other tests. So fix this by calling SYSTEM RELOAD CONFIG, which should reset error count. CI: https://s3.amazonaws.com/clickhouse-test-reports/49380/5abc1a1c68ee204c9024493be1d19835cf5630f7/integration_tests__release__[3_4].html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:58 +02:00
Azat Khuzhin	e1e2a83a9e	Print type of the structure that will be used for HASHED/SPARSE_HASHED Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	2b240d3721	Improve documentation for HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	f8e7d2cb1f	Remove part of the HashTableGrowerWithPrecalculationAndMaxLoadFactor comment Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	c9cde110cd	Add initial degree as parameter for HashTableGrowerWithPrecalculationAndMaxLoadFactor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	01bf041cca	Rewrite HashTableGrower{,WithPrecalculation}::set w/o ternary operators Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	634f168a74	Introduce max_size_degree for HashTableGrower{,WithPrecalculation} Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	42eac6bfbc	Wrap implementation helpers into HashedDictionaryImpl namespace Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	6f351851ad	Rename grower to HashTableGrowerWithPrecalculationAndMaxLoadFactor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	1ab130132c	Add more comments into HashedDictionaryCollectionType.h Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7eba6def94	Add a comment for HashTableGrowerWithPrecalculation about load factor Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	422cbe08fe	Do not use PackedHashMap for non-POD for the purposes of layout In clang-16 the behaviour for POD types had been changed in [1], this does not allows us to use PackedHashMap for some types. [1]: `277123376c` Note, that I tried to come up with a more generic solution then enumeratic types, but failed. Though now I think that this is good, since this shows which types are not allowed for PackedHashMap Another option is to use -fclang-abi-compat=13.0 but I doubt it is a good idea. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	fc19e79f50	Change coding style of declaring packed attribute in PackedHashMap Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	65dd87d0da	Fix "reference binding to misaligned address" in PackedHashMap Use separate helpers that accept/return values, instead of reference, anyway PackedHashMap is developed for small structure. v0: fix for keys v2: fix for values v3: fix bitEquals v4: fix for iterating over HashMap Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7c8d8eeb56	Use Cell::setMapped() over separate helper insertSetMapped() Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	2996b38606	Add ability to configure maximum load factor for the HASHED/SPARSE_HASHED layout As it turns out, HashMap/PackedHashMap works great even with max load factor of 0.99. By "great" I mean it least it works faster then google sparsehash, and not to mention it's friendliness to the memory allocator (it has zero fragmentation since it works with a continuious memory region, in comparison to the sparsehash that doing lots of realloc, which jemalloc does not like, due to it's slabs). Here is a table of different setups: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS - \| - \| - \| - \| - \| - HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap 0.5 \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB hashed 0.95 \| 34.903 \| 115.615 \| 8.65 \| 16GiB \| 18.7GiB PackedHashMap 0.95 \| 93.6 \| 19.883 \| 10.68 \| 10GiB \| 12.8GiB PackedHashMap 0.99 \| 26.113 \| 83.6 \| 11.96 \| 10GiB \| 12.3GiB As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less memory then SPARSE_HASHED in upstream, and it also 2x faster for read! v2: fix grower Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	3698302ddb	Accept float values for dictionary layouts configurations Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	8c6d691f52	Use HashTable constructor in HashSet Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	fb6f7631c2	Add ability to pass grower for HashTable during creation Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	7b5d156cc5	Optimize SPARSE_HASHED layout (by using PackedHashMap) In case you want dictionary optimized for memory, SPARSE_HASHED is not always gives you what you need. Consider the following example <UInt64, UInt16> as <Key, Value>, but this pair will also have a 6 byte padding (on amd64), so this is almost 40% of space wastage. And because of this padding, even google::sparse_hash_map, does not make picture better, in fact, sparse_hash_map is not very friendly to memory allocators (especially jemalloc). Here are some numbers for dictionary with 1e9 elements and UInt64 as key, and UInt16 as value: settings \| load (sec) \| read (sec) \| read (million rows/s) \| bytes_allocated \| RSS HASHED upstream \| - \| - \| - \| - \| 35GiB SPARSE_HASHED upstream \| - \| - \| - \| - \| 26GiB - \| - \| - \| - \| - \| - sparse_hash_map glibc hashbench \| - \| - \| - \| - \| 17.5GiB sparse_hash_map packed allocator \| 101.878 \| 231.48 \| 4.32 \| - \| 17.7GiB PackedHashMap \| 15.514 \| 42.35 \| 23.61 \| 20GiB \| 22GiB As you can see PackedHashMap looks way more better then HASHED, and even better then SPARSE_HASHED, but slightly worse then sparse_hash_map with packed allocator (it is done with a custom patch to google sparse_hash_map). v2: rebase on top of bucket_count fix Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	b44497fd4c	Introduce PackedHashMap (HashMap with structure without padding) In case of you have HashMap with <UInt64, UInt16> as <Key, Value> the overhead of 38% can be crutial, especially if you have tons of keys. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Azat Khuzhin	c4f23e87f1	Export grower_type in HashTable Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-19 06:07:21 +02:00
Michael Kolupaev	e84f0895e7	Support hardlinking parts transactionally	2023-05-18 21:05:56 -07:00
Yakov Olkhovskiy	a2c3de5082	Merge pull request #49933 from ClickHouse/fix-ipv6-proto-serialization Fix IPv6 encoding in protobuf	2023-05-18 23:02:15 -04:00
Dmitry Novik	aea71cf1bb	Merge branch 'master' into group-by-constant-fix	2023-05-19 01:29:56 +02:00
Michael Kolupaev	8dc59c1efe	Fix test_insert_same_partition_and_merge failing if one Azure request attempt fails	2023-05-18 21:40:24 +00:00
Denny Crane	e7b6056bbb	test for #46128	2023-05-18 15:18:55 -03:00
Azat Khuzhin	0f7a310a67	Fix woboq codebrowser build with -Wno-poison-system-directories woboq codebrowser uses clang tooling, which adds clang system includes (in Linux::AddClangSystemIncludeArgs()), because none of (-nostdinc, -nobuiltininc) is set. And later it will complain with -Wpoison-system-directories for added by itself includes in InitHeaderSearch::AddUnmappedPath(), because they are starts from one of the following: - /usr/include - /usr/local/include The interesting thing here is that it got broken only after upgrading to llvm 16 (in #49678), and the reason for this is that clang 15 build has system includes that does not trigger the warning - "/usr/lib/clang/15.0.7/include", while clang 16 has "/usr/include/clang/16.0.4/include" So let's simply disable this warning, but only for woboq. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-18 18:26:05 +02:00
Azat Khuzhin	73661c3a46	Move tunnings for woboq codebrowser to cmake out from build.sh Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-05-18 18:18:30 +02:00
Amos Bird	6b4dcbd3ed	Use PROJECT__DIR instead of CMAKE__DIR.	2023-05-18 23:23:39 +08:00
Yakov Olkhovskiy	30083351f5	test fix	2023-05-18 14:42:48 +00:00
Denny Crane	94fe224935	Update partition.md	2023-05-18 10:06:59 -03:00
Sergei Trifonov	f98c337d2f	Fix stack-use-after-scope in resource manager test (#49908 ) * Fix stack-use-after-scope in resource manager test * fix	2023-05-18 14:53:46 +02:00
Kseniia Sumarokova	dd5ee930eb	Merge pull request #49914 from kssenii/fix-assertion-in-do-cleanup Fix assertion in CacheMetadata::doCleanup	2023-05-18 12:22:49 +02:00
Kseniia Sumarokova	adebac1a92	Merge branch 'master' into fix-assertion-in-do-cleanup	2023-05-18 12:22:02 +02:00
robot-ch-test-poll2	a0ef0955da	Merge pull request #49983 from imbingo123/imbingo123-patch-modify_docs Update grant.md	2023-05-18 10:39:49 +02:00
libin	d294ecbc16	Update grant.md docs: Modifying grant example	2023-05-18 15:50:19 +08:00
FFFFFFFHHHHHHH	d31371adac	Merge branch 'master' into dot_product	2023-05-18 15:31:25 +08:00
Alexey Milovidov	86e14547d4	Merge pull request #49964 from ClickHouse/kssenii-patch-7 Follow up to #49429	2023-05-18 09:20:00 +03:00
Alexey Milovidov	5065049154	Merge pull request #49971 from azat/revert-48593-group_array_nullable [RFC] Revert "`groupArray` returns cannot be nullable"	2023-05-18 09:17:42 +03:00
Rich Raposa	03b5bfe218	Merge pull request #49968 from ClickHouse/reddit Add Reddit comments to datasets	2023-05-17 15:26:29 -06:00
Kseniia Sumarokova	855c95f626	Update src/Interpreters/Cache/Metadata.cpp Co-authored-by: Igor Nikonov <954088+devcrafter@users.noreply.github.com>	2023-05-17 22:46:09 +02:00
Yakov Olkhovskiy	612b79868b	test added	2023-05-17 20:40:51 +00:00
Azat Khuzhin	e2e3a03dbe	Revert "`groupArray` returns cannot be nullable"	2023-05-17 22:33:30 +02:00
rfraposa	6a136897e3	Create reddit-comments.md	2023-05-17 13:23:53 -06:00
DanRoscigno	a1fc96953f	reorder	2023-05-17 14:48:16 -04:00

1 2 3 4 5 ...

115280 Commits