Use separate helpers that accept/return values, instead of reference,
anyway PackedHashMap is developed for small structure.
v0: fix for keys
v2: fix for values
v3: fix bitEquals
v4: fix for iterating over HashMap
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
As it turns out, HashMap/PackedHashMap works great even with max load
factor of 0.99. By "great" I mean it least it works faster then
google sparsehash, and not to mention it's friendliness to the memory
allocator (it has zero fragmentation since it works with a continuious
memory region, in comparison to the sparsehash that doing lots of
realloc, which jemalloc does not like, due to it's slabs).
Here is a table of different setups:
settings | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
- | - | - | - | - | -
HASHED upstream | - | - | - | - | 35GiB
SPARSE_HASHED upstream | - | - | - | - | 26GiB
- | - | - | - | - | -
sparse_hash_map glibc hashbench | - | - | - | - | 17.5GiB
sparse_hash_map packed allocator | 101.878 | 231.48 | 4.32 | - | 17.7GiB
PackedHashMap 0.5 | 15.514 | 42.35 | 23.61 | 20GiB | 22GiB
hashed 0.95 | 34.903 | 115.615 | 8.65 | 16GiB | 18.7GiB
**PackedHashMap 0.95** | **93.6** | **19.883** | **10.68** | **10GiB** | **12.8GiB**
PackedHashMap 0.99 | 26.113 | 83.6 | 11.96 | 10GiB | 12.3GiB
As it shows, PackedHashMap with 0.95 max_load_factor, eats 2.6x less
memory then SPARSE_HASHED in upstream, and it also 2x faster for read!
v2: fix grower
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In case you want dictionary optimized for memory, SPARSE_HASHED is not
always gives you what you need.
Consider the following example <UInt64, UInt16> as <Key, Value>, but
this pair will also have a 6 byte padding (on amd64), so this is almost
40% of space wastage.
And because of this padding, even google::sparse_hash_map, does not make
picture better, in fact, sparse_hash_map is not very friendly to memory
allocators (especially jemalloc).
Here are some numbers for dictionary with 1e9 elements and UInt64 as
key, and UInt16 as value:
settings | load (sec) | read (sec) | read (million rows/s) | bytes_allocated | RSS
HASHED upstream | - | - | - | - | 35GiB
SPARSE_HASHED upstream | - | - | - | - | 26GiB
- | - | - | - | - | -
sparse_hash_map glibc hashbench | - | - | - | - | 17.5GiB
sparse_hash_map packed allocator | 101.878 | 231.48 | 4.32 | - | 17.7GiB
PackedHashMap | 15.514 | 42.35 | 23.61 | 20GiB | 22GiB
As you can see PackedHashMap looks way more better then HASHED, and
even better then SPARSE_HASHED, but slightly worse then sparse_hash_map
with packed allocator (it is done with a custom patch to google
sparse_hash_map).
v2: rebase on top of bucket_count fix
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
In case of you have HashMap with <UInt64, UInt16> as <Key, Value> the
overhead of 38% can be crutial, especially if you have tons of keys.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
woboq codebrowser uses clang tooling, which adds clang system includes
(in Linux::AddClangSystemIncludeArgs()), because none of (-nostdinc,
-nobuiltininc) is set.
And later it will complain with -Wpoison-system-directories for added by
itself includes in InitHeaderSearch::AddUnmappedPath(), because they are
starts from one of the following:
- /usr/include
- /usr/local/include
The interesting thing here is that it got broken only after upgrading to
llvm 16 (in #49678), and the reason for this is that clang 15 build has
system includes that does not trigger the warning -
"/usr/lib/clang/15.0.7/include", while clang 16 has
"/usr/include/clang/16.0.4/include"
So let's simply disable this warning, but only for woboq.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
* set allow_experimental_query_cache as obsolete
* add tsolodov to trusted contributors
* CI linter
---------
Co-authored-by: Nikita Mikhaylov <mikhaylovnikitka@gmail.com>