Since libstdc++ has some of such compares for 3way compare:
/src/ch/clickhouse/src/Common/UInt128.h:61:71: warning: zero as null pointer constant [-Wzero-as-null-pointer-constant]
bool inline operator>= (const UInt128 rhs) const { return tuple() >= rhs.tuple(); }
^~
nullptr
/usr/include/c++/10.2.0/tuple:1426:5: note: while rewriting comparison as call to 'operator<=>' declared here
operator<=>(const tuple<_Tps...>& __t, const tuple<_Ups...>& __u)
preallocation can be used only when we know number of rows, and for this
we need:
- source clickhouse
- no filtering (i.e. lack of <where>), since filtering can filter
too much rows and eventually it may allocate memory that will
never be used.
For sparse_hash the difference is quite significant, preallocated
sparse_hash hashtable allocates ~33% faster (7.5 seconds vs 5 seconds
for insert, and the difference is more significant for higher number of
elements):
$ ninja bench-sparse_hash-run
[1/1] cd /src/ch/hashtable-bench/.cmake && ...ch/hashtable-bench/.cmake/bench-sparse_hash
sparse_hash/insert: 7.574 <!--
sparse_hash/find : 2.14426
sparse_hash/maxrss: 174MiB
sparse_hash/time: 9710.51 msec (user+sys)
$ time ninja bench-sparse_hash-preallocate-run
[1/1] cd /src/ch/hashtable-bench/.cmake && ...-bench/.cmake/bench-sparse_hash-preallocate
sparse_hash/insert: 5.0522 <!--
sparse_hash/find : 2.14024
sparse_hash/maxrss: 174MiB
sparse_hash/time: 7192.06 msec (user+sys)
P.S. the difference for sparse_hashed dictionary with 4e9 elements
(uint64, uint16) is ~18% (4975.905 vs 4103.569 sec)
v2: do not reallocate the dictionary from the progress callback
Since this will access hashtable in parallel.
v3: drop PREALLOCATE() and do this only for source=clickhouse and empty
<where>