preallocation can be used only when we know number of rows, and for this
we need:
- source clickhouse
- no filtering (i.e. lack of <where>), since filtering can filter
too much rows and eventually it may allocate memory that will
never be used.
For sparse_hash the difference is quite significant, preallocated
sparse_hash hashtable allocates ~33% faster (7.5 seconds vs 5 seconds
for insert, and the difference is more significant for higher number of
elements):
$ ninja bench-sparse_hash-run
[1/1] cd /src/ch/hashtable-bench/.cmake && ...ch/hashtable-bench/.cmake/bench-sparse_hash
sparse_hash/insert: 7.574 <!--
sparse_hash/find : 2.14426
sparse_hash/maxrss: 174MiB
sparse_hash/time: 9710.51 msec (user+sys)
$ time ninja bench-sparse_hash-preallocate-run
[1/1] cd /src/ch/hashtable-bench/.cmake && ...-bench/.cmake/bench-sparse_hash-preallocate
sparse_hash/insert: 5.0522 <!--
sparse_hash/find : 2.14024
sparse_hash/maxrss: 174MiB
sparse_hash/time: 7192.06 msec (user+sys)
P.S. the difference for sparse_hashed dictionary with 4e9 elements
(uint64, uint16) is ~18% (4975.905 vs 4103.569 sec)
v2: do not reallocate the dictionary from the progress callback
Since this will access hashtable in parallel.
v3: drop PREALLOCATE() and do this only for source=clickhouse and empty
<where>