For compound hash tables such as the future StringHashMap, an
iterator-based API might be inefficient for iterating over a table or
for merging two tables, because:
1) the key has to be converted to a general format from a from a
component-specific format, which may differ between the components;
2) the information about the component of the compound hash table to
which the value belongs is lost, and has to be recalculated if the
value is reinserted.
A more efficient approach is to use internal iteration, that is,
map-like functions, which avoids unnecessary conversions when iterating,
and allows to use an efficient component-wise approach when merging.
Use separate key and "mapped" value references instead. This is
important for hash tables that do not store the key/"mapped" pair
directly, and cannot provide this interface without some runtime
overhead.
We don't need it anymore after we changed the hash table key memory
management to use callbacks. Removing this interface is important for
hash maps that do not store the key, such as FixedHashMap or the
prospective compound StringHashMap.
Based on std::list<>::emplace_back() and std::unordered_map<>::emplace()
provide strong exception safety, RWLockImpl is now changed to provide
the same level of exception safety.
* Fix build
* cmake: fix cpuinfo
* Fix includes after processors merge
Conflicts:
dbms/src/Processors/Formats/Impl/CapnProtoRowInputFormat.cpp
dbms/src/Processors/Formats/Impl/ParquetBlockOutputFormat.cpp
dbms/src/Processors/Formats/Impl/ProtobufRowInputFormat.cpp
dbms/src/Processors/Formats/Impl/ProtobufRowOutputFormat.cpp
* Fix build in gcc8
* fix test link
* fix test link
* Fix test link
* link fix
* Fix includes after processors merge 2
Conflicts:
dbms/src/Processors/Formats/Impl/ParquetBlockInputFormat.cpp
* Fix includes after processors merge 3
* link fix
* Fix likely/unlikely conflict with cython
* Fix conflict with protobuf/stubs/atomicops.h
* remove unlikely.h
* Fix macos build (do not use timer_t)
* wip
* Fix build (orc, ...)
* Missing files
* Try fix
* fix hdfs
* Fix llvm 7.1 find
Among other things, it is used to filter logs, which are being written even after the global server context is deinitialized, so we can't keep masker there.
Some aggregation methods initially emplace a temporary StringRef key
into a hash table. Then, if the key was not seen before, they make a
persistent copy of the key and update the hash table with it. This
approach is not suitable for compound hash tables, because the logic of
when the persistent key is needed is more complex, and is contained
within the hash table itself.
In this commit, we switch to managing key memory with callbacks passed
to the hash table, that allow it to request a persistent copy of the key
if it is needed. This should be more appropriate for compound hash
tables.
This commit prepares for StringHashMap PR #5417.