Commit Graph

5 Commits

Author SHA1 Message Date
Alexey Milovidov
39ddcf1c74 Fixed build [#CLICKHOUSE-2]. 2017-06-28 15:24:49 +03:00
Alexey Milovidov
c6b83a1c60 Fixed build [#CLICKHOUSE-2]. 2017-06-28 15:22:07 +03:00
proller
4db8d09de9 Reorganize includes. part 1 (#921)
* Make libunwind optional. Allow use custom libcctz

* fix

* Fix

* fix

* Update BaseDaemon.cpp

* Update CMakeLists.txt

* Reorganize includes. part 1

* Update dbms_include.cmake

* Reorganize includes. part 2

* Reorganize includes. part 3

* dbms/src/Common/ThreadPool -> libs/libcommon

* Reorganize includes. part 4

* Fix print_include_directories

* Update thread_creation_latency.cpp

* Update StringRef.h
2017-06-23 23:22:35 +03:00
Marek Vavruša
45bd332460 AggregateFunctionTopK: fix memory usage, performance
* allow separate table key / hash key, and use
  std::string / StringRef for generic variant as
  it has built-in storage and StringRef is supported
  by the hash table, this avoids infinitely
  growing arena with serialised keys
* use power-of-2 size for alpha vector for faster
  binning without using modulo
* use custom grower and allocator for SpaceSaving
  to start with smaller tables
* store computed hash in counter for faster
  reinsertion of smallest element
2017-05-11 18:52:49 +04:00
Marek Vavruša
5f1e65b252 AggregateFunctions: implemented topK(n)
This implements a new function for approximate
computation of the most frequent entries using
Filtered Space Saving with a merge step adapted
from Parallel Space Saving paper.

It works better for cases where GROUP BY x
is impractical due to high cardinality of x,
such as top IP addresses or top search queries.
2017-05-03 23:09:52 -07:00