Commit Graph

43 Commits

Author SHA1 Message Date
Maksim Kita
4b4468b34a Dictionaries use single arena for multiple string attributes 2022-01-08 13:26:11 +03:00
Maksim Kita
ac3cb8c12b CacheDictionary dictionary source access race fix 2021-12-15 15:55:28 +03:00
Azat Khuzhin
a7dc6f309f Remove SparseHashMap.h
It was added only for arcadia build, and used only in one place, no need
to have a separate typedef for it.
2021-12-14 10:07:14 +03:00
Azat Khuzhin
e16891713d Fix sparse_hashed dict performance with sequential keys (wrong hash function)
In #27152 the hash function for sparse_hash_map had been changed to
std::hash<> switch it back to DefaultHash<> (ClickHouse builtin), since
std::hash<> for numeric keys returns itself and this does not works
great with sparse_hash_map.

I've tried the example from #32480 and using some hash fixes the
performance of sparse_hashed layout.

Fixes: #32480

v2: Add comments for SparseHashMap
2021-12-14 10:07:14 +03:00
Maksim Kita
a3a780bbf5 Dictionaries read support multiple threads 2021-10-21 17:17:53 +03:00
Maksim Kita
b4f41bd824 Dictionaries key types refactoring 2021-08-17 20:35:43 +03:00
Nikolai Kochetov
a1ec7f75c5 Merge branch 'master' into qoega-fix-access-gtest-in-arcadia 2021-08-10 11:31:47 +03:00
Nikolai Kochetov
8cc493a3cd Try fix build. 2021-08-09 18:09:29 +03:00
Nikolai Kochetov
13f95f3fdf Streams -> Processors for dicts, part 3. 2021-08-06 11:41:45 +03:00
Nikolai Kochetov
8d14f2ef8f Streams -> Processors for dicts, part 1. 2021-08-04 20:58:18 +03:00
Maksim Kita
67e9b85951 Merge ext into common 2021-06-16 23:28:41 +03:00
Maksim Kita
2a016f52e9 Added tests 2021-06-12 13:53:03 +03:00
Maksim Kita
45b8dc772b Dictionaries support array type 2021-06-10 22:32:09 +03:00
Maksim Kita
72d46beca0
Merge pull request #23979 from azat/dict-preallocate
Reimplement preallocate for hashed/sparse_hashed dictionaries
2021-05-11 20:15:46 +03:00
Azat Khuzhin
808d1a0215 Reimplement preallocate for hashed/sparse_hashed dictionaries
It was initially implemented in #15454, but was reverted in #21948 (due
to higher memory usage).

This implementation differs from the initial, since now there is
separate attribute to enable preallocation, before it was done
automatically, but this has problems with duplicates in the source.

Plus this implementation does not uses dynamic_cast, instead it extends
IDictionarySource interface.
2021-05-10 07:41:48 +03:00
Alexey Milovidov
9753ddc8a0 Merge branch 'master' of github.com:yandex/ClickHouse into normalize-bigint 2021-05-09 18:54:29 +03:00
Alexey Milovidov
49160ae1ba Big integers and UUID in dictionaries 2021-05-08 22:01:59 +03:00
Azat Khuzhin
e08389b2d2 Add interface for rate of found elements in the dictionaries
- IDictionary abstraction
- skeleton implementation into each dictionary
- system.dictionaries.found_rate
- documentation changes
2021-05-08 17:09:01 +03:00
Maksim Kita
66903e4b0c Flat, Hashed dictionary include update field bytes into bytes_allocated 2021-05-01 01:23:22 +03:00
Maksim Kita
7df43891c1 Dictionary added Decimal256 attribute type support 2021-04-10 19:53:21 +03:00
Maksim Kita
ff86c21e65 Dictionary update field fix 2021-04-04 16:30:48 +03:00
Maksim Kita
9772d30754 Fixed performance tests 2021-03-31 13:21:30 +03:00
Maksim Kita
8a65e8b06e Fix arcadia build 2021-03-29 23:00:40 +03:00
Maksim Kita
3f273ef983 Updated hash dictionary nullable attribute implementation 2021-03-26 21:01:56 +03:00
Maksim Kita
eb0039ed03 Fixed tests 2021-03-26 18:42:32 +03:00
Maksim Kita
21d28a37aa Fixed build 2021-03-26 18:42:32 +03:00
Maksim Kita
720e2e0501 Updated dictGetDescendants, dictGetChildren implementation 2021-03-26 18:42:32 +03:00
Maksim Kita
9f2f0d1095 Refactored hierarchy dictionaries interface 2021-03-26 18:42:32 +03:00
Maksim Kita
dc0bb7485d Updated CacheDictionary 2021-03-06 14:36:37 +03:00
Maksim Kita
b7a150cc63 Updated DictionaryDefaultValueExtractor interface 2021-01-27 16:25:27 +03:00
Maksim Kita
c4ffa2160f Updated interfaces. Added documentation. 2021-01-27 16:25:27 +03:00
Maksim Kita
b0d3f32a36 Added DefaultValueExtractor 2021-01-27 16:25:27 +03:00
Maksim Kita
7cb7d4dbce Fixed dicitionaries todo 2021-01-27 16:25:27 +03:00
Maksim Kita
3e2d615e62 Added Nullable support for HashedDictionary 2021-01-27 16:25:27 +03:00
Maksim Kita
cc767d4f2e Updated HashedDictionary to new interface 2021-01-27 16:25:26 +03:00
Maksim Kita
d16a572eee Updated IDictionaryBase interface 2021-01-27 16:25:26 +03:00
Maksim Kita
7a2f6cd5b9 Dictionaries refactoring to new interface 2021-01-27 16:25:26 +03:00
Azat Khuzhin
064f901ea8 Add ability to preallocate hashtables for hashed/sparsehashed dictionaries
preallocation can be used only when we know number of rows, and for this
we need:
- source clickhouse
- no filtering (i.e. lack of <where>), since filtering can filter
  too much rows and eventually it may allocate memory that will
  never be used.

For sparse_hash the difference is quite significant, preallocated
sparse_hash hashtable allocates ~33% faster (7.5 seconds vs 5 seconds
for insert, and the difference is more significant for higher number of
elements):

    $ ninja bench-sparse_hash-run
    [1/1] cd /src/ch/hashtable-bench/.cmake && ...ch/hashtable-bench/.cmake/bench-sparse_hash
    sparse_hash/insert: 7.574 <!--
    sparse_hash/find  : 2.14426
    sparse_hash/maxrss: 174MiB
    sparse_hash/time:   9710.51 msec (user+sys)

    $ time ninja bench-sparse_hash-preallocate-run
    [1/1] cd /src/ch/hashtable-bench/.cmake && ...-bench/.cmake/bench-sparse_hash-preallocate
    sparse_hash/insert: 5.0522 <!--
    sparse_hash/find  : 2.14024
    sparse_hash/maxrss: 174MiB
    sparse_hash/time:   7192.06 msec (user+sys)

P.S. the difference for sparse_hashed dictionary with 4e9 elements
(uint64, uint16) is ~18% (4975.905 vs 4103.569 sec)

v2: do not reallocate the dictionary from the progress callback
    Since this will access hashtable in parallel.
v3: drop PREALLOCATE() and do this only for source=clickhouse and empty
    <where>
2020-10-09 22:28:14 +03:00
Alexander Tokmakov
d10b4c504d rename database with dictionaries 2020-07-16 17:25:39 +03:00
Alexander Tokmakov
4de18e3d8b add StorageID to IDictionaryBase 2 2020-07-14 22:18:33 +03:00
Alexander Tokmakov
1f6ffb08e4 add StorageID to IDictionaryBase 1 2020-07-14 21:46:29 +03:00
Ivan Lezhankin
e230632645 Changes required for auto-sync with Arcadia 2020-04-16 15:31:57 +03:00
Ivan Lezhankin
06446b4f08 dbms/ → src/ 2020-04-03 18:14:31 +03:00