Update external-dicts-dict-layout.md

This commit is contained in:
Denny Crane 2022-05-28 17:26:11 -03:00 committed by GitHub
parent 1d2cf73b81
commit d7e098fb17
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -21,7 +21,7 @@ ClickHouse generates an exception for errors with dictionaries. Examples of erro
- The dictionary being accessed could not be loaded.
- Error querying a `cached` dictionary.
You can view the list of external dictionaries and their statuses in the `system.dictionaries` table.
You can view the list of external dictionaries and their statuses in the [system.dictionaries](../../../operations/system-tables/dictionaries.md) table.
The configuration looks like this:
@ -48,6 +48,35 @@ LAYOUT(LAYOUT_TYPE(param value)) -- layout settings
...
```
Dictionaries without word `complex-key*` in a layout have a key with [UInt64](../../../sql-reference/data-types/int-uint.md) type, `complex-key*` dictionaries have a composite key (complex, with arbitrary types).
[UInt64](../../../sql-reference/data-types/int-uint.md) keys in XML dictionaries are defined with `<id>` tag.
Configuration example (column key_column has UInt64 type):
```xml
...
<structure>
<id>
<name>key_column</name>
</id>
...
```
Composite `complex` keys XML dictionaries are defined `<key>` tag.
Configuration example of a composite key (key has one element with [String](../../../sql-reference/data-types/string.md) type):
```xml
...
<structure>
<key>
<attribute>
<name>country_code</name>
<type>String</type>
</attribute>
</key>
...
```
## Ways to Store Dictionaries in Memory {#ways-to-store-dictionaries-in-memory}
- [flat](#flat)
@ -98,6 +127,8 @@ LAYOUT(FLAT(INITIAL_ARRAY_SIZE 50000 MAX_ARRAY_SIZE 5000000))
The dictionary is completely stored in memory in the form of a hash table. The dictionary can contain any number of elements with any identifiers In practice, the number of keys can reach tens of millions of items.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
If `preallocate` is `true` (default is `false`) the hash table will be preallocated (this will make the dictionary load faster). But note that you should use it only if:
- The source support an approximate number of elements (for now it is supported only by the `ClickHouse` source).
@ -125,6 +156,8 @@ LAYOUT(HASHED(PREALLOCATE 0))
Similar to `hashed`, but uses less memory in favor more CPU usage.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
It will be also preallocated so as `hashed` (with `preallocate` set to `true`), and note that it is even more significant for `sparse_hashed`.
Configuration example:
@ -181,6 +214,8 @@ LAYOUT(COMPLEX_KEY_SPARSE_HASHED())
The dictionary is completely stored in memory. Each attribute is stored in an array. The key attribute is stored in the form of a hashed table where value is an index in the attributes array. The dictionary can contain any number of elements with any identifiers. In practice, the number of keys can reach tens of millions of items.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
All types of sources are supported. When updating, data (from a file or from a table) is read in its entirety.
Configuration example:
@ -220,6 +255,7 @@ LAYOUT(COMPLEX_KEY_HASHED_ARRAY())
The dictionary is stored in memory in the form of a hash table with an ordered array of ranges and their corresponding values.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
This storage method works the same way as hashed and allows using date/time (arbitrary numeric type) ranges in addition to the key.
Example: The table contains discounts for each advertiser in the format:
@ -360,6 +396,8 @@ RANGE(MIN StartDate MAX EndDate);
The dictionary is stored in a cache that has a fixed number of cells. These cells contain frequently used elements.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
When searching for a dictionary, the cache is searched first. For each block of data, all keys that are not found in the cache or are outdated are requested from the source using `SELECT attrs... FROM db.table WHERE id IN (k1, k2, ...)`. The received data is then written to the cache.
If keys are not found in dictionary, then update cache task is created and added into update queue. Update queue properties can be controlled with settings `max_update_queue_size`, `update_queue_push_timeout_milliseconds`, `query_wait_timeout_milliseconds`, `max_threads_for_updates`.
@ -420,6 +458,8 @@ This type of storage is for use with composite [keys](../../../sql-reference/dic
Similar to `cache`, but stores data on SSD and index in RAM. All cache dictionary settings related to update queue can also be applied to SSD cache dictionaries.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
``` xml
<layout>
<ssd_cache>
@ -452,7 +492,7 @@ This type of storage is for use with composite [keys](../../../sql-reference/dic
The dictionary is not stored in memory and directly goes to the source during the processing of a request.
The dictionary key has the `UInt64` type.
The dictionary key has the [UInt64](../../../sql-reference/data-types/int-uint.md) type.
All types of [sources](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md), except local files, are supported.