ClickHouse/docs/en/engines/table-engines/integrations/embedded-rocksdb.md

---
slug: /en/engines/table-engines/integrations/embedded-rocksdb
sidebar_position: 50
sidebar_label: EmbeddedRocksDB
---

import CloudNotSupportedBadge from '@theme/badges/CloudNotSupportedBadge';

# EmbeddedRocksDB Engine

<CloudNotSupportedBadge />

This engine allows integrating ClickHouse with [RocksDB](http://rocksdb.org/).

## Creating a Table {#creating-a-table}

``` sql
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
    ...
) ENGINE = EmbeddedRocksDB([ttl, rocksdb_dir, read_only]) PRIMARY KEY(primary_key_name)
[ SETTINGS name=value, ... ]
```

Engine parameters:

- `ttl` - time to live for values. TTL is accepted in seconds. If TTL is 0, regular RocksDB instance is used (without TTL).
- `rocksdb_dir` - path to the directory of an existed RocksDB or the destination path of the created RocksDB. Open the table with the specified `rocksdb_dir`.
- `read_only` - when `read_only` is set to true, read-only mode is used. For storage with TTL, compaction will not be triggered (neither manual nor automatic), so no expired entries are removed.
- `primary_key_name` – any column name in the column list.
- `primary key` must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as a `rocksdb key`.
- columns other than the primary key will be serialized in binary as `rocksdb` value in corresponding order.
- queries with key `equals` or `in` filtering will be optimized to multi keys lookup from `rocksdb`.

Engine settings:

- `optimize_for_bulk_insert` – Table is optimized for bulk insertions (insert pipeline will create SST files and import to rocksdb database instead of writing to memtables); default value: `1`.
- `bulk_insert_block_size` - Minimum size of SST files (in term of rows) created by bulk insertion; default value: `1048449`.

Example:

``` sql
CREATE TABLE test
(
    `key` String,
    `v1` UInt32,
    `v2` String,
    `v3` Float32
)
ENGINE = EmbeddedRocksDB
PRIMARY KEY key
```

## Metrics

There is also `system.rocksdb` table, that expose rocksdb statistics:

```sql
SELECT
    name,
    value
FROM system.rocksdb

┌─name──────────────────────┬─value─┐
│ no.file.opens             │     1 │
│ number.block.decompressed │     1 │
└───────────────────────────┴───────┘
```

## Configuration

You can also change any [rocksdb options](https://github.com/facebook/rocksdb/wiki/Option-String-and-Option-Map) using config:

```xml
<rocksdb>
    <options>
        <max_background_jobs>8</max_background_jobs>
    </options>
    <column_family_options>
        <num_levels>2</num_levels>
    </column_family_options>
    <tables>
        <table>
            <name>TABLE</name>
            <options>
                <max_background_jobs>8</max_background_jobs>
            </options>
            <column_family_options>
                <num_levels>2</num_levels>
            </column_family_options>
        </table>
    </tables>
</rocksdb>
```

By default trivial approximate count optimization is turned off, which might affect the performance `count()` queries. To enable this
optimization set up `optimize_trivial_approximate_count_query = 1`. Also, this setting affects `system.tables` for EmbeddedRocksDB engine,
turn on the settings to see approximate values for `total_rows` and `total_bytes`.

## Supported operations {#supported-operations}

### Inserts

When new rows are inserted into `EmbeddedRocksDB`, if the key already exists, the value will be updated, otherwise a new key is created.

Example:

```sql
INSERT INTO test VALUES ('some key', 1, 'value', 3.2);
```

### Deletes

Rows can be deleted using `DELETE` query or `TRUNCATE`.

```sql
DELETE FROM test WHERE key LIKE 'some%' AND v1 > 1;
```

```sql
ALTER TABLE test DELETE WHERE key LIKE 'some%' AND v1 > 1;
```

```sql
TRUNCATE TABLE test;
```

### Updates

Values can be updated using the `ALTER TABLE` query. The primary key cannot be updated.

```sql
ALTER TABLE test UPDATE v1 = v1 * 10 + 2 WHERE key LIKE 'some%' AND v3 > 3.1;
```

### Joins

A special `direct` join with EmbeddedRocksDB tables is supported.
This direct join avoids forming a hash table in memory and accesses
the data directly from the EmbeddedRocksDB.

With large joins you may see much lower memory usage with direct joins
because the hash table is not created.

To enable direct joins:
```sql
SET join_algorithm = 'direct, hash'
```

:::tip
When the `join_algorithm` is set to `direct, hash`, direct joins will be used
when possible, and hash otherwise.
:::

#### Example

##### Create and populate an EmbeddedRocksDB table:
```sql
CREATE TABLE rdb
(
    `key` UInt32,
    `value` Array(UInt32),
    `value2` String
)
ENGINE = EmbeddedRocksDB
PRIMARY KEY key
```

```sql
INSERT INTO rdb
    SELECT
        toUInt32(sipHash64(number) % 10) as key,
        [key, key+1] as value,
        ('val2' || toString(key)) as value2
    FROM numbers_mt(10);
```

##### Create and populate a table to join with table `rdb`:

```sql
CREATE TABLE t2
(
    `k` UInt16
)
ENGINE = TinyLog
```

```sql
INSERT INTO t2 SELECT number AS k
FROM numbers_mt(10)
```

##### Set the join algorithm to `direct`:

```sql
SET join_algorithm = 'direct'
```

##### An INNER JOIN:
```sql
SELECT *
FROM
(
    SELECT k AS key
    FROM t2
) AS t2
INNER JOIN rdb ON rdb.key = t2.key
ORDER BY key ASC
```
```response
┌─key─┬─rdb.key─┬─value──┬─value2─┐
│   0 │       0 │ [0,1]  │ val20  │
│   2 │       2 │ [2,3]  │ val22  │
│   3 │       3 │ [3,4]  │ val23  │
│   6 │       6 │ [6,7]  │ val26  │
│   7 │       7 │ [7,8]  │ val27  │
│   8 │       8 │ [8,9]  │ val28  │
│   9 │       9 │ [9,10] │ val29  │
└─────┴─────────┴────────┴────────┘
```

### More information on Joins
- [`join_algorithm` setting](/docs/en/operations/settings/settings.md#join_algorithm)
- [JOIN clause](/docs/en/sql-reference/statements/select/join.md)
-												edit eng ver

											
										
										
											2021-01-25 14:13:29 +00:00
+								---
-												add slugs

											
										
										
											2022-08-28 14:53:34 +00:00
+								slug: /en/engines/table-engines/integrations/embedded-rocksdb
-												Alphabetize table functions and engines

											
										
										
											2023-06-23 13:16:22 +00:00
+								sidebar_position: 50
-												Removed /ja folder, cleaned up /ru markdown

											
										
										
											2022-04-09 13:29:05 +00:00
+								sidebar_label: EmbeddedRocksDB
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
+								---
-												[Docs] Add cloud not supported badge for EmbeddedRocksDB engine

											
										
										
											2024-11-05 17:41:01 +00:00
+								import CloudNotSupportedBadge from '@theme/badges/CloudNotSupportedBadge';
-												Remove H1 anchor tags from docs

											
										
										
											2022-06-02 10:55:18 +00:00
+								# EmbeddedRocksDB Engine
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
-												[Docs] Add cloud not supported badge for EmbeddedRocksDB engine

											
										
										
											2024-11-05 17:41:01 +00:00
+								<CloudNotSupportedBadge />
 								This engine allows integrating ClickHouse with [RocksDB](http://rocksdb.org/).
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
-												Docs: remove anchor prefix

											
										
										
											2023-09-18 18:29:13 +00:00
+								## Creating a Table {#creating-a-table}
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
 								``` sql
 								CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
 								(
 								    name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
 								    name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
 								    ...
-												add read_only option

Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>

											
										
										
											2022-08-23 12:58:57 +00:00
+								) ENGINE = EmbeddedRocksDB([ttl, rocksdb_dir, read_only]) PRIMARY KEY(primary_key_name)
-												Revert "Revert "Introduce bulk loading to StorageEmbeddedRocksDB""

											
										
										
											2024-05-03 10:29:36 +00:00
+								[ SETTINGS name=value, ... ]
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
+								```
-												revise docs

Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>

											
										
										
											2022-08-09 13:48:49 +00:00
+								Engine parameters:
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
-												add functionality test

Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>

											
										
										
											2022-08-10 15:10:52 +00:00
+								- `ttl` - time to live for values. TTL is accepted in seconds. If TTL is 0, regular RocksDB instance is used (without TTL).
-												fix test

Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>

											
										
										
											2022-08-26 12:00:17 +00:00
+								- `rocksdb_dir` - path to the directory of an existed RocksDB or the destination path of the created RocksDB. Open the table with the specified `rocksdb_dir`.
-												add read_only option

Signed-off-by: Lloyd-Pottiger <yan1579196623@gamil.com>

											
										
										
											2022-08-23 12:58:57 +00:00
+								- `read_only` - when `read_only` is set to true, read-only mode is used. For storage with TTL, compaction will not be triggered (neither manual nor automatic), so no expired entries are removed.
-												Add system.rocksdb table (with various rocksdb internals)

v2: fix USE_ROCKSDB check (there is #cmakedefine01 so it is either 0/1,
so we should not check with #ifdef)

											
										
										
											2021-07-26 18:17:39 +00:00
+								- `primary_key_name` – any column name in the column list.
-												Fixed notes.

											
										
										
											2021-01-22 16:40:58 +00:00
+								- `primary key` must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as a `rocksdb key`.
 								- columns other than the primary key will be serialized in binary as `rocksdb` value in corresponding order.
 								- queries with key `equals` or `in` filtering will be optimized to multi keys lookup from `rocksdb`.
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
-												Revert "Revert "Introduce bulk loading to StorageEmbeddedRocksDB""

											
										
										
											2024-05-03 10:29:36 +00:00
+								Engine settings:
 								- `optimize_for_bulk_insert` – Table is optimized for bulk insertions (insert pipeline will create SST files and import to rocksdb database instead of writing to memtables); default value: `1`.
 								- `bulk_insert_block_size` - Minimum size of SST files (in term of rows) created by bulk insertion; default value: `1048449`.
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
+								Example:
 								``` sql
 								CREATE TABLE test
 								(
 								    `key` String,
 								    `v1` UInt32,
 								    `v2` String,
-												Fix syntax error: drop comma
											
										
										
											2021-12-01 20:12:10 +00:00
+								    `v3` Float32
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
+								)
-												Rename Rocksdb to RocksDB

											
										
										
											2020-11-08 15:41:16 +00:00
+								ENGINE = EmbeddedRocksDB
-												add docs

											
										
										
											2020-10-01 10:59:51 +00:00
+								PRIMARY KEY key
-												links

											
										
										
											2021-01-26 19:24:06 +00:00
+								```
-												Add system.rocksdb table (with various rocksdb internals)

v2: fix USE_ROCKSDB check (there is #cmakedefine01 so it is either 0/1,
so we should not check with #ifdef)

											
										
										
											2021-07-26 18:17:39 +00:00
+								## Metrics
 								There is also `system.rocksdb` table, that expose rocksdb statistics:
 								```sql
 								SELECT
 								    name,
 								    value
 								FROM system.rocksdb
 								┌─name──────────────────────┬─value─┐
 								│ no.file.opens             │     1 │
 								│ number.block.decompressed │     1 │
 								└───────────────────────────┴───────┘
 								```
-												Add ability to set any rocksdb option via config.xml

v2: Cover rocksdb options in ClickHouse config
v3: add missing __init__.py
v4: Rework rocksdb options from config
v5: add column_family_options support

											
										
										
											2021-07-26 18:17:39 +00:00
+								## Configuration
 								You can also change any [rocksdb options](https://github.com/facebook/rocksdb/wiki/Option-String-and-Option-Map) using config:
 								```xml
 								<rocksdb>
 								    <options>
 								        <max_background_jobs>8</max_background_jobs>
 								    </options>
 								    <column_family_options>
 								        <num_levels>2</num_levels>
 								    </column_family_options>
 								    <tables>
 								        <table>
 								            <name>TABLE</name>
 								            <options>
 								                <max_background_jobs>8</max_background_jobs>
 								            </options>
 								            <column_family_options>
 								                <num_levels>2</num_levels>
 								            </column_family_options>
 								        </table>
 								    </tables>
 								</rocksdb>
 								```
-												Add docs for KeeperMap

											
										
										
											2023-02-15 14:29:11 +00:00
-												Added optimize_trivial_approximate_count_query documentatiosn for RocksDB

											
										
										
											2023-11-20 16:55:42 +00:00
+								By default trivial approximate count optimization is turned off, which might affect the performance `count()` queries. To enable this
 								optimization set up `optimize_trivial_approximate_count_query = 1`. Also, this setting affects `system.tables` for EmbeddedRocksDB engine,
 								turn on the settings to see approximate values for `total_rows` and `total_bytes`.
-												Docs: remove anchor prefix

											
										
										
											2023-09-18 18:29:13 +00:00
+								## Supported operations {#supported-operations}
-												Add docs for KeeperMap

											
										
										
											2023-02-15 14:29:11 +00:00
 								### Inserts
-												Update docs/en/engines/table-engines/integrations/embedded-rocksdb.md
											
										
										
											2023-02-15 18:29:26 +00:00
+								When new rows are inserted into `EmbeddedRocksDB`, if the key already exists, the value will be updated, otherwise a new key is created.
-												Add docs for KeeperMap

											
										
										
											2023-02-15 14:29:11 +00:00
 								Example:
 								```sql
 								INSERT INTO test VALUES ('some key', 1, 'value', 3.2);
 								```
 								### Deletes
-												Alphabetize table functions and engines

											
										
										
											2023-06-23 13:16:22 +00:00
+								Rows can be deleted using `DELETE` query or `TRUNCATE`.
-												Add docs for KeeperMap

											
										
										
											2023-02-15 14:29:11 +00:00
 								```sql
 								DELETE FROM test WHERE key LIKE 'some%' AND v1 > 1;
 								```
 								```sql
 								ALTER TABLE test DELETE WHERE key LIKE 'some%' AND v1 > 1;
 								```
 								```sql
 								TRUNCATE TABLE test;
 								```
 								### Updates
-												Update docs/en/engines/table-engines/integrations/embedded-rocksdb.md
											
										
										
											2023-02-15 18:29:33 +00:00
+								Values can be updated using the `ALTER TABLE` query. The primary key cannot be updated.
-												Add docs for KeeperMap

											
										
										
											2023-02-15 14:29:11 +00:00
 								```sql
 								ALTER TABLE test UPDATE v1 = v1 * 10 + 2 WHERE key LIKE 'some%' AND v3 > 3.1;
 								```
-												add direct join docs

											
										
										
											2023-06-02 14:08:48 +00:00
 								### Joins
 								A special `direct` join with EmbeddedRocksDB tables is supported.
 								This direct join avoids forming a hash table in memory and accesses
 								the data directly from the EmbeddedRocksDB.
-												address review comments

											
										
										
											2023-06-02 20:30:18 +00:00
+								With large joins you may see much lower memory usage with direct joins
 								because the hash table is not created.
-												add direct join docs

											
										
										
											2023-06-02 14:08:48 +00:00
+								To enable direct joins:
 								```sql
-												address review comments

											
										
										
											2023-06-02 20:30:18 +00:00
+								SET join_algorithm = 'direct, hash'
-												add direct join docs

											
										
										
											2023-06-02 14:08:48 +00:00
+								```
 								:::tip
-												address review comments

											
										
										
											2023-06-02 20:30:18 +00:00
+								When the `join_algorithm` is set to `direct, hash`, direct joins will be used
 								when possible, and hash otherwise.
-												add direct join docs

											
										
										
											2023-06-02 14:08:48 +00:00
+								:::
 								#### Example
 								##### Create and populate an EmbeddedRocksDB table:
 								```sql
 								CREATE TABLE rdb
 								(
 								    `key` UInt32,
 								    `value` Array(UInt32),
 								    `value2` String
 								)
 								ENGINE = EmbeddedRocksDB
 								PRIMARY KEY key
 								```
 								```sql
 								INSERT INTO rdb
 								    SELECT
 								        toUInt32(sipHash64(number) % 10) as key,
 								        [key, key+1] as value,
 								        ('val2' || toString(key)) as value2
 								    FROM numbers_mt(10);
 								```
 								##### Create and populate a table to join with table `rdb`:
 								```sql
 								CREATE TABLE t2
 								(
 								    `k` UInt16
 								)
 								ENGINE = TinyLog
 								```
 								```sql
 								INSERT INTO t2 SELECT number AS k
 								FROM numbers_mt(10)
 								```
 								##### Set the join algorithm to `direct`:
 								```sql
 								SET join_algorithm = 'direct'
 								```
 								##### An INNER JOIN:
 								```sql
 								SELECT *
 								FROM
 								(
 								    SELECT k AS key
 								    FROM t2
 								) AS t2
 								INNER JOIN rdb ON rdb.key = t2.key
 								ORDER BY key ASC
 								```
 								```response
 								┌─key─┬─rdb.key─┬─value──┬─value2─┐
 								│   0 │       0 │ [0,1]  │ val20  │
 								│   2 │       2 │ [2,3]  │ val22  │
 								│   3 │       3 │ [3,4]  │ val23  │
 								│   6 │       6 │ [6,7]  │ val26  │
 								│   7 │       7 │ [7,8]  │ val27  │
 								│   8 │       8 │ [8,9]  │ val28  │
 								│   9 │       9 │ [9,10] │ val29  │
 								└─────┴─────────┴────────┴────────┘
 								```
-												address review comments

											
										
										
											2023-06-02 20:30:18 +00:00
+								### More information on Joins
-												Fix anchors to settings.md

											
										
										
											2023-12-20 18:26:36 +00:00
+								- [`join_algorithm` setting](/docs/en/operations/settings/settings.md#join_algorithm)
-												address review comments

											
										
										
											2023-06-02 20:30:18 +00:00
+								- [JOIN clause](/docs/en/sql-reference/statements/select/join.md)