ClickHouse/docs/en/engines/table-engines/integrations/mongodb.md
2024-05-26 01:58:46 +03:00

4.0 KiB

slug sidebar_position sidebar_label
/en/engines/table-engines/integrations/mongodb 135 MongoDB

MongoDB

MongoDB engine is read-only table engine which allows to read data from remote MongoDB collection.

Only MongoDB v3.6+ servers are supported.

Types mappings

MongoDB ClickHouse
bool, int32, int64 any numeric type, String
int32 Int32, String
int64 Int64, String
double Float64, String
date Date, Date32, DateTime, DateTime64, String
string String, UUID
document String(as JSON)
array Array, String(as JSON)
oid String
any other String

If key not found in MongoDB document, default value or null(if the column is nullable) will be inserted.

Supported clauses

Hint: you can use MongoDB table in CTE to perform any clauses, but be aware, that in some cases, performance will be significantly degraded.

WHERE

Only constant literals are allowed.

PREWHERE and HAVING are not supported.

LIMIT and OFFSET

Only LIMIT is supported.

ORDER BY

Simple expressions only are supported, without any modification like COLLATE, WITH, TO, etc.

WINDOW

Not supported.

GROUP BY

Not supported.

Aggregation functions

Not supported.

Notes

Situation with bool

In ClickHouse boolean is an alias for UInt8, but in MongoDB it's a type. So, not in all cases it's possible to determine, is UInt8 supposed to be bool, and filters may not work correctly. But there is a hack: use x = toBool(true) instead of x = true.

Creating a Table

CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
    name1 [type1],
    name2 [type2],
    ...
) ENGINE = MongoDB(host:port, database, collection, user, password [, options]);

Engine Parameters

  • host:port — MongoDB server address.

  • database — Remote database name.

  • collection — Remote collection name.

  • user — MongoDB user.

  • password — User password.

  • options — MongoDB connection string options (optional parameter).

:::tip If you are using the MongoDB Atlas cloud offering please add these options:

'connectTimeoutMS=10000&ssl=true&authSource=admin'

:::

Also, you can simply pass a URI:

ENGINE = MongoDB(uri, collection);

Engine Parameters

  • uri — MongoDB server's connection URI

  • collection — Remote collection name.

Usage Example

Create a table in ClickHouse which allows to read data from MongoDB collection:

CREATE TABLE mongo_table
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo1:27017', 'test', 'simple_table', 'testuser', 'clickhouse');

or

ENGINE = MongoDB('mongodb://testuser:clickhouse@mongo1:27017/test', 'simple_table');

To read from an SSL secured MongoDB server:

CREATE TABLE mongo_table_ssl
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'clickhouse', 'ssl=true');

Query:

SELECT COUNT() FROM mongo_table;
┌─count()─┐
│       4 │
└─────────┘

You can also adjust connection timeout:

CREATE TABLE mongo_table
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'clickhouse', 'connectTimeoutMS=100000');

Troubleshooting

You can see the generated MongoDB query in DEBUG level logs.

Implementation details can be found in mongocxx and mongoc documentations.