ClickHouse/docs/en/engines/table-engines/integrations/mongodb.md
Kirill Nikiforov 9425be31b3
fix dates
2024-07-29 23:39:19 +03:00

5.6 KiB

slug sidebar_position sidebar_label
/en/engines/table-engines/integrations/mongodb 135 MongoDB

MongoDB

MongoDB engine is read-only table engine which allows to read data from remote MongoDB collection.

Only MongoDB v3.6+ servers are supported.

If you're facing troubles, please report the issue, and try to use the legacy implementation. Keep in mind that it is deprecated, and will be removed in next releases.

Types mappings

MongoDB ClickHouse
bool, int32, int64 any numeric type, String
int32 Int32, String
int64 Int64, String
double Float64, String
date Date, Date32, DateTime, DateTime64, String
string String, UUID
document String(as JSON)
array Array, String(as JSON)
oid String
binary String if in column, base64 encoded string if in an array or document
any other String

If key not found in MongoDB document, default value or null(if the column is nullable) will be inserted.

Supported clauses

You can disable all these restriction, see mongodb_fail_on_query_build_error.
If allow_experimental_analyzer=0, ClickHouse will not try to build MongoDB query, sort and limit.

You can use MongoDB table in CTE to perform any clauses, but be aware, that in some cases, performance will be significantly degraded.

For example, you want to query count() with GROUP BY(which is not supported by MongoDB engine):

SELECT count(), name FROM mongo_table WHERE name IN ('clickhouse', 'mongodb') GROUP BY name;

You can set mongodb_fail_on_query_build_error=0, but this will cause poor performance, because all data will be read from mongo_table before filtering by name.
So, there is a solution:

SELECT count(), name
FROM (SELECT name FROM mongo_table WHERE name in ('clickhouse', 'mongodb'))
GROUP BY name;

WHERE

Only constant literals are allowed.

PREWHERE and HAVING are not supported.

Note:

It's always better to explicitly set type of literal because Mongo requires strict typed filters.
For example you want to filter by Date:

SELECT * FROM mongo_table WHERE date = '2024-01-01'

This will not work because Mongo will not cast string to Date, so you need to cast it manually:

SELECT * FROM mongo_table WHERE date = '2024-01-01'::Date OR date = toDate('2024-01-01')

This applied for Date, Date32, DateTime, Bool, UUID.

LIMIT and OFFSET

Only LIMIT is supported.

ORDER BY

Simple expressions only are supported, without any modification like COLLATE, WITH, TO, etc.

WINDOW

Not supported.

GROUP BY

Not supported.

Aggregation functions

Not supported.

Creating a Table

CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
    name1 [type1],
    name2 [type2],
    ...
) ENGINE = MongoDB(host:port, database, collection, user, password [, options]);

Engine Parameters

  • host:port — MongoDB server address.

  • database — Remote database name.

  • collection — Remote collection name.

  • user — MongoDB user.

  • password — User password.

  • options — MongoDB connection string options (optional parameter).

:::tip If you are using the MongoDB Atlas cloud offering:

- connection url can be obtained from 'Atlas SQL' option
- use options: 'connectTimeoutMS=10000&ssl=true&authSource=admin'

:::

Also, you can simply pass a URI:

ENGINE = MongoDB(uri, collection);

Engine Parameters

  • uri — MongoDB server's connection URI

  • collection — Remote collection name.

Usage Example

Create a table in ClickHouse which allows to read data from MongoDB collection:

CREATE TABLE mongo_table
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo1:27017', 'test', 'simple_table', 'testuser', 'password');

or

ENGINE = MongoDB('mongodb://testuser:password@mongo1:27017/test', 'simple_table');

To read from an SSL secured MongoDB server:

CREATE TABLE mongo_table_ssl
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'password', 'ssl=true');

Query:

SELECT COUNT() FROM mongo_table;
┌─count()─┐
│       4 │
└─────────┘

You can also adjust connection timeout:

CREATE TABLE mongo_table
(
    key UInt64,
    data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'password', 'connectTimeoutMS=100000');

Troubleshooting

You can see the generated MongoDB query in DEBUG level logs.

Implementation details can be found in mongocxx and mongoc documentations.