5.6 KiB
slug | sidebar_position | sidebar_label |
---|---|---|
/en/engines/table-engines/integrations/mongodb | 135 | MongoDB |
MongoDB
MongoDB engine is read-only table engine which allows to read data from remote MongoDB collection.
Only MongoDB v3.6+ servers are supported.
If you're facing troubles, please report the issue, and try to use the legacy implementation. Keep in mind that it is deprecated, and will be removed in next releases.
Types mappings
MongoDB | ClickHouse |
---|---|
bool, int32, int64 | any numeric type, String |
int32 | Int32, String |
int64 | Int64, String |
double | Float64, String |
date | Date, Date32, DateTime, DateTime64, String |
string | String, UUID |
document | String(as JSON) |
array | Array, String(as JSON) |
oid | String |
binary | String if in column, base64 encoded string if in an array or document |
any other | String |
If key not found in MongoDB document, default value or null(if the column is nullable) will be inserted.
Supported clauses
You can disable all these restriction, see mongodb_fail_on_query_build_error.
If allow_experimental_analyzer=0
, ClickHouse will not try to build MongoDB query, sort and limit.
You can use MongoDB table in CTE to perform any clauses, but be aware, that in some cases, performance will be significantly degraded.
For example, you want to query count() with GROUP BY(which is not supported by MongoDB engine):
SELECT count(), name FROM mongo_table WHERE name IN ('clickhouse', 'mongodb') GROUP BY name;
You can set mongodb_fail_on_query_build_error=0
, but this will cause poor performance, because all data will be read from mongo_table
before filtering by name
.
So, there is a solution:
SELECT count(), name
FROM (SELECT name FROM mongo_table WHERE name in ('clickhouse', 'mongodb'))
GROUP BY name;
WHERE
Only constant literals are allowed.
PREWHERE and HAVING are not supported.
Note:
It's always better to explicitly set type of literal because Mongo requires strict typed filters.
For example you want to filter by Date
:
SELECT * FROM mongo_table WHERE date = '2024-01-01'
This will not work because Mongo will not cast string to Date
, so you need to cast it manually:
SELECT * FROM mongo_table WHERE date = '2024-01-01'::Date OR date = toDate('2024-01-01')
This applied for Date
, Date32
, DateTime
, Bool
, UUID
.
LIMIT and OFFSET
Only LIMIT
is supported.
ORDER BY
Simple expressions only are supported, without any modification like COLLATE, WITH, TO, etc.
WINDOW
Not supported.
GROUP BY
Not supported.
Aggregation functions
Not supported.
Creating a Table
CREATE TABLE [IF NOT EXISTS] [db.]table_name
(
name1 [type1],
name2 [type2],
...
) ENGINE = MongoDB(host:port, database, collection, user, password [, options]);
Engine Parameters
-
host:port
— MongoDB server address. -
database
— Remote database name. -
collection
— Remote collection name. -
user
— MongoDB user. -
password
— User password. -
options
— MongoDB connection string options (optional parameter).
:::tip If you are using the MongoDB Atlas cloud offering:
- connection url can be obtained from 'Atlas SQL' option
- use options: 'connectTimeoutMS=10000&ssl=true&authSource=admin'
:::
Also, you can simply pass a URI:
ENGINE = MongoDB(uri, collection);
Engine Parameters
-
uri
— MongoDB server's connection URI -
collection
— Remote collection name.
Usage Example
Create a table in ClickHouse which allows to read data from MongoDB collection:
CREATE TABLE mongo_table
(
key UInt64,
data String
) ENGINE = MongoDB('mongo1:27017', 'test', 'simple_table', 'testuser', 'password');
or
ENGINE = MongoDB('mongodb://testuser:password@mongo1:27017/test', 'simple_table');
To read from an SSL secured MongoDB server:
CREATE TABLE mongo_table_ssl
(
key UInt64,
data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'password', 'ssl=true');
Query:
SELECT COUNT() FROM mongo_table;
┌─count()─┐
│ 4 │
└─────────┘
You can also adjust connection timeout:
CREATE TABLE mongo_table
(
key UInt64,
data String
) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'password', 'connectTimeoutMS=100000');
Troubleshooting
You can see the generated MongoDB query in DEBUG level logs.
Implementation details can be found in mongocxx and mongoc documentations.