ClickHouse/docs/en/sql-reference/table-functions/postgresql.md
2024-08-01 23:10:53 -07:00

6.0 KiB

slug sidebar_position sidebar_label
/en/sql-reference/table-functions/postgresql 160 postgresql

postgresql

Allows SELECT and INSERT queries to be performed on data that is stored on a remote PostgreSQL server.

Syntax

postgresql({host:port, database, table, user, password[, schema, [, on_conflict]] | named_collection[, option=value [,..]]})

Parameters

  • host:port — PostgreSQL server address.
  • database — Remote database name.
  • table — Remote table name.
  • user — PostgreSQL user.
  • password — User password.
  • schema — Non-default table schema. Optional.
  • on_conflict — Conflict resolution strategy. Example: ON CONFLICT DO NOTHING. Optional.

Arguments also can be passed using named collections. In this case host and port should be specified separately. This approach is recommended for production environment.

Returned Value

A table object with the same columns as the original PostgreSQL table.

:::note In the INSERT query to distinguish table function postgresql(...) from table name with column names list you must use keywords FUNCTION or TABLE FUNCTION. See examples below. :::

Implementation Details

SELECT queries on PostgreSQL side run as COPY (SELECT ...) TO STDOUT inside read-only PostgreSQL transaction with commit after each SELECT query.

Simple WHERE clauses such as =, !=, >, >=, <, <=, and IN are executed on the PostgreSQL server.

All joins, aggregations, sorting, IN [ array ] conditions and the LIMIT sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.

INSERT queries on PostgreSQL side run as COPY "table_name" (field1, field2, ... fieldN) FROM STDIN inside PostgreSQL transaction with auto-commit after each INSERT statement.

PostgreSQL Array types converts into ClickHouse arrays.

:::note Be careful, in PostgreSQL an array data type column like Integer[] may contain arrays of different dimensions in different rows, but in ClickHouse it is only allowed to have multidimensional arrays of the same dimension in all rows. :::

Supports multiple replicas that must be listed by |. For example:

SELECT name FROM postgresql(`postgres{1|2|3}:5432`, 'postgres_database', 'postgres_table', 'user', 'password');

or

SELECT name FROM postgresql(`postgres1:5431|postgres2:5432`, 'postgres_database', 'postgres_table', 'user', 'password');

Supports replicas priority for PostgreSQL dictionary source. The bigger the number in map, the less the priority. The highest priority is 0.

Examples

Table in PostgreSQL:

postgres=# CREATE TABLE "public"."test" (
"int_id" SERIAL,
"int_nullable" INT NULL DEFAULT NULL,
"float" FLOAT NOT NULL,
"str" VARCHAR(100) NOT NULL DEFAULT '',
"float_nullable" FLOAT NULL DEFAULT NULL,
PRIMARY KEY (int_id));

CREATE TABLE

postgres=# INSERT INTO test (int_id, str, "float") VALUES (1,'test',2);
INSERT 0 1

postgresql> SELECT * FROM test;
  int_id | int_nullable | float | str  | float_nullable
 --------+--------------+-------+------+----------------
       1 |              |     2 | test |
(1 row)

Selecting data from ClickHouse using plain arguments:

SELECT * FROM postgresql('localhost:5432', 'test', 'test', 'postgresql_user', 'password') WHERE str IN ('test');

Or using named collections:

CREATE NAMED COLLECTION mypg AS
        host = 'localhost',
        port = 5432,
        database = 'test',
        user = 'postgresql_user',
        password = 'password';
SELECT * FROM postgresql(mypg, table='test') WHERE str IN ('test');
┌─int_id─┬─int_nullable─┬─float─┬─str──┬─float_nullable─┐
│      1 │         ᴺᵁᴸᴸ │     2 │ test │           ᴺᵁᴸᴸ │
└────────┴──────────────┴───────┴──────┴────────────────┘

Inserting:

INSERT INTO TABLE FUNCTION postgresql('localhost:5432', 'test', 'test', 'postgrsql_user', 'password') (int_id, float) VALUES (2, 3);
SELECT * FROM postgresql('localhost:5432', 'test', 'test', 'postgresql_user', 'password');
┌─int_id─┬─int_nullable─┬─float─┬─str──┬─float_nullable─┐
│      1 │         ᴺᵁᴸᴸ │     2 │ test │           ᴺᵁᴸᴸ │
│      2 │         ᴺᵁᴸᴸ │     3 │      │           ᴺᵁᴸᴸ │
└────────┴──────────────┴───────┴──────┴────────────────┘

Using Non-default Schema:

postgres=# CREATE SCHEMA "nice.schema";

postgres=# CREATE TABLE "nice.schema"."nice.table" (a integer);

postgres=# INSERT INTO "nice.schema"."nice.table" SELECT i FROM generate_series(0, 99) as t(i)
CREATE TABLE pg_table_schema_with_dots (a UInt32)
        ENGINE PostgreSQL('localhost:5432', 'clickhouse', 'nice.table', 'postgrsql_user', 'password', 'nice.schema');

See Also

Replicating or migrating Postgres data with with PeerDB

In addition to table functions, you can always use PeerDB by ClickHouse to set up a continuous data pipeline from Postgres to ClickHouse. PeerDB is a tool designed specifically to replicate data from Postgres to ClickHouse using change data capture (CDC).