Commit Graph

84739 Commits

Author SHA1 Message Date
Robert Schulze
514d4d2187
Implement ProtobufList - fixes ClickHouse#16436
Introduce IO format "ProtobufList" with protobuf schema

    // schemafile.proto
    message Envelope {
      message MessageType {
        uint32 colA = 1;
        string colB = 2;
      }
      repeated MessageType mt = 1;
    }

where "Envelope" is a hard-coded/expected top-level message and
"MessageType" is a message with user-provided name containing the table
fields to export/import, e.g.

    SELECT * FROM db1.tab1 FORMAT ProtobufList SETTINGS format_schema =
    'schemafile:MessageType'

As a result, the new format wraps a list of messages (one per row) into
a single, containing message. Compare that to the schema of the existing
IO formats "Protobuf" and "ProtobufSingle":

    message MessageType {
      uint32 colA = 1;
      string colB = 2;
    }

The new format does not save space compared to the existing formats, but
it is conceptually a bit more beautiful and also more convenenient.

Implementation details:

- Created new files ProtobufList(Input|Output)Format which use the
  existing ProtobufSerializer mechanism. The goal was to reuse as much
  code as possible and avoid copypasta.

- I was torn between inheriting from I(Input|Output)Format vs.
  IRow(Input|Output)Format for ProtobufList(Input|Output)Format. The
  former is chunk-based which can be better for performance. Since the
  ProtobufSerializer mechanism is row-based but data is generally passed
  around in chunks, I decided for the latter to leverage the existing
  chunk <--> row mapping code in IRow(InputOutput)Format.

- A new ProtobufSerializer called ProtobufSerializerEnvelope was
  introduced (--> ProtobufSerializer.cpp). It represents the top-level
  message which encloses the list of inner nested messages, i.e. the
  rows.

- With the new format, parsing the schema file and matching the fields in
  the schema file to table column works like for the old formats. The only
  difference is that parsing starts one level below the "Envelope" (-->
  ProtobufSchema.cpp). This is more natural than forcing customers to
  have table columns start with "Envelope".

- Creation of the ProtobufSerializer tree also works like before. What
  is different is that we finally add a ProtobufSerializerEnvelope as
  new root of the tree. It's only purpose is to write/read the top-level
  message for the first/last row to write/read.

Caveats:

- The low-level serialization code in ProtobufWriter uses an internal
  buffer which is flushed to the output file only in endMessage().
  In the existing "Protobuf" format, this happens once per row, in the
  new format this happens only at the end of the serialization
  since row-level messages now call start/endNestedMessage(). As a
  future TODO to, the buffer should be flushed also in
  start/endNestedMessage() to reduce memory consumption.
2022-03-14 08:04:58 +01:00
Denny Crane
39c6428636 Doc. named connections 2022-03-14 00:35:02 -03:00
Rich Raposa
928538f04b
Update quotas.md
Missed this comment from Alexey earlier
2022-03-13 21:34:43 -06:00
Rich Raposa
6fbb63b30c
Update internal-dicts.md
The mentioned functions have already been removed
2022-03-13 21:31:48 -06:00
Denny Crane
0b4c3e5be9 Doc. named connections 2022-03-14 00:31:20 -03:00
Rich Raposa
fa3c3f9179
Update rounding-functions.md
Adding a clarification about the use case of `roundDuration`
2022-03-13 21:27:02 -06:00
Rich Raposa
67587a8ed0
Update json-functions.md
Clarified the wording about the assumptions - which only apply to the `visitParam` functions
2022-03-13 21:19:29 -06:00
taiyang-li
8da041fc12 fix test 02117_show_create_table_system 2022-03-14 10:40:45 +08:00
mergify[bot]
cba9c03d18
Merge branch 'master' into change-timezone-in-stateful-tests 2022-03-14 01:28:19 +00:00
Alexey Milovidov
4712499b83
Merge pull request #35247 from ClickHouse/add-test-34682
Add a test for #34682
2022-03-14 04:26:33 +03:00
Alexey Milovidov
eb1192934c
Merge pull request #35249 from azat/fix-01506_buffer_table_alter_block_structure_2
Fix possible 01506_buffer_table_alter_block_structure_2 flakiness
2022-03-14 04:25:32 +03:00
Denny Crane
7e5589fd78 Doc. named connections 2022-03-13 21:38:00 -03:00
Maksim Kita
ded4c8430c
Merge pull request #35242 from ClickHouse/remove-bugs-2
Remove "bugs" that do not exist anymore
2022-03-14 00:59:08 +01:00
Maksim Kita
ce0c8e5597
Update JSONRowOutputFormat.cpp 2022-03-14 00:58:36 +01:00
Azat Khuzhin
19be9c8c64 Add a comment for ColumnAggregateFunction::force_data_ownership
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-13 23:28:38 +03:00
Azat Khuzhin
619bed4371 Fix possible 01506_buffer_table_alter_block_structure_2 flakiness
SELECT from Buffer table is racy, so you can get data from the
underlying table but not from the Buffer itself, since in parallel with
SELECT, Buffer, can flush it's data to the underlying table.

It is hard to avoid with the current architecture, since this will
require to holding lock until the data will be read from the Buffer, and
this is not a good alternative.

So let's fix the test instead, but not relying on background flush (TTL
increased).

Here is an example of a test failure [1]:

    2022.03.12 20:56:58.141182 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Debug> executeQuery: (from [::1]:33324) (comment: 01506_buffer_table_alter_block_structure_2.sql) SELECT * FROM buf ORDER BY timestamp;
    2022.03.12 20:56:58.162709 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 20200101_1_1_0, approx. 1 rows starting from 0
    2022.03.12 20:56:59.144663 [ 615 ] {} <Trace> test_bdtzgu.buf_dest (79ba36b2-0e90-4bbb-b55f-a42b605b362b): Renaming temporary part tmp_insert_20200101_2_2_0 to 20200101_2_2_0.
    2022.03.12 20:56:59.147550 [ 615 ] {} <Debug> StorageBuffer (test_bdtzgu.buf): Flushing buffer with 1 rows, 18 bytes, age 1 seconds, took 19 ms (bg).
    2022.03.12 20:56:59.391774 [ 678 ] {011e7d25-82a9-4ab6-8cb0-dcbbc84f9581} <Information> executeQuery: Read 1 rows, 13.00 B in 1.250102785 sec., 0 rows/sec., 10.40 B/sec.

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/0/044cd6b861c1f4f00c6c24c4020799b676de6d34/stateless_tests__memory__actions__[1/3].html

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-03-13 23:13:31 +03:00
Robert Schulze
f0ba39b071
Clean up some header includes and make formatting more consistent 2022-03-13 20:24:12 +01:00
mergify[bot]
638c7f8637
Merge branch 'master' into remove-bugs-2 2022-03-13 19:14:20 +00:00
Alexey Milovidov
b958edc104 Add a test for #34682 2022-03-13 20:12:10 +01:00
Maksim Kita
0dd807d19d
Merge pull request #34750 from kitaisreal/merge-tree-improve-insert-performance
MergeTree improve insert performance
2022-03-13 13:39:18 +01:00
Kseniia Sumarokova
c04b103e6c
Merge pull request #35245 from ClickHouse/kssenii-patch-3
Update CachedReadBufferFromRemoteFS.cpp
2022-03-13 13:33:00 +01:00
Kseniia Sumarokova
35e5b4e8a5
Update CachedReadBufferFromRemoteFS.cpp 2022-03-13 12:37:00 +01:00
Alexey Milovidov
d009263e37
Merge pull request #35243 from ClickHouse/revert-35225-docker-timezone-change
Revert "Change timezone in Docker"
2022-03-13 04:03:51 +03:00
Alexey Milovidov
d0716b035f
Revert "Change timezone in Docker" 2022-03-13 04:03:06 +03:00
Alexey Milovidov
8dedea4f8f
Merge pull request #35236 from rfraposa/master
Update references in docs
2022-03-12 23:03:22 +03:00
Alexey Milovidov
044cd6b861 Remove "ru" blog 2022-03-12 21:03:03 +01:00
Alexey Milovidov
978877a9c0
Merge pull request #35212 from rschu1ze/cpp14-trait-aliases
Use C++14 aliases for some type traits
2022-03-12 22:20:14 +03:00
Alexey Milovidov
2fe54b57b9 Addition to prev. revision 2022-03-12 20:16:25 +01:00
Alexey Milovidov
f0867ed7ea Moved another test 2022-03-12 20:16:25 +01:00
Alexey Milovidov
4a92a8a732 Remove "bugs" that do not exist anymore 2022-03-12 20:16:25 +01:00
Alexey Milovidov
e1c8ffca0d
Merge pull request #35241 from ClickHouse/revert-35228-remove-bugs
Revert "Remove "bugs" that do not exist anymore"
2022-03-12 22:12:22 +03:00
Alexey Milovidov
a9f0c66475
Revert "Remove "bugs" that do not exist anymore" 2022-03-12 22:11:49 +03:00
Alexey Milovidov
7c84b33918 Update test references 2022-03-12 20:10:34 +01:00
Alexey Milovidov
3385275003
Merge pull request #35226 from ClickHouse/timezone-in-config
Change timezone example in server config
2022-03-12 21:58:29 +03:00
Maksim Kita
b67f756a43 Fixed performance tests 2022-03-12 18:04:08 +00:00
Maksim Kita
3a2b3ce503 Standardize behaviour of CAST into IPv4, IPv6, toIPv4, toIPv6 functions 2022-03-12 17:12:05 +00:00
Maksim Kita
3ccc5e2e82
Merge pull request #35228 from ClickHouse/remove-bugs
Remove "bugs" that do not exist anymore
2022-03-12 18:09:21 +01:00
rfraposa
ecbdfdea08 Incorporated feedback 2022-03-12 10:04:51 -06:00
Alexey Milovidov
1f084f8a5d
Merge pull request #35225 from ClickHouse/docker-timezone-change
Change timezone in Docker
2022-03-12 16:05:03 +03:00
Alexey Milovidov
2b5e42e997
Merge pull request #35227 from ClickHouse/change-timezone-in-performance-tests
Adjust timezone in performance tests
2022-03-12 15:48:38 +03:00
Alexey Milovidov
bbb3a895e7
Merge pull request #35232 from ClickHouse/submodules-in-main-repo
Moved submodules from ClickHouse-Extras to ClickHouse
2022-03-12 15:34:47 +03:00
Robert Schulze
6fc6d3d452
Remove runtime conditional using constexpr if 2022-03-12 10:41:15 +01:00
rfraposa
5a4466cec7 Update references in docs 2022-03-12 00:24:31 -06:00
zzsmdfj
88560c3917 to #35128_add_mysql_error__detail 2022-03-12 11:10:26 +08:00
Alexey Milovidov
261806e897
Merge pull request #35223 from ClickHouse/testflows-remove-redundant-configs
Remove redundant configs for TestFlows
2022-03-12 05:22:37 +03:00
Alexey Milovidov
451fbae076
Merge pull request #35230 from ClickHouse/change-examples-in-docs
Change examples in docs
2022-03-12 04:19:33 +03:00
Alexey Milovidov
c837057b6b Remove unused files from blog 2022-03-12 02:15:09 +01:00
Alexey Milovidov
fbb5547d0f Moved submodules from ClickHouse-Extras to ClickHouse 2022-03-12 02:11:07 +01:00
Alexey Milovidov
99f081d17e Adapted example 2022-03-12 00:42:34 +01:00
Alexey Milovidov
53d59bb88c
Update README.md 2022-03-12 02:39:24 +03:00