Commit Graph

4023 Commits

Author SHA1 Message Date
alexX512
ea07f08f75 Fix bug with wrong checking of execution finish in PullingAsyncPipeline 2023-02-05 13:23:26 +00:00
flynn
f89a6cf68d Improve format JSONColumns when result is empty 2023-02-05 13:13:21 +00:00
Igor Nikonov
f49f5d7091 Update tests
+ updadte sorting properties after applying aggregation in order
2023-02-03 17:10:31 +00:00
Anton Popov
88f2068bfb
Merge branch 'master' into fix-sparse-columns-crash 2023-02-03 16:01:11 +01:00
Nikita Mikhaylov
33877b5e00
Parallel replicas. Part [2] (#43772) 2023-02-03 14:34:18 +01:00
Igor Nikonov
ed00db7580 Update sorting properties after reading in order applied 2023-02-03 13:15:06 +00:00
Robert Schulze
6908093787
Merge pull request #45973 from ClickHouse/final-codecs
Add "final" specifier to some classes
2023-02-03 10:52:59 +01:00
Anton Popov
ca20b1e32f fix INTERSECT and EXCEPT with sparse columns 2023-02-03 00:17:55 +00:00
Robert Schulze
3472ed4f0d
Make some classes 'final' 2023-02-02 15:59:39 +00:00
Antonio Andelic
a39e4e24c6
Merge branch 'master' into optimize_parquet_reader 2023-02-02 14:18:00 +01:00
Alexey Milovidov
97b6934ed6
Update ParquetBlockInputFormat.cpp 2023-02-02 02:42:21 +03:00
liuneng
17fc22a21e add parquet max_block_size setting 2023-02-01 18:29:20 +08:00
Robert Schulze
b512316586
Merge pull request #45682 from ClickHouse/rename-qrc-to-qc
Rename "Query Result Cache" to "Query Cache"
2023-02-01 11:23:29 +01:00
liuneng
cda9b0beea optimize parquet reader 2023-02-01 15:54:10 +08:00
Alexey Milovidov
075dfe9005
Merge pull request #45750 from ucasfl/arrow-duration
Arrow support duration type
2023-01-31 20:34:38 +03:00
Robert Schulze
325c6bdf3d
Renaming: "Query Result Cache" --> "Query Cache"
Reasons:

- The cache will at some point store intermediate results as opposed to
  only query results. We should change the terminology now without
  having to worry about backward compat.

- Equivalent caches in MySQL (1) and Starrocks (2) are called "query
  cache".

- The new name is ca. 13.8% more catchy.

(1) https://dev.mysql.com/doc/refman/5.6/en/query-cache.html
(2) https://docs.starrocks.io/en-us/2.5/using_starrocks/query_cache
2023-01-31 09:54:34 +00:00
flynn
801d6db486 Arrow support duration type 2023-01-29 10:55:31 +00:00
Alexey Milovidov
9ad87f9917 Fix style and typo 2023-01-29 03:51:42 +01:00
Igor Nikonov
300f78df96
Merge pull request #45567 from ClickHouse/enable-remove-redundant-sorting
Enable query_plan_remove_redundant_sorting optimization by default
2023-01-27 19:14:36 +01:00
Igor Nikonov
211edc054e Fix check for sum*(Float) 2023-01-26 19:26:53 +00:00
Alexander Tokmakov
067b1f5f13 Merge branch 'master' into exception_message_patterns4 2023-01-26 15:20:58 +01:00
Igor Nikonov
105d72e299 Fix: sum*() with Nullable(Float) 2023-01-26 13:17:06 +00:00
Igor Nikonov
e1302e1a32 Fix: sum*() with Floats
sum(Float) depends on order, do not remove sorting in such case
2023-01-25 22:39:28 +00:00
Azat Khuzhin
71286e779e Extend assertion in buildPushingToViewsChain() to respect is_detached
Follow-up for: #45566 (@tavplubix)
Refs: #43161 (@AlfVII)
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-25 15:42:06 +01:00
Alexander Tokmakov
d1baa7300c reformat ParsingException 2023-01-24 23:21:29 +01:00
Alexander Tokmakov
dd57215934 Merge branch 'master' into exception_message_patterns4 2023-01-24 17:03:12 +01:00
Kruglov Pavel
23c12ac8ee
Merge branch 'master' into parquet-fixed-binary 2023-01-24 16:51:05 +01:00
Kruglov Pavel
689fbea759
Merge pull request #45478 from Avogar/fix-arrow-abort
Fix possible aborts in arrow lib
2023-01-24 16:37:44 +01:00
Kruglov Pavel
4bd3f0e5ef
Merge pull request #44953 from Avogar/tsv-csv-detect-header
Detect header in CSV/TSV/CustomSeparated files automatically
2023-01-24 15:13:52 +01:00
Alexander Tokmakov
6ecae8388e Merge branch 'master' into exception_message_patterns4 2023-01-24 14:42:36 +01:00
Igor Nikonov
7108189b45 Merge remote-tracking branch 'origin/master' into revert-45414-revert-43905-igor/remove_redundant_order_by 2023-01-24 11:46:24 +00:00
Alexey Milovidov
851f6bf910
Merge pull request #45448 from azat/fix-query-hang
Fix possible (likely distributed) query hung
2023-01-24 13:00:37 +03:00
Alexander Tokmakov
c6910f39b9 fix 2023-01-24 01:11:58 +01:00
Kruglov Pavel
cd1cd904a7
Merge branch 'master' into tsv-csv-detect-header 2023-01-23 23:49:56 +01:00
Alexander Tokmakov
bb4c8e169f check number of parameters in format string 2023-01-23 23:16:16 +01:00
Alexander Tokmakov
3f6594f4c6 forbid old ctor of Exception 2023-01-23 22:18:05 +01:00
Alexander Tokmakov
70d1adfe4b
Better formatting for exception messages (#45449)
* save format string for NetException

* format exceptions

* format exceptions 2

* format exceptions 3

* format exceptions 4

* format exceptions 5

* format exceptions 6

* fix

* format exceptions 7

* format exceptions 8

* Update MergeTreeIndexGin.cpp

* Update AggregateFunctionMap.cpp

* Update AggregateFunctionMap.cpp

* fix
2023-01-24 00:13:58 +03:00
Kruglov Pavel
478a552a0a
Merge branch 'master' into tsv-csv-detect-header 2023-01-23 21:47:17 +01:00
Robert Schulze
0ab7ae6c17
Merge pull request #43797 from ClickHouse/query-result-cache
Query result cache [experimental]
2023-01-23 19:54:35 +01:00
Maksim Kita
7b48c75e82
Merge pull request #45485 from kitaisreal/remove-unnecessary-get-total-row-count-function-calls
Remove unnecessary getTotalRowCount function calls
2023-01-23 21:02:51 +03:00
Kruglov Pavel
af2c1bac6a
Fix typo 2023-01-23 17:13:16 +01:00
Kruglov Pavel
01ddf326ac
Merge branch 'master' into parquet-fixed-binary 2023-01-23 15:31:45 +01:00
Kruglov Pavel
84200be7d2
Better comment 2023-01-23 15:31:07 +01:00
Anton Popov
f181254fb0 fix race in destructor of ParallelParsingInputFormat 2023-01-23 01:18:58 +00:00
Robert Schulze
340f406553
Merge branch 'master' into query-result-cache 2023-01-22 13:21:36 +01:00
Maksim Kita
47385a19e7 Remove unnecessary getTotalRowCount function calls 2023-01-21 11:27:07 +01:00
Azat Khuzhin
a64f6b5f3e Fix possible (likely distributed) query hung
Recently I saw the following, the client executed long distributed query
and terminated the connection, and in this case query cancellation will
be done from PullingAsyncPipelineExecutor dtor, but during cancellation
one of nodes sent ECONNRESET, and this leads to an exception from
PullingAsyncPipelineExecutor::cancel(), and this leads to a deadlock
when multiple threads waits each others, because cancel() for
LazyOutputFormat wasn't called.

Here is as relevant portion of logs:

    2023.01.04 08:26:09.236208 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Debug> executeQuery: (from 10.61.13.253:44266, user: default)  TooLongDistributedQueryToPost
    ...
    2023.01.04 08:26:09.262424 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 9_330_538_18, approx. 61440 rows starting from 0
    2023.01.04 08:26:09.266399 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> Connection (s4.ch:9000): Connecting. Database: (not specified). User: default
    2023.01.04 08:26:09.266849 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Trace> Connection (s4.ch:9000): Connected to ClickHouse server version 22.10.1.
    2023.01.04 08:26:09.267165 [ 26788 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Debug> Connection (s4.ch:9000): Sent data for 2 scalars, total 2 rows in 3.1587e-05 sec., 62635 rows/sec., 68.00 B (2.03 MiB/sec.), compressed 0.4594594594594595 times to 148.00 B (4.41 MiB/sec.)
    2023.01.04 08:39:13.047170 [ 37968 ] {f2ed6149-146d-4a3d-874a-b0b751c7b567} <Error> PullingAsyncPipelineExecutor: Code: 210. DB::NetException: Connection reset by peer, while writing to socket (10.7.142.115:9000). (NETWORK_ERROR), Stack trace (when copying this message, always include the lines below):

    0. ./.build/./contrib/libcxx/include/exception:133: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x1818234c in /usr/lib/debug/usr/bin/clickhouse.debug
    1. ./.build/./src/Common/Exception.cpp:69: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x1004fbda in /usr/lib/debug/usr/bin/clickhouse.debug
    2. ./.build/./src/Common/NetException.h:12: DB::WriteBufferFromPocoSocket::nextImpl() @ 0x14e352f3 in /usr/lib/debug/usr/bin/clickhouse.debug
    3. ./.build/./src/IO/BufferBase.h:39: DB::Connection::sendCancel() @ 0x15c21e6b in /usr/lib/debug/usr/bin/clickhouse.debug
    4. ./.build/./src/Client/MultiplexedConnections.cpp:0: DB::MultiplexedConnections::sendCancel() @ 0x15c4d5b7 in /usr/lib/debug/usr/bin/clickhouse.debug
    5. ./.build/./src/QueryPipeline/RemoteQueryExecutor.cpp:627: DB::RemoteQueryExecutor::tryCancel(char const*, std::__1::unique_ptr<DB::RemoteQueryExecutorReadContext, std::__1::default_delete<DB::RemoteQueryExecutorReadContext> >*) @ 0x14446c09 in /usr/lib/debug/usr/bin/clickhouse.debug
    6. ./.build/./contrib/libcxx/include/__iterator/wrap_iter.h💯 DB::ExecutingGraph::cancel() @ 0x15d2c0de in /usr/lib/debug/usr/bin/clickhouse.debug
    7. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:300: DB::PullingAsyncPipelineExecutor::cancel() @ 0x15d32055 in /usr/lib/debug/usr/bin/clickhouse.debug
    8. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:312: DB::PullingAsyncPipelineExecutor::~PullingAsyncPipelineExecutor() @ 0x15d31f4f in /usr/lib/debug/usr/bin/clickhouse.debug
    9. ./.build/./src/Server/TCPHandler.cpp:0: DB::TCPHandler::processOrdinaryQueryWithProcessors() @ 0x15cde919 in /usr/lib/debug/usr/bin/clickhouse.debug
    10. ./.build/./src/Server/TCPHandler.cpp:0: DB::TCPHandler::runImpl() @ 0x15cd8554 in /usr/lib/debug/usr/bin/clickhouse.debug
    11. ./.build/./src/Server/TCPHandler.cpp:1904: DB::TCPHandler::run() @ 0x15ce6479 in /usr/lib/debug/usr/bin/clickhouse.debug
    12. ./.build/./contrib/poco/Net/src/TCPServerConnection.cpp:57: Poco::Net::TCPServerConnection::start() @ 0x18074f07 in /usr/lib/debug/usr/bin/clickhouse.debug
    13. ./.build/./contrib/libcxx/include/__memory/unique_ptr.h:54: Poco::Net::TCPServerDispatcher::run() @ 0x180753ed in /usr/lib/debug/usr/bin/clickhouse.debug
    14. ./.build/./contrib/poco/Foundation/src/ThreadPool.cpp:213: Poco::PooledThread::run() @ 0x181e3807 in /usr/lib/debug/usr/bin/clickhouse.debug
    15. ./.build/./contrib/poco/Foundation/include/Poco/SharedPtr.h:156: Poco::ThreadImpl::runnableEntry(void*) @ 0x181e1483 in /usr/lib/debug/usr/bin/clickhouse.debug
    16. ? @ 0x7ffff7e55fd4 in ?
    17. ? @ 0x7ffff7ed666c in ?
     (version 22.10.1.1)

And here is the state of the threads:

<details>

<summary>system.stack_trace</summary>

```sql
SELECT
    arrayStringConcat(arrayMap(x -> demangle(addressToSymbol(x)), trace), '\n') AS sym
FROM system.stack_trace
WHERE query_id = 'f2ed6149-146d-4a3d-874a-b0b751c7b567'
SETTINGS allow_introspection_functions=1

Row 1:
──────
sym:
pthread_cond_wait
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&)
bool ConcurrentBoundedQueue<DB::Chunk>::emplaceImpl<DB::Chunk>(std::__1::optional<unsigned long>, DB::Chunk&&)
DB::IOutputFormat::work()
DB::ExecutionThreadContext::executeTask()
DB::PipelineExecutor::executeStepImpl(unsigned long, std::__1::atomic<bool>*)

Row 2:
──────
sym:
pthread_cond_wait
Poco::EventImpl::waitImpl()
DB::PipelineExecutor::joinThreads()
DB::PipelineExecutor::executeImpl(unsigned long)
DB::PipelineExecutor::execute(unsigned long)

Row 3:
──────
sym:
pthread_cond_wait
Poco::EventImpl::waitImpl()
DB::PullingAsyncPipelineExecutor::Data::~Data()
DB::PullingAsyncPipelineExecutor::~PullingAsyncPipelineExecutor()
DB::TCPHandler::processOrdinaryQueryWithProcessors()
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
```

</details>

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 08:05:56 +01:00
Azat Khuzhin
e2fcf0f072 Catch exception on query cancellation
Since we still want to join the thread, yes it will be done in dtor, but
this looks better.

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 08:05:56 +01:00
Azat Khuzhin
0566f72d36 Cleanup PullingAsyncPipelineExecutor::cancel()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-01-21 08:05:56 +01:00
avogar
86336940f8 Better comment 2023-01-20 16:41:59 +00:00
avogar
4432ee9927 Fix aborts in arrow lib 2023-01-20 16:40:33 +00:00
avogar
550a703fbc Make a bit better 2023-01-20 14:58:39 +00:00
Kruglov Pavel
28ddcc2432
Merge branch 'master' into tsv-csv-detect-header 2023-01-20 15:08:38 +01:00
avogar
c34c0aa22e Fix comments 2023-01-19 16:03:46 +00:00
Kruglov Pavel
9820beae68
Apply suggestions from code review
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2023-01-19 16:11:13 +01:00
Igor Nikonov
d0ce804bfc Fix: dynamic_cast -> typeid_cast for SortingStep 2023-01-19 13:40:21 +00:00
Igor Nikonov
df3776d24b Make test stable
+ disable debug logging
2023-01-19 11:43:40 +00:00
Ilya Yatsishin
00962b7ad5
Merge pull request #45424 from Avogar/fix-json-import-nested 2023-01-19 10:31:40 +01:00
Igor Nikonov
57d2fd300a Fix: correct update of data stream sorting properties after removing
sorting
2023-01-19 00:11:58 +00:00
avogar
a8f20363f4 Fix JSON/BSONEachRow parsing with HTTP 2023-01-18 22:49:03 +00:00
Igor Nikonov
1866f990de
Revert "Revert "Remove redundant sorting"" 2023-01-18 20:12:34 +01:00
Igor Nikonov
7ed8fec94f
Revert "Remove redundant sorting" 2023-01-18 18:38:25 +01:00
Robert Schulze
4f90824347
Merge remote-tracking branch 'origin/master' into query-result-cache 2023-01-17 22:49:53 +00:00
Igor Nikonov
0db9bf38a2
Merge branch 'master' into igor/remove_redundant_order_by 2023-01-17 22:26:24 +01:00
Alexander Tokmakov
8b13b85ea0
Merge pull request #44543 from ClickHouse/text_log_add_pattern
Add a column with a message pattern to text_log
2023-01-17 20:19:32 +03:00
Igor Nikonov
0cfa08df7a Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-17 16:28:17 +00:00
Igor Nikonov
9855504403 Rename source file according to implementation 2023-01-17 16:24:51 +00:00
Kruglov Pavel
96bb99f864
Merge branch 'master' into tsv-csv-detect-header 2023-01-17 15:33:02 +01:00
Igor Nikonov
6328e02f22 Fix: update input/output stream properties
After removing sorting step we need to update sorting properties of
input/ouput streams
2023-01-17 13:39:18 +00:00
Sema Checherinda
35431e91e3
Merge pull request #45276 from ucasfl/avro-fix
Fix some avro reading bugs
2023-01-17 12:48:44 +01:00
avogar
5bf4704e7a Support FixedSizeBinary type in Parquet/Arrow 2023-01-16 21:01:31 +00:00
avogar
1c0941d72a Add docs and examples 2023-01-16 16:46:41 +00:00
avogar
3ea80b0f54 Merge branch 'master' of github.com:ClickHouse/ClickHouse into tsv-csv-detect-header 2023-01-16 15:14:25 +00:00
Igor Nikonov
a34991cb65 Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by 2023-01-16 12:14:02 +00:00
Robert Schulze
099e30ef2a
Merge remote-tracking branch 'origin/master' into query-result-cache 2023-01-16 08:04:49 +00:00
Robert Schulze
27fe7ebd93
Cosmetics 2023-01-15 16:12:48 +00:00
flynn
29eb30b49f Fix some reading avro format bugs
fix
2023-01-14 18:05:26 +00:00
Alexander Tokmakov
2d7773fccc Merge branch 'master' into text_log_add_pattern 2023-01-13 20:33:46 +01:00
avogar
e2470dd670 Fix tests 2023-01-13 17:03:53 +00:00
Robert Schulze
d7d3f61c73
Cleanup SourceFromChunks a bit 2023-01-13 10:57:31 +00:00
Robert Schulze
15e11741cb
Cosmetics 2023-01-13 00:00:23 +00:00
Robert Schulze
1b53307375
Refcount cache entries to improve lookup performance 2023-01-12 23:38:39 +00:00
Robert Schulze
0c528fca29
Introduce SourceFromChunks to avoid unnecessary partial result chunk concatenation 2023-01-12 23:01:00 +00:00
avogar
b461935374 Better 2023-01-12 13:11:04 +00:00
Kruglov Pavel
05a11ff4a4
Merge branch 'master' into tsv-csv-detect-header 2023-01-12 12:35:18 +01:00
Maksim Kita
a140d6c5b1 Fixed code review issues 2023-01-12 12:07:58 +01:00
Maksim Kita
47f4159909 Analyzer support distributed queries processing 2023-01-12 12:07:58 +01:00
avogar
e4d774d906 Better naming 2023-01-11 22:57:14 +00:00
avogar
26cd56d113 Fix tests, make better 2023-01-11 22:52:15 +00:00
Alexander Gololobov
6adf1c025f
Merge pull request #45165 from ClickHouse/more_logging_for_replicated_ttl
More logging to facilitate debugging of flaky test_ttl_replicated
2023-01-11 23:24:48 +01:00
avogar
3b45863d15 Make better implementation, fix tests 2023-01-11 17:12:56 +00:00
Alexander Gololobov
659fa96365 More logging to facilitate debugging 2023-01-11 13:06:38 +01:00
Nikolai Kochetov
5e7a6ac619
Merge pull request #45122 from ClickHouse/revert-45121-revert-44653-custom-reading-for-mutation
Revert "Revert "Custom reading for mutation""
2023-01-11 12:37:32 +01:00
avogar
6312b75f44 Fix style 2023-01-10 16:28:52 +00:00
avogar
615fe4cecb Fix tests 2023-01-10 16:27:23 +00:00
Maksim Kita
fbba28b31e Analyzer aggregation without column fix 2023-01-10 16:49:55 +01:00
Nikolai Kochetov
4673b3fe1d
Revert "Revert "Custom reading for mutation"" 2023-01-10 16:31:01 +01:00
Alexander Tokmakov
c8ec130be4
Revert "Custom reading for mutation" 2023-01-10 17:51:30 +03:00
Nikolai Kochetov
11418963c0
Merge pull request #44653 from ClickHouse/custom-reading-for-mutation
Custom reading for mutation
2023-01-10 12:16:24 +01:00
Robert Schulze
923fa2c15a
Fix review comments, pt. II 2023-01-10 10:21:08 +00:00