Commit Graph

160 Commits

Author SHA1 Message Date
Azat Khuzhin
b43046ba06 Distributed: More accurate distribution_queue counters
So now system.distribution_queue will show accurate statistics, so tests
does not requires sleep anymore.

But note that with too much distributed pending this will iterate over
all directories.
2021-03-03 23:30:03 +03:00
Azat Khuzhin
b5a5778589 Distributed: Add ability to limit amount of pending bytes for async INSERT
Right now with distributed_directory_monitor_batch_inserts=1 and
insert_distributed_sync=0 INSERT into Distributed table will store
blocks that should be sent to remote (and in case of
prefer_localhost_replica=0 to the localhost too) on the local
filesystem, and sent it in background.

However there is no limit for this storage, and if the remote is
unavailable (or some other error), these pending blocks may take
significant space, and this is not always desired behaviour.

Add new Distributed setting - bytes_to_throw_insert, that will set the
limit for how much pending bytes is allowed, if the limit will be
reached an exception will be throw.

By default was set to 0, to avoid surprises.
2021-03-03 23:30:00 +03:00
Azat Khuzhin
ce09b7ff89 Distributed: Implement totalBytes() (system.tables.total_bytes) 2021-03-03 23:29:11 +03:00
Anton Popov
a4c00ab5dc
Merge pull request #21303 from ucasFL/forbid
Forbid to drop a column if it's referenced by materialized view
2021-03-03 02:55:06 +03:00
feng lv
a26c9e64a9 fix
fix
2021-03-02 03:20:03 +00:00
feng lv
51021c1164 forbid to drop a column if it's referenced by materialized view 2021-02-28 05:24:39 +00:00
Nikolai Kochetov
d328bfa41f Review fixes. Add setting max_optimizations_to_apply. 2021-02-26 19:29:56 +03:00
Azat Khuzhin
809fa7e4cc Sync SYSTEM FLUSH DISTRIBUTED with TRUNCATE 2021-02-10 23:10:37 +03:00
Azat Khuzhin
ce91c257b2 Lockless SYSTEM FLUSH DISTRIBUTED
Right now SYSTEM FLUSH DISTRIBUTED will block:
- INSERT into this Distributed table (requireDirectoryMonitor())
- SELECT * FROM system.distribution_queue
2021-02-08 22:07:30 +03:00
Kruglov Pavel
d94e8624d7
Merge branch 'master' into shard-id 2021-02-06 16:48:17 +03:00
Aleksei Semiglazov
921518db0a CLICKHOUSE-606: query deduplication based on parts' UUID
* add the query data deduplication excluding duplicated parts in MergeTree family engines.

query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1

allow_experimental_query_deduplication setting is to enable part deduplication, default ot false.

data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.

NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.

* add _part_uuid virtual column, allowing to use UUIDs in predicates.

Signed-off-by: Aleksei Semiglazov <asemiglazov@cloudflare.com>

address comments
2021-02-02 16:53:39 +00:00
feng lv
4279c7da41 add setting insert_shard_id
add test

fix style

fix
2021-02-02 04:26:59 +00:00
kreuzerkrieg
29a2ef3089 Add IStoragePolicy interface 2021-01-26 10:55:28 +02:00
Azat Khuzhin
2e55bd2285 Accept IDisk in DirectoryMonitor (for further fsync) 2021-01-09 16:31:42 +03:00
Azat Khuzhin
b5ace27014 Add fsync support for Distributed engine.
Two new settings (by analogy with MergeTree family) has been added:

- `fsync_after_insert` - Do fsync for every inserted. Will decreases
  performance of inserts.

- `fsync_tmp_directory` - Do fsync for temporary directory (that is used
  for async INSERT only) after all part operations (writes, renames,
  etc.).

Refs: #17380 (p1)
2021-01-09 11:31:32 +03:00
Azat Khuzhin
714d5a067a Expose supports_parallel_insert via system.table_engines 2021-01-08 14:57:24 +03:00
Alexey Milovidov
190402b7d5 Do not insert empty blocks on sync Distributed INSERT 2021-01-06 02:54:22 +03:00
Amos Bird
6fc225e676
Distributed insertion to one random shard (#18294)
* Distributed insertion to one random shard

* add some tests

* add some documentation

* Respect shards' weights

* fine locking

Co-authored-by: Ivan Lezhankin <ilezhankin@yandex-team.ru>
2020-12-23 19:04:05 +03:00
Azat Khuzhin
5365718f01
Fix optimize_distributed_group_by_sharding_key for query with OFFSET only (#16996)
* Fix optimize_distributed_group_by_sharding_key for query with OFFSET only

* Fix 01244_optimize_distributed_group_by_sharding_key flakiness
2020-12-02 20:11:39 +03:00
tavplubix
085359c110
Merge pull request #17274 from ClickHouse/fix_ast_formatting_in_logs
Fix AST formatting in log messages
2020-11-24 19:00:56 +03:00
Alexander Tokmakov
60a5782c75 fix AST formatting in log messages 2020-11-22 20:23:12 +03:00
Amos Bird
1d9d586e20
Make global_context consistent. 2020-11-20 18:23:14 +08:00
Nikolai Kochetov
46f70dd0de Merge branch 'master' into actions-dag-f14 2020-11-12 11:54:44 +03:00
tavplubix
058aa8f85e
Merge pull request #16824 from ClickHouse/replace_stringstreams_with_buffers
Replace std::*stringstreams with DB::*Buffers
2020-11-12 01:11:44 +03:00
Nikolai Kochetov
1846bb3cac Merge branch 'master' into actions-dag-f14 2020-11-11 13:08:57 +03:00
Nikolai Kochetov
1db8e77371 Add comments. Update ActionsDAG::Index 2020-11-10 17:54:59 +03:00
Nikolai Kochetov
195c941c4e Merge branch 'master' into storage-read-query-plan 2020-11-10 15:02:22 +03:00
Alexander Tokmakov
5cdfcfb307 remove other stringstreams 2020-11-09 22:12:44 +03:00
Nikolai Kochetov
6717c7a0af Merge branch 'master' into actions-dag-f14 2020-11-09 14:57:48 +03:00
alexey-milovidov
f4ba5f1f9a
Merge pull request #16772 from ClickHouse/fix-stringstream
Fix "server failed to start" error
2020-11-08 14:27:08 +03:00
Alexey Milovidov
ba4ae00121 Whitespace 2020-11-08 00:30:40 +03:00
Alexey Milovidov
1ea3afadbc Merge with master 2020-11-08 00:28:39 +03:00
Alexey Milovidov
5314185e25 Merge branch 'master' into azat-optimize_skip_unused_shards-optimization 2020-11-08 00:17:59 +03:00
Alexey Milovidov
fd84d16387 Fix "server failed to start" error 2020-11-07 03:14:53 +03:00
Nikolai Kochetov
c10f733587 Merge branch 'master' into storage-read-query-plan 2020-11-06 15:43:46 +03:00
Nikolai Kochetov
9aeb757da4 Merge branch 'master' into actions-dag-f14 2020-11-06 15:04:20 +03:00
Azat Khuzhin
f23995d290 Remove empty directories for async INSERT at start of Distributed engine
Will be created by DistributedBlockOutputStream on demand.
2020-11-05 23:50:30 +03:00
Nikolai Kochetov
6767a226fc Merge branch 'master' into actions-dag-f14 2020-11-03 15:21:06 +03:00
Nikolai Kochetov
07a7c46b89 Refactor ExpressionActions [Part 3] 2020-11-03 14:28:28 +03:00
Azat Khuzhin
fc14fde24a Fix DROP TABLE for Distributed (racy with INSERT)
<details>

```

drop() on T1275:

    0  DB::StorageDistributed::drop (this=0x7f9ed34f0000) at ../contrib/libcxx/include/__hash_table:966
    1  0x000000000d557242 in DB::DatabaseOnDisk::dropTable (this=0x7f9fc22706d8, context=..., table_name=...)
       at ../contrib/libcxx/include/new:340
    2  0x000000000d6fcf7c in DB::InterpreterDropQuery::executeToTable (this=this@entry=0x7f9e42560dc0, query=...)
       at ../contrib/libcxx/include/memory:3826
    3  0x000000000d6ff5ee in DB::InterpreterDropQuery::execute (this=0x7f9e42560dc0) at ../src/Interpreters/InterpreterDropQuery.cpp:50
    4  0x000000000daa40c0 in DB::executeQueryImpl (begin=<optimized out>, end=<optimized out>, context=..., internal=<optimized out>,
       stage=DB::QueryProcessingStage::Complete, has_query_tail=false, istr=0x0) at ../src/Interpreters/executeQuery.cpp:420
    5  0x000000000daa59df in DB::executeQuery (query=..., context=..., internal=internal@entry=false, stage=<optimized out>,
       may_have_embedded_data=<optimized out>) at ../contrib/libcxx/include/string:1487
    6  0x000000000e1369e6 in DB::TCPHandler::runImpl (this=this@entry=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:254
    7  0x000000000e1379c9 in DB::TCPHandler::run (this=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:1326
    8  0x000000001086fac7 in Poco::Net::TCPServerConnection::start (this=this@entry=0x7f9ddf3a9000)
       at ../contrib/poco/Net/src/TCPServerConnection.cpp:43
    9  0x000000001086ff2b in Poco::Net::TCPServerDispatcher::run (this=0x7f9e4eba5c00)
       at ../contrib/poco/Net/src/TCPServerDispatcher.cpp:114
    10 0x00000000109dbe8e in Poco::PooledThread::run (this=0x7f9e4a2d2f80) at ../contrib/poco/Foundation/src/ThreadPool.cpp:199
    11 0x00000000109d78f9 in Poco::ThreadImpl::runnableEntry (pThread=<optimized out>)
       at ../contrib/poco/Foundation/include/Poco/SharedPtr.h:401
    12 0x00007f9fc3cccea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
    13 0x00007f9fc3bebeaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

StorageDistributedDirectoryMonitor on T166:

    0  DB::StorageDistributedDirectoryMonitor::StorageDistributedDirectoryMonitor (this=0x7f9ea7ab1400, storage_=..., path_=...,
       pool_=..., monitor_blocker_=..., bg_pool_=...) at ../src/Storages/Distributed/DirectoryMonitor.cpp:81
    1  0x000000000dbf684e in std::__1::make_unique<> () at ../contrib/libcxx/include/memory:3474
    2  DB::StorageDistributed::requireDirectoryMonitor (this=0x7f9ed34f0000, disk=..., name=...)
       at ../src/Storages/StorageDistributed.cpp:682
    3  0x000000000de3d5fa in DB::DistributedBlockOutputStream::writeToShard (this=this@entry=0x7f9ed39c7418, block=..., dir_names=...)
       at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:634
    4  0x000000000de3e214 in DB::DistributedBlockOutputStream::writeAsyncImpl (this=this@entry=0x7f9ed39c7418, block=...,
       shard_id=shard_id@entry=79) at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:539
    5  0x000000000de3e47b in DB::DistributedBlockOutputStream::writeSplitAsync (this=this@entry=0x7f9ed39c7418, block=...)
       at ../contrib/libcxx/include/vector:1546
    6  0x000000000de3eab0 in DB::DistributedBlockOutputStream::writeAsync (block=..., this=0x7f9ed39c7418)
       at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:141
    7  DB::DistributedBlockOutputStream::write (this=0x7f9ed39c7418, block=...)
       at ../src/Storages/Distributed/DistributedBlockOutputStream.cpp:135
    8  0x000000000d73b376 in DB::PushingToViewsBlockOutputStream::write (this=this@entry=0x7f9ea7a8cf58, block=...)
       at ../src/DataStreams/PushingToViewsBlockOutputStream.cpp:157
    9  0x000000000d7853eb in DB::AddingDefaultBlockOutputStream::write (this=0x7f9ed383d118, block=...)
       at ../contrib/libcxx/include/memory:3826
    10 0x000000000d740790 in DB::SquashingBlockOutputStream::write (this=0x7f9ed383de18, block=...)
       at ../contrib/libcxx/include/memory:3826
    11 0x000000000d68c308 in DB::CountingBlockOutputStream::write (this=0x7f9ea7ac6d60, block=...)
       at ../contrib/libcxx/include/memory:3826
    12 0x000000000ddab449 in DB::StorageBuffer::writeBlockToDestination (this=this@entry=0x7f9fbd56a000, block=..., table=...)
       at ../src/Storages/StorageBuffer.cpp:747
    13 0x000000000ddabfa6 in DB::StorageBuffer::flushBuffer (this=this@entry=0x7f9fbd56a000, buffer=...,
       check_thresholds=check_thresholds@entry=true, locked=locked@entry=false, reset_block_structure=reset_block_structure@entry=false)
       at ../src/Storages/StorageBuffer.cpp:661
    14 0x000000000ddac415 in DB::StorageBuffer::flushAllBuffers (reset_blocks_structure=false, check_thresholds=true, this=0x7f9fbd56a000)
        at ../src/Storages/StorageBuffer.cpp:605

shutdown() on T1275:

    0  DB::StorageDistributed::shutdown (this=0x7f9ed34f0000) at ../contrib/libcxx/include/atomic:1612
    1  0x000000000d6fd938 in DB::InterpreterDropQuery::executeToTable (this=this@entry=0x7f98530c79a0, query=...)
       at ../src/Storages/TableLockHolder.h:12
    2  0x000000000d6ff5ee in DB::InterpreterDropQuery::execute (this=0x7f98530c79a0) at ../src/Interpreters/InterpreterDropQuery.cpp:50
    3  0x000000000daa40c0 in DB::executeQueryImpl (begin=<optimized out>, end=<optimized out>, context=..., internal=<optimized out>,
       stage=DB::QueryProcessingStage::Complete, has_query_tail=false, istr=0x0) at ../src/Interpreters/executeQuery.cpp:420
    4  0x000000000daa59df in DB::executeQuery (query=..., context=..., internal=internal@entry=false, stage=<optimized out>,
       may_have_embedded_data=<optimized out>) at ../contrib/libcxx/include/string:1487
    5  0x000000000e1369e6 in DB::TCPHandler::runImpl (this=this@entry=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:254
    6  0x000000000e1379c9 in DB::TCPHandler::run (this=0x7f9ddf3a9000) at ../src/Server/TCPHandler.cpp:1326
    7  0x000000001086fac7 in Poco::Net::TCPServerConnection::start (this=this@entry=0x7f9ddf3a9000)
       at ../contrib/poco/Net/src/TCPServerConnection.cpp:43
    8  0x000000001086ff2b in Poco::Net::TCPServerDispatcher::run (this=0x7f9e4eba5c00)
       at ../contrib/poco/Net/src/TCPServerDispatcher.cpp:114
    9  0x00000000109dbe8e in Poco::PooledThread::run (this=0x7f9e4a2d2f80) at ../contrib/poco/Foundation/src/ThreadPool.cpp:199
    10 0x00000000109d78f9 in Poco::ThreadImpl::runnableEntry (pThread=<optimized out>)
       at ../contrib/poco/Foundation/include/Poco/SharedPtr.h:401
    11 0x00007f9fc3cccea7 in start_thread (arg=<optimized out>) at pthread_create.c:477
    12 0x00007f9fc3bebeaf in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

```

</details>
2020-10-27 21:19:36 +03:00
Ivan
1d170f5745
ASTTableIdentifier Part #1: improve internal representation of ASTIdentifier name (#16149)
* Use only |name_parts| as primary name source

* Restore legacy logic for table restoration

* Fix build

* Fix tests

* Add pytest server config

* Fix tests

* Fixes due to review
2020-10-24 21:46:10 +03:00
Nikolai Kochetov
7fa045cff8 Merge branch 'master' into storage-read-query-plan 2020-10-22 13:31:10 +03:00
Azat Khuzhin
9b8abd44ab Add allow_nondeterministic_optimize_skip_unused_shards 2020-10-17 01:07:02 +03:00
Alexander Tokmakov
72b1339656 Revert "Revert "Write structure of table functions to metadata""
This reverts commit c65d1e5c70.
2020-10-14 15:19:29 +03:00
tavplubix
c65d1e5c70
Revert "Write structure of table functions to metadata" 2020-10-14 13:59:29 +03:00
alexey-milovidov
f60ccb4edf
Merge pull request #14295 from ClickHouse/write_structure_of_table_functions
Write structure of table functions to metadata
2020-10-13 23:56:09 +03:00
Nikolai Kochetov
7e58f99f64 Merge branch 'master' into storage-read-query-plan 2020-10-12 13:12:39 +03:00
Alexey Milovidov
5b482f4191 Cleanups 2020-10-10 19:31:10 +03:00
Nikolai Kochetov
c5cb05f5f3 Try fix tests. 2020-10-07 14:26:29 +03:00
Azat Khuzhin
b838214a35 Pass non-const SelectQueryInfo (and drop mutable qualifiers) 2020-10-02 22:42:35 +03:00