ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-04 21:42:39 +00:00

Author	SHA1	Message	Date
alexey-milovidov	fde8c87a1f	Merge pull request #12426 from ClickHouse/log-engine-rollback-on-insert-error Rollback insertion error in Log engines	2020-07-16 22:50:48 +03:00
Anton Popov	97e8a88b30	Merge pull request #12277 from bobrik/ivan/exact-range-speedup WIP: Optimize PK lookup for queries that match exact PK range	2020-07-16 19:17:50 +03:00
Vitaly Baranov	000b197ad1	Merge pull request #11234 from traceon/ldap-per-user-authentication Add LDAP authentication support	2020-07-16 13:17:21 +03:00
Alexey Milovidov	6df282e813	Fixup	2020-07-16 11:33:51 +03:00
alexey-milovidov	8966c09ed6	Merge pull request #12519 from vzakaznikov/fix_data_duplication_and_tests_for_live_view Fixing race condition in live view tables which could cause data duplication and live view tests	2020-07-16 11:03:28 +03:00
Alexey Milovidov	68f9fd3767	Debug tests	2020-07-16 06:02:20 +03:00
Alexey Milovidov	82ea884d01	Fix incorrect unit test	2020-07-16 05:37:12 +03:00
Alexey Milovidov	3408b7e259	Merge branch 'master' into log-engine-rollback-on-insert-error	2020-07-16 05:34:02 +03:00
Denis Glazachev	59cb758cf7	Merge branch 'master' into ldap-per-user-authentication	2020-07-16 02:29:24 +04:00
Alexey Milovidov	e1e2204279	Whitespace	2020-07-15 19:37:52 +03:00
Vitaliy Zakaznikov	370dd3396b	Fixing clang build.	2020-07-15 16:18:53 +02:00
Vitaliy Zakaznikov	560151f6cd	* Fix bug in StorageLiveView.cpp * Fixing synchronization of the first insert in live view tests	2020-07-15 13:24:33 +02:00
alesapin	614540eddf	Merge pull request #12382 from ClickHouse/clear-all-columns Better errors for CLEAR/DROP columns (possibly in partitions)	2020-07-15 12:52:06 +03:00
alexey-milovidov	9c68124110	Merge pull request #12302 from azat/kafka-error-in-the-batch-SIGSEGV kafka: fix SIGSEGV if there is a message with error in the middle of the batch	2020-07-15 05:20:26 +03:00
alesapin	9e41fbca55	Remove check for drop detached partition	2020-07-14 16:56:30 +03:00
Alexander Kuzmenkov	b515dd5b83	Merge remote-tracking branch 'origin/master' into HEAD	2020-07-14 15:40:27 +03:00
Alexander Kuzmenkov	b24f727aea	typo	2020-07-14 15:40:18 +03:00
alesapin	014bb070ec	Fix tests	2020-07-14 11:19:39 +03:00
alexey-milovidov	fd4adf27d6	Merge pull request #12456 from CurtizJ/fix-12437 Fix #12437	2020-07-14 09:28:31 +03:00
alexey-milovidov	1893d89ce3	Merge pull request #12448 from ClickHouse/fix-trash-rabbitmq Fix trash from RabbitMQ	2020-07-14 01:11:37 +03:00
alesapin	1f576ee039	Some intermediate solution	2020-07-13 20:27:52 +03:00
Alexey Milovidov	cb46bca157	Merge branch 'master' into fix-trash-rabbitmq	2020-07-13 19:51:17 +03:00
alesapin	4a53264a86	Remove redundant and duplicated code	2020-07-13 19:19:08 +03:00
robot-clickhouse	0f23642a3d	Auto version update to [20.7.1.1] [54437]	2020-07-13 18:26:03 +03:00
Alexander Kuzmenkov	d6e7ab5988	Fuzzing-related fixes	2020-07-13 16:58:48 +03:00
Denis Glazachev	f787702922	Merge branch 'master' into ldap-per-user-authentication * master: (27 commits) Whitespaces Fix typo Fix UBSan report in base64 Correct default secure port for clickhouse-benchmark #11044 Remove test with bug #10697 Update in-functions.md (#12430) Allow nullable key in MergeTree Update arithmetic-functions.md [docs] add rabbitmq docs (#12326) Lower block sizes and look what will happen #9248 Fix lifetime_bytes/lifetime_rows for Buffer direct block write Retrigger CI Fix up test_mysql_protocol failed Implement lifetime_rows/lifetime_bytes for Buffer engine Add comment regarding proxy tunnel usage in PocoHTTPClient.cpp Add lifetime_rows/lifetime_bytes interface (exported via system.tables) Tiny IStorage refactoring Trigger integration-test-runner image rebuild. Delete log.txt Fix test_mysql_client/test_python_client error ...	2020-07-13 15:46:27 +04:00
Anton Popov	a9530d2883	in-memory parts: fix reading from nested	2020-07-13 12:10:55 +03:00
alexey-milovidov	ae7eff98ed	Merge pull request #12433 from amosbird/np Allow nullable key in MergeTree	2020-07-13 04:36:00 +03:00
Alexey Milovidov	8f2055b0a0	Fix trash from RabbitMQ	2020-07-13 04:11:48 +03:00
Amos Bird	cac5a89169	Allow nullable key in MergeTree	2020-07-12 22:21:51 +08:00
Alexey Milovidov	49f60ef3a4	Fix build	2020-07-12 08:26:33 +03:00
Alexey Milovidov	204a4af394	Rollback insertion error in Log engines #12402	2020-07-12 05:32:18 +03:00
Ivan Babrou	8784994d65	Allow conditions outside of PK with exact range Conditions that are outside of PK are marked as `unknown` in `KeyCondition`, so it's safe to allow them, as long as they are always combined by `AND`.	2020-07-11 18:59:26 -07:00
Azat Khuzhin	3bee98c6f0	Fix lifetime_bytes/lifetime_rows for Buffer direct block write	2020-07-12 01:16:05 +03:00
Ivan Babrou	d9d8d0242e	Optimize PK lookup for queries that match exact PK range Existing code that looks up marks that match the query has a pathological case, when most of the part does in fact match the query. The code works by recursively splitting a part into ranges and then discarding the ranges that definitely do not match the query, based on primary key. The problem is that it requires visiting every mark that matches the query, making the complexity of this sort of look up O(n). For queries that match exact range on the primary key, we can find both left and right parts of the range with O(log 2) complexity. This change implements exactly that. To engage this optimization, the query must: * Have a prefix list of the primary key. * Have only range or single set element constraints for columns. * Have only AND as a boolean operator. Consider a table with `(service, timestamp)` as the primary key. The following conditions will be optimized: * `service = 'foo'` * `service = 'foo' and timestamp >= now() - 3600` * `service in ('foo')` * `service in ('foo') and timestamp >= now() - 3600 and timestamp <= now` The following will fall back to previous lookup algorithm: * `timestamp >= now() - 3600` * `service in ('foo', 'bar') and timestamp >= now() - 3600` * `service = 'foo'` Note that the optimization won't engage when PK has a range expression followed by a point expression, since in that case the range is not continuous. Trace query logging provides the following messages types of messages, each representing a different kind of PK usage for a part: ``` Used optimized inclusion search over index for part 20200711_5710108_5710108_0 with 9 steps Used generic exclusion search over index for part 20200711_5710118_5710228_5 with 1495 steps Not using index on part 20200710_5710473_5710473_0 ``` Number of steps translates to computational complexity. Here's a comparison for before and after for a query over 24h of data: ``` Read 4562944 rows, 148.05 MiB in 45.19249672 sec., 100966 rows/sec., 3.28 MiB/sec. Read 4183040 rows, 135.78 MiB in 0.196279627 sec., 21311636 rows/sec., 691.75 MiB/sec. ``` This is especially useful for queries that read data in order and terminate early to return "last X things" matching a query. See #11564 for more thoughts on this.	2020-07-11 12:26:54 -07:00
Denis Glazachev	edb6ef8c09	Merge commit 'ceac649c01b0158090cd271776f3219f5e7ff57c' into ldap-per-user-authentication * commit 'ceac649c01b0158090cd271776f3219f5e7ff57c': (75 commits) [docs] split misc statements (#12403) Update 00405_pretty_formats.reference Update PrettyCompactBlockOutputFormat.cpp Update PrettyBlockOutputFormat.cpp Update DataTypeNullable.cpp Update 01383_remote_ambiguous_column_shard.sql add output_format_pretty_grid_charset setting in docs add setting output_format_pretty_grid_charset Added a test for #11135 Update index.md RIGHT and FULL JOIN for MergeJoin (#12118) Update MergeTreeIndexFullText.cpp restart the tests [docs] add syntax highlight (#12398) query fuzzer Fix std::bad_typeid when JSON functions called with argument of wrong type. Allow typeid_cast() to cast nullptr to nullptr. fix another context-related segfault [security docs] actually, only admins can create advisories query fuzzer ...	2020-07-11 21:32:36 +04:00
Azat Khuzhin	32a45d0dee	Implement lifetime_rows/lifetime_bytes for Buffer engine Buffer engine is usually used on INSERTs, but right now there is no way to track number of INSERTed rows per-table, since only summary metrics exists: - StorageBufferRows - StorageBufferBytes But it can be pretty useful to track INSERTed rows rate (and it can be exposed via http_handlers for i.e. prometheus)	2020-07-11 16:06:11 +03:00
Azat Khuzhin	433fdffc19	Add lifetime_rows/lifetime_bytes interface (exported via system.tables)	2020-07-11 15:33:11 +03:00
Azat Khuzhin	84c93a6b02	Tiny IStorage refactoring	2020-07-11 15:17:06 +03:00
alexey-milovidov	e22547c29d	Merge pull request #12388 from ClickHouse/bloom-filter-arg-check Check arguments of bloom filter index	2020-07-10 20:54:16 +03:00
alexey-milovidov	caef1d8e24	Update MergeTreeIndexFullText.cpp	2020-07-10 20:53:58 +03:00
alexey-milovidov	d819624d7c	Merge pull request #12378 from ClickHouse/allow-clear-column-with-dependencies Allow to CLEAR column even if there are depending DEFAULT expressions	2020-07-10 20:18:14 +03:00
alexey-milovidov	031c773260	Merge pull request #12384 from ClickHouse/support-negative-float-constants-in-key-condition Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables	2020-07-10 20:16:35 +03:00
Azat Khuzhin	610382b693	kafka: fix SIGSEGV if there is an message with error in the middle of the batch ReadBufferFromKafkaConsumer does not handle the case when there is message with an error on non first position in the current batch, since it goes through messages in the batch after poll and stop on first valid message. But later it can try to use message as valid: - while storing offset - get topic name - ... And besides the message itself is also invalid (you can find this in the gdb traces below). So just filter out messages win an error error after poll. SIGSEGV was with the following stacktrace: (gdb) bt 3 0x0000000010f05b4d in rd_kafka_offset_store (app_rkt=0x0, partition=0, offset=0) at ../contrib/librdkafka/src/rdkafka_offset.c:656 4 0x0000000010e69657 in cppkafka::Consumer::store_offset (this=0x7f2015210820, msg=...) at ../contrib/cppkafka/include/cppkafka/message.h:225 5 0x000000000e68f208 in DB::ReadBufferFromKafkaConsumer::storeLastReadMessageOffset (this=0x7f206a136618) at ../contrib/libcxx/include/iterator:1508 6 0x000000000e68b207 in DB::KafkaBlockInputStream::readImpl (this=0x7f202c689020) at ../src/Storages/Kafka/KafkaBlockInputStream.cpp:150 7 0x000000000dd1178d in DB::IBlockInputStream::read (this=this@entry=0x7f202c689020) at ../src/DataStreams/IBlockInputStream.cpp:60 8 0x000000000dd34c0a in DB::copyDataImpl<> () at ../src/DataStreams/copyData.cpp:21 9 DB::copyData () at ../src/DataStreams/copyData.cpp:62 10 0x000000000e67c8f2 in DB::StorageKafka::streamToViews () at ../contrib/libcxx/include/memory:3823 11 0x000000000e67d218 in DB::StorageKafka::threadFunc () at ../src/Storages/Kafka/StorageKafka.cpp:488 And some information from it: (gdb) p this.current.__i $14 = (std::__1::__wrap_iter<cppkafka::Message const>::iterator_type) 0x7f1ca8f58660 # current-1 (gdb) p $14-1 $15 = (const cppkafka::Message ) 0x7f1ca8f58600 (gdb) p $16.handle_ $17 = {__ptr_ = {<std::__1::__compressed_pair_elem<rd_kafka_message_s, 0, false>> = { __value_ = 0x7f203577f938}, ...} (gdb) p (rd_kafka_message_s)0x7f203577f938 $24 = {err = RD_KAFKA_RESP_ERR__TRANSPORT, rkt = 0x0, partition = 0, payload = 0x7f202f0339c0, len = 63, key = 0x0, key_len = 0, offset = 0, _private = 0x7f203577f8c0} # current (gdb) p $14-0 $28 = (const cppkafka::Message ) 0x7f1ca8f58660 (gdb) p $28.handle_.__ptr_ $29 = {<std::__1::__compressed_pair_elem<rd_kafka_message_s, 0, false>> = { __value_ = 0x7f184f129bf0}, ...} (gdb) p (rd_kafka_message_s)0x7f184f129bf0 $30 = {err = RD_KAFKA_RESP_ERR_NO_ERROR, rkt = 0x7f1ed44fe000, partition = 1, payload = 0x7f1fc9bc6036, len = 242, key = 0x0, key_len = 0, offset = 2394853582209, # current+1 (gdb) p (($14+1)).handle_.__ptr_ $44 = {<std::__1::__compressed_pair_elem<rd_kafka_message_s, 0, false>> = { __value_ = 0x7f184f129d30}, ...} (gdb) p (rd_kafka_message_s)0x7f184f129d30 $45 = {err = RD_KAFKA_RESP_ERR_NO_ERROR, rkt = 0x7f1ed44fe000, partition = 1, payload = 0x7f1fc9bc612f, len = 31, key = 0x0, key_len = 0, offset = 2394853582210, _private = 0x7f184f129cc0} # distance from the beginning (gdb) p messages.__end_-messages.__begin_ $34 = 65536 (gdb) p ($14-0)-messages.__begin_ $37 = 8965 (gdb) p ($14-1)-messages.__begin_ $38 = 8964 # parsing info (gdb) p allowed $39 = false (gdb) p new_rows $40 = 1 (gdb) p total_rows $41 = 8964 # current buffer is invalid (gdb) p buffer.__ptr_ $50 = {<DB::ReadBuffer> = {<DB::BufferBase> = {pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure", bytes = 47904863385, working_buffer = { begin_pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure", end_pos = 0x7f202f0339c0 "FindCoordinator response error: Local: Broker transport failure"}, internal_buffer = { v0: check message errors in ReadBufferFromKafkaConsumer::nextImpl() (but this may lead to using of that messages after and SIGSEGV again, doh). v2: skip messages with an error after poll.	2020-07-10 11:41:44 +03:00
Alexey Milovidov	47eaffbe63	Additional checks	2020-07-10 11:21:40 +03:00
Alexey Milovidov	4b86f36d37	Check arguments of bloom filter index	2020-07-10 11:13:21 +03:00
alesapin	5cae87e664	Merge pull request #12335 from ClickHouse/fix_alter_exit_codes Fix alter rename error messages	2020-07-10 11:05:20 +03:00
Alexey Milovidov	276b3a0215	Avoid exception when negative or floating point constant is used in WHERE condition for indexed tables #11905	2020-07-10 09:30:49 +03:00
Alexey Milovidov	a4b35a8a6f	Allow to CLEAR column even if there are depending DEFAULT expressions #12333	2020-07-10 08:54:35 +03:00
alexey-milovidov	c16d8e094b	Merge pull request #12308 from ClickHouse/fix-codec-bad-exception-code Fix wrong exception code in codecs Delta, DoubleDelta #12110	2020-07-10 08:40:46 +03:00

1 2 3 4 5 ...

1066 Commits