ClickHouse

mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-09-20 08:40:50 +00:00

Author	SHA1	Message	Date
Nikita Taranov	52fe7edbd9	better key analysis	2023-01-30 17:11:56 +00:00
Nikita Taranov	2057db68a2	cosmetics	2023-01-30 17:10:45 +00:00
Nikita Taranov	1d45cce03c	support for aggr in order	2023-01-30 17:10:45 +00:00
Nikita Taranov	a2c9aeb7c9	stash	2023-01-30 17:10:45 +00:00
Alexey Milovidov	bc2f454522	Merge branch 'master' into block-non-float-gorilla-v2	2023-01-28 03:30:12 +03:00
Igor Nikonov	300f78df96	Merge pull request #45567 from ClickHouse/enable-remove-redundant-sorting Enable query_plan_remove_redundant_sorting optimization by default	2023-01-27 19:14:36 +01:00
Igor Nikonov	41b94b4954	Enable query_plan_remove_redundant_sorting optimization by default	2023-01-24 13:38:21 +00:00
Robert Schulze	97d1bed114	Merge branch 'master' into improve_week_day	2023-01-21 20:40:33 +01:00
Robert Schulze	e6167d6b36	Deprecate Gorilla compression of non-float columns Reasons: 1. The original Gorilla paper proposed a compression schema for pairs of time stamps and double-precision FP values. ClickHouse's Gorilla codec only implements compression of the latter and it does not impose any data type restrictions. - Data types != Float* or (U)Int* (e.g. Decimal, Point etc.) are definitely not supposed to be used with Gorilla. - (U)Int* types are debatable. The paper only considers integers-stored-as-FP-values, a practical use case for which Gorilla works well. Standalone integers are not considered which makes them at least suspicious. 2. Achieve consistency with FPC, another specialized floating-point timeseries codec, which rejects non-float data. 3. On practical datasets, ZSTD is often "good enough" (*) so it should be okay to disincentive non-ZSTD codecs a little bit. If needed, Delta and DoubleDelta codecs are viable alternative for slowly changing (time-series-like) integer sequences. Since on-prem and hosted users may still have Gorilla-compressed non-float data, this combination is only deprecated for now. No warning or error will be emitted. Users are encouraged to migrate Gorilla-compressed non-float data to an alternative codec. It is planned to treat Gorilla-compressed non-float columns as "suspicious" six months after this commit (i.e. in v23.6). Even then, it will still be possible to set "allow_suspicious_codecs = true" and read and write Gorilla-compressed non-float data. () Sec. 4.1.2, "Gorilla restricts the value element in its tuple to a double floating point type.", https://doi.org/10.14778/2824032.2824078 (**) https://clickhouse.com/blog/optimize-clickhouse-codecs-compression-schema	2023-01-20 17:31:16 +00:00
Igor Nikonov	7ed8fec94f	Revert "Remove redundant sorting"	2023-01-18 18:38:25 +01:00
Igor Nikonov	72066846cf	Merge pull request #43905 from ClickHouse/igor/remove_redundant_order_by Remove redundant sorting	2023-01-18 13:25:03 +01:00
Igor Nikonov	0cfa08df7a	Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by	2023-01-17 16:28:17 +00:00
Alexander Tokmakov	df75c24f01	Revert "Disallow Gorilla codec on non-float columns"	2023-01-16 19:14:28 +03:00
Igor Nikonov	a34991cb65	Merge remote-tracking branch 'origin/master' into igor/remove_redundant_order_by	2023-01-16 12:14:02 +00:00
Robert Schulze	bd41c74ddf	Various test, code and docs fixups	2023-01-15 13:47:34 +00:00
Robert Schulze	7023d68536	Fix codecs_int_*.xml	2023-01-15 13:31:45 +00:00
Azat Khuzhin	925fd2c33a	tests/performance: do not use scientific notation in hashed_dictionary_sharded v2: fix few mistakes Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:26 +01:00
Azat Khuzhin	345c422e28	Add ability to load hashed dictionaries using multiple threads Right now dictionaries (here I will talk about only HASHED/SPARSE_HASHED/COMPLEX_KEY_HASHED/COMPLEX_KEY_SPARSE_HASHED) can load data only in one thread, since it uses one hash table that cannot be filled from multiple threads. And in case you have very big dictionary (i.e. 10e9 elements), it can take a awhile to load them, especially for SPARSE_HASHED variants (and if you have such amount of elements there, you are likely use SPARSE_HASHED, since it requires less memory), in my env it takes ~4 hours, which is enormous amount of time. So this patch add support of shards for dictionaries, number of shards determine how much hash tables will use this dictionary, also, and which is more important, how much threads it can use to load the data. And with 16 threads this works 2x faster, not perfect though, see the follow up patches in this series. v0: PARTITION BY v1: SHARDS 1 v2: SHARDS(1) v3: tried optimized mod - logical and, but it does not gain even 10% v4: tried squashing more (max_block_size * shards), but it does not gain even 10% either v5: move SHARDS into layout parameters (unknown simply ignored) v6: tune params for perf tests (to avoid too long queries) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2023-01-13 13:39:25 +01:00
Nikolai Kochetov	30310df5be	Merge branch 'master' into logical-optimizer-lowcardinality	2023-01-12 18:51:05 +01:00
Nikita Taranov	006fdd32d4	Apply preallocation optimisation more carefully (#44455 ) * impl * add perf test * fix * review fixes	2023-01-09 13:30:48 +01:00
Igor Nikonov	2187bdd4cc	Disable diagnostics + cleanup + disable optimization in sort performance test since it removes sorting at all	2023-01-06 17:00:05 +00:00
Nikolay Degterinsky	dfe93b5d82	Merge pull request #42284 from Algunenano/perf_experiment Performance experiment	2022-12-30 03:14:22 +01:00
Alexey Milovidov	79f2e747e4	Remove QuestDB (flaky test)	2022-12-28 12:42:14 +01:00
Raúl Marín	fc1fa82a39	Merge branch 'master' into perf_experiment	2022-12-27 10:51:58 +01:00
Raúl Marín	45d27f461b	Merge branch 'master' into perf_experiment	2022-12-20 09:07:48 +00:00
Kruglov Pavel	37df9b9990	Merge branch 'master' into refactor-schema-inference	2022-12-16 19:13:15 +01:00
Azat Khuzhin	53bac4de71	tests/perf: fix dependency check during DROP CI [1]: DB::Exception: Cannot drop or rename default.hierarchical_dictionary_source_table, because some tables depend on it: default.hierarchical_hashed_array_dictionary, default.hierarchical_flat_dictionary, default.hierarchical_hashed_dictionary. Stack trace: [1]: https://s3.amazonaws.com/clickhouse-test-reports/44256/8e67a361a8f14abec6717af09ee997eb25151685/performance_comparison_[1/4]/report.html Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>	2022-12-16 15:15:15 +01:00
Nikolay Degterinsky	9b6d31b95d	Merge branch 'master' into perf_experiment	2022-12-13 17:15:07 +01:00
avogar	7375a7d429	Refactor and improve schema inference for text formats	2022-12-07 21:19:27 +00:00
Guo Wangyang	b86686b3f8	Merge branch 'master' into logical-optimizer-lowcardinality	2022-12-07 13:33:25 +08:00
Maksim Kita	1cdc7ab62a	Merge pull request #43556 from Algunenano/interpretation_benchmark Add benchmark for query interpretation with JOINs	2022-12-01 22:53:02 +03:00
Vladimir C	53dc70a2d0	Merge pull request #38191 from BigRedEye/grace_hash_join Closes https://github.com/ClickHouse/ClickHouse/issues/11596	2022-11-30 17:01:00 +01:00
Nikolai Kochetov	51439e2c19	Merge pull request #43260 from ClickHouse/read-from-mt-in-io-pool Read from MergeTree in I/O pool	2022-11-29 12:09:03 +01:00
Nikolai Kochetov	d9fc13b230	Update async_remote_read.xml	2022-11-28 14:00:49 +01:00
Nikita Taranov	8ed5cfc265	Memory bound merging for distributed aggregation in order (#40879 ) * impl * fix style * make executeQueryWithParallelReplicas similar to executeQuery * impl for parallel replicas * cleaner code for remote sorting properties * update test * fix * handle when nodes of old versions participate * small fixes * temporary enable for testing * fix after merge * Revert "temporary enable for testing" This reverts commit `cce7f8884c`. * review fixes * add bc test * Update src/Core/Settings.h	2022-11-28 00:41:31 +01:00
Nikita Taranov	d1c258cf20	Add `xxh3` hash function (#43411 ) * impl * try fix * add docs * add test * rm unused file * excellent	2022-11-26 00:14:08 +01:00
Nikolai Kochetov	4632e7c644	Add max_streams_for_merge_tree_reading setting.	2022-11-25 17:14:22 +00:00
Nikolai Kochetov	dfd3976040	Update async_remote_read.xml	2022-11-25 14:53:45 +01:00
Igor Nikonov	236e7e3989	Small fixes	2022-11-25 12:04:12 +00:00
Igor Nikonov	20e67b7140	Merge remote-tracking branch 'origin/master' into HEAD	2022-11-24 13:10:37 +00:00
Nikolai Kochetov	e79c91947a	Update async_remote_read.xml	2022-11-24 12:35:02 +01:00
Raúl Marín	e910648c5d	Add benchmark for query interpretation with JOINs	2022-11-23 13:15:35 +01:00
Raúl Marín	ed0c174c0c	Merge remote-tracking branch 'blessed/master' into perf_experiment	2022-11-21 11:02:31 +01:00
Guo Wangyang	7d6ff90e34	Merge branch 'master' into logical-optimizer-lowcardinality	2022-11-20 09:56:50 +08:00
Nikolai Kochetov	5da1d893fd	Merge branch 'master' into read-from-mt-in-io-pool	2022-11-18 21:10:45 +01:00
Nikita Taranov	7beb58b0cf	Optimize merge of uniqExact without_key (#43072 ) * impl for uniqExact * rm unused (read\|write)Text methods * fix style * small fixes * impl for variadic uniqExact * refactor * fix style * more agressive inlining * disable if max_threads=1 * small improvements * review fixes * Revert "rm unused (read\|write)Text methods" This reverts commit `a7e7480584`. * encapsulate is_able_to_parallelize_merge in Data * encapsulate is_exact & argument_is_tuple in Data	2022-11-17 13:19:02 +01:00
Kruglov Pavel	1b68f605a2	Merge pull request #42761 from AlfVII/fix-slow-json-extract-with-low-cardinality Fixed slowness in JSONExtract with LowCardinality(String) tuples	2022-11-17 12:49:18 +01:00
Raúl Marín	97d6fc3071	Merge remote-tracking branch 'blessed/master' into perf_experiment	2022-11-17 11:48:46 +01:00
Nikolai Kochetov	10f449c6c1	Add a query to perftest.	2022-11-15 18:08:03 +00:00
李扬	1de5bb2392	Add function canonicalRand (#43124 ) * add function canonicalRand * add perf test * revert rand.xml	2022-11-15 00:27:19 +01:00

1 2 3 4 5 ...

1012 Commits