Commit Graph

113 Commits

Author SHA1 Message Date
Anton Popov
3548e31afd fix corner cases in vertical merges with ReplacingMergeTree 2021-04-22 01:27:18 +03:00
Anton Popov
88cd775f6a Merge remote-tracking branch 'upstream/master' into HEAD 2021-04-02 00:14:03 +03:00
Nikolai Kochetov
ed19864e5b
Update CollapsingSortedAlgorithm.cpp
Add comment.
2021-03-23 11:25:45 +03:00
Nikolai Kochetov
d7fc5e69f4
Update CollapsingSortedAlgorithm.h
Added comment.
2021-03-23 11:18:49 +03:00
Nikolai Kochetov
35ff8925df Fix crash. 2021-03-22 17:43:45 +03:00
Nikolai Kochetov
22444045f0 CollapsingSortedAlgorithm should not return more then index_granularity rows. 2021-03-22 16:18:14 +03:00
Anton Popov
6f7c8894e9 fix bugs in aggregation by primary key 2021-03-18 23:57:42 +03:00
Alexander Kuzmenkov
4b0cbb6ed7
Update FinishAggregatingInOrderAlgorithm.h 2021-03-09 13:43:06 +03:00
Anton Popov
87da7f0589
Update FinishAggregatingInOrderAlgorithm.h 2021-03-05 12:50:28 +03:00
Anton Popov
2e8b45afc1 fix ubsan report 2021-02-01 16:35:08 +03:00
Anton Popov
38e8bab6b1 fix tests 2021-01-27 03:44:36 +03:00
Anton Popov
666aab676e add comments to algorithm 2021-01-26 21:45:22 +03:00
Anton Popov
573edbcd11 improve performance of aggregation in order of sorting key 2021-01-22 05:34:08 +03:00
Alexey Milovidov
dab4719aac Remove some headers 2021-01-05 06:22:06 +03:00
Amos Bird
6a644b2af1
Fix SimpleAggregateFunction in SummingMergeTree 2 2021-01-01 12:42:22 +08:00
alexey-milovidov
a2e1f21ef2
Merge pull request #18637 from amosbird/summingsimple
Fix SimpleAggregateFunction in SummingMergeTree
2020-12-31 15:23:26 +03:00
Amos Bird
ae72f96111
Fix SimpleAggregateFunction in SummingMergeTree 2020-12-30 22:38:11 +08:00
Alexey Milovidov
be884a89f8 Minor fixes for min/sim hash 2020-12-29 13:16:43 +03:00
Pavel Kruglov
4b58528b9e Rename getPos to getRow, change mergeBlock, pass setting instead of context 2020-12-04 19:25:30 +03:00
Pavel Kruglov
905ba78adc Merge branch 'master' of github.com:ClickHouse/ClickHouse into optimize-data-on-insert 2020-12-04 18:56:46 +03:00
Azat Khuzhin
9c801239cf
Better description of costly ArenaAllocChunks in AggregatingSortedAlgorithm
Co-authored-by: Alexander Kuzmenkov <36882414+akuzm@users.noreply.github.com>
2020-12-01 00:52:12 +03:00
Azat Khuzhin
35231662b3 Improve performance of AggregatingMergeTree w/ SimpleAggregateFunction(String)
While reading from AggregatingMergeTree with
SimpleAggregateFunction(String) in primary key and
optimize_aggregation_in_order perf top shows:

    Samples: 1M of event 'cycles', 4000 Hz, Event count (approx.): 287759760270 lost: 0/0 drop: 0/0
      Children      Self  Shared Object         Symbol
    +   12.64%    11.39%  clickhouse            [.] memcpy
    +    9.08%     0.23%  [unknown]             [.] 0000000000000000
    +    8.45%     8.40%  clickhouse            [.] ProfileEvents::increment    # <-- this, and in debug it has not 0.08x overhead, but 5.8x overhead
    +    7.68%     7.67%  clickhouse            [.] LZ4_compress_fast_extState
    +    5.29%     5.22%  clickhouse            [.] DB::IAggregateFunctionHelper<DB::AggregateFunctionNullUnary<true, true> >::addFree

The reason is obvious, ProfileEvents is atomic counters (and also they
are nested):

<details>

```
    Samples: 7M of event 'cycles', 4000 Hz, Event count (approx.): 450726149337
    ProfileEvents::increment  /usr/bin/clickhouse [Percent: local period]
    Percent│
           │
           │
           │    Disassembly of section .text:
           │
           │    00000000078d8900 <ProfileEvents::increment(unsigned long, unsigned long)@@Base>:
           │    ProfileEvents::increment(unsigned long, unsigned long):
      0.17 │      push  %rbp
      0.00 │      mov   %rsi,%rbp
      0.04 │      push  %rbx
      0.20 │      mov   %rdi,%rbx
      0.17 │      sub   $0x8,%rsp
      0.26 │    → callq DB::CurrentThread::getProfileEvents
           │    ProfileEvents::Counters::increment(unsigned long, unsigned long):
      0.00 │      lea   0x0(,%rbx,8),%rdi
      0.05 │      nop
           │    unsigned long std::__1::__cxx_atomic_fetch_add<unsigned long, unsigned long>(std::__1::__cxx_atomic_base_impl<unsigned long>*, unsigned long, std::__1::memory_order):
      1.02 │      mov   (%rax),%rdx
     97.04 │      lock  add   %rbp,(%rdx,%rdi,1)
           │    ProfileEvents::Counters::increment(unsigned long, unsigned long):
      0.21 │      mov   0x10(%rax),%rax
      0.04 │      test  %rax,%rax
      0.00 │    → jne   78d8920 <ProfileEvents::increment(unsigned long, unsigned long)@@Base+0x20>
           │    ProfileEvents::increment(unsigned long, unsigned long):
      0.38 │      add   $0x8,%rsp
      0.00 │      pop   %rbx
      0.04 │      pop   %rbp
      0.38 │    ← retq
```

</details>

These ProfileEvents was ArenaAllocChunks (it shows ~1.5M events per
second), and the reason is that the table has
SimpleAggregateFunction(String) in PK, which requires Arena.
But most of the time there Arena wasn't even used, so avoid this cost by
re-creating Arena only if it was "used" (i.e. has new chunks).

Another possibility is to avoid populating Arena::head in ctor, but this
will make the Arena code more complex, so for now this was preferred.

Also as a long-term solution it worth looking at implementing them via
RCU (to move the extra overhead out from the write code path into read
side).
2020-11-19 23:06:12 +03:00
Pavel Kruglov
547ec19fb3 Merge branch 'master' of github.com:ClickHouse/ClickHouse into optimize-data-on-insert 2020-11-18 12:01:59 +03:00
Pavel Kruglov
6a57c0a8cf Move merge in MergeTreeDataWriter 2020-11-13 10:55:56 +03:00
Pavel Kruglov
8d5e0784d3 Add setting optimize_on_insert 2020-11-12 23:37:23 +03:00
Alexander Tokmakov
b94cc5c4e5 remove more stringstreams 2020-11-10 21:22:26 +03:00
Alexey Milovidov
fd84d16387 Fix "server failed to start" error 2020-11-07 03:14:53 +03:00
Alexey Milovidov
17b3dff0c2 Whitespaces 2020-11-06 20:58:04 +03:00
alexey-milovidov
10e9d14466
Merge pull request #15818 from ClickHouse/style-pragma-once
Check for #pragma once in headers
2020-10-11 13:14:09 +03:00
Alexey Milovidov
269b6383f5 Check for #pragma once in headers 2020-10-10 21:37:02 +03:00
Alexey Milovidov
5b482f4191 Cleanups 2020-10-10 19:31:10 +03:00
Alexey Milovidov
edd89a8610 Fix half of typos 2020-08-08 03:47:03 +03:00
Nikita Mikhaylov
d31ed58f01 done 2020-07-06 17:33:31 +03:00
Alexey Milovidov
1462a66d1e Fix typos 2020-06-27 22:05:00 +03:00
Anton Popov
14e09e5650 Merge remote-tracking branch 'upstream/master' into HEAD 2020-06-25 14:59:15 +03:00
Nikolai Kochetov
35ab9ad051 Merge branch 'master' into fix-parallel-final-stuck 2020-06-18 16:04:14 +03:00
Anton Popov
8ba5bd8530 Merge remote-tracking branch 'upstream/master' into distinct-combinator 2020-06-18 01:44:36 +03:00
Anton Popov
88b325dcdc rework distinct combinator 2020-06-17 22:36:27 +03:00
Nikolai Kochetov
b456a3cc77 Fix tests. 2020-06-15 20:48:04 +03:00
Nikolai Kochetov
5436ef38bf Fix MergingSortedAlgorithm. 2020-06-15 18:21:10 +03:00
Nikolai Kochetov
b5ecef6adf Fix tests. 2020-06-15 16:56:38 +03:00
Nikolai Kochetov
ccf2ceb876 Fix pipeline stuck for parallel final. 2020-06-15 14:02:47 +03:00
Alexey Milovidov
394fb64a9c Better way of implementation 2020-06-14 20:42:11 +03:00
Alexey Milovidov
25f941020b Remove namespace pollution 2020-05-31 00:57:37 +03:00
Alexey Milovidov
5aff138956 Preparation for structured logging 2020-05-31 00:35:52 +03:00
Alexey Milovidov
7e1813825b Return old names of macros 2020-05-24 01:24:01 +03:00
Alexey Milovidov
ce0619dabf Progress on task 2020-05-24 00:26:45 +03:00
Alexey Milovidov
9d2a0d2dd7 Apply all transformations again 2020-05-23 21:59:49 +03:00
Alexey Milovidov
a2ad11897f Remove duplicate whitespaces (preparation) 2020-05-23 21:53:58 +03:00
Alexey Milovidov
1f13515a65 Make all LOG in single line (preparation) 2020-05-23 21:31:37 +03:00