<p>We are processing about ~25 billions of events (page views, conversions, etc).</p>
<p>We must generate and show reports in realtime.</p>
</section>
<sectionclass="slide">
<h2>The old Metrica (RIP 2008–2014)</h2>
<p>Everything was working fine. User could show about 50 different reports.</p>
<p>But there was a problem. We want more than just 50 pre-defined reports. We need to make every report infinitely customizable. The user must be able to slice and dice, and drill down every report from summary up to show single visitors.</p>
</section>
<sectionclass="slide">
<h2>The report builder</h2>
<p>We had quickly made a prototype of so-called "report builder".</p>
<p>This was 2010 year. It was just simple specialized column-oriented data structure.</p>
<p>It worked fine and we got understanding, what the right direction to go.</p>
<p>We need good column-oriented DBMS.</p>
</section>
<sectionclass="slide">
<h2>Why column-oriented?</h2>
<p>This is how "traditional" row-oriented databases work:</p>
<p><imgsrc="pictures/row_oriented.gif"/></p>
</section>
<sectionclass="slide">
<h2>Why column-oriented?</h2>
<p>And this is how column-oriented databases work:</p>
<p><imgsrc="pictures/column_oriented.gif"/></p>
</section>
<sectionclass="slide">
<h2>Why ClickHouse?</h2>
<p>In 2011 there was nothing suitable in the marked. In fact there is nothing comparable even now.</p>
<p>Then we developed ClickHouse.</p>
<p>See nice article «Evolution of data structures in Yandex.Metrica»</p>
<p>There was even cases, when single analysts install ClickHouse on their VMs and started to use it without any questions.</p>
</section>
<sectionclass="slide">
<h2>Open-source</h2>
<p>Then we decided — ClickHouse is just too good to be used solely by Yandex.</p>
<p>To just have more fun, we need to make more companies and people around the world using ClickHouse, to let them be happy. We decided to be open-source.</p>
</section>
<sectionclass="slide">
<h2>Open-source</h2>
<p>Apache 2.0 licence — very unrestrictive.</p>
<p>The goal — maximum widespread of ClickHouse.</p>
<p>We want for product by Yandex to be used everywhere.</p>
<h2style="font-size: 40px;">ClickHouse vs. typical row-oriented DBMS</h2>
<p>Itai Shirav:<br/><br/>«I haven't made a rigorous comparison, but I did convert a time-series table with 9 million rows from Postgres to ClickHouse.</p>
<p>Under ClickHouse queries run about 100 times faster, and the table takes 20 times less disk space. Which is pretty amazing if you ask me».</p>
</section>
<sectionclass="slide">
<h2> </h2>
<p>Bao Dang:<br/><br/>«Obviously, ClickHouse outperformed PostgreSQL at any metric».</p>
<p>Timur Shenkao:<br/><br/>«ClickHouse is extremely fast at simple SELECTs without joins, much faster than Vertica».</p>
</section>
<sectionclass="slide">
<h2>ClickHouse vs. PrestoDB</h2>
<p>Ömer Osman Koçak:<br/><br/>
«When we evaluated ClickHouse the results were great compared to Prestodb. Even though the columnar storage optimizations for ORC and Clickhouse is quite similar, Clickhouse uses CPU and Memory resources more efficiently (Presto also uses vectorized execution but cannot take advantage of hardware level optimizations such as SIMD instruction sets because it's written in Java so that's fair) so we also wanted to add support for Clickhouse for our open-source analytics platform Rakam (https://github.com/rakam-io/rakam)»</p>
</section>
<sectionclass="slide">
<h2>ClickHouse vs. Spark</h2>
<p>«Я потестировал Clickhouse, по скорости просто отлично = намного быстрее spark на одной машине (у меня получилось порядка 3x, но еще буду сравнивать). Кроме того compression получается лучше».</p>
</section>
<sectionclass="slide">
<h2>ClickHouse vs. Google BigQuery</h2>
<p>«ClickHouse показывает сравнимую скорость на <u>таком запросе</u> за 30 дней и в 8 раз быстрее (!) на <u>таком запросе</u>. В планах есть протестировать и другие запросы, еще не добрались.<br/><br/>Скорость выполнения запросов стабильна. В Google BigQuery в период пиковых нагрузок, например в 4:00 p.m. PDT или в начале месяца, время выполнения запросов может заметно увеличиваться».</p>
</section>
<sectionclass="slide">
<h2>ClickHouse vs. Druid</h2>
<p>«В этом году мы развернули сборку на основе Druid — Imply Analytics Platform, а также Tranquility, и уже приготовились запускать в продакшн… Но после выхода ClickHouse сразу отказались от Druid, хотя потратили два месяца на его изучение и внедрение».</p>