Merge branch 'master' into bitcast

This commit is contained in:
Robert Schulze 2022-11-07 09:24:52 +01:00 committed by GitHub
commit cdaf0becfe
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
68 changed files with 1061 additions and 279 deletions

2
contrib/NuRaft vendored

@ -1 +1 @@
Subproject commit 1be805e7cb2494aa8170015493474379b0362dfc
Subproject commit e4e746a24eb56861a86f3672771e3308d8c40722

View File

@ -4,25 +4,39 @@ sidebar_label: Cell Towers
sidebar_position: 3
title: "Cell Towers"
---
import ConnectionDetails from '@site/docs/en/_snippets/_gather_your_details_http.mdx';
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
import CodeBlock from '@theme/CodeBlock';
import ActionsMenu from '@site/docs/en/_snippets/_service_actions_menu.md';
import SQLConsoleDetail from '@site/docs/en/_snippets/_launch_sql_console.md';
import SupersetDocker from '@site/docs/en/_snippets/_add_superset_detail.md';
This dataset is from [OpenCellid](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
## Goal
In this guide you will learn how to:
- Load the OpenCelliD data in Clickhouse
- Connect Apache Superset to ClickHouse
- Build a dashboard based on data available in the dataset
Here is a preview of the dashboard created in this guide:
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
## Get the Dataset {#get-the-dataset}
This dataset is from [OpenCelliD](https://www.opencellid.org/) - The world's largest Open Database of Cell Towers.
As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc).
OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.
## Get the Dataset {#get-the-dataset}
<Tabs groupId="deployMethod">
<TabItem value="serverless" label="ClickHouse Cloud" default>
### Load the sample data
ClickHouse Cloud provides an easy-button for uploading this dataset from S3. Log in to your ClickHouse Cloud organization, or create a free trial at [ClickHouse.cloud](https://clickhouse.cloud).
<ActionsMenu menu="Load Data" />
@ -30,13 +44,33 @@ Choose the **Cell Towers** dataset from the **Sample data** tab, and **Load data
![Load cell towers dataset](@site/docs/en/_snippets/images/cloud-load-data-sample.png)
Examine the schema of the cell_towers table:
### Examine the schema of the cell_towers table
```sql
DESCRIBE TABLE cell_towers
```
<SQLConsoleDetail />
This is the output of `DESCRIBE`. Down further in this guide the field type choices will be described.
```response
┌─name──────────┬─type──────────────────────────────────────────────────────────────────┬
│ radio │ Enum8('' = 0, 'CDMA' = 1, 'GSM' = 2, 'LTE' = 3, 'NR' = 4, 'UMTS' = 5) │
│ mcc │ UInt16 │
│ net │ UInt16 │
│ area │ UInt16 │
│ cell │ UInt64 │
│ unit │ Int16 │
│ lon │ Float64 │
│ lat │ Float64 │
│ range │ UInt32 │
│ samples │ UInt32 │
│ changeable │ UInt8 │
│ created │ DateTime │
│ updated │ DateTime │
│ averageSignal │ UInt8 │
└───────────────┴───────────────────────────────────────────────────────────────────────┴
```
</TabItem>
<TabItem value="selfmanaged" label="Self-managed">
@ -86,7 +120,7 @@ clickhouse-client --query "INSERT INTO cell_towers FORMAT CSVWithNames" < cell_t
</TabItem>
</Tabs>
## Example queries {#examples}
## Run some example queries {#examples}
1. A number of cell towers by type:
@ -127,13 +161,13 @@ SELECT mcc, count() FROM cell_towers GROUP BY mcc ORDER BY count() DESC LIMIT 10
10 rows in set. Elapsed: 0.019 sec. Processed 43.28 million rows, 86.55 MB (2.33 billion rows/s., 4.65 GB/s.)
```
So, the top countries are: the USA, Germany, and Russia.
Based on the above query and the [MCC list](https://en.wikipedia.org/wiki/Mobile_country_code), the countries with the most cell towers are: the USA, Germany, and Russia.
You may want to create an [External Dictionary](../../sql-reference/dictionaries/external-dictionaries/external-dicts.md) in ClickHouse to decode these values.
## Use case: Incorporate geo data {#use-case}
Using `pointInPolygon` function.
Using the [`pointInPolygon`](/docs/en/sql-reference/functions/geo/coordinates.md/#pointinpolygon) function.
1. Create a table where we will store polygons:
@ -224,6 +258,110 @@ WHERE pointInPolygon((lon, lat), (SELECT * FROM moscow))
1 rows in set. Elapsed: 0.067 sec. Processed 43.28 million rows, 692.42 MB (645.83 million rows/s., 10.33 GB/s.)
```
The data is also available for interactive queries in the [Playground](https://play.clickhouse.com/play?user=play), [example](https://play.clickhouse.com/play?user=play#U0VMRUNUIG1jYywgY291bnQoKSBGUk9NIGNlbGxfdG93ZXJzIEdST1VQIEJZIG1jYyBPUkRFUiBCWSBjb3VudCgpIERFU0M=).
## Review of the schema
Although you cannot create temporary tables there.
Before building visualizations in Superset have a look at the columns that you will use. This dataset primarily provides the location (Longitude and Latitude) and radio types at mobile cellular towers worldwide. The column descriptions can be found in the [community forum](https://community.opencellid.org/t/documenting-the-columns-in-the-downloadable-cells-database-csv/186). The columns used in the visualizations that will be built are described below
Here is a description of the columns taken from the OpenCelliD forum:
| Column | Description |
|--------------|--------------------------------------------------------|
| radio | Technology generation: CDMA, GSM, UMTS, 5G NR |
| mcc | Mobile Country Code: `204` is The Netherlands |
| lon | Longitude: With Latitude, approximate tower location |
| lat | Latitude: With Longitude, approximate tower location |
:::tip mcc
To find your MCC check [Mobile network codes](https://en.wikipedia.org/wiki/Mobile_country_code), and use the three digits in the **Mobile country code** column.
:::
The schema for this table was designed for compact storage on disk and query speed.
- The `radio` data is stored as an `Enum8` (`UInt8`) rather than a string.
- `mcc` or Mobile country code, is stored as a `UInt16` as we know the range is 1 - 999.
- `lon` and `lat` are `Float64`.
None of the other fields are used in the queries or visualizations in this guide, but they are described in the forum linked above if you are interested.
## Build visualizations with Apache Superset
Superset is easy to run from Docker. If you already have Superset running, all you need to do is add ClickHouse Connect with `pip install clickhouse-connect`. If you need to install Superset open the **Launch Apache Superset in Docker** directly below.
<SupersetDocker />
To build a Superset dashboard using the OpenCelliD dataset you should:
- Add your ClickHouse service as a Superset **database**
- Add the table **cell_towers** as a Superset **dataset**
- Create some **charts**
- Add the charts to a **dashboard**
### Add your ClickHouse service as a Superset database
<ConnectionDetails />
In Superset a database can be added by choosing the database type, and then providing the connection details. Open Superset and look for the **+**, it has a menu with **Data** and then **Connect database** options.
![Add a database](@site/docs/en/getting-started/example-datasets/images/superset-add.png)
Choose **ClickHouse Connect** from the list:
![Choose clickhouse connect as database type](@site/docs/en/getting-started/example-datasets/images/superset-choose-a-database.png)
:::note
If **ClickHouse Connect** is not one of your options, then you will need to install it. The comand is `pip install clickhouse-connect`, and more info is [available here](https://pypi.org/project/clickhouse-connect/).
:::
#### Add your connection details:
:::tip
Make sure that you set **SSL** on when connecting to ClickHouse Cloud or other ClickHouse systems that enforce the use of SSL.
:::
![Add ClickHouse as a Superset datasource](@site/docs/en/getting-started/example-datasets/images/superset-connect-a-database.png)
### Add the table **cell_towers** as a Superset **dataset**
In Superset a **dataset** maps to a table within a database. Click on add a dataset and choose your ClickHouse service, the database containing your table (`default`), and choose the `cell_towers` table:
![Add cell_towers table as a dataset](@site/docs/en/getting-started/example-datasets/images/superset-add-dataset.png)
### Create some **charts**
When you choose to add a chart in Superset you have to specify the dataset (`cell_towers`) and the chart type. Since the OpenCelliD dataset provides longitude and latitude coordinates for cell towers we will create a **Map** chart. The **deck.gL Scatterplot** type is suited to this dataset as it works well with dense data points on a map.
![Create a map in Superset](@site/docs/en/getting-started/example-datasets/images/superset-create-map.png)
#### Specify the query used for the map
A deck.gl Scatterplot requires a longitude and latitude, and one or more filters can also be applied to the query. In this example two filters are applied, one for cell towers with UMTS radios, and one for the Mobile country code assigned to The Netherlands.
The fields `lon` and `lat` contain the longitude and latitude:
![Specify longitude and latitude fields](@site/docs/en/getting-started/example-datasets/images/superset-lon-lat.png)
Add a filter with `mcc` = `204` (or substitute any other `mcc` value):
![Filter on MCC 204](@site/docs/en/getting-started/example-datasets/images/superset-mcc-204.png)
Add a filter with `radio` = `'UMTS'` (or substitute any other `radio` value, you can see the choices in the output of `DESCRIBE TABLE cell_towers`):
![Filter on radio = UMTS](@site/docs/en/getting-started/example-datasets/images/superset-radio-umts.png)
This is the full configuration for the chart that filters on `radio = 'UMTS'` and `mcc = 204`:
![Chart for UMTS radios in MCC 204](@site/docs/en/getting-started/example-datasets/images/superset-umts-netherlands.png)
Click on **UPDATE CHART** to render the visualization.
### Add the charts to a **dashboard**
This screenshot shows cell tower locations with LTE, UMTS, and GSM radios. The charts are all created in the same way and they are added to a dashboard.
![Dashboard of cell towers by radio type in mcc 204](@site/docs/en/getting-started/example-datasets/images/superset-cell-tower-dashboard.png)
:::tip
The data is also available for interactive queries in the [Playground](https://play.clickhouse.com/play?user=play).
This [example](https://play.clickhouse.com/play?user=play#U0VMRUNUIG1jYywgY291bnQoKSBGUk9NIGNlbGxfdG93ZXJzIEdST1VQIEJZIG1jYyBPUkRFUiBCWSBjb3VudCgpIERFU0M=) will populate the username and even the query for you.
Although you cannot create tables in the Playground, you can run all of the queries and even use Superset (adjust the hostname and port number).
:::

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 475 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 53 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 290 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 38 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

View File

@ -4,7 +4,7 @@ sidebar_label: Recipes Dataset
title: "Recipes Dataset"
---
RecipeNLG dataset is available for download [here](https://recipenlg.cs.put.poznan.pl/dataset). It contains 2.2 million recipes. The size is slightly less than 1 GB.
The RecipeNLG dataset is available for download [here](https://recipenlg.cs.put.poznan.pl/dataset). It contains 2.2 million recipes. The size is slightly less than 1 GB.
## Download and Unpack the Dataset

View File

@ -126,7 +126,7 @@ clickhouse keeper --config /etc/your_path_to_config/config.xml
ClickHouse Keeper also provides 4lw commands which are almost the same with Zookeeper. Each command is composed of four letters such as `mntr`, `stat` etc. There are some more interesting commands: `stat` gives some general information about the server and connected clients, while `srvr` and `cons` give extended details on server and connections respectively.
The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchc,wchs,dirs,mntr,isro`.
The 4lw commands has a white list configuration `four_letter_word_white_list` which has default value `conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif`.
You can issue the commands to ClickHouse Keeper via telnet or nc, at the client port.
@ -309,6 +309,25 @@ Sessions with Ephemerals (1):
/clickhouse/task_queue/ddl
```
- `csnp`: Schedule a snapshot creation task. Return the last committed log index of the scheduled snapshot if success or `Failed to schedule snapshot creation task.` if failed. Note that `lgif` command can help you determine whether the snapshot is done.
```
100
```
- `lgif`: Keeper log information. `first_log_idx` : my first log index in log store; `first_log_term` : my first log term; `last_log_idx` : my last log index in log store; `last_log_term` : my last log term; `last_committed_log_idx` : my last committed log index in state machine; `leader_committed_log_idx` : leader's committed log index from my perspective; `target_committed_log_idx` : target log index should be committed to; `last_snapshot_idx` : the largest committed log index in last snapshot.
```
first_log_idx 1
first_log_term 1
last_log_idx 101
last_log_term 1
last_committed_log_idx 100
leader_committed_log_idx 101
target_committed_log_idx 101
last_snapshot_idx 50
```
## Migration from ZooKeeper {#migration-from-zookeeper}
Seamlessly migration from ZooKeeper to ClickHouse Keeper is impossible you have to stop your ZooKeeper cluster, convert data and start ClickHouse Keeper. `clickhouse-keeper-converter` tool allows converting ZooKeeper logs and snapshots to ClickHouse Keeper snapshot. It works only with ZooKeeper > 3.4. Steps for migration:

View File

@ -8,70 +8,69 @@ title: "Geo Functions"
## Geographical Coordinates Functions
- [greatCircleDistance](./coordinates.md#greatCircleDistance)
- [geoDistance](./coordinates.md#geoDistance)
- [greatCircleAngle](./coordinates.md#greatCircleAngle)
- [pointInEllipses](./coordinates.md#pointInEllipses)
- [pointInPolygon](./coordinates.md#pointInPolygon)
- [greatCircleDistance](./coordinates.md#greatcircledistance)
- [geoDistance](./coordinates.md#geodistance)
- [greatCircleAngle](./coordinates.md#greatcircleangle)
- [pointInEllipses](./coordinates.md#pointinellipses)
- [pointInPolygon](./coordinates.md#pointinpolygon)
## Geohash Functions
- [geohashEncode](./geohash.md#geohashEncode)
- [geohashDecode](./geohash.md#geohashDecode)
- [geohashesInBox](./geohash.md#geohashesInBox)
- [geohashEncode](./geohash.md#geohashencode)
- [geohashDecode](./geohash.md#geohashdecode)
- [geohashesInBox](./geohash.md#geohashesinbox)
## H3 Indexes Functions
- [h3IsValid](./h3.md#h3IsValid)
- [h3GetResolution](./h3.md#h3GetResolution)
- [h3EdgeAngle](./h3.md#h3EdgeAngle)
- [h3EdgeLengthM](./h3.md#h3EdgeLengthM)
- [h3EdgeLengthKm](./h3.md#h3EdgeLengthKm)
- [geoToH3](./h3.md#geoToH3)
- [h3ToGeo](./h3.md#h3ToGeo)
- [h3ToGeoBoundary](./h3.md#h3ToGeoBoundary)
- [h3kRing](./h3.md#h3kRing)
- [h3GetBaseCell](./h3.md#h3GetBaseCell)
- [h3HexAreaM2](./h3.md#h3HexAreaM2)
- [h3HexAreaKm2](./h3.md#h3HexAreaKm2)
- [h3IndexesAreNeighbors](./h3.md#h3IndexesAreNeighbors)
- [h3ToChildren](./h3.md#h3ToChildren)
- [h3ToParent](./h3.md#h3ToParent)
- [h3ToString](./h3.md#h3ToString)
- [stringToH3](./h3.md#stringToH3)
- [h3GetResolution](./h3.md#h3GetResolution)
- [h3IsResClassIII](./h3.md#h3IsResClassIII)
- [h3IsPentagon](./h3.md#h3IsPentagon)
- [h3GetFaces](./h3.md#h3GetFaces)
- [h3CellAreaM2](./h3.md#h3CellAreaM2)
- [h3CellAreaRads2](./h3.md#h3CellAreaRads2)
- [h3ToCenterChild](./h3.md#h3ToCenterChild)
- [h3ExactEdgeLengthM](./h3.md#h3ExactEdgeLengthM)
- [h3ExactEdgeLengthKm](./h3.md#h3ExactEdgeLengthKm)
- [h3ExactEdgeLengthRads](./h3.md#h3ExactEdgeLengthRads)
- [h3NumHexagons](./h3.md#h3NumHexagons)
- [h3Line](./h3.md#h3Line)
- [h3Distance](./h3.md#h3Distance)
- [h3HexRing](./h3.md#h3HexRing)
- [h3GetUnidirectionalEdge](./h3.md#h3GetUnidirectionalEdge)
- [h3UnidirectionalEdgeIsValid](./h3.md#h3UnidirectionalEdgeIsValid)
- [h3GetOriginIndexFromUnidirectionalEdge](./h3.md#h3GetOriginIndexFromUnidirectionalEdge)
- [h3GetDestinationIndexFromUnidirectionalEdge](./h3.md#h3GetDestinationIndexFromUnidirectionalEdge)
- [h3GetIndexesFromUnidirectionalEdge](./h3.md#h3GetIndexesFromUnidirectionalEdge)
- [h3GetUnidirectionalEdgesFromHexagon](./h3.md#h3GetUnidirectionalEdgesFromHexagon)
- [h3GetUnidirectionalEdgeBoundary](./h3.md#h3GetUnidirectionalEdgeBoundary)
- [h3IsValid](./h3.md#h3isvalid)
- [h3GetResolution](./h3.md#h3getresolution)
- [h3EdgeAngle](./h3.md#h3edgeangle)
- [h3EdgeLengthM](./h3.md#h3edgelengthm)
- [h3EdgeLengthKm](./h3.md#h3edgelengthkm)
- [geoToH3](./h3.md#geotoh3)
- [h3ToGeo](./h3.md#h3togeo)
- [h3ToGeoBoundary](./h3.md#h3togeoboundary)
- [h3kRing](./h3.md#h3kring)
- [h3GetBaseCell](./h3.md#h3getbasecell)
- [h3HexAreaM2](./h3.md#h3hexaream2)
- [h3HexAreaKm2](./h3.md#h3hexareakm2)
- [h3IndexesAreNeighbors](./h3.md#h3indexesareneighbors)
- [h3ToChildren](./h3.md#h3tochildren)
- [h3ToParent](./h3.md#h3toparent)
- [h3ToString](./h3.md#h3tostring)
- [stringToH3](./h3.md#stringtoh3)
- [h3GetResolution](./h3.md#h3getresolution)
- [h3IsResClassIII](./h3.md#h3isresclassiii)
- [h3IsPentagon](./h3.md#h3ispentagon)
- [h3GetFaces](./h3.md#h3getfaces)
- [h3CellAreaM2](./h3.md#h3cellaream2)
- [h3CellAreaRads2](./h3.md#h3cellarearads2)
- [h3ToCenterChild](./h3.md#h3tocenterchild)
- [h3ExactEdgeLengthM](./h3.md#h3exactedgelengthm)
- [h3ExactEdgeLengthKm](./h3.md#h3exactedgelengthkm)
- [h3ExactEdgeLengthRads](./h3.md#h3exactedgelengthrads)
- [h3NumHexagons](./h3.md#h3numhexagons)
- [h3Line](./h3.md#h3line)
- [h3Distance](./h3.md#h3distance)
- [h3HexRing](./h3.md#h3hexring)
- [h3GetUnidirectionalEdge](./h3.md#h3getunidirectionaledge)
- [h3UnidirectionalEdgeIsValid](./h3.md#h3unidirectionaledgeisvalid)
- [h3GetOriginIndexFromUnidirectionalEdge](./h3.md#h3getoriginindexfromunidirectionaledge)
- [h3GetDestinationIndexFromUnidirectionalEdge](./h3.md#h3getdestinationindexfromunidirectionaledge)
- [h3GetIndexesFromUnidirectionalEdge](./h3.md#h3getindexesfromunidirectionaledge)
- [h3GetUnidirectionalEdgesFromHexagon](./h3.md#h3getunidirectionaledgesfromhexagon)
- [h3GetUnidirectionalEdgeBoundary](./h3.md#h3getunidirectionaledgeboundary)
## S2 Index Functions
- [geoToS2](./s2.md#geoToS2)
- [s2ToGeo](./s2.md#s2ToGeo)
- [s2GetNeighbors](./s2.md#s2GetNeighbors)
- [s2CellsIntersect](./s2.md#s2CellsIntersect)
- [s2CapContains](./s2.md#s2CapContains)
- [s2CapUnion](./s2.md#s2CapUnion)
- [s2RectAdd](./s2.md#s2RectAdd)
- [s2RectContains](./s2.md#s2RectContains)
- [s2RectUinion](./s2.md#s2RectUinion)
- [s2RectIntersection](./s2.md#s2RectIntersection)
- [geoToS2](./s2.md#geotos2)
- [s2ToGeo](./s2.md#s2togeo)
- [s2GetNeighbors](./s2.md#s2getneighbors)
- [s2CellsIntersect](./s2.md#s2cellsintersect)
- [s2CapContains](./s2.md#s2capcontains)
- [s2CapUnion](./s2.md#s2capunion)
- [s2RectAdd](./s2.md#s2rectadd)
- [s2RectContains](./s2.md#s2rectcontains)
- [s2RectUnion](./s2.md#s2rectunion)
- [s2RectIntersection](./s2.md#s2rectintersection)
[Original article](https://clickhouse.com/docs/en/sql-reference/functions/geo/) <!--hide-->

View File

@ -593,6 +593,27 @@ LIMIT 10
└────────────────┴─────────┘
```
## formatReadableDecimalSize(x)
Accepts the size (number of bytes). Returns a rounded size with a suffix (KB, MB, etc.) as a string.
Example:
``` sql
SELECT
arrayJoin([1, 1024, 1024*1024, 192851925]) AS filesize_bytes,
formatReadableDecimalSize(filesize_bytes) AS filesize
```
``` text
┌─filesize_bytes─┬─filesize───┐
│ 1 │ 1.00 B │
│ 1024 │ 1.02 KB │
│ 1048576 │ 1.05 MB │
│ 192851925 │ 192.85 MB │
└────────────────┴────────────┘
```
## formatReadableSize(x)
Accepts the size (number of bytes). Returns a rounded size with a suffix (KiB, MiB, etc.) as a string.

View File

@ -1088,7 +1088,8 @@ void Client::processConfig()
}
else
{
need_render_progress = config().getBool("progress", false);
std::string progress = config().getString("progress", "tty");
need_render_progress = (Poco::icompare(progress, "off") && Poco::icompare(progress, "no") && Poco::icompare(progress, "false") && Poco::icompare(progress, "0"));
echo_queries = config().getBool("echo", false);
ignore_error = config().getBool("ignore-error", false);

View File

@ -489,7 +489,8 @@ void LocalServer::processConfig()
}
else
{
need_render_progress = config().getBool("progress", false);
std::string progress = config().getString("progress", "tty");
need_render_progress = (Poco::icompare(progress, "off") && Poco::icompare(progress, "no") && Poco::icompare(progress, "false") && Poco::icompare(progress, "0"));
echo_queries = config().hasOption("echo") || config().hasOption("verbose");
ignore_error = config().getBool("ignore-error", false);
is_multiquery = true;

View File

@ -65,10 +65,12 @@
#include <Interpreters/ReplaceQueryParameterVisitor.h>
#include <Interpreters/ProfileEventsExt.h>
#include <IO/WriteBufferFromOStream.h>
#include <IO/WriteBufferFromFileDescriptor.h>
#include <IO/CompressionMethod.h>
#include <Client/InternalTextLogs.h>
#include <IO/ForkWriteBuffer.h>
#include <Parsers/Kusto/ParserKQLStatement.h>
#include <boost/algorithm/string/case_conv.hpp>
namespace fs = std::filesystem;
@ -103,6 +105,7 @@ namespace ErrorCodes
extern const int CANNOT_SET_SIGNAL_HANDLER;
extern const int UNRECOGNIZED_ARGUMENTS;
extern const int LOGICAL_ERROR;
extern const int CANNOT_OPEN_FILE;
}
}
@ -116,6 +119,25 @@ namespace ProfileEvents
namespace DB
{
std::istream& operator>> (std::istream & in, ProgressOption & progress)
{
std::string token;
in >> token;
boost::to_upper(token);
if (token == "OFF" || token == "FALSE" || token == "0" || token == "NO")
progress = ProgressOption::OFF;
else if (token == "TTY" || token == "ON" || token == "TRUE" || token == "1" || token == "YES")
progress = ProgressOption::TTY;
else if (token == "ERR")
progress = ProgressOption::ERR;
else
throw boost::program_options::validation_error(boost::program_options::validation_error::invalid_option_value);
return in;
}
static ClientInfo::QueryKind parseQueryKind(const String & query_kind)
{
if (query_kind == "initial_query")
@ -413,8 +435,8 @@ void ClientBase::onData(Block & block, ASTPtr parsed_query)
return;
/// If results are written INTO OUTFILE, we can avoid clearing progress to avoid flicker.
if (need_render_progress && (stdout_is_a_tty || is_interactive) && (!select_into_file || select_into_file_and_stdout))
progress_indication.clearProgressOutput();
if (need_render_progress && tty_buf && (!select_into_file || select_into_file_and_stdout))
progress_indication.clearProgressOutput(*tty_buf);
try
{
@ -431,11 +453,11 @@ void ClientBase::onData(Block & block, ASTPtr parsed_query)
output_format->flush();
/// Restore progress bar after data block.
if (need_render_progress && (stdout_is_a_tty || is_interactive))
if (need_render_progress && tty_buf)
{
if (select_into_file && !select_into_file_and_stdout)
std::cerr << "\r";
progress_indication.writeProgress();
progress_indication.writeProgress(*tty_buf);
}
}
@ -443,7 +465,8 @@ void ClientBase::onData(Block & block, ASTPtr parsed_query)
void ClientBase::onLogData(Block & block)
{
initLogsOutputStream();
progress_indication.clearProgressOutput();
if (need_render_progress && tty_buf)
progress_indication.clearProgressOutput(*tty_buf);
logs_out_stream->writeLogs(block);
logs_out_stream->flush();
}
@ -639,6 +662,58 @@ void ClientBase::initLogsOutputStream()
}
}
void ClientBase::initTtyBuffer(bool to_err)
{
if (!tty_buf)
{
static constexpr auto tty_file_name = "/dev/tty";
/// Output all progress bar commands to terminal at once to avoid flicker.
/// This size is usually greater than the window size.
static constexpr size_t buf_size = 1024;
if (!to_err)
{
std::error_code ec;
std::filesystem::file_status tty = std::filesystem::status(tty_file_name, ec);
if (!ec && exists(tty) && is_character_file(tty)
&& (tty.permissions() & std::filesystem::perms::others_write) != std::filesystem::perms::none)
{
try
{
tty_buf = std::make_unique<WriteBufferFromFile>(tty_file_name, buf_size);
/// It is possible that the terminal file has writeable permissions
/// but we cannot write anything there. Check it with invisible character.
tty_buf->write('\0');
tty_buf->next();
return;
}
catch (const Exception & e)
{
if (tty_buf)
tty_buf.reset();
if (e.code() != ErrorCodes::CANNOT_OPEN_FILE)
throw;
/// It is normal if file exists, indicated as writeable but still cannot be opened.
/// Fallback to other options.
}
}
}
if (stderr_is_a_tty)
{
tty_buf = std::make_unique<WriteBufferFromFileDescriptor>(STDERR_FILENO, buf_size);
}
else
need_render_progress = false;
}
}
void ClientBase::updateSuggest(const ASTPtr & ast)
{
std::vector<std::string> new_words;
@ -937,14 +1012,15 @@ void ClientBase::onProgress(const Progress & value)
if (output_format)
output_format->onProgress(value);
if (need_render_progress)
progress_indication.writeProgress();
if (need_render_progress && tty_buf)
progress_indication.writeProgress(*tty_buf);
}
void ClientBase::onEndOfStream()
{
progress_indication.clearProgressOutput();
if (need_render_progress && tty_buf)
progress_indication.clearProgressOutput(*tty_buf);
if (output_format)
output_format->finalize();
@ -952,10 +1028,7 @@ void ClientBase::onEndOfStream()
resetOutput();
if (is_interactive && !written_first_block)
{
progress_indication.clearProgressOutput();
std::cout << "Ok." << std::endl;
}
}
@ -998,15 +1071,16 @@ void ClientBase::onProfileEvents(Block & block)
}
progress_indication.updateThreadEventData(thread_times);
if (need_render_progress)
progress_indication.writeProgress();
if (need_render_progress && tty_buf)
progress_indication.writeProgress(*tty_buf);
if (profile_events.print)
{
if (profile_events.watch.elapsedMilliseconds() >= profile_events.delay_ms)
{
initLogsOutputStream();
progress_indication.clearProgressOutput();
if (need_render_progress && tty_buf)
progress_indication.clearProgressOutput(*tty_buf);
logs_out_stream->writeProfileEvents(block);
logs_out_stream->flush();
@ -1180,7 +1254,8 @@ void ClientBase::sendData(Block & sample, const ColumnsDescription & columns_des
progress_indication.updateProgress(Progress(file_progress));
/// Set callback to be called on file progress.
progress_indication.setFileProgressCallback(global_context, true);
if (tty_buf)
progress_indication.setFileProgressCallback(global_context, *tty_buf);
}
/// If data fetched from file (maybe compressed file)
@ -1432,12 +1507,12 @@ bool ClientBase::receiveEndOfQuery()
void ClientBase::cancelQuery()
{
connection->sendCancel();
if (need_render_progress && tty_buf)
progress_indication.clearProgressOutput(*tty_buf);
if (is_interactive)
{
progress_indication.clearProgressOutput();
std::cout << "Cancelling query." << std::endl;
}
cancelled = true;
}
@ -1557,7 +1632,8 @@ void ClientBase::processParsedSingleQuery(const String & full_query, const Strin
if (profile_events.last_block)
{
initLogsOutputStream();
progress_indication.clearProgressOutput();
if (need_render_progress && tty_buf)
progress_indication.clearProgressOutput(*tty_buf);
logs_out_stream->writeProfileEvents(profile_events.last_block);
logs_out_stream->flush();
@ -2248,7 +2324,7 @@ void ClientBase::init(int argc, char ** argv)
("stage", po::value<std::string>()->default_value("complete"), "Request query processing up to specified stage: complete,fetch_columns,with_mergeable_state,with_mergeable_state_after_aggregation,with_mergeable_state_after_aggregation_and_limit")
("query_kind", po::value<std::string>()->default_value("initial_query"), "One of initial_query/secondary_query/no_query")
("query_id", po::value<std::string>(), "query_id")
("progress", "print progress of queries execution")
("progress", po::value<ProgressOption>()->implicit_value(ProgressOption::TTY, "tty")->default_value(ProgressOption::TTY, "tty"), "Print progress of queries execution - to TTY (default): tty|on|1|true|yes; to STDERR: err; OFF: off|0|false|no")
("disable_suggestion,A", "Disable loading suggestion data. Note that suggestion data is loaded asynchronously through a second connection to ClickHouse server. Also it is reasonable to disable suggestion if you want to paste a query with TAB characters. Shorthand option -A is for those who get used to mysql client.")
("time,t", "print query execution time to stderr in non-interactive mode (for benchmarks)")
@ -2303,6 +2379,11 @@ void ClientBase::init(int argc, char ** argv)
parseAndCheckOptions(options_description, options, common_arguments);
po::notify(options);
if (options["progress"].as<ProgressOption>() == ProgressOption::OFF)
need_render_progress = false;
else
initTtyBuffer(options["progress"].as<ProgressOption>() == ProgressOption::ERR);
if (options.count("version") || options.count("V"))
{
showClientVersion();
@ -2353,7 +2434,20 @@ void ClientBase::init(int argc, char ** argv)
if (options.count("profile-events-delay-ms"))
config().setUInt64("profile-events-delay-ms", options["profile-events-delay-ms"].as<UInt64>());
if (options.count("progress"))
config().setBool("progress", true);
{
switch (options["progress"].as<ProgressOption>())
{
case OFF:
config().setString("progress", "off");
break;
case TTY:
config().setString("progress", "tty");
break;
case ERR:
config().setString("progress", "err");
break;
}
}
if (options.count("echo"))
config().setBool("echo", true);
if (options.count("disable_suggestion"))

View File

@ -15,6 +15,7 @@
#include <Storages/StorageFile.h>
#include <Storages/SelectQueryInfo.h>
namespace po = boost::program_options;
@ -35,9 +36,18 @@ enum MultiQueryProcessingStage
PARSING_FAILED,
};
enum ProgressOption
{
OFF,
TTY,
ERR,
};
std::istream& operator>> (std::istream & in, ProgressOption & progress);
void interruptSignalHandler(int signum);
class InternalTextLogs;
class WriteBufferFromFileDescriptor;
class ClientBase : public Poco::Util::Application, public IHints<2, ClientBase>
{
@ -143,6 +153,7 @@ private:
void initOutputFormat(const Block & block, ASTPtr parsed_query);
void initLogsOutputStream();
void initTtyBuffer(bool to_err = false);
String prompt() const;
@ -218,6 +229,10 @@ protected:
String server_logs_file;
std::unique_ptr<InternalTextLogs> logs_out_stream;
/// /dev/tty if accessible or std::cerr - for progress bar.
/// We prefer to output progress bar directly to tty to allow user to redirect stdout and stderr and still get the progress indication.
std::unique_ptr<WriteBufferFromFileDescriptor> tty_buf;
String home_path;
String history_file; /// Path to a file containing command history.

View File

@ -2,6 +2,7 @@
#include <algorithm>
#include <cstddef>
#include <numeric>
#include <filesystem>
#include <cmath>
#include <IO/WriteBufferFromFileDescriptor.h>
#include <base/types.h>
@ -11,6 +12,9 @@
#include "IO/WriteBufferFromString.h"
#include <Databases/DatabaseMemory.h>
/// http://en.wikipedia.org/wiki/ANSI_escape_code
#define CLEAR_TO_END_OF_LINE "\033[K"
namespace
{
@ -44,15 +48,6 @@ bool ProgressIndication::updateProgress(const Progress & value)
return progress.incrementPiecewiseAtomically(value);
}
void ProgressIndication::clearProgressOutput()
{
if (written_progress_chars)
{
written_progress_chars = 0;
std::cerr << "\r" CLEAR_TO_END_OF_LINE;
}
}
void ProgressIndication::resetProgress()
{
watch.restart();
@ -67,15 +62,12 @@ void ProgressIndication::resetProgress()
}
}
void ProgressIndication::setFileProgressCallback(ContextMutablePtr context, bool write_progress_on_update_)
void ProgressIndication::setFileProgressCallback(ContextMutablePtr context, WriteBufferFromFileDescriptor & message)
{
write_progress_on_update = write_progress_on_update_;
context->setFileProgressCallback([&](const FileProgress & file_progress)
{
progress.incrementPiecewiseAtomically(Progress(file_progress));
if (write_progress_on_update)
writeProgress();
writeProgress(message);
});
}
@ -142,13 +134,10 @@ void ProgressIndication::writeFinalProgress()
std::cout << ". ";
}
void ProgressIndication::writeProgress()
void ProgressIndication::writeProgress(WriteBufferFromFileDescriptor & message)
{
std::lock_guard lock(progress_mutex);
/// Output all progress bar commands to stderr at once to avoid flicker.
WriteBufferFromFileDescriptor message(STDERR_FILENO, 1024);
static size_t increment = 0;
static const char * indicators[8] = {
"\033[1;30m→\033[0m",
@ -307,4 +296,14 @@ void ProgressIndication::writeProgress()
message.next();
}
void ProgressIndication::clearProgressOutput(WriteBufferFromFileDescriptor & message)
{
if (written_progress_chars)
{
written_progress_chars = 0;
message << "\r" CLEAR_TO_END_OF_LINE;
message.next();
}
}
}

View File

@ -9,12 +9,12 @@
#include <Common/Stopwatch.h>
#include <Common/EventRateMeter.h>
/// http://en.wikipedia.org/wiki/ANSI_escape_code
#define CLEAR_TO_END_OF_LINE "\033[K"
namespace DB
{
class WriteBufferFromFileDescriptor;
struct ThreadEventData
{
UInt64 time() const noexcept { return user_ms + system_ms; }
@ -30,14 +30,13 @@ using HostToThreadTimesMap = std::unordered_map<String, ThreadIdToTimeMap>;
class ProgressIndication
{
public:
/// Write progress to stderr.
void writeProgress();
/// Write progress bar.
void writeProgress(WriteBufferFromFileDescriptor & message);
void clearProgressOutput(WriteBufferFromFileDescriptor & message);
/// Write summary.
void writeFinalProgress();
/// Clear stderr output.
void clearProgressOutput();
/// Reset progress values.
void resetProgress();
@ -52,7 +51,7 @@ public:
/// In some cases there is a need to update progress value, when there is no access to progress_inidcation object.
/// In this case it is added via context.
/// `write_progress_on_update` is needed to write progress for loading files data via pipe in non-interactive mode.
void setFileProgressCallback(ContextMutablePtr context, bool write_progress_on_update = false);
void setFileProgressCallback(ContextMutablePtr context, WriteBufferFromFileDescriptor & message);
/// How much seconds passed since query execution start.
double elapsedSeconds() const { return getElapsedNanoseconds() / 1e9; }

View File

@ -3,10 +3,10 @@
#include <cstring>
#include <iostream>
#include <Core/Defines.h>
#include <Common/Stopwatch.h>
#include <Common/TargetSpecific.h>
#include <base/types.h>
#include <base/unaligned.h>
#include <Common/Stopwatch.h>
#include <Common/TargetSpecific.h>
#ifdef __SSE2__
#include <emmintrin.h>
@ -599,6 +599,9 @@ bool NO_INLINE decompressImpl(const char * const source, char * const dest, size
copy_end = op + length;
if (unlikely(copy_end > output_end))
return false;
/** Here we can write up to copy_amount - 1 - 4 * 2 bytes after buffer.
* The worst case when offset = 1 and length = 4
*/

View File

@ -1,8 +1,5 @@
#include <Compression/CompressionFactory.h>
#include <Common/PODArray.h>
#include <Common/Stopwatch.h>
#include <base/types.h>
#include <DataTypes/DataTypesNumber.h>
#include <DataTypes/IDataType.h>
#include <IO/ReadBufferFromMemory.h>
@ -10,6 +7,12 @@
#include <Parsers/ExpressionElementParsers.h>
#include <Parsers/IParser.h>
#include <Parsers/TokenIterator.h>
#include <base/types.h>
#include <Common/PODArray.h>
#include <Common/Stopwatch.h>
#include <Compression/LZ4_decompress_faster.h>
#include <IO/BufferWithOwnMemory.h>
#include <random>
#include <bitset>
@ -1319,4 +1322,34 @@ INSTANTIATE_TEST_SUITE_P(Gorilla,
// ),
//);
TEST(LZ4Test, DecompressMalformedInput)
{
/// This malformed input was initially found by lz4_decompress_fuzzer and causes failure under UBSAN.
constexpr unsigned char data[]
= {0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x07, 0x00,
0x00, 0x20, 0x00, 0x00, 0x66, 0x66, 0x66, 0x66, 0xff, 0xff, 0xff, 0x17, 0xff, 0xff, 0x0f, 0x00, 0x00, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xfe, 0x1f, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
const char * const source = reinterpret_cast<const char * const>(data);
const uint32_t source_size = std::size(data);
constexpr uint32_t uncompressed_size = 80;
DB::Memory<> memory;
memory.resize(ICompressionCodec::getHeaderSize() + uncompressed_size + LZ4::ADDITIONAL_BYTES_AT_END_OF_BUFFER);
unalignedStoreLE<uint8_t>(memory.data(), static_cast<uint8_t>(CompressionMethodByte::LZ4));
unalignedStoreLE<uint32_t>(&memory[1], source_size);
unalignedStoreLE<uint32_t>(&memory[5], uncompressed_size);
auto codec = CompressionCodecFactory::instance().get("LZ4", {});
ASSERT_THROW(codec->decompress(source, source_size, memory.data()), Exception);
}
}

View File

@ -36,7 +36,7 @@ void CoordinationSettings::loadFromConfig(const String & config_elem, const Poco
}
const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv";
const String KeeperConfigurationAndSettings::DEFAULT_FOUR_LETTER_WORD_CMD = "conf,cons,crst,envi,ruok,srst,srvr,stat,wchs,dirs,mntr,isro,rcvr,apiv,csnp,lgif";
KeeperConfigurationAndSettings::KeeperConfigurationAndSettings()
: server_id(NOT_EXIST)

View File

@ -136,6 +136,12 @@ void FourLetterCommandFactory::registerCommands(KeeperDispatcher & keeper_dispat
FourLetterCommandPtr api_version_command = std::make_shared<ApiVersionCommand>(keeper_dispatcher);
factory.registerCommand(api_version_command);
FourLetterCommandPtr create_snapshot_command = std::make_shared<CreateSnapshotCommand>(keeper_dispatcher);
factory.registerCommand(create_snapshot_command);
FourLetterCommandPtr log_info_command = std::make_shared<LogInfoCommand>(keeper_dispatcher);
factory.registerCommand(log_info_command);
factory.initializeAllowList(keeper_dispatcher);
factory.setInitialize(true);
}
@ -472,4 +478,33 @@ String ApiVersionCommand::run()
return toString(static_cast<uint8_t>(Coordination::current_keeper_api_version));
}
String CreateSnapshotCommand::run()
{
auto log_index = keeper_dispatcher.createSnapshot();
return log_index > 0 ? std::to_string(log_index) : "Failed to schedule snapshot creation task.";
}
String LogInfoCommand::run()
{
KeeperLogInfo log_info = keeper_dispatcher.getKeeperLogInfo();
StringBuffer ret;
auto append = [&ret] (String key, uint64_t value) -> void
{
writeText(key, ret);
writeText('\t', ret);
writeText(std::to_string(value), ret);
writeText('\n', ret);
};
append("first_log_idx", log_info.first_log_idx);
append("first_log_term", log_info.first_log_idx);
append("last_log_idx", log_info.last_log_idx);
append("last_log_term", log_info.last_log_term);
append("last_committed_log_idx", log_info.last_committed_log_idx);
append("leader_committed_log_idx", log_info.leader_committed_log_idx);
append("target_committed_log_idx", log_info.target_committed_log_idx);
append("last_snapshot_idx", log_info.last_snapshot_idx);
return ret.str();
}
}

View File

@ -17,6 +17,7 @@ using FourLetterCommandPtr = std::shared_ptr<DB::IFourLetterCommand>;
/// Just like zookeeper Four Letter Words commands, CH Keeper responds to a small set of commands.
/// Each command is composed of four letters, these commands are useful to monitor and issue system problems.
/// The feature is based on Zookeeper 3.5.9, details is in https://zookeeper.apache.org/doc/r3.5.9/zookeeperAdmin.html#sc_zkCommands.
/// Also we add some additional commands such as csnp, lgif etc.
struct IFourLetterCommand
{
public:
@ -327,4 +328,40 @@ struct ApiVersionCommand : public IFourLetterCommand
String run() override;
~ApiVersionCommand() override = default;
};
/// Create snapshot manually
struct CreateSnapshotCommand : public IFourLetterCommand
{
explicit CreateSnapshotCommand(KeeperDispatcher & keeper_dispatcher_)
: IFourLetterCommand(keeper_dispatcher_)
{
}
String name() override { return "csnp"; }
String run() override;
~CreateSnapshotCommand() override = default;
};
/** Raft log information:
* first_log_idx 1
* first_log_term 1
* last_log_idx 101
* last_log_term 1
* last_committed_idx 100
* leader_committed_log_idx 101
* target_committed_log_idx 101
* last_snapshot_idx 50
*/
struct LogInfoCommand : public IFourLetterCommand
{
explicit LogInfoCommand(KeeperDispatcher & keeper_dispatcher_)
: IFourLetterCommand(keeper_dispatcher_)
{
}
String name() override { return "lgif"; }
String run() override;
~LogInfoCommand() override = default;
};
}

View File

@ -47,4 +47,32 @@ struct Keeper4LWInfo
}
};
/// Keeper log information for 4lw commands
struct KeeperLogInfo
{
/// My first log index in log store.
uint64_t first_log_idx;
/// My first log term.
uint64_t first_log_term;
/// My last log index in log store.
uint64_t last_log_idx;
/// My last log term.
uint64_t last_log_term;
/// My last committed log index in state machine.
uint64_t last_committed_log_idx;
/// Leader's committed log index from my perspective.
uint64_t leader_committed_log_idx;
/// Target log index should be committed to.
uint64_t target_committed_log_idx;
/// The largest committed log index in last snapshot.
uint64_t last_snapshot_idx;
};
}

View File

@ -203,6 +203,18 @@ public:
{
keeper_stats.reset();
}
/// Create snapshot manually, return the last committed log index in the snapshot
uint64_t createSnapshot()
{
return server->createSnapshot();
}
/// Get Raft information
KeeperLogInfo getKeeperLogInfo()
{
return server->getKeeperLogInfo();
}
};
}

View File

@ -907,4 +907,29 @@ Keeper4LWInfo KeeperServer::getPartiallyFilled4LWInfo() const
return result;
}
uint64_t KeeperServer::createSnapshot()
{
uint64_t log_idx = raft_instance->create_snapshot();
if (log_idx != 0)
LOG_INFO(log, "Snapshot creation scheduled with last committed log index {}.", log_idx);
else
LOG_WARNING(log, "Failed to schedule snapshot creation task.");
return log_idx;
}
KeeperLogInfo KeeperServer::getKeeperLogInfo()
{
KeeperLogInfo log_info;
auto log_store = state_manager->load_log_store();
log_info.first_log_idx = log_store->start_index();
log_info.first_log_term = log_store->term_at(log_info.first_log_idx);
log_info.last_log_idx = raft_instance->get_last_log_idx();
log_info.last_log_term = raft_instance->get_last_log_term();
log_info.last_committed_log_idx = raft_instance->get_committed_log_idx();
log_info.leader_committed_log_idx = raft_instance->get_leader_committed_log_idx();
log_info.target_committed_log_idx = raft_instance->get_target_committed_log_idx();
log_info.last_snapshot_idx = raft_instance->get_last_snapshot_idx();
return log_info;
}
}

View File

@ -131,6 +131,10 @@ public:
/// Wait configuration update for action. Used by followers.
/// Return true if update was successfully received.
bool waitConfigurationUpdate(const ConfigUpdateAction & task);
uint64_t createSnapshot();
KeeperLogInfo getKeeperLogInfo();
};
}

View File

@ -377,6 +377,9 @@ void KeeperStorage::UncommittedState::commit(int64_t commit_zxid)
{
assert(deltas.empty() || deltas.front().zxid >= commit_zxid);
// collect nodes that have no further modification in the current transaction
std::unordered_set<std::string> modified_nodes;
while (!deltas.empty() && deltas.front().zxid == commit_zxid)
{
if (std::holds_alternative<SubDeltaEnd>(deltas.front().operation))
@ -393,7 +396,17 @@ void KeeperStorage::UncommittedState::commit(int64_t commit_zxid)
assert(path_deltas.front() == &front_delta);
path_deltas.pop_front();
if (path_deltas.empty())
{
deltas_for_path.erase(front_delta.path);
// no more deltas for path -> no modification
modified_nodes.insert(std::move(front_delta.path));
}
else if (path_deltas.front()->zxid > commit_zxid)
{
// next delta has a zxid from a different transaction -> no modification in this transaction
modified_nodes.insert(std::move(front_delta.path));
}
}
else if (auto * add_auth = std::get_if<AddAuthDelta>(&front_delta.operation))
{
@ -409,9 +422,12 @@ void KeeperStorage::UncommittedState::commit(int64_t commit_zxid)
}
// delete all cached nodes that were not modified after the commit_zxid
// the commit can end on SubDeltaEnd so we don't want to clear cached nodes too soon
if (deltas.empty() || deltas.front().zxid > commit_zxid)
std::erase_if(nodes, [commit_zxid](const auto & node) { return node.second.zxid == commit_zxid; });
// we only need to check the nodes that were modified in this transaction
for (const auto & node : modified_nodes)
{
if (nodes[node].zxid == commit_zxid)
nodes.erase(node);
}
}
void KeeperStorage::UncommittedState::rollback(int64_t rollback_zxid)

View File

@ -84,11 +84,12 @@ void SerializationString::deserializeBinary(IColumn & column, ReadBuffer & istr)
void SerializationString::serializeBinaryBulk(const IColumn & column, WriteBuffer & ostr, size_t offset, size_t limit) const
{
const ColumnString & column_string = typeid_cast<const ColumnString &>(column);
const auto & full_column = column.convertToFullColumnIfLowCardinality();
const ColumnString & column_string = typeid_cast<const ColumnString &>(*full_column);
const ColumnString::Chars & data = column_string.getChars();
const ColumnString::Offsets & offsets = column_string.getOffsets();
size_t size = column.size();
size_t size = column_string.size();
if (!size)
return;

View File

@ -443,6 +443,11 @@ ASTPtr DatabasePostgreSQL::getColumnDeclaration(const DataTypePtr & data_type) c
if (which.isArray())
return makeASTFunction("Array", getColumnDeclaration(typeid_cast<const DataTypeArray *>(data_type.get())->getNestedType()));
if (which.isDateTime64())
{
return makeASTFunction("DateTime64", std::make_shared<ASTLiteral>(static_cast<UInt32>(6)));
}
return std::make_shared<ASTIdentifier>(data_type->getName());
}

View File

@ -0,0 +1,35 @@
#include <Functions/FunctionFactory.h>
#include <Functions/formatReadable.h>
namespace DB
{
namespace
{
struct Impl
{
static constexpr auto name = "formatReadableDecimalSize";
static void format(double value, DB::WriteBuffer & out)
{
formatReadableSizeWithDecimalSuffix(value, out);
}
};
}
REGISTER_FUNCTION(FormatReadableDecimalSize)
{
factory.registerFunction<FunctionFormatReadable<Impl>>(
{
R"(
Accepts the size (number of bytes). Returns a rounded size with a suffix (KB, MB, etc.) as a string.
)",
Documentation::Examples{
{"formatReadableDecimalSize", "SELECT formatReadableDecimalSize(1000)"}},
Documentation::Categories{"OtherFunctions"}
},
FunctionFactory::CaseSensitive);
}
}

View File

@ -2637,7 +2637,7 @@ void NO_INLINE Aggregator::mergeBucketImpl(
ManyAggregatedDataVariants Aggregator::prepareVariantsToMerge(ManyAggregatedDataVariants & data_variants) const
{
if (data_variants.empty())
throw Exception("Empty data passed to Aggregator::mergeAndConvertToBlocks.", ErrorCodes::EMPTY_DATA_PASSED);
throw Exception("Empty data passed to Aggregator::prepareVariantsToMerge.", ErrorCodes::EMPTY_DATA_PASSED);
LOG_TRACE(log, "Merging aggregated data");

View File

@ -546,10 +546,13 @@ std::vector<TableNeededColumns> normalizeColumnNamesExtractNeeded(
{
auto alias = aliases.find(ident->name())->second;
auto alias_ident = alias->clone();
alias_ident->as<ASTIdentifier>()->restoreTable();
bool alias_equals_column_name = alias_ident->getColumnNameWithoutAlias() == ident->getColumnNameWithoutAlias();
if (!alias_equals_column_name)
throw Exception("Alias clashes with qualified column '" + ident->name() + "'", ErrorCodes::AMBIGUOUS_COLUMN_NAME);
if (auto * alias_ident_typed = alias_ident->as<ASTIdentifier>())
{
alias_ident_typed->restoreTable();
bool alias_equals_column_name = alias_ident->getColumnNameWithoutAlias() == ident->getColumnNameWithoutAlias();
if (!alias_equals_column_name)
throw Exception("Alias clashes with qualified column '" + ident->name() + "'", ErrorCodes::AMBIGUOUS_COLUMN_NAME);
}
}
String short_name = ident->shortName();
String original_long_name;

View File

@ -1197,6 +1197,9 @@ public:
if (!mergeElement())
return false;
if (elements.size() != 2)
return false;
elements = {makeASTFunction("CAST", elements[0], elements[1])};
finished = true;
return true;
@ -1406,7 +1409,7 @@ public:
protected:
bool getResultImpl(ASTPtr & node) override
{
if (state == 2)
if (state == 2 && elements.size() == 2)
std::swap(elements[1], elements[0]);
node = makeASTFunction("position", std::move(elements));

View File

@ -214,6 +214,14 @@ SelectPartsDecision MergeTreeDataMergerMutator::selectPartsToMerge(
/// Previous part only in boundaries of partition frame
const MergeTreeData::DataPartPtr * prev_part = nullptr;
/// collect min_age for each partition while iterating parts
struct PartitionInfo
{
time_t min_age{std::numeric_limits<time_t>::max()};
};
std::unordered_map<std::string, PartitionInfo> partitions_info;
size_t parts_selected_precondition = 0;
for (const MergeTreeData::DataPartPtr & part : data_parts)
{
@ -277,6 +285,9 @@ SelectPartsDecision MergeTreeDataMergerMutator::selectPartsToMerge(
part_info.compression_codec_desc = part->default_codec->getFullCodecDesc();
part_info.shall_participate_in_merges = has_volumes_with_disabled_merges ? part->shallParticipateInMerges(storage_policy) : true;
auto & partition_info = partitions_info[partition_id];
partition_info.min_age = std::min(partition_info.min_age, part_info.age);
++parts_selected_precondition;
parts_ranges.back().emplace_back(part_info);
@ -333,7 +344,8 @@ SelectPartsDecision MergeTreeDataMergerMutator::selectPartsToMerge(
SimpleMergeSelector::Settings merge_settings;
/// Override value from table settings
merge_settings.max_parts_to_merge_at_once = data_settings->max_parts_to_merge_at_once;
merge_settings.min_age_to_force_merge = data_settings->min_age_to_force_merge_seconds;
if (!data_settings->min_age_to_force_merge_on_partition_only)
merge_settings.min_age_to_force_merge = data_settings->min_age_to_force_merge_seconds;
if (aggressive)
merge_settings.base = 1;
@ -347,6 +359,20 @@ SelectPartsDecision MergeTreeDataMergerMutator::selectPartsToMerge(
if (parts_to_merge.empty())
{
if (data_settings->min_age_to_force_merge_on_partition_only && data_settings->min_age_to_force_merge_seconds)
{
auto best_partition_it = std::max_element(
partitions_info.begin(),
partitions_info.end(),
[](const auto & e1, const auto & e2) { return e1.second.min_age < e2.second.min_age; });
assert(best_partition_it != partitions_info.end());
if (static_cast<size_t>(best_partition_it->second.min_age) >= data_settings->min_age_to_force_merge_seconds)
return selectAllPartsToMergeWithinPartition(
future_part, can_merge_callback, best_partition_it->first, true, metadata_snapshot, txn, out_disable_reason);
}
if (out_disable_reason)
*out_disable_reason = "There is no need to merge parts according to merge selector algorithm";
return SelectPartsDecision::CANNOT_SELECT;

View File

@ -63,6 +63,7 @@ struct Settings;
M(UInt64, merge_tree_clear_old_parts_interval_seconds, 1, "The period of executing the clear old parts operation in background.", 0) \
M(UInt64, merge_tree_clear_old_broken_detached_parts_ttl_timeout_seconds, 1ULL * 3600 * 24 * 30, "Remove old broken detached parts in the background if they remained intouched for a specified by this setting period of time.", 0) \
M(UInt64, min_age_to_force_merge_seconds, 0, "If all parts in a certain range are older than this value, range will be always eligible for merging. Set to 0 to disable.", 0) \
M(Bool, min_age_to_force_merge_on_partition_only, false, "Whether min_age_to_force_merge_seconds should be applied only on the entire partition and not on subset.", false) \
M(UInt64, merge_tree_enable_clear_old_broken_detached, false, "Enable clearing old broken detached parts operation in background.", 0) \
M(Bool, remove_rolled_back_parts_immediately, 1, "Setting for an incomplete experimental feature.", 0) \
\

View File

@ -87,7 +87,7 @@ class PRInfo:
self.body = ""
self.diff_urls = []
self.release_pr = 0
ref = github_event.get("ref", "refs/head/master")
ref = github_event.get("ref", "refs/heads/master")
if ref and ref.startswith("refs/heads/"):
ref = ref[11:]

View File

@ -447,6 +447,7 @@
"FORMAT"
"formatDateTime"
"formatReadableQuantity"
"formatReadableDecimalSize"
"formatReadableSize"
"formatReadableTimeDelta"
"formatRow"

View File

@ -399,6 +399,7 @@
"demangle"
"toNullable"
"concat"
"formatReadableDecimalSize"
"formatReadableSize"
"shardCount"
"fromModifiedJulianDayOrNull"

View File

@ -596,3 +596,48 @@ def test_cmd_wchp(started_cluster):
assert "/test_4lw_normal_node_1" in list_data
finally:
destroy_zk_client(zk)
def test_cmd_csnp(started_cluster):
zk = None
try:
wait_nodes()
zk = get_fake_zk(node1.name, timeout=30.0)
data = keeper_utils.send_4lw_cmd(cluster, node1, cmd="csnp")
try:
int(data)
assert True
except ValueError:
assert False
finally:
destroy_zk_client(zk)
def test_cmd_lgif(started_cluster):
zk = None
try:
wait_nodes()
clear_znodes()
zk = get_fake_zk(node1.name, timeout=30.0)
do_some_action(zk, create_cnt=100)
data = keeper_utils.send_4lw_cmd(cluster, node1, cmd="lgif")
print(data)
reader = csv.reader(data.split("\n"), delimiter="\t")
result = {}
for row in reader:
if len(row) != 0:
result[row[0]] = row[1]
assert int(result["first_log_idx"]) == 1
assert int(result["first_log_term"]) == 1
assert int(result["last_log_idx"]) >= 1
assert int(result["last_log_term"]) == 1
assert int(result["last_committed_log_idx"]) >= 1
assert int(result["leader_committed_log_idx"]) >= 1
assert int(result["target_committed_log_idx"]) >= 1
assert int(result["last_snapshot_idx"]) >= 1
finally:
destroy_zk_client(zk)

View File

@ -1,8 +0,0 @@
<clickhouse>
<zookeeper>
<node index="1">
<host>zoo1</host>
<port>2181</port>
</node>
</zookeeper>
</clickhouse>

View File

@ -1,88 +0,0 @@
import pytest
import time
from helpers.client import QueryRuntimeException
from helpers.cluster import ClickHouseCluster
from helpers.test_tools import TSV
cluster = ClickHouseCluster(__file__)
node = cluster.add_instance(
"node",
main_configs=["configs/zookeeper_config.xml"],
with_zookeeper=True,
)
@pytest.fixture(scope="module")
def start_cluster():
try:
cluster.start()
yield cluster
finally:
cluster.shutdown()
def get_part_number(table_name):
return TSV(
node.query(
f"SELECT count(*) FROM system.parts where table='{table_name}' and active=1"
)
)
def check_expected_part_number(seconds, table_name, expected):
ok = False
for i in range(int(seconds) * 2):
result = get_part_number(table_name)
if result == expected:
ok = True
break
else:
time.sleep(1)
assert ok
def test_without_force_merge_old_parts(start_cluster):
node.query(
"CREATE TABLE test_without_merge (i Int64) ENGINE = MergeTree ORDER BY i;"
)
node.query("INSERT INTO test_without_merge SELECT 1")
node.query("INSERT INTO test_without_merge SELECT 2")
node.query("INSERT INTO test_without_merge SELECT 3")
expected = TSV("""3\n""")
# verify that the parts don't get merged
for i in range(10):
if get_part_number("test_without_merge") != expected:
assert False
time.sleep(1)
node.query("DROP TABLE test_without_merge;")
def test_force_merge_old_parts(start_cluster):
node.query(
"CREATE TABLE test_with_merge (i Int64) ENGINE = MergeTree ORDER BY i SETTINGS min_age_to_force_merge_seconds=5;"
)
node.query("INSERT INTO test_with_merge SELECT 1")
node.query("INSERT INTO test_with_merge SELECT 2")
node.query("INSERT INTO test_with_merge SELECT 3")
expected = TSV("""1\n""")
check_expected_part_number(10, "test_with_merge", expected)
node.query("DROP TABLE test_with_merge;")
def test_force_merge_old_parts_replicated_merge_tree(start_cluster):
node.query(
"CREATE TABLE test_replicated (i Int64) ENGINE = ReplicatedMergeTree('/clickhouse/testing/test', 'node') ORDER BY i SETTINGS min_age_to_force_merge_seconds=5;"
)
node.query("INSERT INTO test_replicated SELECT 1")
node.query("INSERT INTO test_replicated SELECT 2")
node.query("INSERT INTO test_replicated SELECT 3")
expected = TSV("""1\n""")
check_expected_part_number(10, "test_replicated", expected)
node.query("DROP TABLE test_replicated;")

View File

@ -693,6 +693,19 @@ def test_auto_close_connection(started_cluster):
assert count == 2
def test_datetime(started_cluster):
cursor = started_cluster.postgres_conn.cursor()
cursor.execute("drop table if exists test")
cursor.execute("create table test (u timestamp)")
node1.query("drop database if exists pg")
node1.query("create database pg engine = PostgreSQL(postgres1)")
assert "DateTime64(6)" in node1.query("show create table pg.test")
node1.query("detach table pg.test")
node1.query("attach table pg.test")
assert "DateTime64(6)" in node1.query("show create table pg.test")
if __name__ == "__main__":
cluster.start()
input("Cluster created, press any key to destroy...")

View File

@ -0,0 +1,70 @@
1.00 B 1.00 B 1.00 B
2.72 B 2.00 B 2.00 B
7.39 B 7.00 B 7.00 B
20.09 B 20.00 B 20.00 B
54.60 B 54.00 B 54.00 B
148.41 B 148.00 B 148.00 B
403.43 B 403.00 B 403.00 B
1.10 KB 1.10 KB 1.10 KB
2.98 KB 2.98 KB 2.98 KB
8.10 KB 8.10 KB 8.10 KB
22.03 KB 22.03 KB 22.03 KB
59.87 KB 59.87 KB 59.87 KB
162.75 KB 162.75 KB 162.75 KB
442.41 KB 442.41 KB 442.41 KB
1.20 MB 1.20 MB 1.20 MB
3.27 MB 3.27 MB 3.27 MB
8.89 MB 8.89 MB 8.89 MB
24.15 MB 24.15 MB 24.15 MB
65.66 MB 65.66 MB 65.66 MB
178.48 MB 178.48 MB 178.48 MB
485.17 MB 485.17 MB 485.17 MB
1.32 GB 1.32 GB 1.32 GB
3.58 GB 3.58 GB 2.15 GB
9.74 GB 9.74 GB 2.15 GB
26.49 GB 26.49 GB 2.15 GB
72.00 GB 72.00 GB 2.15 GB
195.73 GB 195.73 GB 2.15 GB
532.05 GB 532.05 GB 2.15 GB
1.45 TB 1.45 TB 2.15 GB
3.93 TB 3.93 TB 2.15 GB
10.69 TB 10.69 TB 2.15 GB
29.05 TB 29.05 TB 2.15 GB
78.96 TB 78.96 TB 2.15 GB
214.64 TB 214.64 TB 2.15 GB
583.46 TB 583.46 TB 2.15 GB
1.59 PB 1.59 PB 2.15 GB
4.31 PB 4.31 PB 2.15 GB
11.72 PB 11.72 PB 2.15 GB
31.86 PB 31.86 PB 2.15 GB
86.59 PB 86.59 PB 2.15 GB
235.39 PB 235.39 PB 2.15 GB
639.84 PB 639.84 PB 2.15 GB
1.74 EB 1.74 EB 2.15 GB
4.73 EB 4.73 EB 2.15 GB
12.85 EB 12.85 EB 2.15 GB
34.93 EB 18.45 EB 2.15 GB
94.96 EB 18.45 EB 2.15 GB
258.13 EB 18.45 EB 2.15 GB
701.67 EB 18.45 EB 2.15 GB
1.91 ZB 18.45 EB 2.15 GB
5.18 ZB 18.45 EB 2.15 GB
14.09 ZB 18.45 EB 2.15 GB
38.31 ZB 18.45 EB 2.15 GB
104.14 ZB 18.45 EB 2.15 GB
283.08 ZB 18.45 EB 2.15 GB
769.48 ZB 18.45 EB 2.15 GB
2.09 YB 18.45 EB 2.15 GB
5.69 YB 18.45 EB 2.15 GB
15.46 YB 18.45 EB 2.15 GB
42.01 YB 18.45 EB 2.15 GB
114.20 YB 18.45 EB 2.15 GB
310.43 YB 18.45 EB 2.15 GB
843.84 YB 18.45 EB 2.15 GB
2293.78 YB 18.45 EB 2.15 GB
6235.15 YB 18.45 EB 2.15 GB
16948.89 YB 18.45 EB 2.15 GB
46071.87 YB 18.45 EB 2.15 GB
125236.32 YB 18.45 EB 2.15 GB
340427.60 YB 18.45 EB 2.15 GB
925378.17 YB 18.45 EB 2.15 GB

View File

@ -0,0 +1,4 @@
WITH round(exp(number), 6) AS x, x > 0xFFFFFFFFFFFFFFFF ? 0xFFFFFFFFFFFFFFFF : toUInt64(x) AS y, x > 0x7FFFFFFF ? 0x7FFFFFFF : toInt32(x) AS z
SELECT formatReadableDecimalSize(x), formatReadableDecimalSize(y), formatReadableDecimalSize(z)
FROM system.numbers
LIMIT 70;

View File

@ -1,3 +1,8 @@
SELECT (if(a.test == 'a', b.test, c.test)) as `a.test` FROM
(SELECT 1 AS id, 'a' AS test) a
LEFT JOIN (SELECT 1 AS id, 'b' AS test) b ON b.id = a.id
LEFT JOIN (SELECT 1 AS id, 'c' AS test) c ON c.id = a.id;
SELECT COLUMNS('test') FROM
(SELECT 1 AS id, 'a' AS test) a
LEFT JOIN (SELECT 1 AS id, 'b' AS test) b ON b.id = a.id

View File

@ -0,0 +1,32 @@
#!/usr/bin/expect -f
# Tags: long
# This is the regression for the concurrent access in ProgressIndication,
# so it is important to read enough rows here (10e6).
#
# Initially there was 100e6, but under thread fuzzer 10min may be not enough sometimes,
# but I believe that CI will catch possible issues even with less rows anyway.
set basedir [file dirname $argv0]
set basename [file tail $argv0]
exp_internal -f $env(CLICKHOUSE_TMP)/$basename.debuglog 0
log_user 0
set timeout 60
match_max 100000
set stty_init "rows 25 cols 120"
expect_after {
eof { exp_continue }
timeout { exit 1 }
}
spawn bash
send "source $basedir/../shell_config.sh\r"
send "yes | head -n10000000 | \$CLICKHOUSE_CLIENT --query \"insert into function null('foo String') format TSV\" >/dev/null\r"
expect "Progress: "
send "\3"
send "exit\r"
expect eof

View File

@ -1,2 +0,0 @@
0
--progress produce some rows

View File

@ -1,19 +0,0 @@
#!/usr/bin/env bash
# Tags: long
# This is the regression for the concurrent access in ProgressIndication,
# so it is important to read enough rows here (10e6).
#
# Initially there was 100e6, but under thread fuzzer 10min may be not enough sometimes,
# but I believe that CI will catch possible issues even with less rows anyway.
CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
tmp_file_progress="$(mktemp "$CUR_DIR/$CLICKHOUSE_TEST_UNIQUE_NAME.XXXXXX.progress")"
trap 'rm $tmp_file_progress' EXIT
yes | head -n10000000 | $CLICKHOUSE_CLIENT -q "insert into function null('foo String') format TSV" --progress 2> "$tmp_file_progress"
echo $?
test -s "$tmp_file_progress" && echo "--progress produce some rows" || echo "FAIL: no rows with --progress"

View File

@ -0,0 +1,32 @@
#!/usr/bin/expect -f
# Tags: long
# This is the regression for the concurrent access in ProgressIndication,
# so it is important to read enough rows here (10e6).
#
# Initially there was 100e6, but under thread fuzzer 10min may be not enough sometimes,
# but I believe that CI will catch possible issues even with less rows anyway.
set basedir [file dirname $argv0]
set basename [file tail $argv0]
exp_internal -f $env(CLICKHOUSE_TMP)/$basename.debuglog 0
log_user 0
set timeout 60
match_max 100000
set stty_init "rows 25 cols 120"
expect_after {
eof { exp_continue }
timeout { exit 1 }
}
spawn bash
send "source $basedir/../shell_config.sh\r"
send "yes | head -n10000000 | \$CLICKHOUSE_LOCAL --query \"insert into function null('foo String') format TSV\" >/dev/null\r"
expect "Progress: "
send "\3"
send "exit\r"
expect eof

View File

@ -1,2 +0,0 @@
0
--progress produce some rows

View File

@ -1,19 +0,0 @@
#!/usr/bin/env bash
# Tags: long
# This is the regression for the concurrent access in ProgressIndication,
# so it is important to read enough rows here (10e6).
#
# Initially there was 100e6, but under thread fuzzer 10min may be not enough sometimes,
# but I believe that CI will catch possible issues even with less rows anyway.
CUR_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
# shellcheck source=../shell_config.sh
. "$CUR_DIR"/../shell_config.sh
tmp_file_progress="$(mktemp "$CUR_DIR/$CLICKHOUSE_TEST_UNIQUE_NAME.XXXXXX.progress")"
trap 'rm $tmp_file_progress' EXIT
yes | head -n10000000 | $CLICKHOUSE_LOCAL -q "insert into function null('foo String') format TSV" --progress 2> "$tmp_file_progress"
echo $?
test -s "$tmp_file_progress" && echo "--progress produce some rows" || echo "FAIL: no rows with --progress"

View File

@ -0,0 +1,55 @@
#!/usr/bin/expect -f
set basedir [file dirname $argv0]
set basename [file tail $argv0]
exp_internal -f $env(CLICKHOUSE_TMP)/$basename.debuglog 0
log_user 0
set timeout 60
match_max 100000
set stty_init "rows 25 cols 120"
expect_after {
eof { exp_continue }
timeout { exit 1 }
}
spawn bash
send "source $basedir/../shell_config.sh\r"
# Progress is displayed by default
send "\$CLICKHOUSE_LOCAL --query 'SELECT sum(sleep(1) = 0) FROM numbers(3) SETTINGS max_block_size = 1' >/dev/null\r"
expect "Progress: "
expect "█"
send "\3"
# It is true even if we redirect both stdout and stderr to /dev/null
send "\$CLICKHOUSE_LOCAL --query 'SELECT sum(sleep(1) = 0) FROM numbers(3) SETTINGS max_block_size = 1' >/dev/null 2>&1\r"
expect "Progress: "
expect "█"
send "\3"
# The option --progress has implicit value of true
send "\$CLICKHOUSE_LOCAL --progress --query 'SELECT sum(sleep(1) = 0) FROM numbers(3) SETTINGS max_block_size = 1' >/dev/null 2>&1\r"
expect "Progress: "
expect "█"
send "\3"
# But we can set it to false
send "\$CLICKHOUSE_LOCAL --progress false --query 'SELECT sleep(1), \$\$Hello\$\$ FROM numbers(3) SETTINGS max_block_size = 1' 2>/dev/null\r"
expect -exact "0\tHello\r\n"
send "\3"
# As well as to 0 for the same effect
send "\$CLICKHOUSE_LOCAL --progress 0 --query 'SELECT sleep(1), \$\$Hello\$\$ FROM numbers(3) SETTINGS max_block_size = 1' 2>/dev/null\r"
expect -exact "0\tHello\r\n"
send "\3"
# If we set it to 1, the progress will be displayed as well
send "\$CLICKHOUSE_LOCAL --progress 1 --query 'SELECT sum(sleep(1) = 0) FROM numbers(3) SETTINGS max_block_size = 1' >/dev/null 2>&1\r"
expect "Progress: "
expect "█"
send "\3"
send "exit\r"
expect eof

View File

@ -0,0 +1,12 @@
Without merge
3
With merge any part range
1
With merge partition only
1
With merge replicated any part range
1
With merge replicated partition only
1
With merge partition only and new parts
3

View File

@ -0,0 +1,87 @@
-- Tags: long
DROP TABLE IF EXISTS test_without_merge;
DROP TABLE IF EXISTS test_with_merge;
DROP TABLE IF EXISTS test_replicated;
SELECT 'Without merge';
CREATE TABLE test_without_merge (i Int64) ENGINE = MergeTree ORDER BY i;
INSERT INTO test_without_merge SELECT 1;
INSERT INTO test_without_merge SELECT 2;
INSERT INTO test_without_merge SELECT 3;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_without_merge' AND active;
DROP TABLE test_without_merge;
SELECT 'With merge any part range';
CREATE TABLE test_with_merge (i Int64) ENGINE = MergeTree ORDER BY i
SETTINGS min_age_to_force_merge_seconds=3, min_age_to_force_merge_on_partition_only=false;
INSERT INTO test_with_merge SELECT 1;
INSERT INTO test_with_merge SELECT 2;
INSERT INTO test_with_merge SELECT 3;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_with_merge' AND active;
DROP TABLE test_with_merge;
SELECT 'With merge partition only';
CREATE TABLE test_with_merge (i Int64) ENGINE = MergeTree ORDER BY i
SETTINGS min_age_to_force_merge_seconds=3, min_age_to_force_merge_on_partition_only=true;
INSERT INTO test_with_merge SELECT 1;
INSERT INTO test_with_merge SELECT 2;
INSERT INTO test_with_merge SELECT 3;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_with_merge' AND active;
DROP TABLE test_with_merge;
SELECT 'With merge replicated any part range';
CREATE TABLE test_replicated (i Int64) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test02473', 'node') ORDER BY i
SETTINGS min_age_to_force_merge_seconds=3, min_age_to_force_merge_on_partition_only=false;
INSERT INTO test_replicated SELECT 1;
INSERT INTO test_replicated SELECT 2;
INSERT INTO test_replicated SELECT 3;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_replicated' AND active;
DROP TABLE test_replicated;
SELECT 'With merge replicated partition only';
CREATE TABLE test_replicated (i Int64) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{database}/test02473_partition_only', 'node') ORDER BY i
SETTINGS min_age_to_force_merge_seconds=3, min_age_to_force_merge_on_partition_only=true;
INSERT INTO test_replicated SELECT 1;
INSERT INTO test_replicated SELECT 2;
INSERT INTO test_replicated SELECT 3;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_replicated' AND active;
DROP TABLE test_replicated;
SELECT 'With merge partition only and new parts';
CREATE TABLE test_with_merge (i Int64) ENGINE = MergeTree ORDER BY i
SETTINGS min_age_to_force_merge_seconds=3, min_age_to_force_merge_on_partition_only=true;
SYSTEM STOP MERGES test_with_merge;
-- These three parts will have min_age=6 at the time of merge
INSERT INTO test_with_merge SELECT 1;
INSERT INTO test_with_merge SELECT 2;
SELECT sleepEachRow(1) FROM numbers(9) FORMAT Null;
-- These three parts will have min_age=0 at the time of merge
-- and so, nothing will be merged.
INSERT INTO test_with_merge SELECT 3;
SYSTEM START MERGES test_with_merge;
SELECT count(*) FROM system.parts WHERE database = currentDatabase() AND table='test_with_merge' AND active;
DROP TABLE test_with_merge;

View File

@ -0,0 +1,2 @@
bbbbb
bbbbb

View File

@ -0,0 +1 @@
SELECT if(materialize(0), extract(materialize(CAST('aaaaaa', 'LowCardinality(String)')), '\\w'), extract(materialize(CAST('bbbbb', 'LowCardinality(String)')), '\\w*')) AS res FROM numbers(2);

View File

@ -0,0 +1 @@
SELECT CAST(a, b -> c) ++; -- { clientError SYNTAX_ERROR }

View File

@ -1,3 +1,7 @@
## This parser is unsupported
We keep it in this repository for your curiosity. But this is not the parser of ClickHouse.
## How to generate source code files from grammar
Grammar is located inside `ClickHouseLexer.g4` and `ClickHouseParser.g4` files.

View File

@ -59,7 +59,7 @@ std::string randomDate()
int32_t month = rng() % 12 + 1;
int32_t day = rng() % 12 + 1;
char answer[13];
size_t size = sprintf(answer, "'%04u-%02u-%02u'", year, month, day);
size_t size = snprintf(answer, sizeof(answer), "'%04u-%02u-%02u'", year, month, day);
return std::string(answer, size);
}
@ -72,8 +72,9 @@ std::string randomDatetime()
int32_t minutes = rng() % 60;
int32_t seconds = rng() % 60;
char answer[22];
size_t size = sprintf(
size_t size = snprintf(
answer,
sizeof(answer),
"'%04u-%02u-%02u %02u:%02u:%02u'",
year,
month,

View File

@ -179,7 +179,7 @@ void setCurrentBlockNumber(zkutil::ZooKeeper & zk, const std::string & path, Int
if (number != current_block_number)
{
char suffix[11] = "";
size_t size = sprintf(suffix, "%010lld", current_block_number);
size_t size = snprintf(suffix, sizeof(suffix), "%010lld", current_block_number);
std::string expected_path = block_prefix + std::string(suffix, size);
std::cerr << "\t" << path_created << ": Ephemeral node has been created with an unexpected path (expected something like "
<< expected_path << ")." << std::endl;