Merge pull request #36705 from gingerwizard/diagnostic-tool

New Diagnostics tool
This commit is contained in:
Alexey Milovidov 2022-04-29 13:13:04 +03:00 committed by GitHub
commit d3dd5b78b3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
84 changed files with 13477 additions and 0 deletions

30
tools/clickhouse-diagnostics/.gitignore vendored Normal file
View File

@ -0,0 +1,30 @@
# If you prefer the allow list template instead of the deny list, see community template:
# https://github.com/github/gitignore/blob/main/community/Golang/Go.AllowList.gitignore
#
# Binaries for programs and plugins
*.exe
*.exe~
*.dll
*.so
*.dylib
# Test binary, built with `go test -c`
*.test
# Output of the go coverage tool, specifically when used with LiteIDE
*.out
# Dependency directories (remove the comment below to include it)
# vendor/
# Go workspace file
go.work
.idea
clickhouse-diagnostics
output
vendor
bin
profile.cov
clickhouse-diagnostics.yml
dist/

View File

@ -0,0 +1,49 @@
# Contribution
We keep things simple. Execute all commands in this folder.
## Requirements
- docker - tested on version 20.10.12.
- golang >= go1.17.6
## Building
Creates a binary `clickhouse-diagnostics` in the local folder. Build will be versioned according to a timestamp. For a versioned release see [Releasing](#releasing).
```bash
make build
```
## Linting
We use [golangci-lint](https://golangci-lint.run/). We use a container to run so no need to install.
```bash
make lint-go
```
## Running Tests
```bash
make test
```
For a coverage report,
```bash
make test-coverage
```
## Adding Collectors
TODO
## Adding Outputs
TODO
## Frames
## Parameter Types

View File

@ -0,0 +1,60 @@
GOCMD=go
GOTEST=$(GOCMD) test
BINARY_NAME=clickhouse-diagnostics
BUILD_DIR=dist
TIMESTAMP := $(shell date +%Y%m%d-%H%M)
COMMIT := $(shell git rev-parse --short HEAD)
DEVLDFLAGS = -ldflags "-X github.com/ClickHouse/clickhouse-diagnostics/cmd.Version=v.dev-${TIMESTAMP} -X github.com/ClickHouse/clickhouse-diagnostics/cmd.Commit=${COMMIT}"
# override with env variable to test other versions e.g. 21.11.10.1
CLICKHOUSE_VERSION ?= latest
GREEN := $(shell tput -Txterm setaf 2)
YELLOW := $(shell tput -Txterm setaf 3)
WHITE := $(shell tput -Txterm setaf 7)
CYAN := $(shell tput -Txterm setaf 6)
RESET := $(shell tput -Txterm sgr0)
.PHONY: all test build vendor release lint-go test-coverages dep
all: help
release: ## Release is delegated to goreleaser
$(shell goreleaser release --rm-dist)
## Build:
build: ## Build a binary for local use
# timestamped version
$(GOCMD) build ${DEVLDFLAGS} -o $(BINARY_NAME) .
clean: ## Remove build related file
rm ${BINARY_NAME}
rm -f checkstyle-report.xml ./coverage.xml ./profile.cov
vendor: ## Copy of all packages needed to support builds and tests in the vendor directory
$(GOCMD) mod vendor
test: ## Run the tests of the project
CLICKHOUSE_VERSION=$(CLICKHOUSE_VERSION) $(GOTEST) -v -race `go list ./... | grep -v ./internal/platform/test`
lint-go: ## Use golintci-lint
docker run --rm -v $(shell pwd):/app -w /app golangci/golangci-lint:latest-alpine golangci-lint run
test-coverage: ## Run the tests of the project and export the coverage
CLICKHOUSE_VERSION=$(CLICKHOUSE_VERSION) $(GOTEST) -cover -covermode=count -coverprofile=profile.cov `go list ./... | grep -v ./internal/platform/test`
$(GOCMD) tool cover -func profile.cov
dep:
$(shell go mod download)
help: ## Show this help.
@echo ''
@echo 'Usage:'
@echo ' ${YELLOW}make${RESET} ${GREEN}<target>${RESET}'
@echo ''
@echo 'Targets:'
@awk 'BEGIN {FS = ":.*?## "} { \
if (/^[a-zA-Z_-]+:.*?##.*$$/) {printf " ${YELLOW}%-20s${GREEN}%s${RESET}\n", $$1, $$2} \
else if (/^## .*$$/) {printf " ${CYAN}%s${RESET}\n", substr($$1,4)} \
}' $(MAKEFILE_LIST)

View File

@ -0,0 +1,167 @@
# Clickhouse Diagnostics Tool
## Purpose
This tool provides a means of obtaining a diagnostic bundle from a ClickHouse instance. This bundle can be provided to your nearest ClickHouse support provider in order to assist with the diagnosis of issues.
## Design Philosophy
- **No local dependencies** to run. We compile to a platform-independent binary, hence Go.
- **Minimize resource overhead**. Improvements always welcome.
- **Extendable framework**. At its core, the tool provides collectors and outputs. Collectors are independent and are responsible for collecting a specific dataset e.g. system configuration. Outputs produce the diagnostic bundle in a specific format. It should be trivial to add both for contributors. See [Collectors](#collectors) and [Outputs](#outputs) for more details.
- **Convertable output formats**. Outputs produce diagnostic bundles in different formats e.g. archive, simple report etc. Where possible, it should be possible to convert between these formats. For example, an administrator may provide a bundle as an archive to their support provider who in turn wishes to visualise this as a report or even in ClickHouse itself...
- **Something is better than nothing**. Collectors execute independently. We never fail a collection because one fails - preferring to warn the user only. There are good reasons for a collector failure e.g. insufficient permissions or missing data.
- **Execute anywhere** - Ideally, this tool is executed on a ClickHouse host. Some collectors e.g. configuration file collection or system information, rely on this. However, collectors will obtain as much information remotely from the database as possible if executed remotely from the cluster - warning where collection fails. **We do currently require ClickHouse to be running, connecting over the native port**.
We recommend reading [Permissions, Warnings & Locality](#permissions-warnings--locality).
## Usage
### Collection
The `collect` command allows the collection of a diagnostic bundle. In its simplest form, assuming ClickHouse is running locally on default ports with no password:
```bash
clickhouse-diagnostics collect
```
This will use the default collectors and the simple output. This output produces a timestamped archive bundle in `gz` format in a sub folder named after the host. This folder name can be controlled via the parameter `--id` or configured directly for the simple output parameter `output.simple.folder` (this allows a specific diretory to be specified).
Collectors, Outputs and ClickHouse connection credentials can be specified as shown below:
```bash
clickhouse-diagnostics collect --password random --username default --collector=system_db,system --output=simple --id my_cluster_name
```
This collects the system database and host information from the cluster running locally. The archive bundle will be produced under a folder `my_cluster_name`.
For further details, use the in built help (the commands below are equivalent):
```bash
clickhouse-diagnostics collect --help
./clickhouse-diagnostics help collect
```
### Help & Finding parameters for collectors & outputs
Collectors and outputs have their own parameters not listed under the help for the command for the `collect` command. These can be identified using the `help` command. Specifically,
For more information about a specific collector.
```bash
Use "clickhouse-diagnostics help --collector [collector]"
```
For more information about a specific output.
```bash
Use "clickhouse-diagnostics help --output [output]"
```
### Convert
Coming soon to a cluster near you...
## Collectors
We currently support the following collectors. A `*` indicates this collector is enabled by default:
- `system_db*` - Collects all tables in the system database, except those which have been excluded and up to a specified row limit.
- `system*` - Collects summary OS and hardware statistics for the host.
- `config*` - Collects the ClickHouse configuration from the local filesystem. A best effort is made using process information if ClickHouse is not installed locally. `include_path` are also considered.
- `db_logs*` - Collects the ClickHouse logs directly from the database.
- `logs*` - Collects the ClickHouse logs directly from the database.
- `summary*` - Collects summary statistics on the database based on a set of known useful queries. This represents the easiest collector to extend - contributions are welcome to this set which can be found [here](https://github.com/ClickHouse/clickhouse-diagnostics/blob/main/internal/collectors/clickhouse/queries.json).
- `file` - Collects files based on glob patterns. Does not collect directories. To preview files which will be collected try, `clickhouse-diagnostics collect --collectors=file --collector.file.file_pattern=<glob path> --output report`
- `command` - Collects the output of a user specified command. To preview output, `clickhouse-diagnostics collect --collectors=command --collector.command.command="<command>" --output report`
- `zookeeper_db` - Collects information about zookeeper using the `system.zookeeper` table, recursively iterating the zookeeper tree/table. Note: changing the default parameter values can cause extremely high load to be placed on the database. Use with caution. By default, uses the glob `/clickhouse/{task_queue}/**` to match zookeeper paths and iterates to a max depth of 8.
## Outputs
We currently support the following outputs. The `simple` output is currently the default:
- `simple` - Writes out the diagnostic bundle as files in a structured directory, optionally producing a compressed archive.
- `report` - Writes out the diagnostic bundle to the terminal as a simple report. Supports an ascii table format or markdown.
- `clickhouse` - **Under development**. This will allow a bundle to be stored in a cluster allowing visualization in common tooling e.g. Grafana.
## Simple Output
Since the `simple` output is the default we provide additional details here.
This output produces a timestamped archive by default in `gz` format under a directory created with either the hostname of the specified collection `--id`. As shown below, a specific folder can also be specified. Compression can also be disabled, leaving just the contents of the folder:
```bash
./clickhouse-diagnostics help --output simple
Writes out the diagnostic bundle as files in a structured directory, optionally producing a compressed archive.
Usage:
--output=simple [flags]
Flags:
--output.simple.directory string Directory in which to create dump. Defaults to the current directory. (default "./")
--output.simple.format string Format of exported files (default "csv")
--output.simple.skip_archive Don't compress output to an archive
```
The archive itself contains a folder for each collector. Each collector can potentially produce many discrete sets of data, known as frames. Each of these typically results in a single file within the collector's folder. For example, each query for the `summary` collector results in a correspondingly named file within the `summary` folder.
## Permissions, Warnings & Locality
Some collectors either require specific permissions for complete collection or should be executed on a ClickHouse host. We aim to collate these requirements below:
- `system_db` - This collect aims to collect all tables in the `system` database. Some tables may fail if certain features are not enabled. Specifically,[allow_introspection_functions](https://clickhouse.com/docs/en/operations/settings/settings/#settings-allow_introspection_functions) is required to collect the `stack_traces` table. [access_management](https://clickhouse.com/docs/en/operations/settings/settings-users/#access_management-user-setting) must be set for the ClickHouse user specified for collection, to permit access to access management tables e.g. `quota_usage`.
- `db_logs`- The ClickHouse user must have access to the tables `query_log`,`query_thread_log` and `text_log`.
- `logs` - The system user under which the tool is executed must have access to the logs directory. It must therefore also be executed on the target ClickHouse server directly for this collector work. In cases where the logs directory is not a default location e.g. `/var/log/clickhouse-server` we will attempt to establish the location from the ClickHouse configuration. This requires permissions to read the configuration files - which in most cases requires specific permissions to be granted to the run user if you are not comfortable executing the tool under sudo or the `clickhouse` user.
- `summary`- This collector executes pre-recorded queries. Some of these read tables concerning access management, thus requiring the ClickHouse user to have the [access_management](https://clickhouse.com/docs/en/operations/settings/settings-users/#access_management-user-setting) permission.
- `config` - This collector reads and copies the local configuration files. It thus requires permissions to read the configuration files - which in most cases requires specific permissions to be granted to the run user if you are not comfortable executing the tool under sudo or the `clickhouse` user.
**If a collector cannot collect specific data because of either execution location or permissions, it will log a warning to the terminal.**
## Logging
All logs are output to `stderr`. `stdout` is used exclusively for outputs to print information.
## Configuration file
In addition to supporting parameters via the command line, a configuration file can be specified via the `--config`, `-f` flag.
By default, we look for a configuration file `clickhouse-diagnostics.yml` in the same directory as the binary. If not present, we revert to command line flags.
**Values set via the command line values always take precedence over those in the configuration file.**
All parameters can be set via the configuration file and can in most cases be converted to a yaml hierarchy, where periods indicate a nesting. For example,
`--collector.system_db.row_limit=1`
becomes
```yaml
collector:
system_db:
row_limit: 1
```
The following exceptions exist to avoid collisions:
| Command | Parameter | Configuration File |
|---------|------------|--------------------|
| collect | output | collect.output |
| collect | collectors | collect.collectors |
## FAQ
1. Does the collector need root permissions?
No. However, to read some local files e.g. configurations, the tool should be executed as the `clickhouse` user.
2. What ClickHouse database permissions does the collector need?
Read permissions on all system tables are required in most cases - although only specific collectors need this. [Access management permissions]((https://clickhouse.com/docs/en/operations/settings/settings-users/#access_management-user-setting)) will ensure full collection.
3. Is any processing done on logs for anonimization purposes?
Currently no. ClickHouse should not log sensitive information to logs e.g. passwords.
4. Is sensitive information removed from configuration files e.g. passwords?
Yes. We remove both passwords and hashed passwords. Please raise an issue if you require further information to be anonimized. We appreciate this is a sensitive topic.

View File

@ -0,0 +1,158 @@
package cmd
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/cmd/params"
"github.com/ClickHouse/clickhouse-diagnostics/internal"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/file"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/terminal"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/rs/zerolog/log"
"github.com/spf13/cobra"
"github.com/spf13/pflag"
"github.com/spf13/viper"
"os"
"strings"
)
var id string
var output = params.StringOptionsVar{
Options: outputs.GetOutputNames(),
Value: "simple",
}
// access credentials
var host string
var port uint16
var username string
var password string
var collectorNames = params.StringSliceOptionsVar{
Options: collectors.GetCollectorNames(false),
Values: collectors.GetCollectorNames(true),
}
// holds the collector params passed by the cli
var collectorParams params.ParamMap
// holds the output params passed by the cli
var outputParams params.ParamMap
const collectHelpTemplate = `Usage:{{if .Runnable}}
{{.UseLine}}{{end}}{{if .HasAvailableSubCommands}}
{{.CommandPath}} [command]{{end}}{{if gt (len .Aliases) 0}}
Aliases:
{{.NameAndAliases}}{{end}}{{if .HasExample}}
Examples:
{{.Example}}{{end}}{{if .HasAvailableSubCommands}}
Available Commands:{{range .Commands}}{{if (or .IsAvailableCommand (eq .Name "help"))}}
{{rpad .Name .NamePadding }} {{.Short}}{{end}}{{end}}{{end}}{{if .HasAvailableLocalFlags}}
Flags:
{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}{{if .HasAvailableInheritedFlags}}
Global Flags:
{{.InheritedFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}
Additional help topics:
Use "{{.CommandPath}} [command] --help" for more information about a command.
Use "{{.Parent.Name}} help --collector [collector]" for more information about a specific collector.
Use "{{.Parent.Name}} help --output [output]" for more information about a specific output.
`
func init() {
collectCmd.Flags().StringVar(&id, "id", getHostName(), "Id of diagnostic bundle")
// access credentials
collectCmd.Flags().StringVar(&host, "host", "localhost", "ClickHouse host")
collectCmd.Flags().Uint16VarP(&port, "port", "p", 9000, "ClickHouse native port")
collectCmd.Flags().StringVarP(&username, "username", "u", "", "ClickHouse username")
collectCmd.Flags().StringVar(&password, "password", "", "ClickHouse password")
// collectors and outputs
collectCmd.Flags().VarP(&output, "output", "o", fmt.Sprintf("Output Format for the diagnostic Bundle, options: [%s]\n", strings.Join(output.Options, ",")))
collectCmd.Flags().VarP(&collectorNames, "collectors", "c", fmt.Sprintf("Collectors to use, options: [%s]\n", strings.Join(collectorNames.Options, ",")))
collectorConfigs, err := collectors.BuildConfigurationOptions()
if err != nil {
log.Fatal().Err(err).Msg("Unable to build collector configurations")
}
collectorParams = params.NewParamMap(collectorConfigs)
outputConfigs, err := outputs.BuildConfigurationOptions()
if err != nil {
log.Fatal().Err(err).Msg("Unable to build output configurations")
}
params.AddParamMapToCmd(collectorParams, collectCmd, "collector", true)
outputParams = params.NewParamMap(outputConfigs)
params.AddParamMapToCmd(outputParams, collectCmd, "output", true)
collectCmd.SetFlagErrorFunc(handleFlagErrors)
collectCmd.SetHelpTemplate(collectHelpTemplate)
rootCmd.AddCommand(collectCmd)
}
var collectCmd = &cobra.Command{
Use: "collect",
Short: "Collect a diagnostic bundle",
Long: `Collect a ClickHouse diagnostic bundle for a specified ClickHouse instance`,
PreRun: func(cmd *cobra.Command, args []string) {
bindFlagsToConfig(cmd)
},
Example: fmt.Sprintf(`%s collect --username default --collector=%s --output=simple`, rootCmd.Name(), strings.Join(collectorNames.Options[:2], ",")),
Run: func(cmd *cobra.Command, args []string) {
log.Info().Msgf("executing collect command with %v collectors and %s output", collectorNames.Values, output.Value)
outputConfig := params.ConvertParamsToConfig(outputParams)[output.Value]
runConfig := internal.NewRunConfiguration(id, host, port, username, password, output.Value, outputConfig, collectorNames.Values, params.ConvertParamsToConfig(collectorParams))
internal.Capture(runConfig)
os.Exit(0)
},
}
func getHostName() string {
name, err := os.Hostname()
if err != nil {
name = "clickhouse-diagnostics"
}
return name
}
// these flags are nested under the cmd name in the config file to prevent collisions
var flagsToNest = []string{"output", "collectors"}
// this saves us binding each command manually to viper
func bindFlagsToConfig(cmd *cobra.Command) {
cmd.Flags().VisitAll(func(f *pflag.Flag) {
err := viper.BindEnv(f.Name, fmt.Sprintf("%s_%s", envPrefix,
strings.ToUpper(strings.Replace(f.Name, ".", "_", -1))))
if err != nil {
log.Error().Msgf("Unable to bind %s to config", f.Name)
}
configFlagName := f.Name
if utils.Contains(flagsToNest, f.Name) {
configFlagName = fmt.Sprintf("%s.%s", cmd.Use, configFlagName)
}
err = viper.BindPFlag(configFlagName, f)
if err != nil {
log.Error().Msgf("Unable to bind %s to config", f.Name)
}
// here we prefer the config value when the param is not set on the cmd line
if !f.Changed && viper.IsSet(configFlagName) {
val := viper.Get(configFlagName)
log.Debug().Msgf("Setting parameter %s from configuration file", f.Name)
err = cmd.Flags().Set(f.Name, fmt.Sprintf("%v", val))
if err != nil {
log.Error().Msgf("Unable to read \"%s\" value from config", f.Name)
} else {
log.Debug().Msgf("Set parameter \"%s\" from configuration", f.Name)
}
}
})
}

View File

@ -0,0 +1 @@
package cmd

View File

@ -0,0 +1,123 @@
package cmd
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/cmd/params"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/rs/zerolog/log"
"github.com/spf13/cobra"
"os"
)
var cHelp = params.StringOptionsVar{
Options: collectors.GetCollectorNames(false),
Value: "",
}
var oHelp = params.StringOptionsVar{
Options: outputs.GetOutputNames(),
Value: "",
}
func init() {
helpCmd.Flags().VarP(&cHelp, "collector", "c", "Specify collector to get description of available flags")
helpCmd.Flags().VarP(&oHelp, "output", "o", "Specify output to get description of available flags")
helpCmd.SetUsageTemplate(`Usage:{{if .Runnable}}
{{.UseLine}}{{end}}{{if .HasExample}}
Examples:
{{.Example}}{{end}}
Available Commands:{{range .Parent.Commands}}{{if (or .IsAvailableCommand (eq .Name "help"))}}
{{rpad .Name .NamePadding }} {{.Short}}{{end}}{{end}}{{if .HasAvailableLocalFlags}}
Flags:
{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}
Alternatively use "{{.CommandPath}} [command] --help" for more information about a command.
`)
helpCmd.SetFlagErrorFunc(handleFlagErrors)
}
var helpCmd = &cobra.Command{
Use: "help [command]",
Short: "Help about any command, collector or output",
Long: `Help provides help for any command, collector or output in the application.`,
Example: fmt.Sprintf(`%[1]v help collect
%[1]v help --collector=config
%[1]v help --output=simple`, rootCmd.Name()),
Run: func(c *cobra.Command, args []string) {
if len(args) != 0 {
//find the command on which help is requested
cmd, _, e := c.Root().Find(args)
if cmd == nil || e != nil {
c.Printf("Unknown help topic %#q\n", args)
cobra.CheckErr(c.Root().Usage())
} else {
cmd.InitDefaultHelpFlag()
cobra.CheckErr(cmd.Help())
}
return
}
if cHelp.Value != "" && oHelp.Value != "" {
log.Error().Msg("Specify either --collector or --output not both")
_ = c.Help()
os.Exit(1)
}
if cHelp.Value != "" {
collector, err := collectors.GetCollectorByName(cHelp.Value)
if err != nil {
log.Fatal().Err(err).Msgf("Unable to initialize collector %s", cHelp.Value)
}
configHelp(collector.Configuration(), "collector", cHelp.Value, collector.Description())
} else if oHelp.Value != "" {
output, err := outputs.GetOutputByName(oHelp.Value)
if err != nil {
log.Fatal().Err(err).Msgf("Unable to initialize output %s", oHelp.Value)
}
configHelp(output.Configuration(), "output", oHelp.Value, output.Description())
} else {
_ = c.Help()
}
os.Exit(0)
},
}
func configHelp(conf config.Configuration, componentType, name, description string) {
paramMap := params.NewParamMap(map[string]config.Configuration{
name: conf,
})
tempHelpCmd := &cobra.Command{
Use: fmt.Sprintf("--%s=%s", componentType, name),
Short: fmt.Sprintf("Help about the %s collector", name),
Long: description,
SilenceErrors: true,
Run: func(c *cobra.Command, args []string) {
_ = c.Help()
},
}
params.AddParamMapToCmd(paramMap, tempHelpCmd, componentType, false)
// this is workaround to hide the help flag
tempHelpCmd.Flags().BoolP("help", "h", false, "Dummy help")
tempHelpCmd.Flags().Lookup("help").Hidden = true
tempHelpCmd.SetUsageTemplate(`
{{.Long}}
Usage:{{if .Runnable}}
{{.UseLine}}{{end}}{{if .HasExample}}
Examples:
{{.Example}}{{end}}
Flags:{{if .HasAvailableLocalFlags}}
{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{else}}
No configuration flags available
{{end}}
`)
_ = tempHelpCmd.Execute()
}

View File

@ -0,0 +1,280 @@
package params
import (
"bytes"
"encoding/csv"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/spf13/cobra"
"strings"
)
type cliParamType uint8
const (
String cliParamType = iota
StringList
StringOptionsList
Integer
Boolean
)
type CliParam struct {
Description string
Default interface{}
//this should always be an address to a value - as required by cobra
Value interface{}
Type cliParamType
}
type ParamMap map[string]map[string]CliParam
func NewParamMap(configs map[string]config.Configuration) ParamMap {
paramMap := make(ParamMap)
for name, configuration := range configs {
for _, param := range configuration.Params {
switch p := param.(type) {
case config.StringParam:
paramMap = paramMap.createStringParam(name, p)
case config.StringListParam:
paramMap = paramMap.createStringListParam(name, p)
case config.StringOptions:
paramMap = paramMap.createStringOptionsParam(name, p)
case config.IntParam:
paramMap = paramMap.createIntegerParam(name, p)
case config.BoolParam:
paramMap = paramMap.createBoolParam(name, p)
}
}
}
return paramMap
}
func (m ParamMap) createBoolParam(rootKey string, bParam config.BoolParam) ParamMap {
if _, ok := m[rootKey]; !ok {
m[rootKey] = make(map[string]CliParam)
}
var value bool
param := CliParam{
Description: bParam.Description(),
Default: bParam.Value,
Value: &value,
Type: Boolean,
}
m[rootKey][bParam.Name()] = param
return m
}
func (m ParamMap) createStringParam(rootKey string, sParam config.StringParam) ParamMap {
if _, ok := m[rootKey]; !ok {
m[rootKey] = make(map[string]CliParam)
}
var value string
param := CliParam{
Description: sParam.Description(),
Default: sParam.Value,
Value: &value,
Type: String,
}
m[rootKey][sParam.Name()] = param
return m
}
func (m ParamMap) createStringListParam(rootKey string, lParam config.StringListParam) ParamMap {
if _, ok := m[rootKey]; !ok {
m[rootKey] = make(map[string]CliParam)
}
var value []string
param := CliParam{
Description: lParam.Description(),
Default: lParam.Values,
Value: &value,
Type: StringList,
}
m[rootKey][lParam.Name()] = param
return m
}
func (m ParamMap) createStringOptionsParam(rootKey string, oParam config.StringOptions) ParamMap {
if _, ok := m[rootKey]; !ok {
m[rootKey] = make(map[string]CliParam)
}
value := StringOptionsVar{
Options: oParam.Options,
Value: oParam.Value,
}
param := CliParam{
Description: oParam.Description(),
Default: oParam.Value,
Value: &value,
Type: StringOptionsList,
}
m[rootKey][oParam.Name()] = param
return m
}
func (m ParamMap) createIntegerParam(rootKey string, iParam config.IntParam) ParamMap {
if _, ok := m[rootKey]; !ok {
m[rootKey] = make(map[string]CliParam)
}
var value int64
param := CliParam{
Description: iParam.Description(),
Default: iParam.Value,
Value: &value,
Type: Integer,
}
m[rootKey][iParam.Name()] = param
return m
}
func (c CliParam) GetConfigParam(name string) config.ConfigParam {
// this is a config being passed to a collector - required can be false
param := config.NewParam(name, c.Description, false)
switch c.Type {
case String:
return config.StringParam{
Param: param,
// values will be pointers
Value: *(c.Value.(*string)),
}
case StringList:
return config.StringListParam{
Param: param,
Values: *(c.Value.(*[]string)),
}
case StringOptionsList:
optionsVar := *(c.Value.(*StringOptionsVar))
return config.StringOptions{
Param: param,
Options: optionsVar.Options,
Value: optionsVar.Value,
}
case Integer:
return config.IntParam{
Param: param,
Value: *(c.Value.(*int64)),
}
case Boolean:
return config.BoolParam{
Param: param,
Value: *(c.Value.(*bool)),
}
}
return param
}
type StringOptionsVar struct {
Options []string
Value string
}
func (o StringOptionsVar) String() string {
return o.Value
}
func (o *StringOptionsVar) Set(p string) error {
isIncluded := func(opts []string, val string) bool {
for _, opt := range opts {
if val == opt {
return true
}
}
return false
}
if !isIncluded(o.Options, p) {
return fmt.Errorf("%s is not included in options: %v", p, o.Options)
}
o.Value = p
return nil
}
func (o *StringOptionsVar) Type() string {
return "string"
}
type StringSliceOptionsVar struct {
Options []string
Values []string
}
func (o StringSliceOptionsVar) String() string {
str, _ := writeAsCSV(o.Values)
return "[" + str + "]"
}
func (o *StringSliceOptionsVar) Set(val string) error {
values, err := readAsCSV(val)
if err != nil {
return err
}
vValues := utils.Distinct(values, o.Options)
if len(vValues) > 0 {
return fmt.Errorf("%v are not included in options: %v", vValues, o.Options)
}
o.Values = values
return nil
}
func (o *StringSliceOptionsVar) Type() string {
return "stringSlice"
}
func writeAsCSV(vals []string) (string, error) {
b := &bytes.Buffer{}
w := csv.NewWriter(b)
err := w.Write(vals)
if err != nil {
return "", err
}
w.Flush()
return strings.TrimSuffix(b.String(), "\n"), nil
}
func readAsCSV(val string) ([]string, error) {
if val == "" {
return []string{}, nil
}
stringReader := strings.NewReader(val)
csvReader := csv.NewReader(stringReader)
return csvReader.Read()
}
func AddParamMapToCmd(paramMap ParamMap, cmd *cobra.Command, prefix string, hide bool) {
for rootKey, childMap := range paramMap {
for childKey, value := range childMap {
paramName := fmt.Sprintf("%s.%s.%s", prefix, rootKey, childKey)
switch value.Type {
case String:
cmd.Flags().StringVar(value.Value.(*string), paramName, value.Default.(string), value.Description)
case StringList:
cmd.Flags().StringSliceVar(value.Value.(*[]string), paramName, value.Default.([]string), value.Description)
case StringOptionsList:
cmd.Flags().Var(value.Value.(*StringOptionsVar), paramName, value.Description)
case Integer:
cmd.Flags().Int64Var(value.Value.(*int64), paramName, value.Default.(int64), value.Description)
case Boolean:
cmd.Flags().BoolVar(value.Value.(*bool), paramName, value.Default.(bool), value.Description)
}
// this ensures flags from collectors and outputs are not shown as they will pollute the output
if hide {
_ = cmd.Flags().MarkHidden(paramName)
}
}
}
}
func ConvertParamsToConfig(paramMap ParamMap) map[string]config.Configuration {
configuration := make(map[string]config.Configuration)
for rootKey, childMap := range paramMap {
if _, ok := configuration[rootKey]; !ok {
configuration[rootKey] = config.Configuration{}
}
for childKey, value := range childMap {
configParam := value.GetConfigParam(childKey)
configuration[rootKey] = config.Configuration{Params: append(configuration[rootKey].Params, configParam)}
}
}
return configuration
}

View File

@ -0,0 +1,246 @@
package params_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/cmd/params"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/spf13/cobra"
"github.com/stretchr/testify/require"
"os"
"sort"
"testing"
)
var conf = map[string]config.Configuration{
"config": {
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("directory", "A directory", false),
AllowEmpty: true,
},
},
},
"system": {
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Include tables", false),
},
config.StringListParam{
Values: []string{"distributed_ddl_queue", "query_thread_log", "query_log", "asynchronous_metric_log", "zookeeper"},
Param: config.NewParam("exclude_tables", "Excluded tables", false),
},
config.IntParam{
Value: 100000,
Param: config.NewParam("row_limit", "Max rows", false),
},
},
},
"reader": {
Params: []config.ConfigParam{
config.StringOptions{
Value: "csv",
Options: []string{"csv"},
Param: config.NewParam("format", "Format of imported files", false),
},
config.BoolParam{
Value: true,
Param: config.NewParam("collect_archives", "Collect archives", false),
},
},
},
}
func TestNewParamMap(t *testing.T) {
// test each of the types via NewParamMap - one with each type. the keys here can represent anything e.g. a collector name
t.Run("test param map correctly converts types", func(t *testing.T) {
paramMap := params.NewParamMap(conf)
require.Len(t, paramMap, 3)
// check config
require.Contains(t, paramMap, "config")
require.Len(t, paramMap["config"], 1)
require.Contains(t, paramMap["config"], "directory")
require.IsType(t, params.CliParam{}, paramMap["config"]["directory"])
require.Equal(t, "A directory", paramMap["config"]["directory"].Description)
require.Equal(t, "", *(paramMap["config"]["directory"].Value.(*string)))
require.Equal(t, "", paramMap["config"]["directory"].Default)
require.Equal(t, params.String, paramMap["config"]["directory"].Type)
// check system
require.Contains(t, paramMap, "system")
require.Len(t, paramMap["system"], 3)
require.IsType(t, params.CliParam{}, paramMap["system"]["include_tables"])
require.Equal(t, "Include tables", paramMap["system"]["include_tables"].Description)
var value []string
require.Equal(t, &value, paramMap["system"]["include_tables"].Value)
require.Equal(t, value, paramMap["system"]["include_tables"].Default)
require.Equal(t, params.StringList, paramMap["system"]["include_tables"].Type)
require.Equal(t, "Excluded tables", paramMap["system"]["exclude_tables"].Description)
require.IsType(t, params.CliParam{}, paramMap["system"]["exclude_tables"])
require.Equal(t, &value, paramMap["system"]["exclude_tables"].Value)
require.Equal(t, []string{"distributed_ddl_queue", "query_thread_log", "query_log", "asynchronous_metric_log", "zookeeper"}, paramMap["system"]["exclude_tables"].Default)
require.Equal(t, params.StringList, paramMap["system"]["exclude_tables"].Type)
require.Equal(t, "Max rows", paramMap["system"]["row_limit"].Description)
require.IsType(t, params.CliParam{}, paramMap["system"]["row_limit"])
var iValue int64
require.Equal(t, &iValue, paramMap["system"]["row_limit"].Value)
require.Equal(t, int64(100000), paramMap["system"]["row_limit"].Default)
require.Equal(t, params.Integer, paramMap["system"]["row_limit"].Type)
// check reader
require.Contains(t, paramMap, "reader")
require.Len(t, paramMap["reader"], 2)
require.IsType(t, params.CliParam{}, paramMap["reader"]["format"])
require.Equal(t, "Format of imported files", paramMap["reader"]["format"].Description)
require.IsType(t, params.CliParam{}, paramMap["reader"]["format"])
oValue := params.StringOptionsVar{
Options: []string{"csv"},
Value: "csv",
}
require.Equal(t, &oValue, paramMap["reader"]["format"].Value)
require.Equal(t, "csv", paramMap["reader"]["format"].Default)
require.Equal(t, params.StringOptionsList, paramMap["reader"]["format"].Type)
require.IsType(t, params.CliParam{}, paramMap["reader"]["collect_archives"])
require.Equal(t, "Collect archives", paramMap["reader"]["collect_archives"].Description)
require.IsType(t, params.CliParam{}, paramMap["reader"]["collect_archives"])
var bVar bool
require.Equal(t, &bVar, paramMap["reader"]["collect_archives"].Value)
require.Equal(t, true, paramMap["reader"]["collect_archives"].Default)
require.Equal(t, params.Boolean, paramMap["reader"]["collect_archives"].Type)
})
}
// test GetConfigParam
func TestConvertParamsToConfig(t *testing.T) {
paramMap := params.NewParamMap(conf)
t.Run("test we can convert a param map back to a config", func(t *testing.T) {
cParam := params.ConvertParamsToConfig(paramMap)
// these will not be equal as we have some information loss e.g. allowEmpty
//require.Equal(t, conf, cParam)
// deep equality
for name := range conf {
require.Equal(t, len(conf[name].Params), len(cParam[name].Params))
// sort both consistently
sort.Slice(conf[name].Params, func(i, j int) bool {
return conf[name].Params[i].Name() < conf[name].Params[j].Name()
})
sort.Slice(cParam[name].Params, func(i, j int) bool {
return cParam[name].Params[i].Name() < cParam[name].Params[j].Name()
})
for i, param := range conf[name].Params {
require.Equal(t, param.Required(), cParam[name].Params[i].Required())
require.Equal(t, param.Name(), cParam[name].Params[i].Name())
require.Equal(t, param.Description(), cParam[name].Params[i].Description())
}
}
})
}
// create via NewParamMap and add to command AddParamMapToCmd - check contents
func TestAddParamMapToCmd(t *testing.T) {
paramMap := params.NewParamMap(conf)
t.Run("test we can add hidden params to a command", func(t *testing.T) {
testComand := &cobra.Command{
Use: "test",
Short: "Run a test",
Long: `Longer description`,
Run: func(cmd *cobra.Command, args []string) {
os.Exit(0)
},
}
params.AddParamMapToCmd(paramMap, testComand, "collector", true)
// check we get an error on one which doesn't exist
_, err := testComand.Flags().GetString("collector.config.random")
require.NotNil(t, err)
// check getting incorrect type
_, err = testComand.Flags().GetString("collector.system.include_tables")
require.NotNil(t, err)
// check existence of all flags
directory, err := testComand.Flags().GetString("collector.config.directory")
require.Nil(t, err)
require.Equal(t, "", directory)
includeTables, err := testComand.Flags().GetStringSlice("collector.system.include_tables")
require.Nil(t, err)
require.Equal(t, []string{}, includeTables)
excludeTables, err := testComand.Flags().GetStringSlice("collector.system.exclude_tables")
require.Nil(t, err)
require.Equal(t, []string{"distributed_ddl_queue", "query_thread_log", "query_log", "asynchronous_metric_log", "zookeeper"}, excludeTables)
rowLimit, err := testComand.Flags().GetInt64("collector.system.row_limit")
require.Nil(t, err)
require.Equal(t, int64(100000), rowLimit)
format, err := testComand.Flags().GetString("collector.reader.format")
require.Nil(t, err)
require.Equal(t, "csv", format)
collectArchives, err := testComand.Flags().GetBool("collector.reader.collect_archives")
require.Nil(t, err)
require.Equal(t, true, collectArchives)
})
}
// test StringOptionsVar
func TestStringOptionsVar(t *testing.T) {
t.Run("test we can set", func(t *testing.T) {
format := params.StringOptionsVar{
Options: []string{"csv", "tsv", "native"},
Value: "csv",
}
require.Equal(t, "csv", format.String())
err := format.Set("tsv")
require.Nil(t, err)
require.Equal(t, "tsv", format.String())
})
t.Run("test set invalid", func(t *testing.T) {
format := params.StringOptionsVar{
Options: []string{"csv", "tsv", "native"},
Value: "csv",
}
require.Equal(t, "csv", format.String())
err := format.Set("random")
require.NotNil(t, err)
require.Equal(t, "random is not included in options: [csv tsv native]", err.Error())
})
}
// test StringSliceOptionsVar
func TestStringSliceOptionsVar(t *testing.T) {
t.Run("test we can set", func(t *testing.T) {
formats := params.StringSliceOptionsVar{
Options: []string{"csv", "tsv", "native", "qsv"},
Values: []string{"csv", "tsv"},
}
require.Equal(t, "[csv,tsv]", formats.String())
err := formats.Set("tsv,native")
require.Nil(t, err)
require.Equal(t, "[tsv,native]", formats.String())
})
t.Run("test set invalid", func(t *testing.T) {
formats := params.StringSliceOptionsVar{
Options: []string{"csv", "tsv", "native", "qsv"},
Values: []string{"csv", "tsv"},
}
require.Equal(t, "[csv,tsv]", formats.String())
err := formats.Set("tsv,random")
require.NotNil(t, err)
require.Equal(t, "[random] are not included in options: [csv tsv native qsv]", err.Error())
err = formats.Set("msv,random")
require.NotNil(t, err)
require.Equal(t, "[msv random] are not included in options: [csv tsv native qsv]", err.Error())
})
}

View File

@ -0,0 +1,173 @@
package cmd
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/pkg/errors"
"github.com/rs/zerolog"
"github.com/rs/zerolog/log"
"github.com/spf13/cobra"
"github.com/spf13/viper"
"net/http"
_ "net/http/pprof"
"os"
"strings"
"time"
)
func enableDebug() {
if debug {
zerolog.SetGlobalLevel(zerolog.DebugLevel)
go func() {
err := http.ListenAndServe("localhost:8080", nil)
if err != nil {
log.Error().Err(err).Msg("unable to start debugger")
} else {
log.Debug().Msg("debugger has been started on port 8080")
}
}()
}
}
var rootCmd = &cobra.Command{
Use: "clickhouse-diagnostics",
Short: "Capture and convert ClickHouse diagnostic bundles.",
Long: `Captures ClickHouse diagnostic bundles to a number of supported formats, including file and ClickHouse itself. Converts bundles between formats.`,
PersistentPreRunE: func(cmd *cobra.Command, args []string) error {
enableDebug()
err := initializeConfig()
if err != nil {
log.Error().Err(err)
os.Exit(1)
}
return nil
},
Example: `clickhouse-diagnostics collect`,
}
const (
colorRed = iota + 31
colorGreen
colorYellow
colorMagenta = 35
colorBold = 1
)
const TimeFormat = time.RFC3339
var debug bool
var configFiles []string
const (
// The environment variable prefix of all environment variables bound to our command line flags.
// For example, --output is bound to CLICKHOUSE_DIAGNOSTIC_OUTPUT.
envPrefix = "CLICKHOUSE_DIAGNOSTIC"
)
func init() {
rootCmd.PersistentFlags().BoolVarP(&debug, "debug", "d", false, "Enable debug mode")
rootCmd.PersistentFlags().StringSliceVarP(&configFiles, "config", "f", []string{"clickhouse-diagnostics.yml", "/etc/clickhouse-diagnostics.yml"}, "Configuration file path")
// set a usage template to ensure flags on root are listed as global
rootCmd.SetUsageTemplate(`Usage:{{if .Runnable}}
{{.UseLine}}{{end}}{{if .HasAvailableSubCommands}}
{{.CommandPath}} [command]{{end}}{{if gt (len .Aliases) 0}}
Aliases:
{{.NameAndAliases}}{{end}}{{if .HasExample}}
Examples:
{{.Example}}{{end}}{{if .HasAvailableSubCommands}}
Available Commands:{{range .Commands}}{{if (or .IsAvailableCommand (eq .Name "help"))}}
{{rpad .Name .NamePadding }} {{.Short}}{{end}}{{end}}{{end}}{{if .HasAvailableLocalFlags}}
Global Flags:
{{.LocalFlags.FlagUsages | trimTrailingWhitespaces}}{{end}}{{if .HasAvailableInheritedFlags}}
Additional help topics:{{range .Commands}}{{if .IsAdditionalHelpTopicCommand}}
{{rpad .CommandPath .CommandPathPadding}} {{.Short}}{{end}}{{end}}{{end}}{{if .HasAvailableSubCommands}}
Use "{{.CommandPath}} [command] --help" for more information about a command.{{end}}
`)
rootCmd.SetFlagErrorFunc(handleFlagErrors)
}
func Execute() {
// logs go to stderr - stdout is exclusive for outputs e.g. tables
output := zerolog.ConsoleWriter{Out: os.Stderr, TimeFormat: TimeFormat}
// override the colors
output.FormatLevel = func(i interface{}) string {
var l string
if ll, ok := i.(string); ok {
switch ll {
case zerolog.LevelTraceValue:
l = colorize("TRC", colorMagenta)
case zerolog.LevelDebugValue:
l = colorize("DBG", colorMagenta)
case zerolog.LevelInfoValue:
l = colorize("INF", colorGreen)
case zerolog.LevelWarnValue:
l = colorize(colorize("WRN", colorYellow), colorBold)
case zerolog.LevelErrorValue:
l = colorize(colorize("ERR", colorRed), colorBold)
case zerolog.LevelFatalValue:
l = colorize(colorize("FTL", colorRed), colorBold)
case zerolog.LevelPanicValue:
l = colorize(colorize("PNC", colorRed), colorBold)
default:
l = colorize("???", colorBold)
}
} else {
if i == nil {
l = colorize("???", colorBold)
} else {
l = strings.ToUpper(fmt.Sprintf("%s", i))[0:3]
}
}
return l
}
output.FormatTimestamp = func(i interface{}) string {
tt := i.(string)
return colorize(tt, colorGreen)
}
log.Logger = log.Output(output)
zerolog.SetGlobalLevel(zerolog.InfoLevel)
rootCmd.SetHelpCommand(helpCmd)
if err := rootCmd.Execute(); err != nil {
log.Fatal().Err(err)
}
}
// colorize returns the string s wrapped in ANSI code c
func colorize(s interface{}, c int) string {
return fmt.Sprintf("\x1b[%dm%v\x1b[0m", c, s)
}
func handleFlagErrors(cmd *cobra.Command, err error) error {
fmt.Println(colorize(colorize(fmt.Sprintf("Error: %s\n", err), colorRed), colorBold))
_ = cmd.Help()
os.Exit(1)
return nil
}
func initializeConfig() error {
// we use the first config file we find
var configFile string
for _, confFile := range configFiles {
if ok, _ := utils.FileExists(confFile); ok {
configFile = confFile
break
}
}
if configFile == "" {
log.Warn().Msgf("config file in %s not found - config file will be ignored", configFiles)
return nil
}
viper.SetConfigFile(configFile)
if err := viper.ReadInConfig(); err != nil {
return errors.Wrapf(err, "Unable to read configuration file at %s", configFile)
}
return nil
}

View File

@ -0,0 +1,24 @@
package cmd
import (
"fmt"
"github.com/spf13/cobra"
)
var (
Version = "" // set at compile time with -ldflags "-X versserv/cmd.Version=x.y.yz"
Commit = ""
)
func init() {
rootCmd.AddCommand(versionCmd)
}
var versionCmd = &cobra.Command{
Use: "version",
Short: "Print the version number of clickhouse-diagnostics",
Long: `All software has versions. This is clickhouse-diagnostics`,
Run: func(cmd *cobra.Command, args []string) {
fmt.Printf("Clickhouse Diagnostics %s (%s)\n", Version, Commit)
},
}

View File

@ -0,0 +1,91 @@
module github.com/ClickHouse/clickhouse-diagnostics
go 1.17
require (
github.com/ClickHouse/clickhouse-go/v2 v2.0.12
github.com/DATA-DOG/go-sqlmock v1.5.0
github.com/Masterminds/semver v1.5.0
github.com/bmatcuk/doublestar/v4 v4.0.2
github.com/elastic/gosigar v0.14.2
github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510
github.com/jaypipes/ghw v0.8.0
github.com/matishsiao/goInfo v0.0.0-20210923090445-da2e3fa8d45f
github.com/mholt/archiver/v4 v4.0.0-alpha.4
github.com/olekukonko/tablewriter v0.0.5
github.com/pkg/errors v0.9.1
github.com/rs/zerolog v1.26.1
github.com/spf13/cobra v1.3.0
github.com/spf13/pflag v1.0.5
github.com/spf13/viper v1.10.1
github.com/stretchr/testify v1.7.0
github.com/testcontainers/testcontainers-go v0.12.0
github.com/yargevad/filepathx v1.0.0
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b
)
require (
github.com/Azure/go-ansiterm v0.0.0-20170929234023-d6e3b3328b78 // indirect
github.com/Microsoft/go-winio v0.4.17-0.20210211115548-6eac466e5fa3 // indirect
github.com/Microsoft/hcsshim v0.8.16 // indirect
github.com/StackExchange/wmi v0.0.0-20190523213315-cbe66965904d // indirect
github.com/andybalholm/brotli v1.0.4 // indirect
github.com/cenkalti/backoff v2.2.1+incompatible // indirect
github.com/containerd/cgroups v0.0.0-20210114181951-8a68de567b68 // indirect
github.com/containerd/containerd v1.5.0-beta.4 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/docker/distribution v2.7.1+incompatible // indirect
github.com/docker/docker v20.10.11+incompatible // indirect
github.com/docker/go-connections v0.4.0 // indirect
github.com/docker/go-units v0.4.0 // indirect
github.com/dsnet/compress v0.0.1 // indirect
github.com/fsnotify/fsnotify v1.5.1 // indirect
github.com/ghodss/yaml v1.0.0 // indirect
github.com/go-ole/go-ole v1.2.4 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect
github.com/golang/protobuf v1.5.2 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/google/uuid v1.3.0 // indirect
github.com/hashicorp/hcl v1.0.0 // indirect
github.com/inconshreveable/mousetrap v1.0.0 // indirect
github.com/jaypipes/pcidb v0.6.0 // indirect
github.com/klauspost/compress v1.13.6 // indirect
github.com/klauspost/pgzip v1.2.5 // indirect
github.com/magiconair/properties v1.8.5 // indirect
github.com/mattn/go-runewidth v0.0.9 // indirect
github.com/mitchellh/go-homedir v1.1.0 // indirect
github.com/mitchellh/mapstructure v1.4.3 // indirect
github.com/moby/sys/mount v0.2.0 // indirect
github.com/moby/sys/mountinfo v0.5.0 // indirect
github.com/moby/term v0.0.0-20201216013528-df9cb8a40635 // indirect
github.com/morikuni/aec v0.0.0-20170113033406-39771216ff4c // indirect
github.com/nwaples/rardecode/v2 v2.0.0-beta.2 // indirect
github.com/opencontainers/go-digest v1.0.0 // indirect
github.com/opencontainers/image-spec v1.0.1 // indirect
github.com/opencontainers/runc v1.0.2 // indirect
github.com/paulmach/orb v0.4.0 // indirect
github.com/pelletier/go-toml v1.9.4 // indirect
github.com/pierrec/lz4/v4 v4.1.14 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/shopspring/decimal v1.3.1 // indirect
github.com/sirupsen/logrus v1.8.1 // indirect
github.com/spf13/afero v1.8.0 // indirect
github.com/spf13/cast v1.4.1 // indirect
github.com/spf13/jwalterweatherman v1.1.0 // indirect
github.com/subosito/gotenv v1.2.0 // indirect
github.com/therootcompany/xz v1.0.1 // indirect
github.com/ulikunitz/xz v0.5.10 // indirect
go.opencensus.io v0.23.0 // indirect
go.opentelemetry.io/otel v1.4.1 // indirect
go.opentelemetry.io/otel/trace v1.4.1 // indirect
golang.org/x/net v0.0.0-20211108170745-6635138e15ea // indirect
golang.org/x/sys v0.0.0-20220114195835-da31bd327af9 // indirect
golang.org/x/text v0.3.7 // indirect
google.golang.org/genproto v0.0.0-20211208223120-3a66f561d7aa // indirect
google.golang.org/grpc v1.43.0 // indirect
google.golang.org/protobuf v1.27.1 // indirect
gopkg.in/ini.v1 v1.66.2 // indirect
gopkg.in/yaml.v2 v2.4.0 // indirect
howett.net/plist v0.0.0-20181124034731-591f970eefbb // indirect
)

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,112 @@
package clickhouse
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/pkg/errors"
"path/filepath"
)
type ConfigCollector struct {
resourceManager *platform.ResourceManager
}
func NewConfigCollector(m *platform.ResourceManager) *ConfigCollector {
return &ConfigCollector{
resourceManager: m,
}
}
const DefaultConfigLocation = "/etc/clickhouse-server/"
const ProcessedConfigurationLocation = "/var/lib/clickhouse/preprocessed_configs"
func (c ConfigCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(c.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
directory, err := config.ReadStringValue(conf, "directory")
if err != nil {
return &data.DiagnosticBundle{}, err
}
if directory != "" {
// user has specified a directory - we therefore skip all other efforts to locate the config
frame, errs := data.NewConfigFileFrame(directory)
return &data.DiagnosticBundle{
Frames: map[string]data.Frame{
"user_specified": frame,
},
Errors: data.FrameErrors{Errors: errs},
}, nil
}
configCandidates, err := FindConfigurationFiles()
if err != nil {
return &data.DiagnosticBundle{}, errors.Wrapf(err, "Unable to find configuration files")
}
frames := make(map[string]data.Frame)
var frameErrors []error
for frameName, confDir := range configCandidates {
frame, errs := data.NewConfigFileFrame(confDir)
frameErrors = append(frameErrors, errs...)
frames[frameName] = frame
}
return &data.DiagnosticBundle{
Frames: frames,
Errors: data.FrameErrors{Errors: frameErrors},
}, err
}
func FindConfigurationFiles() (map[string]string, error) {
configCandidates := map[string]string{
"default": DefaultConfigLocation,
"preprocessed": ProcessedConfigurationLocation,
}
// we don't know specifically where the config is but try to find via processes
processConfigs, err := utils.FindConfigsFromClickHouseProcesses()
if err != nil {
return nil, err
}
for i, path := range processConfigs {
confDir := filepath.Dir(path)
if len(processConfigs) == 1 {
configCandidates["process"] = confDir
break
}
configCandidates[fmt.Sprintf("process_%d", i)] = confDir
}
return configCandidates, nil
}
func (c ConfigCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("directory", "Specify the location of the configuration files for ClickHouse Server e.g. /etc/clickhouse-server/", false),
AllowEmpty: true,
},
},
}
}
func (c ConfigCollector) Description() string {
return "Collects the ClickHouse configuration from the local filesystem."
}
func (c ConfigCollector) IsDefault() bool {
return true
}
// here we register the collector for use
func init() {
collectors.Register("config", func() (collectors.Collector, error) {
return &ConfigCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,127 @@
package clickhouse_test
import (
"encoding/xml"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"io"
"os"
"path"
"testing"
)
func TestConfigConfiguration(t *testing.T) {
t.Run("correct configuration is returned for config collector", func(t *testing.T) {
configCollector := clickhouse.NewConfigCollector(&platform.ResourceManager{})
conf := configCollector.Configuration()
require.Len(t, conf.Params, 1)
// check first param
require.IsType(t, config.StringParam{}, conf.Params[0])
directory, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.False(t, directory.Required())
require.Equal(t, directory.Name(), "directory")
require.Equal(t, "", directory.Value)
})
}
func TestConfigCollect(t *testing.T) {
configCollector := clickhouse.NewConfigCollector(&platform.ResourceManager{})
t.Run("test default file collector configuration", func(t *testing.T) {
diagSet, err := configCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
// we won't be able to collect the default configs preprocessed and default - even if clickhouse is installed
// these directories should not be readable under any permissions these tests are unrealistically executed!
// note: we may also pick up configs from a local clickhouse process - we thus allow a len >=2 but don't check this
// as its non-deterministic
require.GreaterOrEqual(t, len(diagSet.Frames), 2)
// check default key
require.Contains(t, diagSet.Frames, "default")
require.Equal(t, diagSet.Frames["default"].Name(), "/etc/clickhouse-server/")
require.Equal(t, diagSet.Frames["default"].Columns(), []string{"config"})
// collection will have failed
checkFrame(t, diagSet.Frames["default"], nil)
// check preprocessed key
require.Contains(t, diagSet.Frames, "preprocessed")
require.Equal(t, diagSet.Frames["preprocessed"].Name(), "/var/lib/clickhouse/preprocessed_configs")
require.Equal(t, diagSet.Frames["preprocessed"].Columns(), []string{"config"})
// min of 2 - might be more if a local installation of clickhouse is running
require.GreaterOrEqual(t, len(diagSet.Errors.Errors), 2)
})
t.Run("test configuration when specified", func(t *testing.T) {
// create some test files
tempDir := t.TempDir()
confDir := path.Join(tempDir, "conf")
// create an includes file
includesDir := path.Join(tempDir, "includes")
err := os.MkdirAll(includesDir, os.ModePerm)
require.Nil(t, err)
includesPath := path.Join(includesDir, "random.xml")
includesFile, err := os.Create(includesPath)
require.Nil(t, err)
xmlWriter := io.Writer(includesFile)
enc := xml.NewEncoder(xmlWriter)
enc.Indent(" ", " ")
xmlConfig := data.XmlConfig{
XMLName: xml.Name{},
Clickhouse: data.XmlLoggerConfig{
XMLName: xml.Name{},
ErrorLog: "/var/log/clickhouse-server/clickhouse-server.err.log",
Log: "/var/log/clickhouse-server/clickhouse-server.log",
},
IncludeFrom: "",
}
err = enc.Encode(xmlConfig)
require.Nil(t, err)
// create 5 temporary config files - length is 6 for the included file
rows := make([][]interface{}, 6)
for i := 0; i < 5; i++ {
if i == 4 {
// set the includes for the last doc
xmlConfig.IncludeFrom = includesPath
}
// we want to check hierarchies are walked so create a simple folder for each file
fileDir := path.Join(confDir, fmt.Sprintf("%d", i))
err := os.MkdirAll(fileDir, os.ModePerm)
require.Nil(t, err)
filepath := path.Join(fileDir, fmt.Sprintf("random-%d.xml", i))
row := make([]interface{}, 1)
row[0] = data.XmlConfigFile{Path: filepath}
rows[i] = row
xmlFile, err := os.Create(filepath)
require.Nil(t, err)
// write a little xml so its valid
xmlConfig := xmlConfig
xmlWriter := io.Writer(xmlFile)
enc := xml.NewEncoder(xmlWriter)
enc.Indent(" ", " ")
err = enc.Encode(xmlConfig)
require.Nil(t, err)
}
diagSet, err := configCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: confDir,
Param: config.NewParam("directory", "File locations", false),
},
},
})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Frames, 1)
require.Contains(t, diagSet.Frames, "user_specified")
require.Equal(t, diagSet.Frames["user_specified"].Name(), confDir)
require.Equal(t, diagSet.Frames["user_specified"].Columns(), []string{"config"})
iConf := make([]interface{}, 1)
iConf[0] = data.XmlConfigFile{Path: includesPath, Included: true}
rows[5] = iConf
checkFrame(t, diagSet.Frames["user_specified"], rows)
})
}

View File

@ -0,0 +1,108 @@
package clickhouse
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
)
type DBLogTable struct {
orderBy data.OrderBy
excludeColumns []string
}
var DbLogTables = map[string]DBLogTable{
"query_log": {
orderBy: data.OrderBy{
Column: "event_time_microseconds",
Order: data.Asc,
},
excludeColumns: []string{},
},
"query_thread_log": {
orderBy: data.OrderBy{
Column: "event_time_microseconds",
Order: data.Asc,
},
excludeColumns: []string{},
},
"text_log": {
orderBy: data.OrderBy{
Column: "event_time_microseconds",
Order: data.Asc,
},
excludeColumns: []string{},
},
}
// This collector collects db logs
type DBLogsCollector struct {
resourceManager *platform.ResourceManager
}
func NewDBLogsCollector(m *platform.ResourceManager) *DBLogsCollector {
return &DBLogsCollector{
resourceManager: m,
}
}
func (dc *DBLogsCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(dc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
rowLimit, err := config.ReadIntValue(conf, "row_limit")
if err != nil {
return &data.DiagnosticBundle{}, err
}
frames := make(map[string]data.Frame)
var frameErrors []error
for logTable, tableConfig := range DbLogTables {
frame, err := dc.resourceManager.DbClient.ReadTable("system", logTable, tableConfig.excludeColumns, tableConfig.orderBy, rowLimit)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to collect %s", logTable))
} else {
frames[logTable] = frame
}
}
fErrors := data.FrameErrors{
Errors: frameErrors,
}
return &data.DiagnosticBundle{
Frames: frames,
Errors: fErrors,
}, nil
}
func (dc *DBLogsCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: 100000,
Param: config.NewParam("row_limit", "Maximum number of log rows to collect. Negative values mean unlimited", false),
},
},
}
}
func (dc *DBLogsCollector) IsDefault() bool {
return true
}
func (dc DBLogsCollector) Description() string {
return "Collects the ClickHouse logs directly from the database."
}
// here we register the collector for use
func init() {
collectors.Register("db_logs", func() (collectors.Collector, error) {
return &DBLogsCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,118 @@
package clickhouse_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"testing"
)
func TestDbLogsConfiguration(t *testing.T) {
t.Run("correct configuration is returned for summary collector", func(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
dbLogsCollector := clickhouse.NewDBLogsCollector(&platform.ResourceManager{
DbClient: client,
})
conf := dbLogsCollector.Configuration()
require.Len(t, conf.Params, 1)
require.IsType(t, config.IntParam{}, conf.Params[0])
rowLimit, ok := conf.Params[0].(config.IntParam)
require.True(t, ok)
require.False(t, rowLimit.Required())
require.Equal(t, rowLimit.Name(), "row_limit")
require.Equal(t, int64(100000), rowLimit.Value)
})
}
func TestDbLogsCollect(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
dbLogsCollector := clickhouse.NewDBLogsCollector(&platform.ResourceManager{
DbClient: client,
})
queryLogColumns := []string{"type", "event_date", "event_time", "event_time_microseconds",
"query_start_time", "query_start_time_microseconds", "query_duration_ms", "read_rows", "read_bytes", "written_rows", "written_bytes",
"result_rows", "result_bytes", "memory_usage", "current_database", "query", "formatted_query", "normalized_query_hash",
"query_kind", "databases", "tables", "columns", "projections", "views", "exception_code", "exception", "stack_trace",
"is_initial_query", "user", "query_id", "address", "port", "initial_user", "initial_query_id", "initial_address", "initial_port",
"initial_query_start_time", "initial_query_start_time_microseconds", "interface", "os_user", "client_hostname", "client_name",
"client_revision", "client_version_major", "client_version_minor", "client_version_patch", "http_method", "http_user_agent",
"http_referer", "forwarded_for", "quota_key", "revision", "log_comment", "thread_ids", "ProfileEvents", "Settings",
"used_aggregate_functions", "used_aggregate_function_combinators", "used_database_engines", "used_data_type_families",
"used_dictionaries", "used_formats", "used_functions", "used_storages", "used_table_functions"}
queryLogFrame := test.NewFakeDataFrame("queryLog", queryLogColumns,
[][]interface{}{
{"QueryStart", "2021-12-13", "2021-12-13 12:53:20", "2021-12-13 12:53:20.590579", "2021-12-13 12:53:20", "2021-12-13 12:53:20.590579", "0", "0", "0", "0", "0", "0", "0", "0", "default", "SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)", "", "6666026786019643712", "Select", "['system']", "['system.aggregate_function_combinators','system.clusters','system.columns','system.data_type_families','system.databases','system.dictionaries','system.formats','system.functions','system.macros','system.merge_tree_settings','system.settings','system.storage_policies','system.table_engines','system.table_functions','system.tables']", "['system.aggregate_function_combinators.name','system.clusters.cluster','system.columns.name','system.data_type_families.name','system.databases.name','system.dictionaries.name','system.formats.name','system.functions.is_aggregate','system.functions.name','system.macros.macro','system.merge_tree_settings.name','system.settings.name','system.storage_policies.policy_name','system.table_engines.name','system.table_functions.name','system.tables.name']", "[]", "[]", "0", "", "", "1", "default", "3b5feb6d-3086-4718-adb2-17464988ff12", "::ffff:127.0.0.1", "50920", "default", "3b5feb6d-3086-4718-adb2-17464988ff12", "::ffff:127.0.0.1", "50920", "2021-12-13 12:53:30", "2021-12-13 12:53:30.590579", "1", "", "", "ClickHouse client", "54450", "21", "11", "0", "0", "", "", "", "", "54456", "", "[]", "{}", "{'load_balancing':'random','max_memory_usage':'10000000000'}", "[]", "[]", "[]", "[]", "[]", "[]", "[]", "[]", "[]"},
{"QueryFinish", "2021-12-13", "2021-12-13 12:53:30", "2021-12-13 12:53:30.607292", "2021-12-13 12:53:30", "2021-12-13 12:53:30.590579", "15", "4512", "255694", "0", "0", "4358", "173248", "4415230", "default", "SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)", "", "6666026786019643712", "Select", "['system']", "['system.aggregate_function_combinators','system.clusters','system.columns','system.data_type_families','system.databases','system.dictionaries','system.formats','system.functions','system.macros','system.merge_tree_settings','system.settings','system.storage_policies','system.table_engines','system.table_functions','system.tables']", "['system.aggregate_function_combinators.name','system.clusters.cluster','system.columns.name','system.data_type_families.name','system.databases.name','system.dictionaries.name','system.formats.name','system.functions.is_aggregate','system.functions.name','system.macros.macro','system.merge_tree_settings.name','system.settings.name','system.storage_policies.policy_name','system.table_engines.name','system.table_functions.name','system.tables.name']", "[]", "[]", "0", "", "", "1", "default", "3b5feb6d-3086-4718-adb2-17464988ff12", "::ffff:127.0.0.1", "50920", "default", "3b5feb6d-3086-4718-adb2-17464988ff12", "::ffff:127.0.0.1", "50920", "2021-12-13 12:53:30", "2021-12-13 12:53:30.590579", "1", "", "", "ClickHouse client", "54450", "21", "11", "0", "0", "", "", "", "", "54456", "", "[95298,95315,95587,95316,95312,95589,95318,95586,95588,95585]", "{'Query':1,'SelectQuery':1,'ArenaAllocChunks':41,'ArenaAllocBytes':401408,'FunctionExecute':62,'NetworkSendElapsedMicroseconds':463,'NetworkSendBytes':88452,'SelectedRows':4512,'SelectedBytes':255694,'RegexpCreated':6,'ContextLock':411,'RWLockAcquiredReadLocks':190,'RealTimeMicroseconds':49221,'UserTimeMicroseconds':19811,'SystemTimeMicroseconds':2817,'SoftPageFaults':1128,'OSCPUWaitMicroseconds':127,'OSCPUVirtualTimeMicroseconds':22624,'OSWriteBytes':12288,'OSWriteChars':13312}", "{'load_balancing':'random','max_memory_usage':'10000000000'}", "[]", "[]", "[]", "[]", "[]", "[]", "['concat','notEmpty','extractAll']", "[]", "[]"},
{"QueryStart", "2021-12-13", "2021-12-13 13:02:53", "2021-12-13 13:02:53.419528", "2021-12-13 13:02:53", "2021-12-13 13:02:53.419528", "0", "0", "0", "0", "0", "0", "0", "0", "default", "SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)", "", "6666026786019643712", "Select", "['system']", "['system.aggregate_function_combinators','system.clusters','system.columns','system.data_type_families','system.databases','system.dictionaries','system.formats','system.functions','system.macros','system.merge_tree_settings','system.settings','system.storage_policies','system.table_engines','system.table_functions','system.tables']", "['system.aggregate_function_combinators.name','system.clusters.cluster','system.columns.name','system.data_type_families.name','system.databases.name','system.dictionaries.name','system.formats.name','system.functions.is_aggregate','system.functions.name','system.macros.macro','system.merge_tree_settings.name','system.settings.name','system.storage_policies.policy_name','system.table_engines.name','system.table_functions.name','system.tables.name']", "[]", "[]", "0", "", "", "1", "default", "351b58e4-6128-47d4-a7b8-03d78c1f84c6", "::ffff:127.0.0.1", "50968", "default", "351b58e4-6128-47d4-a7b8-03d78c1f84c6", "::ffff:127.0.0.1", "50968", "2021-12-13 13:02:53", "2021-12-13 13:02:53.419528", "1", "", "", "ClickHouse client", "54450", "21", "11", "0", "0", "", "", "", "", "54456", "", "[]", "{}", "{'load_balancing':'random','max_memory_usage':'10000000000'}", "[]", "[]", "[]", "[]", "[]", "[]", "[]", "[]", "[]"},
{"QueryFinish", "2021-12-13", "2021-12-13 13:02:56", "2021-12-13 13:02:56.437115", "2021-12-13 13:02:56", "2021-12-13 13:02:56.419528", "16", "4629", "258376", "0", "0", "4377", "174272", "4404694", "default", "SELECT DISTINCT arrayJoin(extractAll(name, '[\\w_]{2,}')) AS res FROM (SELECT name FROM system.functions UNION ALL SELECT name FROM system.table_engines UNION ALL SELECT name FROM system.formats UNION ALL SELECT name FROM system.table_functions UNION ALL SELECT name FROM system.data_type_families UNION ALL SELECT name FROM system.merge_tree_settings UNION ALL SELECT name FROM system.settings UNION ALL SELECT cluster FROM system.clusters UNION ALL SELECT macro FROM system.macros UNION ALL SELECT policy_name FROM system.storage_policies UNION ALL SELECT concat(func.name, comb.name) FROM system.functions AS func CROSS JOIN system.aggregate_function_combinators AS comb WHERE is_aggregate UNION ALL SELECT name FROM system.databases LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.tables LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.dictionaries LIMIT 10000 UNION ALL SELECT DISTINCT name FROM system.columns LIMIT 10000) WHERE notEmpty(res)", "", "6666026786019643712", "Select", "['system']", "['system.aggregate_function_combinators','system.clusters','system.columns','system.data_type_families','system.databases','system.dictionaries','system.formats','system.functions','system.macros','system.merge_tree_settings','system.settings','system.storage_policies','system.table_engines','system.table_functions','system.tables']", "['system.aggregate_function_combinators.name','system.clusters.cluster','system.columns.name','system.data_type_families.name','system.databases.name','system.dictionaries.name','system.formats.name','system.functions.is_aggregate','system.functions.name','system.macros.macro','system.merge_tree_settings.name','system.settings.name','system.storage_policies.policy_name','system.table_engines.name','system.table_functions.name','system.tables.name']", "[]", "[]", "0", "", "", "1", "default", "351b58e4-6128-47d4-a7b8-03d78c1f84c6", "::ffff:127.0.0.1", "50968", "default", "351b58e4-6128-47d4-a7b8-03d78c1f84c6", "::ffff:127.0.0.1", "50968", "2021-12-13 13:02:53", "2021-12-13 13:02:53.419528", "1", "", "", "ClickHouse client", "54450", "21", "11", "0", "0", "", "", "", "", "54456", "", "[95298,95318,95315,95316,95312,95588,95589,95586,95585,95587]", "{'Query':1,'SelectQuery':1,'ArenaAllocChunks':41,'ArenaAllocBytes':401408,'FunctionExecute':62,'NetworkSendElapsedMicroseconds':740,'NetworkSendBytes':88794,'SelectedRows':4629,'SelectedBytes':258376,'ContextLock':411,'RWLockAcquiredReadLocks':194,'RealTimeMicroseconds':52469,'UserTimeMicroseconds':17179,'SystemTimeMicroseconds':4218,'SoftPageFaults':569,'OSCPUWaitMicroseconds':303,'OSCPUVirtualTimeMicroseconds':25087,'OSWriteBytes':12288,'OSWriteChars':12288}", "{'load_balancing':'random','max_memory_usage':'10000000000'}", "[]", "[]", "[]", "[]", "[]", "[]", "['concat','notEmpty','extractAll']", "[]", "[]"},
})
client.QueryResponses["SELECT * FROM system.query_log ORDER BY event_time_microseconds ASC LIMIT 100000"] = &queryLogFrame
textLogColumns := []string{"event_date", "event_time", "event_time_microseconds", "microseconds", "thread_name", "thread_id", "level", "query_id", "logger_name", "message", "revision", "source_file", "source_line"}
textLogFrame := test.NewFakeDataFrame("textLog", textLogColumns,
[][]interface{}{
{"2022-02-03", "2022-02-03 16:17:47", "2022-02-03 16:37:17.056950", "56950", "clickhouse-serv", "68947", "Information", "", "DNSCacheUpdater", "Update period 15 seconds", "54458", "../src/Interpreters/DNSCacheUpdater.cpp; void DB::DNSCacheUpdater::start()", "46"},
{"2022-02-03", "2022-02-03 16:27:47", "2022-02-03 16:37:27.057022", "57022", "clickhouse-serv", "68947", "Information", "", "Application", "Available RAM: 62.24 GiB; physical cores: 8; logical cores: 16.", "54458", "../programs/server/Server.cpp; virtual int DB::Server::main(const std::vector<std::string> &)", "1380"},
{"2022-02-03", "2022-02-03 16:37:47", "2022-02-03 16:37:37.057484", "57484", "clickhouse-serv", "68947", "Information", "", "Application", "Listening for http://[::1]:8123", "54458", "../programs/server/Server.cpp; virtual int DB::Server::main(const std::vector<std::string> &)", "1444"},
{"2022-02-03", "2022-02-03 16:47:47", "2022-02-03 16:37:47.057527", "57527", "clickhouse-serv", "68947", "Information", "", "Application", "Listening for native protocol (tcp): [::1]:9000", "54458", "../programs/server/Server.cpp; virtual int DB::Server::main(const std::vector<std::string> &)", "1444"},
})
client.QueryResponses["SELECT * FROM system.text_log ORDER BY event_time_microseconds ASC LIMIT 100000"] = &textLogFrame
// skip query_thread_log frame - often it doesn't exist anyway unless enabled
t.Run("test default db logs collection", func(t *testing.T) {
bundle, errs := dbLogsCollector.Collect(config.Configuration{})
require.Empty(t, errs)
require.NotNil(t, bundle)
require.Len(t, bundle.Frames, 2)
require.Contains(t, bundle.Frames, "text_log")
require.Contains(t, bundle.Frames, "query_log")
require.Len(t, bundle.Errors.Errors, 1)
// check query_log frame
require.Contains(t, bundle.Frames, "query_log")
require.Equal(t, queryLogColumns, bundle.Frames["query_log"].Columns())
checkFrame(t, bundle.Frames["query_log"], queryLogFrame.Rows)
//check text_log frame
require.Contains(t, bundle.Frames, "text_log")
require.Equal(t, textLogColumns, bundle.Frames["text_log"].Columns())
checkFrame(t, bundle.Frames["text_log"], textLogFrame.Rows)
client.Reset()
})
t.Run("test db logs collection with limit", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: 1,
Param: config.NewParam("row_limit", "Maximum number of log rows to collect. Negative values mean unlimited", false),
},
},
}
bundle, err := dbLogsCollector.Collect(conf)
require.Empty(t, err)
require.NotNil(t, bundle)
require.Len(t, bundle.Frames, 0)
require.Len(t, bundle.Errors.Errors, 3)
// populate client
client.QueryResponses["SELECT * FROM system.query_log ORDER BY event_time_microseconds ASC LIMIT 1"] = &queryLogFrame
client.QueryResponses["SELECT * FROM system.text_log ORDER BY event_time_microseconds ASC LIMIT 1"] = &textLogFrame
bundle, err = dbLogsCollector.Collect(conf)
require.Empty(t, err)
require.Len(t, bundle.Frames, 2)
require.Len(t, bundle.Errors.Errors, 1)
require.Contains(t, bundle.Frames, "text_log")
require.Contains(t, bundle.Frames, "query_log")
// check query_log frame
require.Contains(t, bundle.Frames, "query_log")
require.Equal(t, queryLogColumns, bundle.Frames["query_log"].Columns())
checkFrame(t, bundle.Frames["query_log"], queryLogFrame.Rows[:1])
//check text_log frame
require.Contains(t, bundle.Frames, "text_log")
require.Equal(t, textLogColumns, bundle.Frames["text_log"].Columns())
checkFrame(t, bundle.Frames["text_log"], textLogFrame.Rows[:1])
client.Reset()
})
}

View File

@ -0,0 +1,139 @@
package clickhouse
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"path/filepath"
)
// This collector collects logs
type LogsCollector struct {
resourceManager *platform.ResourceManager
}
func NewLogsCollector(m *platform.ResourceManager) *LogsCollector {
return &LogsCollector{
resourceManager: m,
}
}
var DefaultLogsLocation = filepath.Clean("/var/log/clickhouse-server/")
func (lc *LogsCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(lc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
directory, err := config.ReadStringValue(conf, "directory")
if err != nil {
return &data.DiagnosticBundle{}, err
}
collectArchives, err := config.ReadBoolValue(conf, "collect_archives")
if err != nil {
return &data.DiagnosticBundle{}, err
}
logPatterns := []string{"*.log"}
if collectArchives {
logPatterns = append(logPatterns, "*.gz")
}
if directory != "" {
// user has specified a directory - we therefore skip all other efforts to locate the logs
frame, errs := data.NewFileDirectoryFrame(directory, logPatterns)
return &data.DiagnosticBundle{
Frames: map[string]data.Frame{
"user_specified": frame,
},
Errors: data.FrameErrors{Errors: errs},
}, nil
}
// add the default
frames := make(map[string]data.Frame)
dirFrame, frameErrors := data.NewFileDirectoryFrame(DefaultLogsLocation, logPatterns)
frames["default"] = dirFrame
logFolders, errs := FindLogFileCandidates()
frameErrors = append(frameErrors, errs...)
i := 0
for folder, paths := range logFolders {
// we will collect the default location anyway above so skip these
if folder != DefaultLogsLocation {
if collectArchives {
paths = append(paths, "*.gz")
}
dirFrame, errs := data.NewFileDirectoryFrame(folder, paths)
frames[fmt.Sprintf("logs-%d", i)] = dirFrame
frameErrors = append(frameErrors, errs...)
}
}
return &data.DiagnosticBundle{
Frames: frames,
Errors: data.FrameErrors{Errors: frameErrors},
}, err
}
func (lc *LogsCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("directory", "Specify the location of the log files for ClickHouse Server e.g. /var/log/clickhouse-server/", false),
AllowEmpty: true,
},
config.BoolParam{
Param: config.NewParam("collect_archives", "Collect compressed log archive files", false),
},
},
}
}
func FindLogFileCandidates() (logFolders map[string][]string, configErrors []error) {
// we need the config to determine the location of the logs
configCandidates := make(map[string]data.ConfigFileFrame)
configFiles, err := FindConfigurationFiles()
logFolders = make(map[string][]string)
if err != nil {
configErrors = append(configErrors, err)
return logFolders, configErrors
}
for _, folder := range configFiles {
configFrame, errs := data.NewConfigFileFrame(folder)
configErrors = append(configErrors, errs...)
configCandidates[filepath.Clean(folder)] = configFrame
}
for _, config := range configCandidates {
paths, errs := config.FindLogPaths()
for _, path := range paths {
folder := filepath.Dir(path)
filename := filepath.Base(path)
if _, ok := logFolders[folder]; !ok {
logFolders[folder] = []string{}
}
logFolders[folder] = utils.Unique(append(logFolders[folder], filename))
}
configErrors = append(configErrors, errs...)
}
return logFolders, configErrors
}
func (lc *LogsCollector) IsDefault() bool {
return true
}
func (lc LogsCollector) Description() string {
return "Collects the ClickHouse logs directly from the database."
}
// here we register the collector for use
func init() {
collectors.Register("logs", func() (collectors.Collector, error) {
return &LogsCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,146 @@
package clickhouse_test
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"os"
"path"
"testing"
)
func TestLogsConfiguration(t *testing.T) {
t.Run("correct configuration is returned for logs collector", func(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
logsCollector := clickhouse.NewLogsCollector(&platform.ResourceManager{
DbClient: client,
})
conf := logsCollector.Configuration()
require.Len(t, conf.Params, 2)
// check directory
require.IsType(t, config.StringParam{}, conf.Params[0])
directory, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.False(t, directory.Required())
require.Equal(t, directory.Name(), "directory")
require.Empty(t, directory.Value)
// check collect_archives
require.IsType(t, config.BoolParam{}, conf.Params[1])
collectArchives, ok := conf.Params[1].(config.BoolParam)
require.True(t, ok)
require.False(t, collectArchives.Required())
require.Equal(t, collectArchives.Name(), "collect_archives")
require.False(t, collectArchives.Value)
})
}
func TestLogsCollect(t *testing.T) {
logsCollector := clickhouse.NewLogsCollector(&platform.ResourceManager{})
t.Run("test default logs collection", func(t *testing.T) {
// we can't rely on a local installation of clickhouse being present for tests - if it is present (and running)
// results maybe variable e.g. we may find a config. For now, we allow flexibility and test only default.
// TODO: we may want to test this within a container
bundle, err := logsCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, bundle)
// we will have some errors if clickhouse is installed or not. If former, permission issues - if latter missing folders.
require.Greater(t, len(bundle.Errors.Errors), 0)
require.Len(t, bundle.Frames, 1)
require.Contains(t, bundle.Frames, "default")
_, ok := bundle.Frames["default"].(data.DirectoryFileFrame)
require.True(t, ok)
// no guarantees clickhouse is installed so this bundle could have no frames
})
t.Run("test logs collection when directory is specified", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
logsPath := path.Join(cwd, "../../../testdata", "logs", "var", "logs")
bundle, err := logsCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: logsPath,
Param: config.NewParam("directory", "Specify the location of the log files for ClickHouse Server e.g. /var/log/clickhouse-server/", false),
AllowEmpty: true,
},
},
})
require.Nil(t, err)
checkDirectoryBundle(t, bundle, logsPath, []string{"clickhouse-server.log", "clickhouse-server.err.log"})
})
t.Run("test logs collection of archives", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
logsPath := path.Join(cwd, "../../../testdata", "logs", "var", "logs")
bundle, err := logsCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: logsPath,
Param: config.NewParam("directory", "Specify the location of the log files for ClickHouse Server e.g. /var/log/clickhouse-server/", false),
AllowEmpty: true,
},
config.BoolParam{
Value: true,
Param: config.NewParam("collect_archives", "Collect compressed log archive files", false),
},
},
})
require.Nil(t, err)
checkDirectoryBundle(t, bundle, logsPath, []string{"clickhouse-server.log", "clickhouse-server.err.log", "clickhouse-server.log.gz"})
})
t.Run("test when directory does not exist", func(t *testing.T) {
tmpDir := t.TempDir()
logsPath := path.Join(tmpDir, "random")
bundle, err := logsCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: logsPath,
Param: config.NewParam("directory", "Specify the location of the log files for ClickHouse Server e.g. /var/log/clickhouse-server/", false),
AllowEmpty: true,
},
},
})
// not a fatal error currently
require.Nil(t, err)
require.Len(t, bundle.Errors.Errors, 1)
require.Equal(t, fmt.Sprintf("directory %s does not exist", logsPath), bundle.Errors.Errors[0].Error())
})
}
func checkDirectoryBundle(t *testing.T, bundle *data.DiagnosticBundle, logsPath string, expectedFiles []string) {
require.NotNil(t, bundle)
require.Nil(t, bundle.Errors.Errors)
require.Len(t, bundle.Frames, 1)
require.Contains(t, bundle.Frames, "user_specified")
dirFrame, ok := bundle.Frames["user_specified"].(data.DirectoryFileFrame)
require.True(t, ok)
require.Equal(t, logsPath, dirFrame.Directory)
require.Equal(t, []string{"files"}, dirFrame.Columns())
i := 0
fullPaths := make([]string, len(expectedFiles))
for i, filePath := range expectedFiles {
fullPaths[i] = path.Join(logsPath, filePath)
}
for {
values, ok, err := dirFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Len(t, values, 1)
file, ok := values[0].(data.SimpleFile)
require.True(t, ok)
require.Contains(t, fullPaths, file.FilePath())
i += 1
}
require.Equal(t, len(fullPaths), i)
}

View File

@ -0,0 +1,153 @@
{
"queries": {
"version": [
{
"statement": "SELECT version()"
}
],
"databases": [
{
"statement": "SELECT name, engine, tables, partitions, parts, formatReadableSize(bytes_on_disk) \"disk_size\" FROM system.databases db LEFT JOIN ( SELECT database, uniq(table) \"tables\", uniq(table, partition) \"partitions\", count() AS parts, sum(bytes_on_disk) \"bytes_on_disk\" FROM system.parts WHERE active GROUP BY database ) AS db_stats ON db.name = db_stats.database ORDER BY bytes_on_disk DESC LIMIT {{.Limit}}"
}
],
"access": [
{
"statement": "SHOW ACCESS"
}
],
"quotas": [
{
"statement": "SHOW QUOTA"
}
],
"db_engines": [
{
"statement": "SELECT engine, count() \"count\" FROM system.databases GROUP BY engine"
}
],
"table_engines": [
{
"statement": "SELECT engine, count() \"count\" FROM system.tables WHERE database != 'system' GROUP BY engine"
}
],
"dictionaries": [
{
"statement": "SELECT source, type, status, count() \"count\" FROM system.dictionaries GROUP BY source, type, status ORDER BY status DESC, source"
}
],
"replicated_tables_by_delay": [
{
"statement": "SELECT database, table, is_leader, is_readonly, absolute_delay, queue_size, inserts_in_queue, merges_in_queue FROM system.replicas ORDER BY absolute_delay DESC LIMIT {{.Limit}}"
}
],
"replication_queue_by_oldest": [
{
"statement": "SELECT database, table, replica_name, position, node_name, type, source_replica, parts_to_merge, new_part_name, create_time, required_quorum, is_detach, is_currently_executing, num_tries, last_attempt_time, last_exception, concat( 'time: ', toString(last_postpone_time), ', number: ', toString(num_postponed), ', reason: ', postpone_reason ) postpone FROM system.replication_queue ORDER BY create_time ASC LIMIT {{.Limit}}"
}
],
"replicated_fetches": [
{
"statement": "SELECT database, table, round(elapsed, 1) \"elapsed\", round(100 * progress, 1) \"progress\", partition_id, result_part_name, result_part_path, total_size_bytes_compressed, bytes_read_compressed, source_replica_path, source_replica_hostname, source_replica_port, interserver_scheme, to_detached, thread_id FROM system.replicated_fetches"
}
],
"tables_by_max_partition_count": [
{
"statement": "SELECT database, table, count() \"partitions\", sum(part_count) \"parts\", max(part_count) \"max_parts_per_partition\" FROM ( SELECT database, table, partition, count() \"part_count\" FROM system.parts WHERE active GROUP BY database, table, partition ) partitions GROUP BY database, table ORDER BY max_parts_per_partition DESC LIMIT {{.Limit}}"
}
],
"stack_traces": [
{
"statement": "SELECT '\\n' || arrayStringConcat( arrayMap( x, y -> concat(x, ': ', y), arrayMap(x -> addressToLine(x), trace), arrayMap(x -> demangle(addressToSymbol(x)), trace) ), '\\n' ) AS trace FROM system.stack_trace"
}
],
"crash_log": [
{
"statement": "SELECT event_time, signal, thread_id, query_id, '\\n' || arrayStringConcat(trace_full, '\\n') AS trace, version FROM system.crash_log ORDER BY event_time DESC"
}
],
"merges": [
{
"statement": "SELECT database, table, round(elapsed, 1) \"elapsed\", round(100 * progress, 1) \"progress\", is_mutation, partition_id, result_part_path, source_part_paths, num_parts, formatReadableSize(total_size_bytes_compressed) \"total_size_compressed\", formatReadableSize(bytes_read_uncompressed) \"read_uncompressed\", formatReadableSize(bytes_written_uncompressed) \"written_uncompressed\", columns_written, formatReadableSize(memory_usage) \"memory_usage\", thread_id FROM system.merges",
"constraint": ">=20.3"
},
{
"statement": "SELECT database, table, round(elapsed, 1) \"elapsed\", round(100 * progress, 1) \"progress\", is_mutation, partition_id, num_parts, formatReadableSize(total_size_bytes_compressed) \"total_size_compressed\", formatReadableSize(bytes_read_uncompressed) \"read_uncompressed\", formatReadableSize(bytes_written_uncompressed) \"written_uncompressed\", columns_written, formatReadableSize(memory_usage) \"memory_usage\" FROM system.merges"
}
],
"mutations": [
{
"statement": "SELECT database, table, mutation_id, command, create_time, parts_to_do_names, parts_to_do, is_done, latest_failed_part, latest_fail_time, latest_fail_reason FROM system.mutations WHERE NOT is_done ORDER BY create_time DESC",
"constraint": ">=20.3"
},
{
"statement": "SELECT database, table, mutation_id, command, create_time, parts_to_do, is_done, latest_failed_part, latest_fail_time, latest_fail_reason FROM system.mutations WHERE NOT is_done ORDER BY create_time DESC"
}
],
"recent_data_parts": [
{
"statement": "SELECT database, table, engine, partition_id, name, part_type, active, level, disk_name, path, marks, rows, bytes_on_disk, data_compressed_bytes, data_uncompressed_bytes, marks_bytes, modification_time, remove_time, refcount, is_frozen, min_date, max_date, min_time, max_time, min_block_number, max_block_number FROM system.parts WHERE modification_time > now() - INTERVAL 3 MINUTE ORDER BY modification_time DESC",
"constraint": ">=20.3"
},
{
"statement": "SELECT database, table, engine, partition_id, name, active, level, path, marks, rows, bytes_on_disk, data_compressed_bytes, data_uncompressed_bytes, marks_bytes, modification_time, remove_time, refcount, is_frozen, min_date, max_date, min_time, max_time, min_block_number, max_block_number FROM system.parts WHERE modification_time > now() - INTERVAL 3 MINUTE ORDER BY modification_time DESC"
}
],
"detached_parts": [
{
"statement": "SELECT database, table, partition_id, name, disk, reason, min_block_number, max_block_number, level FROM system.detached_parts"
}
],
"processes": [
{
"statement": "SELECT elapsed, query_id, normalizeQuery(query) AS normalized_query, is_cancelled, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, formatReadableSize(memory_usage) AS \"memory usage\", user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, thread_ids, ProfileEvents, Settings FROM system.processes ORDER BY elapsed DESC",
"constraint": ">=21.8"
},
{
"statement": "SELECT elapsed, query_id, normalizeQuery(query) AS normalized_query, is_cancelled, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, formatReadableSize(memory_usage) AS \"memory usage\", user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, thread_ids, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.processes ORDER BY elapsed DESC",
"constraint": ">=21.3"
},
{
"statement": "SELECT elapsed, query_id, normalizeQuery(query) AS normalized_query, is_cancelled, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, formatReadableSize(memory_usage) AS \"memory usage\", user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.processes ORDER BY elapsed DESC"
}
],
"top_queries_by_duration": [
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents, Settings FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY query_duration_ms DESC LIMIT {{.Limit}}",
"constraint": ">=21.8"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY query_duration_ms DESC LIMIT {{.Limit}}",
"constraint": ">=21.3"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY query_duration_ms DESC LIMIT {{.Limit}}"
}
],
"top_queries_by_memory": [
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents, Settings FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY memory_usage DESC LIMIT {{.Limit}}",
"constraint": ">=21.8"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY memory_usage DESC LIMIT {{.Limit}}",
"constraint": ">=21.3"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY ORDER BY memory_usage DESC LIMIT {{.Limit}}"
}
],
"failed_queries": [
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents, Settings FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY AND exception != '' ORDER BY query_start_time DESC LIMIT {{.Limit}}",
"constraint": ">=21.8"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, databases, tables, columns, used_aggregate_functions, used_aggregate_function_combinators, used_database_engines, used_data_type_families, used_dictionaries, used_formats, used_functions, used_storages, used_table_functions, thread_ids, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY AND exception != '' ORDER BY query_start_time DESC LIMIT {{.Limit}}",
"constraint": ">=21.3"
},
{
"statement": "SELECT type, query_start_time, query_duration_ms, query_id, query_kind, is_initial_query, normalizeQuery(query) AS normalized_query, concat( toString(read_rows), ' rows / ', formatReadableSize(read_bytes) ) AS read, concat( toString(written_rows), ' rows / ', formatReadableSize(written_bytes) ) AS written, concat( toString(result_rows), ' rows / ', formatReadableSize(result_bytes) ) AS result, formatReadableSize(memory_usage) AS \"memory usage\", exception, '\\n' || stack_trace AS stack_trace, user, initial_user, multiIf( empty(client_name), http_user_agent, concat( client_name, ' ', toString(client_version_major), '.', toString(client_version_minor), '.', toString(client_version_patch) ) ) AS client, client_hostname, ProfileEvents.Names, ProfileEvents.Values, Settings.Names, Settings.Values FROM system.query_log WHERE type != 'QueryStart' AND event_date >= today() - 1 AND event_time >= now() - INTERVAL 1 DAY AND exception != '' ORDER BY query_start_time DESC LIMIT {{.Limit}}"
}
]
}
}

View File

@ -0,0 +1,158 @@
package clickhouse
import (
"bytes"
_ "embed"
"encoding/json"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/Masterminds/semver"
"github.com/pkg/errors"
"strings"
"text/template"
)
// This collector collects the system db from database
type SummaryCollector struct {
resourceManager *platform.ResourceManager
}
type querySet struct {
Queries map[string][]query `json:"queries"`
}
type query struct {
Statement string `json:"statement"`
Constraint string `json:"constraint"`
}
type ParameterTemplate struct {
Limit int64
}
//go:embed queries.json
var queryFile []byte
func NewSummaryCollector(m *platform.ResourceManager) *SummaryCollector {
return &SummaryCollector{
resourceManager: m,
}
}
func (sc *SummaryCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(sc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
var queries querySet
err = json.Unmarshal(queryFile, &queries)
if err != nil {
return &data.DiagnosticBundle{}, errors.Wrap(err, "Unable to read queries from disk")
}
limit, err := config.ReadIntValue(conf, "row_limit")
if err != nil {
return &data.DiagnosticBundle{}, err
}
paramTemplate := ParameterTemplate{
Limit: limit,
}
frames := make(map[string]data.Frame)
serverVersion, err := getServerSemVersion(sc)
if err != nil {
return &data.DiagnosticBundle{}, errors.Wrapf(err, "Unable to read server version")
}
var frameErrors []error
for queryId, sqlQueries := range queries.Queries {
// we find the first matching query that satisfies the current version. Empty version means ANY version is
// supported
for _, sqlQuery := range sqlQueries {
var queryConstraint *semver.Constraints
if sqlQuery.Constraint != "" {
queryConstraint, err = semver.NewConstraint(sqlQuery.Constraint)
if err != nil {
//we try another one
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to parse version %s for query %s", sqlQuery.Constraint, queryId))
continue
}
}
if sqlQuery.Constraint == "" || queryConstraint.Check(serverVersion) {
tmpl, err := template.New(queryId).Parse(sqlQuery.Statement)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to parse query %s", queryId))
//we try another one
continue
}
buf := new(bytes.Buffer)
err = tmpl.Execute(buf, paramTemplate)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to process query %s template", queryId))
//we try another one
continue
}
frame, err := sc.resourceManager.DbClient.ExecuteStatement(queryId, buf.String())
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to execute query %s", queryId))
//we try another one
} else {
frames[queryId] = frame
// only 1 query executed
break
}
}
}
}
fErrors := data.FrameErrors{
Errors: frameErrors,
}
return &data.DiagnosticBundle{
Frames: frames,
Errors: fErrors,
}, nil
}
func getServerSemVersion(sc *SummaryCollector) (*semver.Version, error) {
serverVersion, err := sc.resourceManager.DbClient.Version()
if err != nil {
return &semver.Version{}, err
}
//drop the build number - it is not a semantic version
versionComponents := strings.Split(serverVersion, ".")
serverVersion = strings.Join(versionComponents[:len(versionComponents)-1], ".")
return semver.NewVersion(serverVersion)
}
func (sc *SummaryCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: 20,
Param: config.NewParam("row_limit", "Limit rows on supported queries.", false),
},
},
}
}
func (sc *SummaryCollector) IsDefault() bool {
return true
}
func (sc *SummaryCollector) Description() string {
return "Collects summary statistics on the database based on a set of known useful queries."
}
// here we register the collector for use
func init() {
collectors.Register("summary", func() (collectors.Collector, error) {
return &SummaryCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,110 @@
package clickhouse_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"testing"
)
func TestSummaryConfiguration(t *testing.T) {
t.Run("correct configuration is returned for summary collector", func(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
summaryCollector := clickhouse.NewSummaryCollector(&platform.ResourceManager{
DbClient: client,
})
conf := summaryCollector.Configuration()
require.Len(t, conf.Params, 1)
require.IsType(t, config.IntParam{}, conf.Params[0])
limit, ok := conf.Params[0].(config.IntParam)
require.True(t, ok)
require.False(t, limit.Required())
require.Equal(t, limit.Name(), "row_limit")
require.Equal(t, int64(20), limit.Value)
})
}
func TestSummaryCollection(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
versionFrame := test.NewFakeDataFrame("version", []string{"version()"},
[][]interface{}{
{"22.1.3.7"},
},
)
client.QueryResponses["SELECT version()"] = &versionFrame
databasesFrame := test.NewFakeDataFrame("databases", []string{"name", "engine", "tables", "partitions", "parts", "disk_size"},
[][]interface{}{
{"tutorial", "Atomic", 2, 2, 2, "1.70 GiB"},
{"default", "Atomic", 5, 5, 6, "1.08 GiB"},
{"system", "Atomic", 11, 24, 70, "1.05 GiB"},
{"INFORMATION_SCHEMA", "Memory", 0, 0, 0, "0.00 B"},
{"covid19db", "Atomic", 0, 0, 0, "0.00 B"},
{"information_schema", "Memory", 0, 0, 0, "0.00 B"}})
client.QueryResponses["SELECT name, engine, tables, partitions, parts, formatReadableSize(bytes_on_disk) \"disk_size\" "+
"FROM system.databases db LEFT JOIN ( SELECT database, uniq(table) \"tables\", uniq(table, partition) \"partitions\", "+
"count() AS parts, sum(bytes_on_disk) \"bytes_on_disk\" FROM system.parts WHERE active GROUP BY database ) AS db_stats "+
"ON db.name = db_stats.database ORDER BY bytes_on_disk DESC LIMIT 20"] = &databasesFrame
summaryCollector := clickhouse.NewSummaryCollector(&platform.ResourceManager{
DbClient: client,
})
t.Run("test default summary collection", func(t *testing.T) {
bundle, errs := summaryCollector.Collect(config.Configuration{})
require.Empty(t, errs)
require.Len(t, bundle.Errors.Errors, 30)
require.NotNil(t, bundle)
require.Len(t, bundle.Frames, 2)
// check version frame
require.Contains(t, bundle.Frames, "version")
require.Equal(t, []string{"version()"}, bundle.Frames["version"].Columns())
checkFrame(t, bundle.Frames["version"], versionFrame.Rows)
//check databases frame
require.Contains(t, bundle.Frames, "databases")
require.Equal(t, []string{"name", "engine", "tables", "partitions", "parts", "disk_size"}, bundle.Frames["databases"].Columns())
checkFrame(t, bundle.Frames["databases"], databasesFrame.Rows)
client.Reset()
})
t.Run("test summary collection with limit", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: 1,
Param: config.NewParam("row_limit", "Limit rows on supported queries.", false),
},
},
}
bundle, errs := summaryCollector.Collect(conf)
require.Empty(t, errs)
require.Len(t, bundle.Errors.Errors, 31)
require.NotNil(t, bundle)
// databases will be absent due to limit
require.Len(t, bundle.Frames, 1)
// check version frame
require.Contains(t, bundle.Frames, "version")
require.Equal(t, []string{"version()"}, bundle.Frames["version"].Columns())
checkFrame(t, bundle.Frames["version"], versionFrame.Rows)
client.QueryResponses["SELECT name, engine, tables, partitions, parts, formatReadableSize(bytes_on_disk) \"disk_size\" "+
"FROM system.databases db LEFT JOIN ( SELECT database, uniq(table) \"tables\", uniq(table, partition) \"partitions\", "+
"count() AS parts, sum(bytes_on_disk) \"bytes_on_disk\" FROM system.parts WHERE active GROUP BY database ) AS db_stats "+
"ON db.name = db_stats.database ORDER BY bytes_on_disk DESC LIMIT 1"] = &databasesFrame
bundle, errs = summaryCollector.Collect(conf)
require.Empty(t, errs)
require.Len(t, bundle.Errors.Errors, 30)
require.NotNil(t, bundle)
require.Len(t, bundle.Frames, 2)
require.Contains(t, bundle.Frames, "version")
//check databases frame
require.Contains(t, bundle.Frames, "databases")
require.Equal(t, []string{"name", "engine", "tables", "partitions", "parts", "disk_size"}, bundle.Frames["databases"].Columns())
// this will parse as our mock client does not read statement (specifically the limit clause) when called with execute
checkFrame(t, bundle.Frames["databases"], databasesFrame.Rows)
})
}

View File

@ -0,0 +1,164 @@
package clickhouse
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/pkg/errors"
)
// This collector collects the system db from database
type SystemDatabaseCollector struct {
resourceManager *platform.ResourceManager
}
const SystemDatabase = "system"
// ExcludeColumns columns if we need - this will be refined over time [table_name][columnA, columnB]
var ExcludeColumns = map[string][]string{}
// BannedTables - Hardcoded list. These are always excluded even if the user doesn't specify in exclude_tables.
//Attempts to export will work but we will warn
var BannedTables = []string{"numbers", "zeros"}
// OrderBy contains a map of tables to an order by clause - by default we don't order table dumps
var OrderBy = map[string]data.OrderBy{
"errors": {
Column: "last_error_message",
Order: data.Desc,
},
"replication_queue": {
Column: "create_time",
Order: data.Asc,
},
}
func NewSystemDatabaseCollector(m *platform.ResourceManager) *SystemDatabaseCollector {
return &SystemDatabaseCollector{
resourceManager: m,
}
}
func (sc *SystemDatabaseCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(sc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
includeTables, err := config.ReadStringListValues(conf, "include_tables")
if err != nil {
return &data.DiagnosticBundle{}, err
}
excludeTables, err := config.ReadStringListValues(conf, "exclude_tables")
if err != nil {
return &data.DiagnosticBundle{}, err
}
rowLimit, err := config.ReadIntValue(conf, "row_limit")
if err != nil {
return &data.DiagnosticBundle{}, err
}
excludeTables = checkBannedTables(includeTables, excludeTables)
ds, err := sc.readSystemAllTables(includeTables, excludeTables, rowLimit)
if err != nil {
return &data.DiagnosticBundle{}, err
}
return ds, nil
}
// all banned tables are added to excluded if not present and not specified in included. Returns new exclude_tables list.
func checkBannedTables(includeTables []string, excludeTables []string) []string {
for _, bannedTable := range BannedTables {
//if its specified we don't add to our exclude list - explicitly included tables take precedence
if !utils.Contains(includeTables, bannedTable) && !utils.Contains(excludeTables, bannedTable) {
excludeTables = append(excludeTables, bannedTable)
}
}
return excludeTables
}
func (sc *SystemDatabaseCollector) readSystemAllTables(include []string, exclude []string, limit int64) (*data.DiagnosticBundle, error) {
tableNames, err := sc.resourceManager.DbClient.ReadTableNamesForDatabase(SystemDatabase)
if err != nil {
return nil, err
}
var frameErrors []error
if include != nil {
// nil means include everything
tableNames = utils.Intersection(tableNames, include)
if len(tableNames) != len(include) {
// we warn that some included tables aren't present in db
frameErrors = append(frameErrors, fmt.Errorf("some tables specified in the include_tables are not in the system database and will not be exported: %v",
utils.Distinct(include, tableNames)))
}
}
// exclude tables unless specified in includes
excludedTables := utils.Distinct(exclude, include)
tableNames = utils.Distinct(tableNames, excludedTables)
frames := make(map[string]data.Frame)
for _, tableName := range tableNames {
var excludeColumns []string
if _, ok := ExcludeColumns[tableName]; ok {
excludeColumns = ExcludeColumns[tableName]
}
orderBy := data.OrderBy{}
if _, ok := OrderBy[tableName]; ok {
orderBy = OrderBy[tableName]
}
frame, err := sc.resourceManager.DbClient.ReadTable(SystemDatabase, tableName, excludeColumns, orderBy, limit)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to collect %s", tableName))
} else {
frames[tableName] = frame
}
}
fErrors := data.FrameErrors{
Errors: frameErrors,
}
return &data.DiagnosticBundle{
Frames: frames,
Errors: fErrors,
}, nil
}
func (sc *SystemDatabaseCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Specify list of tables to collect. Takes precedence over exclude_tables. If not specified (default) all tables except exclude_tables.", false),
},
config.StringListParam{
Values: []string{"licenses", "distributed_ddl_queue", "query_thread_log", "query_log", "asynchronous_metric_log", "zookeeper", "aggregate_function_combinators", "collations", "contributors", "data_type_families", "formats", "graphite_retentions", "numbers", "numbers_mt", "one", "parts_columns", "projection_parts", "projection_parts_columns", "table_engines", "time_zones", "zeros", "zeros_mt"},
Param: config.NewParam("exclude_tables", "Specify list of tables to not collect.", false),
},
config.IntParam{
Value: 100000,
Param: config.NewParam("row_limit", "Maximum number of rows to collect from any table. Negative values mean unlimited.", false),
},
},
}
}
func (sc *SystemDatabaseCollector) IsDefault() bool {
return true
}
func (sc *SystemDatabaseCollector) Description() string {
return "Collects all tables in the system database, except those which have been excluded."
}
// here we register the collector for use
func init() {
collectors.Register("system_db", func() (collectors.Collector, error) {
return &SystemDatabaseCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,365 @@
package clickhouse_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"testing"
)
func TestSystemConfiguration(t *testing.T) {
t.Run("correct configuration is returned for system db collector", func(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
systemDbCollector := clickhouse.NewSystemDatabaseCollector(&platform.ResourceManager{
DbClient: client,
})
conf := systemDbCollector.Configuration()
require.Len(t, conf.Params, 3)
// check first param
require.IsType(t, config.StringListParam{}, conf.Params[0])
includeTables, ok := conf.Params[0].(config.StringListParam)
require.True(t, ok)
require.False(t, includeTables.Required())
require.Equal(t, includeTables.Name(), "include_tables")
require.Nil(t, includeTables.Values)
// check second param
require.IsType(t, config.StringListParam{}, conf.Params[1])
excludeTables, ok := conf.Params[1].(config.StringListParam)
require.True(t, ok)
require.False(t, excludeTables.Required())
require.Equal(t, "exclude_tables", excludeTables.Name())
require.Equal(t, []string{"licenses", "distributed_ddl_queue", "query_thread_log", "query_log", "asynchronous_metric_log", "zookeeper", "aggregate_function_combinators", "collations", "contributors", "data_type_families", "formats", "graphite_retentions", "numbers", "numbers_mt", "one", "parts_columns", "projection_parts", "projection_parts_columns", "table_engines", "time_zones", "zeros", "zeros_mt"}, excludeTables.Values)
// check third param
require.IsType(t, config.IntParam{}, conf.Params[2])
rowLimit, ok := conf.Params[2].(config.IntParam)
require.True(t, ok)
require.False(t, rowLimit.Required())
require.Equal(t, "row_limit", rowLimit.Name())
require.Equal(t, int64(100000), rowLimit.Value)
})
}
func TestSystemDbCollect(t *testing.T) {
diskFrame := test.NewFakeDataFrame("disks", []string{"name", "path", "free_space", "total_space", "keep_free_space", "type"},
[][]interface{}{
{"default", "/var/lib/clickhouse", 1729659346944, 1938213220352, "", "local"},
},
)
clusterFrame := test.NewFakeDataFrame("clusters", []string{"cluster", "shard_num", "shard_weight", "replica_num", "host_name", "host_address", "port", "is_local", "user", "default_database", "errors_count", "slowdowns_count", "estimated_recovery_time"},
[][]interface{}{
{"events", 1, 1, 1, "dalem-local-clickhouse-blue-1", "192.168.144.2", 9000, 1, "default", "", 0, 0, 0},
{"events", 2, 1, 1, "dalem-local-clickhouse-blue-2", "192.168.144.4", 9000, 1, "default", "", 0, 0, 0},
{"events", 3, 1, 1, "dalem-local-clickhouse-blue-3", "192.168.144.3", 9000, 1, "default", "", 0, 0, 0},
},
)
userFrame := test.NewFakeDataFrame("users", []string{"name", "id", "storage", "auth_type", "auth_params", "host_ip", "host_names", "host_names_regexp", "host_names_like"},
[][]interface{}{
{"default", "94309d50-4f52-5250-31bd-74fecac179db,users.xml,plaintext_password", "sha256_password", []string{"::0"}, []string{}, []string{}, []string{}},
},
)
dbTables := map[string][]string{
clickhouse.SystemDatabase: {"disks", "clusters", "users"},
}
client := test.NewFakeClickhouseClient(dbTables)
client.QueryResponses["SELECT * FROM system.disks LIMIT 100000"] = &diskFrame
client.QueryResponses["SELECT * FROM system.clusters LIMIT 100000"] = &clusterFrame
client.QueryResponses["SELECT * FROM system.users LIMIT 100000"] = &userFrame
systemDbCollector := clickhouse.NewSystemDatabaseCollector(&platform.ResourceManager{
DbClient: client,
})
t.Run("test default system db collection", func(t *testing.T) {
diagSet, err := systemDbCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
// disks frame
require.Equal(t, "disks", diagSet.Frames["disks"].Name())
require.Equal(t, diskFrame.ColumnNames, diagSet.Frames["disks"].Columns())
checkFrame(t, diagSet.Frames["disks"], diskFrame.Rows)
// clusters frame
require.Equal(t, "clusters", diagSet.Frames["clusters"].Name())
require.Equal(t, clusterFrame.ColumnNames, diagSet.Frames["clusters"].Columns())
checkFrame(t, diagSet.Frames["clusters"], clusterFrame.Rows)
// users frame
require.Equal(t, "users", diagSet.Frames["users"].Name())
require.Equal(t, userFrame.ColumnNames, diagSet.Frames["users"].Columns())
checkFrame(t, diagSet.Frames["users"], userFrame.Rows)
client.Reset()
})
t.Run("test when we pass an includes", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: []string{"disks"},
Param: config.NewParam("include_tables", "Exclusion", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 1)
// disks frame
require.Equal(t, "disks", diagSet.Frames["disks"].Name())
require.Equal(t, diskFrame.ColumnNames, diagSet.Frames["disks"].Columns())
checkFrame(t, diagSet.Frames["disks"], diskFrame.Rows)
client.Reset()
})
// test excludes
t.Run("test when we pass an excludes", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
Values: []string{"disks"},
Param: config.NewParam("exclude_tables", "Exclusion", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 2)
// clusters frame
require.Equal(t, "clusters", diagSet.Frames["clusters"].Name())
require.Equal(t, clusterFrame.ColumnNames, diagSet.Frames["clusters"].Columns())
checkFrame(t, diagSet.Frames["clusters"], clusterFrame.Rows)
// users frame
require.Equal(t, "users", diagSet.Frames["users"].Name())
require.Equal(t, userFrame.ColumnNames, diagSet.Frames["users"].Columns())
checkFrame(t, diagSet.Frames["users"], userFrame.Rows)
client.Reset()
})
// test includes which isn't in the list
t.Run("test when we pass an invalid includes", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: []string{"disks", "invalid"},
Param: config.NewParam("include_tables", "Exclusion", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 1)
require.Equal(t, diagSet.Errors.Error(), "some tables specified in the include_tables are not in the "+
"system database and will not be exported: [invalid]")
require.Len(t, diagSet.Frames, 1)
// disks frame
require.Equal(t, "disks", diagSet.Frames["disks"].Name())
require.Equal(t, diskFrame.ColumnNames, diagSet.Frames["disks"].Columns())
checkFrame(t, diagSet.Frames["disks"], diskFrame.Rows)
client.Reset()
})
t.Run("test when we use a table with excluded fields", func(t *testing.T) {
excludeDefault := clickhouse.ExcludeColumns
client.QueryResponses["SELECT * EXCEPT(keep_free_space,type) FROM system.disks LIMIT 100000"] = &diskFrame
clickhouse.ExcludeColumns = map[string][]string{
"disks": {"keep_free_space", "type"},
}
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: []string{"disks"},
Param: config.NewParam("include_tables", "Exclusion", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 1)
// disks frame
require.Equal(t, "disks", diagSet.Frames["disks"].Name())
require.Equal(t, []string{"name", "path", "free_space", "total_space"}, diagSet.Frames["disks"].Columns())
eDiskFrame := test.NewFakeDataFrame("disks", []string{"name", "path", "free_space", "total_space"},
[][]interface{}{
{"default", "/var/lib/clickhouse", 1729659346944, 1938213220352},
},
)
checkFrame(t, diagSet.Frames["disks"], eDiskFrame.Rows)
clickhouse.ExcludeColumns = excludeDefault
client.Reset()
})
t.Run("test with a low row limit", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: 1,
Param: config.NewParam("row_limit", "Maximum number of rows to collect from any table. Negative values mean unlimited.", false),
},
},
}
client.QueryResponses["SELECT * FROM system.disks LIMIT 1"] = &diskFrame
client.QueryResponses["SELECT * FROM system.clusters LIMIT 1"] = &clusterFrame
client.QueryResponses["SELECT * FROM system.users LIMIT 1"] = &userFrame
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
// clusters frame
require.Equal(t, "clusters", diagSet.Frames["clusters"].Name())
require.Equal(t, clusterFrame.ColumnNames, diagSet.Frames["clusters"].Columns())
lClusterFrame := test.NewFakeDataFrame("clusters", []string{"cluster", "shard_num", "shard_weight", "replica_num", "host_name", "host_address", "port", "is_local", "user", "default_database", "errors_count", "slowdowns_count", "estimated_recovery_time"},
[][]interface{}{
{"events", 1, 1, 1, "dalem-local-clickhouse-blue-1", "192.168.144.2", 9000, 1, "default", "", 0, 0, 0},
})
checkFrame(t, diagSet.Frames["clusters"], lClusterFrame.Rows)
client.Reset()
})
t.Run("test with a negative low row limit", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
Value: -23,
Param: config.NewParam("row_limit", "Maximum number of rows to collect from any table. Negative values mean unlimited.", false),
},
},
}
client.QueryResponses["SELECT * FROM system.clusters"] = &clusterFrame
client.QueryResponses["SELECT * FROM system.disks"] = &diskFrame
client.QueryResponses["SELECT * FROM system.users"] = &userFrame
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
// disks frame
require.Equal(t, "disks", diagSet.Frames["disks"].Name())
require.Equal(t, diskFrame.ColumnNames, diagSet.Frames["disks"].Columns())
checkFrame(t, diagSet.Frames["disks"], diskFrame.Rows)
// clusters frame
require.Equal(t, "clusters", diagSet.Frames["clusters"].Name())
require.Equal(t, clusterFrame.ColumnNames, diagSet.Frames["clusters"].Columns())
checkFrame(t, diagSet.Frames["clusters"], clusterFrame.Rows)
// users frame
require.Equal(t, "users", diagSet.Frames["users"].Name())
require.Equal(t, userFrame.ColumnNames, diagSet.Frames["users"].Columns())
checkFrame(t, diagSet.Frames["users"], userFrame.Rows)
client.Reset()
})
t.Run("test that includes overrides excludes", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: []string{"disks"},
Param: config.NewParam("exclude_tables", "Excluded", false),
},
config.StringListParam{
// nil means include everything
Values: []string{"disks", "clusters", "users"},
Param: config.NewParam("include_tables", "Included", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
client.Reset()
})
t.Run("test banned", func(t *testing.T) {
bannedDefault := clickhouse.BannedTables
clickhouse.BannedTables = []string{"disks"}
diagSet, err := systemDbCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 2)
require.Contains(t, diagSet.Frames, "users")
require.Contains(t, diagSet.Frames, "clusters")
clickhouse.BannedTables = bannedDefault
client.Reset()
})
t.Run("test banned unless included", func(t *testing.T) {
bannedDefault := clickhouse.BannedTables
clickhouse.BannedTables = []string{"disks"}
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: []string{"disks", "clusters", "users"},
Param: config.NewParam("include_tables", "Included", false),
},
},
}
diagSet, err := systemDbCollector.Collect(conf)
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
require.Contains(t, diagSet.Frames, "disks")
require.Contains(t, diagSet.Frames, "users")
require.Contains(t, diagSet.Frames, "clusters")
clickhouse.BannedTables = bannedDefault
client.Reset()
})
t.Run("tables are ordered if configured", func(t *testing.T) {
defaultOrderBy := clickhouse.OrderBy
clickhouse.OrderBy = map[string]data.OrderBy{
"clusters": {
Column: "shard_num",
Order: data.Desc,
},
}
client.QueryResponses["SELECT * FROM system.clusters ORDER BY shard_num DESC LIMIT 100000"] = &clusterFrame
diagSet, err := systemDbCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 3)
clickhouse.OrderBy = defaultOrderBy
oClusterFrame := test.NewFakeDataFrame("clusters", []string{"cluster", "shard_num", "shard_weight", "replica_num", "host_name", "host_address", "port", "is_local", "user", "default_database", "errors_count", "slowdowns_count", "estimated_recovery_time"},
[][]interface{}{
{"events", 3, 1, 1, "dalem-local-clickhouse-blue-3", "192.168.144.3", 9000, 1, "default", "", 0, 0, 0},
{"events", 2, 1, 1, "dalem-local-clickhouse-blue-2", "192.168.144.4", 9000, 1, "default", "", 0, 0, 0},
{"events", 1, 1, 1, "dalem-local-clickhouse-blue-1", "192.168.144.2", 9000, 1, "default", "", 0, 0, 0},
},
)
checkFrame(t, diagSet.Frames["clusters"], oClusterFrame.Rows)
client.Reset()
})
}
func checkFrame(t *testing.T, frame data.Frame, rows [][]interface{}) {
i := 0
for {
values, ok, err := frame.Next()
require.Nil(t, err)
if !ok {
break
}
require.ElementsMatch(t, rows[i], values)
i += 1
}
require.Equal(t, i, len(rows))
}

View File

@ -0,0 +1,152 @@
package clickhouse
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/bmatcuk/doublestar/v4"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
"strings"
)
// This collector collects the system zookeeper db
type ZookeeperCollector struct {
resourceManager *platform.ResourceManager
}
func NewZookeeperCollector(m *platform.ResourceManager) *ZookeeperCollector {
return &ZookeeperCollector{
resourceManager: m,
}
}
func (zkc *ZookeeperCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(zkc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
pathPattern, err := config.ReadStringValue(conf, "path_pattern")
if err != nil {
return &data.DiagnosticBundle{}, err
}
defaultPattern, _ := zkc.Configuration().GetConfigParam("path_pattern")
if defaultPattern.(config.StringParam).Value != pathPattern {
log.Warn().Msgf("Using non default zookeeper glob pattern [%s] - this can potentially cause high query load", pathPattern)
}
maxDepth, err := config.ReadIntValue(conf, "max_depth")
if err != nil {
return &data.DiagnosticBundle{}, err
}
rowLimit, err := config.ReadIntValue(conf, "row_limit")
if err != nil {
return &data.DiagnosticBundle{}, err
}
// we use doublestar for globs as it provides us with ** but also allows us to identify prefix or base paths
if !doublestar.ValidatePattern(pathPattern) {
return &data.DiagnosticBundle{}, errors.Wrapf(err, "%s is not a valid pattern", pathPattern)
}
base, _ := doublestar.SplitPattern(pathPattern)
frames := make(map[string]data.Frame)
hFrame, frameErrors := zkc.collectSubFrames(base, pathPattern, rowLimit, 0, maxDepth)
fErrors := data.FrameErrors{
Errors: frameErrors,
}
frames["zookeeper_db"] = hFrame
return &data.DiagnosticBundle{
Frames: frames,
Errors: fErrors,
}, nil
}
// recursively iterates over the zookeeper sub tables to a max depth, applying the filter and max rows per table
func (zkc *ZookeeperCollector) collectSubFrames(path, pathPattern string, rowLimit, currentDepth, maxDepth int64) (data.HierarchicalFrame, []error) {
var frameErrors []error
var subFrames []data.HierarchicalFrame
currentDepth += 1
if currentDepth == maxDepth {
return data.HierarchicalFrame{}, frameErrors
}
match, err := doublestar.PathMatch(pathPattern, path)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Path match failed for pattern %s with path %s", pathPattern, path))
return data.HierarchicalFrame{}, frameErrors
}
// we allow a single level to be examined or we never get going
if !match && currentDepth > 1 {
return data.HierarchicalFrame{}, frameErrors
}
frame, err := zkc.resourceManager.DbClient.ExecuteStatement(path, fmt.Sprintf("SELECT name FROM system.zookeeper WHERE path='%s' LIMIT %d", path, rowLimit))
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to read zookeeper table path for sub paths %s", path))
return data.HierarchicalFrame{}, frameErrors
}
// this isn't ideal, we add re-execute the query to our collection as this will be consumed by the output lazily
outputFrame, err := zkc.resourceManager.DbClient.ExecuteStatement(path, fmt.Sprintf("SELECT * FROM system.zookeeper WHERE path='%s' LIMIT %d", path, rowLimit))
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to read zookeeper table path %s", path))
return data.HierarchicalFrame{}, frameErrors
}
frameComponents := strings.Split(path, "/")
frameId := frameComponents[len(frameComponents)-1]
for {
values, ok, err := frame.Next()
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "unable to read frame %s", frame.Name()))
return data.NewHierarchicalFrame(frameId, outputFrame, subFrames), frameErrors
}
if !ok {
return data.NewHierarchicalFrame(frameId, outputFrame, subFrames), frameErrors
}
subName := fmt.Sprintf("%v", values[0])
subPath := fmt.Sprintf("%s/%s", path, subName)
subFrame, errs := zkc.collectSubFrames(subPath, pathPattern, rowLimit, currentDepth, maxDepth)
if subFrame.Name() != "" {
subFrames = append(subFrames, subFrame)
}
frameErrors = append(frameErrors, errs...)
}
}
func (zkc *ZookeeperCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "/clickhouse/{task_queue}/**",
Param: config.NewParam("path_pattern", "Glob pattern for zookeeper path matching. Change with caution.", false),
},
config.IntParam{
Value: 8,
Param: config.NewParam("max_depth", "Max depth for zookeeper navigation.", false),
},
config.IntParam{
Value: 10,
Param: config.NewParam("row_limit", "Maximum number of rows/sub nodes to collect/expand from any zookeeper leaf. Negative values mean unlimited.", false),
},
},
}
}
func (zkc *ZookeeperCollector) IsDefault() bool {
return false
}
func (zkc *ZookeeperCollector) Description() string {
return "Collects Zookeeper information available within ClickHouse."
}
// here we register the collector for use
func init() {
collectors.Register("zookeeper_db", func() (collectors.Collector, error) {
return &ZookeeperCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,101 @@
package clickhouse_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"testing"
)
func TestZookeeperConfiguration(t *testing.T) {
t.Run("correct configuration is returned for system zookeeper collector", func(t *testing.T) {
client := test.NewFakeClickhouseClient(make(map[string][]string))
zkCollector := clickhouse.NewZookeeperCollector(&platform.ResourceManager{
DbClient: client,
})
conf := zkCollector.Configuration()
require.Len(t, conf.Params, 3)
// check first param
require.IsType(t, config.StringParam{}, conf.Params[0])
pathPattern, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.False(t, pathPattern.Required())
require.Equal(t, pathPattern.Name(), "path_pattern")
require.Equal(t, "/clickhouse/{task_queue}/**", pathPattern.Value)
// check second param
require.IsType(t, config.IntParam{}, conf.Params[1])
maxDepth, ok := conf.Params[1].(config.IntParam)
require.True(t, ok)
require.False(t, maxDepth.Required())
require.Equal(t, "max_depth", maxDepth.Name())
require.Equal(t, int64(8), maxDepth.Value)
// check third param
require.IsType(t, config.IntParam{}, conf.Params[2])
rowLimit, ok := conf.Params[2].(config.IntParam)
require.True(t, ok)
require.False(t, rowLimit.Required())
require.Equal(t, "row_limit", rowLimit.Name())
require.Equal(t, int64(10), rowLimit.Value)
})
}
func TestZookeeperCollect(t *testing.T) {
level1 := test.NewFakeDataFrame("level_1", []string{"name", "value", "czxid", "mzxid", "ctime", "mtime", "version", "cversion", "aversion", "ephemeralOwner", "dataLength", "numChildren", "pzxid", "path"},
[][]interface{}{
{"name", "value", "czxid", "mzxid", "ctime", "mtime", "version", "cversion", "aversion", "ephemeralOwner", "dataLength", "numChildren", "pzxid", "path"},
{"task_queue", "", "4", "4", "2022-02-22 13:30:15", "2022-02-22 13:30:15", "0", "1", "0", "0", "0", "1", "5", "/clickhouse"},
{"copytasks", "", "525608", "525608", "2022-03-09 13:47:39", "2022-03-09 13:47:39", "0", "7", "0", "0", "0", "7", "526100", "/clickhouse"},
},
)
level2 := test.NewFakeDataFrame("level_2", []string{"name", "value", "czxid", "mzxid", "ctime", "mtime", "version", "cversion", "aversion", "ephemeralOwner", "dataLength", "numChildren", "pzxid", "path"},
[][]interface{}{
{"ddl", "", "5", "5", "2022-02-22 13:30:15", "2022-02-22 13:30:15", "0", "0", "0", "0", "0", "0", "5", "/clickhouse/task_queue"},
},
)
level3 := test.NewFakeDataFrame("level_2", []string{"name", "value", "czxid", "mzxid", "ctime", "mtime", "version", "cversion", "aversion", "ephemeralOwner", "dataLength", "numChildren", "pzxid", "path"},
[][]interface{}{},
)
dbTables := map[string][]string{
clickhouse.SystemDatabase: {"zookeeper"},
}
client := test.NewFakeClickhouseClient(dbTables)
client.QueryResponses["SELECT name FROM system.zookeeper WHERE path='/clickhouse' LIMIT 10"] = &level1
// can't reuse the frame as the first frame will be iterated as part of the recursive zookeeper search performed by the collector
cLevel1 := test.NewFakeDataFrame("level_1", level1.Columns(), level1.Rows)
client.QueryResponses["SELECT * FROM system.zookeeper WHERE path='/clickhouse' LIMIT 10"] = &cLevel1
client.QueryResponses["SELECT name FROM system.zookeeper WHERE path='/clickhouse/task_queue' LIMIT 10"] = &level2
cLevel2 := test.NewFakeDataFrame("level_2", level2.Columns(), level2.Rows)
client.QueryResponses["SELECT * FROM system.zookeeper WHERE path='/clickhouse/task_queue' LIMIT 10"] = &cLevel2
client.QueryResponses["SELECT name FROM system.zookeeper WHERE path='/clickhouse/task_queue/ddl' LIMIT 10"] = &level3
cLevel3 := test.NewFakeDataFrame("level_3", level3.Columns(), level3.Rows)
client.QueryResponses["SELECT * FROM system.zookeeper WHERE path='/clickhouse/task_queue/ddl' LIMIT 10"] = &cLevel3
zKCollector := clickhouse.NewZookeeperCollector(&platform.ResourceManager{
DbClient: client,
})
t.Run("test default zookeeper collection", func(t *testing.T) {
diagSet, err := zKCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 1)
require.Contains(t, diagSet.Frames, "zookeeper_db")
require.Equal(t, "clickhouse", diagSet.Frames["zookeeper_db"].Name())
require.IsType(t, data.HierarchicalFrame{}, diagSet.Frames["zookeeper_db"])
checkFrame(t, diagSet.Frames["zookeeper_db"], level1.Rows)
require.Equal(t, level1.Columns(), diagSet.Frames["zookeeper_db"].Columns())
hierarchicalFrame := diagSet.Frames["zookeeper_db"].(data.HierarchicalFrame)
require.Len(t, hierarchicalFrame.SubFrames, 1)
checkFrame(t, hierarchicalFrame.SubFrames[0], cLevel2.Rows)
require.Equal(t, cLevel2.Columns(), hierarchicalFrame.SubFrames[0].Columns())
hierarchicalFrame = hierarchicalFrame.SubFrames[0]
require.Len(t, hierarchicalFrame.SubFrames, 1)
checkFrame(t, hierarchicalFrame.SubFrames[0], cLevel3.Rows)
require.Equal(t, cLevel3.Columns(), hierarchicalFrame.SubFrames[0].Columns())
})
}

View File

@ -0,0 +1,74 @@
package collectors
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
)
type Collector interface {
Collect(config config.Configuration) (*data.DiagnosticBundle, error)
Configuration() config.Configuration
IsDefault() bool
Description() string
}
// Register can be called from init() on a collector in this package
// It will automatically be added to the Collectors map to be called externally
func Register(name string, collector CollectorFactory) {
if name == "diag_trace" {
// we use this to record errors and warnings
log.Fatal().Msgf("diag_trace is a reserved collector name")
}
// names must be unique
if _, ok := Collectors[name]; ok {
log.Fatal().Msgf("More than 1 collector is trying to register under the name %s. Names must be unique.", name)
}
Collectors[name] = collector
}
// CollectorFactory lets us use a closure to get instances of the collector struct
type CollectorFactory func() (Collector, error)
var Collectors = map[string]CollectorFactory{}
func GetCollectorNames(defaultOnly bool) []string {
// can't pre-allocate as not all maybe default
var collectors []string
for collectorName := range Collectors {
collector, err := GetCollectorByName(collectorName)
if err != nil {
log.Fatal().Err(err)
}
if !defaultOnly || (defaultOnly && collector.IsDefault()) {
collectors = append(collectors, collectorName)
}
}
return collectors
}
func GetCollectorByName(name string) (Collector, error) {
if collectorFactory, ok := Collectors[name]; ok {
//do something here
collector, err := collectorFactory()
if err != nil {
return nil, errors.Wrapf(err, "collector %s could not be initialized", name)
}
return collector, nil
}
return nil, fmt.Errorf("%s is not a valid collector name", name)
}
func BuildConfigurationOptions() (map[string]config.Configuration, error) {
configurations := make(map[string]config.Configuration)
for name, collectorFactory := range Collectors {
collector, err := collectorFactory()
if err != nil {
return nil, errors.Wrapf(err, "collector %s could not be initialized", name)
}
configurations[name] = collector.Configuration()
}
return configurations, nil
}

View File

@ -0,0 +1,56 @@
package collectors_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/stretchr/testify/require"
"testing"
)
func TestGetCollectorNames(t *testing.T) {
t.Run("can get all collector names", func(t *testing.T) {
collectorNames := collectors.GetCollectorNames(false)
require.ElementsMatch(t, []string{"system_db", "config", "summary", "system", "logs", "db_logs", "file", "command", "zookeeper_db"}, collectorNames)
})
t.Run("can get default collector names", func(t *testing.T) {
collectorNames := collectors.GetCollectorNames(true)
require.ElementsMatch(t, []string{"system_db", "config", "summary", "system", "logs", "db_logs"}, collectorNames)
})
}
func TestGetCollectorByName(t *testing.T) {
t.Run("can get collector by name", func(t *testing.T) {
collector, err := collectors.GetCollectorByName("system_db")
require.Nil(t, err)
require.Equal(t, clickhouse.NewSystemDatabaseCollector(platform.GetResourceManager()), collector)
})
t.Run("fails on non existing collector", func(t *testing.T) {
collector, err := collectors.GetCollectorByName("random")
require.NotNil(t, err)
require.Equal(t, "random is not a valid collector name", err.Error())
require.Nil(t, collector)
})
}
func TestBuildConfigurationOptions(t *testing.T) {
t.Run("can get all collector configurations", func(t *testing.T) {
configs, err := collectors.BuildConfigurationOptions()
require.Nil(t, err)
require.Len(t, configs, 9)
require.Contains(t, configs, "system_db")
require.Contains(t, configs, "config")
require.Contains(t, configs, "summary")
require.Contains(t, configs, "system")
require.Contains(t, configs, "logs")
require.Contains(t, configs, "db_logs")
require.Contains(t, configs, "file")
require.Contains(t, configs, "command")
require.Contains(t, configs, "zookeeper_db")
})
}

View File

@ -0,0 +1,89 @@
package system
import (
"bytes"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/google/shlex"
"github.com/pkg/errors"
"os/exec"
)
// This collector runs a user specified command and collects it to a file
type CommandCollector struct {
resourceManager *platform.ResourceManager
}
func NewCommandCollector(m *platform.ResourceManager) *CommandCollector {
return &CommandCollector{
resourceManager: m,
}
}
func (c *CommandCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(c.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
command, err := config.ReadStringValue(conf, "command")
if err != nil {
return &data.DiagnosticBundle{}, err
}
var frameErrors []error
// shlex to split the commands and args
cmdArgs, err := shlex.Split(command)
if err != nil || len(cmdArgs) == 0 {
return &data.DiagnosticBundle{}, errors.Wrap(err, "Unable to parse command")
}
cmd := exec.Command(cmdArgs[0], cmdArgs[1:]...)
var stdout, stderr bytes.Buffer
cmd.Stdout = &stdout
cmd.Stderr = &stderr
err = cmd.Run()
var sError string
if err != nil {
frameErrors = append(frameErrors, errors.Wrap(err, "Unable to execute command"))
sError = err.Error()
}
memoryFrame := data.NewMemoryFrame("output", []string{"command", "stdout", "stderr", "error"}, [][]interface{}{
{command, stdout.String(), stderr.String(), sError},
})
return &data.DiagnosticBundle{
Errors: data.FrameErrors{Errors: frameErrors},
Frames: map[string]data.Frame{
"output": memoryFrame,
},
}, nil
}
func (c *CommandCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("command", "Command to execute", true),
AllowEmpty: false,
},
},
}
}
func (c *CommandCollector) IsDefault() bool {
return false
}
func (c *CommandCollector) Description() string {
return "Allows collection of the output from a user specified command"
}
// here we register the collector for use
func init() {
collectors.Register("command", func() (collectors.Collector, error) {
return &CommandCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,106 @@
package system_test
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"testing"
)
func TestCommandConfiguration(t *testing.T) {
t.Run("correct configuration is returned for file collector", func(t *testing.T) {
commandCollector := system.NewCommandCollector(&platform.ResourceManager{})
conf := commandCollector.Configuration()
require.Len(t, conf.Params, 1)
require.IsType(t, config.StringParam{}, conf.Params[0])
command, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.True(t, command.Required())
require.Equal(t, command.Name(), "command")
require.Equal(t, "", command.Value)
})
}
func TestCommandCollect(t *testing.T) {
t.Run("test simple command with args", func(t *testing.T) {
commandCollector := system.NewCommandCollector(&platform.ResourceManager{})
bundle, err := commandCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "ls -l ../../../testdata",
Param: config.NewParam("command", "Command to execute", true),
AllowEmpty: false,
},
},
})
require.Nil(t, err)
require.Nil(t, bundle.Errors.Errors)
require.Len(t, bundle.Frames, 1)
require.Contains(t, bundle.Frames, "output")
require.Equal(t, bundle.Frames["output"].Columns(), []string{"command", "stdout", "stderr", "error"})
memFrame := bundle.Frames["output"].(data.MemoryFrame)
values, ok, err := memFrame.Next()
require.True(t, ok)
require.Nil(t, err)
fmt.Println(values)
require.Len(t, values, 4)
require.Equal(t, "ls -l ../../../testdata", values[0])
require.Contains(t, values[1], "configs")
require.Contains(t, values[1], "docker")
require.Contains(t, values[1], "log")
require.Equal(t, "", values[2])
require.Equal(t, "", values[3])
values, ok, err = memFrame.Next()
require.False(t, ok)
require.Nil(t, err)
require.Nil(t, values)
})
t.Run("test empty command", func(t *testing.T) {
commandCollector := system.NewCommandCollector(&platform.ResourceManager{})
bundle, err := commandCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("command", "Command to execute", true),
AllowEmpty: false,
},
},
})
require.Equal(t, "parameter command is invalid - command cannot be empty", err.Error())
require.Equal(t, &data.DiagnosticBundle{}, bundle)
})
t.Run("test invalid command", func(t *testing.T) {
commandCollector := system.NewCommandCollector(&platform.ResourceManager{})
bundle, err := commandCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "ls --invalid ../../../testdata",
Param: config.NewParam("command", "Command to execute", true),
AllowEmpty: false,
},
},
})
// commands may error with output - we still capture on stderr
require.Nil(t, err)
require.Len(t, bundle.Errors.Errors, 1)
require.Equal(t, "Unable to execute command: exit status 2", bundle.Errors.Errors[0].Error())
require.Len(t, bundle.Frames, 1)
require.Contains(t, bundle.Frames, "output")
require.Equal(t, bundle.Frames["output"].Columns(), []string{"command", "stdout", "stderr", "error"})
memFrame := bundle.Frames["output"].(data.MemoryFrame)
values, ok, err := memFrame.Next()
require.True(t, ok)
require.Nil(t, err)
require.Len(t, values, 4)
require.Equal(t, "ls --invalid ../../../testdata", values[0])
require.Equal(t, "", values[1])
// exact values here may vary on platform
require.NotEmpty(t, values[2])
require.NotEmpty(t, values[3])
})
}

View File

@ -0,0 +1,99 @@
package system
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
"github.com/yargevad/filepathx"
"os"
)
// This collector collects arbitrary user files
type FileCollector struct {
resourceManager *platform.ResourceManager
}
func NewFileCollector(m *platform.ResourceManager) *FileCollector {
return &FileCollector{
resourceManager: m,
}
}
func (f *FileCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(f.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
filePattern, err := config.ReadStringValue(conf, "file_pattern")
if err != nil {
return &data.DiagnosticBundle{}, err
}
var frameErrors []error
// this util package supports recursive file matching e.g. /**/*
matches, err := filepathx.Glob(filePattern)
if err != nil {
return &data.DiagnosticBundle{}, errors.Wrapf(err, "Invalid file_pattern \"%s\"", filePattern)
}
if len(matches) == 0 {
frameErrors = append(frameErrors, errors.New("0 files match glob pattern"))
return &data.DiagnosticBundle{
Errors: data.FrameErrors{Errors: frameErrors},
}, nil
}
var filePaths []string
for _, match := range matches {
fi, err := os.Stat(match)
if err != nil {
frameErrors = append(frameErrors, errors.Wrapf(err, "Unable to read file %s", match))
}
if !fi.IsDir() {
log.Debug().Msgf("Collecting file %s", match)
filePaths = append(filePaths, match)
}
}
frame := data.NewFileFrame("collection", filePaths)
return &data.DiagnosticBundle{
Errors: data.FrameErrors{Errors: frameErrors},
Frames: map[string]data.Frame{
"collection": frame,
},
}, nil
}
func (f *FileCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("file_pattern", "Glob based pattern to specify files for collection", true),
AllowEmpty: false,
},
},
}
}
func (f *FileCollector) IsDefault() bool {
return false
}
func (f *FileCollector) Description() string {
return "Allows collection of user specified files"
}
// here we register the collector for use
func init() {
collectors.Register("file", func() (collectors.Collector, error) {
return &FileCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,109 @@
package system_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"testing"
)
func TestFileConfiguration(t *testing.T) {
t.Run("correct configuration is returned for file collector", func(t *testing.T) {
fileCollector := system.NewFileCollector(&platform.ResourceManager{})
conf := fileCollector.Configuration()
require.Len(t, conf.Params, 1)
require.IsType(t, config.StringParam{}, conf.Params[0])
filePattern, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.True(t, filePattern.Required())
require.Equal(t, filePattern.Name(), "file_pattern")
require.Equal(t, "", filePattern.Value)
})
}
func TestFileCollect(t *testing.T) {
t.Run("test filter patterns work", func(t *testing.T) {
fileCollector := system.NewFileCollector(&platform.ResourceManager{})
bundle, err := fileCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "../../../testdata/**/*.xml",
Param: config.NewParam("file_pattern", "Glob based pattern to specify files for collection", true),
AllowEmpty: false,
},
},
})
require.Nil(t, err)
require.Nil(t, bundle.Errors.Errors)
checkFileBundle(t, bundle,
[]string{"../../../testdata/configs/include/xml/server-include.xml",
"../../../testdata/configs/include/xml/user-include.xml",
"../../../testdata/configs/xml/config.xml",
"../../../testdata/configs/xml/users.xml",
"../../../testdata/configs/xml/users.d/default-password.xml",
"../../../testdata/configs/yandex_xml/config.xml",
"../../../testdata/docker/admin.xml",
"../../../testdata/docker/custom.xml"})
})
t.Run("invalid file patterns are detected", func(t *testing.T) {
fileCollector := system.NewFileCollector(&platform.ResourceManager{})
bundle, err := fileCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("file_pattern", "Glob based pattern to specify files for collection", true),
AllowEmpty: false,
},
},
})
require.NotNil(t, err)
require.Equal(t, "parameter file_pattern is invalid - file_pattern cannot be empty", err.Error())
require.Equal(t, &data.DiagnosticBundle{}, bundle)
})
t.Run("check empty matches are reported", func(t *testing.T) {
fileCollector := system.NewFileCollector(&platform.ResourceManager{})
bundle, err := fileCollector.Collect(config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "../../../testdata/**/*.random",
Param: config.NewParam("file_pattern", "Glob based pattern to specify files for collection", true),
AllowEmpty: false,
},
},
})
require.Nil(t, err)
require.Nil(t, bundle.Frames)
require.Len(t, bundle.Errors.Errors, 1)
require.Equal(t, "0 files match glob pattern", bundle.Errors.Errors[0].Error())
})
}
func checkFileBundle(t *testing.T, bundle *data.DiagnosticBundle, expectedFiles []string) {
require.NotNil(t, bundle)
require.Nil(t, bundle.Errors.Errors)
require.Len(t, bundle.Frames, 1)
require.Contains(t, bundle.Frames, "collection")
dirFrame, ok := bundle.Frames["collection"].(data.FileFrame)
require.True(t, ok)
require.Equal(t, []string{"files"}, dirFrame.Columns())
i := 0
for {
values, ok, err := dirFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Len(t, values, 1)
file, ok := values[0].(data.SimpleFile)
require.True(t, ok)
require.Contains(t, expectedFiles, file.FilePath())
i += 1
}
require.Equal(t, len(expectedFiles), i)
}

View File

@ -0,0 +1,234 @@
package system
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/elastic/gosigar"
"github.com/jaypipes/ghw"
"github.com/matishsiao/goInfo"
"github.com/pkg/errors"
"strings"
)
// This collector collects the system overview
type SystemCollector struct {
resourceManager *platform.ResourceManager
}
func NewSystemCollector(m *platform.ResourceManager) *SystemCollector {
return &SystemCollector{
resourceManager: m,
}
}
func (sc *SystemCollector) Collect(conf config.Configuration) (*data.DiagnosticBundle, error) {
conf, err := conf.ValidateConfig(sc.Configuration())
if err != nil {
return &data.DiagnosticBundle{}, err
}
frames := make(map[string]data.Frame)
var frameErrors []error
frameErrors = addStatsToFrame(frames, frameErrors, "disks", getDisk)
frameErrors = addStatsToFrame(frames, frameErrors, "disk_usage", getDiskUsage)
frameErrors = addStatsToFrame(frames, frameErrors, "memory", getMemory)
frameErrors = addStatsToFrame(frames, frameErrors, "memory_usage", getMemoryUsage)
frameErrors = addStatsToFrame(frames, frameErrors, "cpu", getCPU)
//frameErrors = addStatsToFrame(frames, frameErrors, "cpu_usage", getCPUUsage)
frameErrors = addStatsToFrame(frames, frameErrors, "processes", getProcessList)
frameErrors = addStatsToFrame(frames, frameErrors, "os", getHostDetails)
return &data.DiagnosticBundle{
Frames: frames,
Errors: data.FrameErrors{
Errors: frameErrors,
},
}, err
}
func addStatsToFrame(frames map[string]data.Frame, errors []error, name string, statFunc func() (data.MemoryFrame, error)) []error {
frame, err := statFunc()
if err != nil {
errors = append(errors, err)
}
frames[name] = frame
return errors
}
func (sc *SystemCollector) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{},
}
}
func (sc *SystemCollector) IsDefault() bool {
return true
}
func getDisk() (data.MemoryFrame, error) {
block, err := ghw.Block()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to list block storage")
}
var rows [][]interface{}
columns := []string{"name", "size", "physicalBlockSize", "driveType", "controller", "vendor", "model", "partitionName", "partitionSize", "mountPoint", "readOnly"}
for _, disk := range block.Disks {
for _, part := range disk.Partitions {
rows = append(rows, []interface{}{disk.Name, disk.SizeBytes, disk.PhysicalBlockSizeBytes, disk.DriveType, disk.StorageController, disk.Vendor, disk.Model, part.Name, part.SizeBytes, part.MountPoint, part.IsReadOnly})
}
}
return data.NewMemoryFrame("disk_usage", columns, rows), nil
}
func getDiskUsage() (data.MemoryFrame, error) {
fsList := gosigar.FileSystemList{}
err := fsList.Get()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to list filesystems for usage")
}
rows := make([][]interface{}, len(fsList.List))
columns := []string{"filesystem", "size", "used", "avail", "use%", "mounted on"}
for i, fs := range fsList.List {
dirName := fs.DirName
usage := gosigar.FileSystemUsage{}
err = usage.Get(dirName)
if err == nil {
rows[i] = []interface{}{fs.DevName, usage.Total, usage.Used, usage.Avail, usage.UsePercent(), dirName}
} else {
// we try to output something
rows[i] = []interface{}{fs.DevName, 0, 0, 0, 0, dirName}
}
}
return data.NewMemoryFrame("disk_usage", columns, rows), nil
}
func getMemory() (data.MemoryFrame, error) {
memory, err := ghw.Memory()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to read memory")
}
columns := []string{"totalPhysical", "totalUsable", "supportedPageSizes"}
rows := make([][]interface{}, 1)
rows[0] = []interface{}{memory.TotalPhysicalBytes, memory.TotalUsableBytes, memory.SupportedPageSizes}
return data.NewMemoryFrame("memory", columns, rows), nil
}
func getMemoryUsage() (data.MemoryFrame, error) {
mem := gosigar.Mem{}
swap := gosigar.Swap{}
err := mem.Get()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to read memory usage")
}
err = swap.Get()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to read swap")
}
columns := []string{"type", "total", "used", "free"}
rows := make([][]interface{}, 3)
rows[0] = []interface{}{"mem", mem.Total, mem.Used, mem.Free}
rows[1] = []interface{}{"buffers/cache", 0, mem.ActualUsed, mem.ActualFree}
rows[2] = []interface{}{"swap", swap.Total, swap.Used, swap.Free}
return data.NewMemoryFrame("memory_usage", columns, rows), nil
}
func getCPU() (data.MemoryFrame, error) {
cpu, err := ghw.CPU()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to list cpus")
}
columns := []string{"processor", "vendor", "model", "core", "numThreads", "logical", "capabilities"}
var rows [][]interface{}
for _, proc := range cpu.Processors {
for _, core := range proc.Cores {
rows = append(rows, []interface{}{proc.ID, proc.Vendor, proc.Model, core.ID, core.NumThreads, core.LogicalProcessors, strings.Join(proc.Capabilities, " ")})
}
}
return data.NewMemoryFrame("cpu", columns, rows), nil
}
// this gets cpu usage vs a listing of arch etc - see getCPU(). This needs successive values as its ticks - not currently used
// see https://github.com/elastic/beats/blob/master/metricbeat/internal/metrics/cpu/metrics.go#L131 for inspiration
//nolint
func getCPUUsage() (data.MemoryFrame, error) {
cpuList := gosigar.CpuList{}
err := cpuList.Get()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to list cpus for usage")
}
columns := []string{"sys", "nice", "stolen", "irq", "idle", "softIrq", "user", "wait", "total"}
rows := make([][]interface{}, len(cpuList.List), len(cpuList.List))
for i, cpu := range cpuList.List {
rows[i] = []interface{}{cpu.Sys, cpu.Nice, cpu.Stolen, cpu.Irq, cpu.Idle, cpu.SoftIrq, cpu.User, cpu.Wait, cpu.Total()}
}
return data.NewMemoryFrame("cpu_usage", columns, rows), nil
}
func getProcessList() (data.MemoryFrame, error) {
pidList := gosigar.ProcList{}
err := pidList.Get()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to list processes")
}
columns := []string{"pid", "ppid", "stime", "time", "rss", "size", "faults", "minorFaults", "majorFaults", "user", "state", "priority", "nice", "command"}
rows := make([][]interface{}, len(pidList.List))
for i, pid := range pidList.List {
state := gosigar.ProcState{}
mem := gosigar.ProcMem{}
time := gosigar.ProcTime{}
args := gosigar.ProcArgs{}
if err := state.Get(pid); err != nil {
continue
}
if err := mem.Get(pid); err != nil {
continue
}
if err := time.Get(pid); err != nil {
continue
}
if err := args.Get(pid); err != nil {
continue
}
rows[i] = []interface{}{pid, state.Ppid, time.FormatStartTime(), time.FormatTotal(), mem.Resident, mem.Size,
mem.PageFaults, mem.MinorFaults, mem.MajorFaults, state.Username, state.State, state.Priority, state.Nice,
strings.Join(args.List, " ")}
}
return data.NewMemoryFrame("process_list", columns, rows), nil
}
func getHostDetails() (data.MemoryFrame, error) {
gi, err := goInfo.GetInfo()
if err != nil {
return data.MemoryFrame{}, errors.Wrapf(err, "unable to get host summary")
}
columns := []string{"hostname", "os", "goOs", "cpus", "core", "kernel", "platform"}
rows := [][]interface{}{
{gi.Hostname, gi.OS, gi.GoOS, gi.CPUs, gi.Core, gi.Kernel, gi.Platform},
}
return data.NewMemoryFrame("os", columns, rows), nil
}
func (sc *SystemCollector) Description() string {
return "Collects summary OS and hardware statistics for the host"
}
// here we register the collector for use
func init() {
collectors.Register("system", func() (collectors.Collector, error) {
return &SystemCollector{
resourceManager: platform.GetResourceManager(),
}, nil
})
}

View File

@ -0,0 +1,88 @@
package system_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"testing"
)
func TestSystemConfiguration(t *testing.T) {
t.Run("correct configuration is returned for system collector", func(t *testing.T) {
systemCollector := system.NewSystemCollector(&platform.ResourceManager{})
conf := systemCollector.Configuration()
require.Len(t, conf.Params, 0)
require.Equal(t, []config.ConfigParam{}, conf.Params)
})
}
func TestSystemCollect(t *testing.T) {
t.Run("test default system collection", func(t *testing.T) {
systemCollector := system.NewSystemCollector(&platform.ResourceManager{})
diagSet, err := systemCollector.Collect(config.Configuration{})
require.Nil(t, err)
require.NotNil(t, diagSet)
require.Len(t, diagSet.Errors.Errors, 0)
require.Len(t, diagSet.Frames, 7)
require.Contains(t, diagSet.Frames, "disks")
require.Contains(t, diagSet.Frames, "disk_usage")
require.Contains(t, diagSet.Frames, "memory")
require.Contains(t, diagSet.Frames, "memory_usage")
require.Contains(t, diagSet.Frames, "cpu")
require.Contains(t, diagSet.Frames, "processes")
require.Contains(t, diagSet.Frames, "os")
// responses here will vary depending on platform - mocking seems excessive so we test we have some data
// disks
require.Equal(t, []string{"name", "size", "physicalBlockSize", "driveType", "controller", "vendor", "model", "partitionName", "partitionSize", "mountPoint", "readOnly"}, diagSet.Frames["disks"].Columns())
diskFrames, err := countFrameRows(diagSet, "disks")
require.Greater(t, diskFrames, 0)
require.Nil(t, err)
// disk usage
require.Equal(t, []string{"filesystem", "size", "used", "avail", "use%", "mounted on"}, diagSet.Frames["disk_usage"].Columns())
diskUsageFrames, err := countFrameRows(diagSet, "disk_usage")
require.Greater(t, diskUsageFrames, 0)
require.Nil(t, err)
// memory
require.Equal(t, []string{"totalPhysical", "totalUsable", "supportedPageSizes"}, diagSet.Frames["memory"].Columns())
memoryFrames, err := countFrameRows(diagSet, "memory")
require.Greater(t, memoryFrames, 0)
require.Nil(t, err)
// memory_usage
require.Equal(t, []string{"type", "total", "used", "free"}, diagSet.Frames["memory_usage"].Columns())
memoryUsageFrames, err := countFrameRows(diagSet, "memory_usage")
require.Greater(t, memoryUsageFrames, 0)
require.Nil(t, err)
// cpu
require.Equal(t, []string{"processor", "vendor", "model", "core", "numThreads", "logical", "capabilities"}, diagSet.Frames["cpu"].Columns())
cpuFrames, err := countFrameRows(diagSet, "cpu")
require.Greater(t, cpuFrames, 0)
require.Nil(t, err)
// processes
require.Equal(t, []string{"pid", "ppid", "stime", "time", "rss", "size", "faults", "minorFaults", "majorFaults", "user", "state", "priority", "nice", "command"}, diagSet.Frames["processes"].Columns())
processesFrames, err := countFrameRows(diagSet, "processes")
require.Greater(t, processesFrames, 0)
require.Nil(t, err)
// os
require.Equal(t, []string{"hostname", "os", "goOs", "cpus", "core", "kernel", "platform"}, diagSet.Frames["os"].Columns())
osFrames, err := countFrameRows(diagSet, "os")
require.Greater(t, osFrames, 0)
require.Nil(t, err)
})
}
func countFrameRows(diagSet *data.DiagnosticBundle, frameName string) (int, error) {
frame := diagSet.Frames[frameName]
i := 0
for {
_, ok, err := frame.Next()
if !ok {
return i, err
}
if err != nil {
return i, err
}
i++
}
}

View File

@ -0,0 +1,343 @@
package file
import (
"context"
"encoding/csv"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/mholt/archiver/v4"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
"os"
"path"
"path/filepath"
"strconv"
"strings"
)
const OutputName = "simple"
type SubFolderGenerator func() string
type SimpleOutput struct {
// mainly used for testing to make sub folder deterministic - which it won't be by default as it uses a timestamp
FolderGenerator SubFolderGenerator
}
func (o SimpleOutput) Write(id string, bundles map[string]*data.DiagnosticBundle, conf config.Configuration) (data.FrameErrors, error) {
conf, err := conf.ValidateConfig(o.Configuration())
if err != nil {
return data.FrameErrors{}, err
}
directory, err := config.ReadStringValue(conf, "directory")
if err != nil {
return data.FrameErrors{}, err
}
directory, err = getWorkingDirectory(directory)
if err != nil {
return data.FrameErrors{}, err
}
subFolder := strconv.FormatInt(utils.MakeTimestamp(), 10)
if o.FolderGenerator != nil {
subFolder = o.FolderGenerator()
}
skipArchive, err := config.ReadBoolValue(conf, "skip_archive")
if err != nil {
return data.FrameErrors{}, err
}
outputDir := filepath.Join(directory, id, subFolder)
log.Info().Msgf("creating bundle in %s", outputDir)
if err := os.MkdirAll(outputDir, os.ModePerm); err != nil {
return data.FrameErrors{}, err
}
frameErrors := data.FrameErrors{}
var filePaths []string
for name := range bundles {
bundlePaths, frameError := writeDiagnosticBundle(name, bundles[name], outputDir)
filePaths = append(filePaths, bundlePaths...)
frameErrors.Errors = append(frameErrors.Errors, frameError.Errors...)
}
log.Info().Msg("bundle created")
if !skipArchive {
archiveFilename := filepath.Join(directory, id, fmt.Sprintf("%s.tar.gz", subFolder))
log.Info().Msgf("compressing bundle to %s", archiveFilename)
// produce a map containing the input paths to the archive paths - we preserve the output directory and hierarchy
archiveMap := createArchiveMap(filePaths, directory)
if err := createArchive(archiveFilename, archiveMap); err != nil {
return frameErrors, err
}
// we delete the original directory leaving just the archive behind
if err := os.RemoveAll(outputDir); err != nil {
return frameErrors, err
}
log.Info().Msgf("archive ready at: %s ", archiveFilename)
}
return frameErrors, nil
}
func writeDiagnosticBundle(name string, diag *data.DiagnosticBundle, baseDir string) ([]string, data.FrameErrors) {
diagDir := filepath.Join(baseDir, name)
if err := os.MkdirAll(diagDir, os.ModePerm); err != nil {
return nil, data.FrameErrors{Errors: []error{
errors.Wrapf(err, "unable to create directory for %s", name),
}}
}
frameErrors := data.FrameErrors{}
var filePaths []string
for frameId, frame := range diag.Frames {
fFilePath, errs := writeFrame(frameId, frame, diagDir)
filePaths = append(filePaths, fFilePath...)
if len(errs) > 0 {
// it would be nice if we could wrap this list of errors into something formal but this logs well
frameErrors.Errors = append(frameErrors.Errors, fmt.Errorf("unable to write frame %s for %s", frameId, name))
frameErrors.Errors = append(frameErrors.Errors, errs...)
}
}
return filePaths, frameErrors
}
func writeFrame(frameId string, frame data.Frame, baseDir string) ([]string, []error) {
switch f := frame.(type) {
case data.DatabaseFrame:
return writeDatabaseFrame(frameId, f, baseDir)
case data.ConfigFileFrame:
return writeConfigFrame(frameId, f, baseDir)
case data.DirectoryFileFrame:
return processDirectoryFileFrame(frameId, f, baseDir)
case data.FileFrame:
return processFileFrame(frameId, f, baseDir)
case data.HierarchicalFrame:
return writeHierarchicalFrame(frameId, f, baseDir)
default:
// for now our data frame writer supports all frames
return writeDatabaseFrame(frameId, frame, baseDir)
}
}
func writeHierarchicalFrame(frameId string, frame data.HierarchicalFrame, baseDir string) ([]string, []error) {
filePaths, errs := writeFrame(frameId, frame.DataFrame, baseDir)
for _, subFrame := range frame.SubFrames {
subDir := filepath.Join(baseDir, subFrame.Name())
if err := os.MkdirAll(subDir, os.ModePerm); err != nil {
errs = append(errs, err)
continue
}
subPaths, subErrs := writeFrame(subFrame.Name(), subFrame, subDir)
filePaths = append(filePaths, subPaths...)
errs = append(errs, subErrs...)
}
return filePaths, errs
}
func writeDatabaseFrame(frameId string, frame data.Frame, baseDir string) ([]string, []error) {
frameFilePath := filepath.Join(baseDir, fmt.Sprintf("%s.csv", frameId))
var errs []error
f, err := os.Create(frameFilePath)
if err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create directory for frame %s", frameId))
return []string{}, errs
}
defer f.Close()
w := csv.NewWriter(f)
defer w.Flush()
if err := w.Write(frame.Columns()); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to write columns for frame %s", frameId))
return []string{}, errs
}
// we don't collect an error for every line here like configs and logs - could mean alot of unnecessary noise
for {
values, ok, err := frame.Next()
if err != nil {
errs = append(errs, errors.Wrapf(err, "unable to read frame %s", frameId))
return []string{}, errs
}
if !ok {
return []string{frameFilePath}, errs
}
sValues := make([]string, len(values))
for i, value := range values {
sValues[i] = fmt.Sprintf("%v", value)
}
if err := w.Write(sValues); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to write row for frame %s", frameId))
return []string{}, errs
}
}
}
func writeConfigFrame(frameId string, frame data.ConfigFileFrame, baseDir string) ([]string, []error) {
var errs []error
frameDirectory := filepath.Join(baseDir, frameId)
if err := os.MkdirAll(frameDirectory, os.ModePerm); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create directory for frame %s", frameId))
return []string{}, errs
}
// this holds our files included
includesDirectory := filepath.Join(frameDirectory, "includes")
if err := os.MkdirAll(includesDirectory, os.ModePerm); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create includes directory for frame %s", frameId))
return []string{}, errs
}
for {
values, ok, err := frame.Next()
if err != nil {
errs = append(errs, err)
return []string{frameDirectory}, errs
}
if !ok {
return []string{frameDirectory}, errs
}
configFile := values[0].(data.ConfigFile)
if !configFile.IsIncluded() {
relPath := strings.TrimPrefix(configFile.FilePath(), frame.Directory)
destPath := path.Join(frameDirectory, relPath)
if err = configFile.Copy(destPath, true); err != nil {
errs = append(errs, errors.Wrapf(err, "Unable to copy file %s", configFile.FilePath()))
}
} else {
// include files could be anywhere - potentially multiple with the same name. We thus, recreate the directory
// hierarchy under includes to avoid collisions
destPath := path.Join(includesDirectory, configFile.FilePath())
if err = configFile.Copy(destPath, true); err != nil {
errs = append(errs, errors.Wrapf(err, "Unable to copy file %s", configFile.FilePath()))
}
}
}
}
func processDirectoryFileFrame(frameId string, frame data.DirectoryFileFrame, baseDir string) ([]string, []error) {
var errs []error
// each set of files goes under its own directory to preserve grouping
frameDirectory := filepath.Join(baseDir, frameId)
if err := os.MkdirAll(frameDirectory, os.ModePerm); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create directory for frame %s", frameId))
return []string{}, errs
}
for {
values, ok, err := frame.Next()
if err != nil {
errs = append(errs, err)
return []string{frameDirectory}, errs
}
if !ok {
return []string{frameDirectory}, errs
}
file := values[0].(data.SimpleFile)
relPath := strings.TrimPrefix(file.FilePath(), frame.Directory)
destPath := path.Join(frameDirectory, relPath)
if err = file.Copy(destPath, true); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to copy file %s for frame %s", file, frameId))
}
}
}
func processFileFrame(frameId string, frame data.FileFrame, baseDir string) ([]string, []error) {
var errs []error
frameDirectory := filepath.Join(baseDir, frameId)
if err := os.MkdirAll(frameDirectory, os.ModePerm); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create directory for frame %s", frameId))
return []string{}, errs
}
for {
values, ok, err := frame.Next()
if err != nil {
errs = append(errs, err)
}
if !ok {
return []string{frameDirectory}, errs
}
file := values[0].(data.SimpleFile)
// we need an absolute path to preserve the directory hierarchy
dir, err := filepath.Abs(filepath.Dir(file.FilePath()))
if err != nil {
errs = append(errs, errors.Wrapf(err, "unable to determine dir for %s", file.FilePath()))
}
outputDir := filepath.Join(frameDirectory, dir)
if _, err := os.Stat(outputDir); os.IsNotExist(err) {
if err := os.MkdirAll(outputDir, os.ModePerm); err != nil {
errs = append(errs, errors.Wrapf(err, "unable to create directory for %s", file.FilePath()))
} else {
outputPath := filepath.Join(outputDir, filepath.Base(file.FilePath()))
err = file.Copy(outputPath, false)
if err != nil {
errs = append(errs, errors.Wrapf(err, "unable to copy file %s", file.FilePath()))
}
}
}
}
}
func getWorkingDirectory(path string) (string, error) {
if !filepath.IsAbs(path) {
workingPath, err := os.Getwd()
if err != nil {
return "", err
}
return filepath.Join(workingPath, path), nil
}
return path, nil
}
func createArchiveMap(filePaths []string, prefix string) map[string]string {
archiveMap := make(map[string]string)
for _, path := range filePaths {
archiveMap[path] = strings.TrimPrefix(path, prefix)
}
return archiveMap
}
func createArchive(outputFile string, filePaths map[string]string) error {
files, err := archiver.FilesFromDisk(nil, filePaths)
if err != nil {
return err
}
out, err := os.Create(outputFile)
if err != nil {
return err
}
defer out.Close()
format := archiver.CompressedArchive{
Compression: archiver.Gz{},
Archival: archiver.Tar{},
}
err = format.Archive(context.Background(), out, files)
return err
}
func (o SimpleOutput) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "./",
Param: config.NewParam("directory", "Directory in which to create dump. Defaults to the current directory.", false),
},
config.StringOptions{
Value: "csv",
// TODO: add tsv and others here later
Options: []string{"csv"},
Param: config.NewParam("format", "Format of exported files", false),
},
config.BoolParam{
Value: false,
Param: config.NewParam("skip_archive", "Don't compress output to an archive", false),
},
},
}
}
func (o SimpleOutput) Description() string {
return "Writes out the diagnostic bundle as files in a structured directory, optionally producing a compressed archive."
}
// here we register the output for use
func init() {
outputs.Register(OutputName, func() (outputs.Output, error) {
return SimpleOutput{}, nil
})
}

View File

@ -0,0 +1,467 @@
package file_test
import (
"bufio"
"encoding/xml"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/file"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"io"
"os"
"path"
"strings"
"testing"
)
var clusterFrame = test.NewFakeDataFrame("clusters", []string{"cluster", "shard_num", "shard_weight", "replica_num", "host_name", "host_address", "port", "is_local", "user", "default_database", "errors_count", "slowdowns_count", "estimated_recovery_time"},
[][]interface{}{
{"events", 1, 1, 1, "dalem-local-clickhouse-blue-1", "192.168.144.2", 9000, 1, "default", "", 0, 0, 0},
{"events", 2, 1, 1, "dalem-local-clickhouse-blue-2", "192.168.144.4", 9001, 1, "default", "", 0, 0, 0},
{"events", 3, 1, 1, "dalem-local-clickhouse-blue-3", "192.168.144.3", 9002, 1, "default", "", 0, 0, 0},
},
)
var diskFrame = test.NewFakeDataFrame("disks", []string{"name", "path", "free_space", "total_space", "keep_free_space", "type"},
[][]interface{}{
{"default", "/var/lib/clickhouse", 1729659346944, 1938213220352, "", "local"},
},
)
var userFrame = test.NewFakeDataFrame("users", []string{"name", "id", "storage", "auth_type", "auth_params", "host_ip", "host_names", "host_names_regexp", "host_names_like"},
[][]interface{}{
{"default", "94309d50-4f52-5250-31bd-74fecac179db,users.xml,plaintext_password", "sha256_password", []string{"::0"}, []string{}, []string{}, []string{}},
},
)
func TestConfiguration(t *testing.T) {
t.Run("correct configuration is returned", func(t *testing.T) {
output := file.SimpleOutput{}
conf := output.Configuration()
require.Len(t, conf.Params, 3)
// check first directory param
require.IsType(t, config.StringParam{}, conf.Params[0])
directory, ok := conf.Params[0].(config.StringParam)
require.True(t, ok)
require.False(t, directory.Required())
require.Equal(t, "directory", directory.Name())
require.Equal(t, "./", directory.Value)
// check second format param
require.IsType(t, config.StringOptions{}, conf.Params[1])
format, ok := conf.Params[1].(config.StringOptions)
require.True(t, ok)
require.False(t, format.Required())
require.Equal(t, "format", format.Name())
require.Equal(t, "csv", format.Value)
require.Equal(t, []string{"csv"}, format.Options)
// check third format compress
require.IsType(t, config.BoolParam{}, conf.Params[2])
skipArchive, ok := conf.Params[2].(config.BoolParam)
require.True(t, ok)
require.False(t, format.Required())
require.False(t, skipArchive.Value)
})
}
func TestWrite(t *testing.T) {
bundles := map[string]*data.DiagnosticBundle{
"systemA": {
Frames: map[string]data.Frame{
"disk": diskFrame,
"cluster": clusterFrame,
},
},
"systemB": {
Frames: map[string]data.Frame{
"user": userFrame,
},
},
}
t.Run("test we can write simple diagnostic sets", func(t *testing.T) {
tempDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: tempDir,
},
// turn compression off as the folder will be deleted by default
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", bundles, configuration)
require.Nil(t, err)
require.Equal(t, data.FrameErrors{}, frameErrors)
clusterFile := path.Join(tempDir, "test", "test", "systemA", "cluster.csv")
diskFile := path.Join(tempDir, "test", "test", "systemA", "disk.csv")
userFile := path.Join(tempDir, "test", "test", "systemB", "user.csv")
require.FileExists(t, clusterFile)
require.FileExists(t, diskFile)
require.FileExists(t, userFile)
diskLines, err := readFileLines(diskFile)
require.Nil(t, err)
require.Len(t, diskLines, 2)
usersLines, err := readFileLines(userFile)
require.Nil(t, err)
require.Len(t, usersLines, 2)
clusterLines, err := readFileLines(clusterFile)
require.Nil(t, err)
require.Len(t, clusterLines, 4)
require.Equal(t, strings.Join(clusterFrame.ColumnNames, ","), clusterLines[0])
require.Equal(t, "events,1,1,1,dalem-local-clickhouse-blue-1,192.168.144.2,9000,1,default,,0,0,0", clusterLines[1])
require.Equal(t, "events,2,1,1,dalem-local-clickhouse-blue-2,192.168.144.4,9001,1,default,,0,0,0", clusterLines[2])
require.Equal(t, "events,3,1,1,dalem-local-clickhouse-blue-3,192.168.144.3,9002,1,default,,0,0,0", clusterLines[3])
resetFrames()
})
t.Run("test invalid parameter", func(t *testing.T) {
tempDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: tempDir,
},
config.StringOptions{
Value: "random",
Options: []string{"csv"},
// TODO: add tsv and others here later
Param: config.NewParam("format", "Format of exported files", false),
},
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip compressed archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", bundles, configuration)
require.Equal(t, data.FrameErrors{}, frameErrors)
require.NotNil(t, err)
require.Equal(t, "parameter format is invalid - random is not a valid value for format - [csv]", err.Error())
resetFrames()
})
t.Run("test compression", func(t *testing.T) {
tempDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: tempDir,
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", bundles, configuration)
require.Nil(t, err)
require.Equal(t, data.FrameErrors{}, frameErrors)
archiveFileName := path.Join(tempDir, "test", "test.tar.gz")
fi, err := os.Stat(archiveFileName)
require.Nil(t, err)
require.FileExists(t, archiveFileName)
// compression will vary so lets test range
require.Greater(t, int64(600), fi.Size())
require.Less(t, int64(200), fi.Size())
outputFolder := path.Join(tempDir, "test", "test")
// check the folder doesn't exist and is cleaned up
require.NoFileExists(t, outputFolder)
resetFrames()
})
t.Run("test support for directory frames", func(t *testing.T) {
// create 5 temporary files
tempDir := t.TempDir()
files := createRandomFiles(tempDir, 5)
dirFrame, errs := data.NewFileDirectoryFrame(tempDir, []string{"*.log"})
require.Empty(t, errs)
fileBundles := map[string]*data.DiagnosticBundle{
"systemA": {
Frames: map[string]data.Frame{
"disk": diskFrame,
"cluster": clusterFrame,
},
},
"config": {
Frames: map[string]data.Frame{
"logs": dirFrame,
},
},
}
destDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: destDir,
},
// turn compression off as the folder will be deleted by default
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", fileBundles, configuration)
require.Nil(t, err)
require.NotNil(t, frameErrors)
// test the usual frames still work
clusterFile := path.Join(destDir, "test", "test", "systemA", "cluster.csv")
diskFile := path.Join(destDir, "test", "test", "systemA", "disk.csv")
require.FileExists(t, clusterFile)
require.FileExists(t, diskFile)
diskLines, err := readFileLines(diskFile)
require.Nil(t, err)
require.Len(t, diskLines, 2)
clusterLines, err := readFileLines(clusterFile)
require.Nil(t, err)
require.Len(t, clusterLines, 4)
require.Equal(t, strings.Join(clusterFrame.ColumnNames, ","), clusterLines[0])
require.Equal(t, "events,1,1,1,dalem-local-clickhouse-blue-1,192.168.144.2,9000,1,default,,0,0,0", clusterLines[1])
require.Equal(t, "events,2,1,1,dalem-local-clickhouse-blue-2,192.168.144.4,9001,1,default,,0,0,0", clusterLines[2])
require.Equal(t, "events,3,1,1,dalem-local-clickhouse-blue-3,192.168.144.3,9002,1,default,,0,0,0", clusterLines[3])
//test our directory frame
for _, filepath := range files {
// check they were copied
subPath := strings.TrimPrefix(filepath, tempDir)
// path here will be <destDir>/<id>/test>/config/logs/<sub path>
newPath := path.Join(destDir, "test", "test", "config", "logs", subPath)
require.FileExists(t, newPath)
}
resetFrames()
})
t.Run("test support for config frames", func(t *testing.T) {
xmlConfig := data.XmlConfig{
XMLName: xml.Name{},
Clickhouse: data.XmlLoggerConfig{
XMLName: xml.Name{},
ErrorLog: "/var/log/clickhouse-server/clickhouse-server.err.log",
Log: "/var/log/clickhouse-server/clickhouse-server.log",
},
IncludeFrom: "",
}
tempDir := t.TempDir()
confDir := path.Join(tempDir, "conf")
// create an includes file
includesDir := path.Join(tempDir, "includes")
err := os.MkdirAll(includesDir, os.ModePerm)
require.Nil(t, err)
includesPath := path.Join(includesDir, "random.xml")
includesFile, err := os.Create(includesPath)
require.Nil(t, err)
xmlWriter := io.Writer(includesFile)
enc := xml.NewEncoder(xmlWriter)
enc.Indent(" ", " ")
err = enc.Encode(xmlConfig)
require.Nil(t, err)
// create 5 temporary config files
files := make([]string, 5)
// set the includes
xmlConfig.IncludeFrom = includesPath
for i := 0; i < 5; i++ {
// we want to check hierarchies are preserved so create a simple folder for each file
fileDir := path.Join(confDir, fmt.Sprintf("%d", i))
err := os.MkdirAll(fileDir, os.ModePerm)
require.Nil(t, err)
filepath := path.Join(fileDir, fmt.Sprintf("random-%d.xml", i))
files[i] = filepath
xmlFile, err := os.Create(filepath)
require.Nil(t, err)
// write a little xml so its valid
xmlWriter := io.Writer(xmlFile)
enc := xml.NewEncoder(xmlWriter)
enc.Indent(" ", " ")
err = enc.Encode(xmlConfig)
require.Nil(t, err)
}
configFrame, errs := data.NewConfigFileFrame(confDir)
require.Empty(t, errs)
fileBundles := map[string]*data.DiagnosticBundle{
"systemA": {
Frames: map[string]data.Frame{
"disk": diskFrame,
"cluster": clusterFrame,
},
},
"config": {
Frames: map[string]data.Frame{
"user_specified": configFrame,
},
},
}
destDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: destDir,
},
// turn compression off as the folder will be deleted by default
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", fileBundles, configuration)
require.Nil(t, err)
require.NotNil(t, frameErrors)
require.Empty(t, frameErrors.Errors)
//test our config frame
for _, filepath := range files {
// check they were copied
subPath := strings.TrimPrefix(filepath, confDir)
// path here will be <destDir>/<id>/test>/config/user_specified/file
newPath := path.Join(destDir, "test", "test", "config", "user_specified", subPath)
require.FileExists(t, newPath)
}
// check our includes file exits
// path here will be <destDir>/<id>/test>/config/user_specified/file/includes
require.FileExists(t, path.Join(destDir, "test", "test", "config", "user_specified", "includes", includesPath))
resetFrames()
})
t.Run("test support for file frames", func(t *testing.T) {
// create 5 temporary files
tempDir := t.TempDir()
files := createRandomFiles(tempDir, 5)
fileFrame := data.NewFileFrame("collection", files)
fileBundles := map[string]*data.DiagnosticBundle{
"systemA": {
Frames: map[string]data.Frame{
"disk": diskFrame,
"cluster": clusterFrame,
},
},
"file": {
Frames: map[string]data.Frame{
"collection": fileFrame,
},
},
}
destDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: destDir,
},
// turn compression off as the folder will be deleted by default
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
frameErrors, err := output.Write("test", fileBundles, configuration)
require.Nil(t, err)
require.NotNil(t, frameErrors)
//test our directory frame
for _, filepath := range files {
// path here will be <destDir>/<id>/test>/file/collection/<sub path>
newPath := path.Join(destDir, "test", "test", "file", "collection", filepath)
require.FileExists(t, newPath)
}
resetFrames()
})
t.Run("test support for hierarchical frames", func(t *testing.T) {
bottomFrame := data.NewHierarchicalFrame("bottomLevel", userFrame, []data.HierarchicalFrame{})
middleFrame := data.NewHierarchicalFrame("middleLevel", diskFrame, []data.HierarchicalFrame{bottomFrame})
topFrame := data.NewHierarchicalFrame("topLevel", clusterFrame, []data.HierarchicalFrame{middleFrame})
tempDir := t.TempDir()
configuration := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Param: config.NewParam("directory", "A directory", true),
Value: tempDir,
},
// turn compression off as the folder will be deleted by default
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Skip archive", false),
},
},
}
output := file.SimpleOutput{FolderGenerator: staticFolderName}
hierarchicalBundle := map[string]*data.DiagnosticBundle{
"systemA": {
Frames: map[string]data.Frame{
"topLevel": topFrame,
},
},
}
frameErrors, err := output.Write("test", hierarchicalBundle, configuration)
require.Nil(t, err)
require.Equal(t, data.FrameErrors{}, frameErrors)
topFile := path.Join(tempDir, "test", "test", "systemA", "topLevel.csv")
middleFile := path.Join(tempDir, "test", "test", "systemA", "middleLevel", "middleLevel.csv")
bottomFile := path.Join(tempDir, "test", "test", "systemA", "middleLevel", "bottomLevel", "bottomLevel.csv")
require.FileExists(t, topFile)
require.FileExists(t, middleFile)
require.FileExists(t, bottomFile)
topLines, err := readFileLines(topFile)
require.Nil(t, err)
require.Len(t, topLines, 4)
middleLines, err := readFileLines(middleFile)
require.Nil(t, err)
require.Len(t, middleLines, 2)
bottomLines, err := readFileLines(bottomFile)
require.Nil(t, err)
require.Len(t, bottomLines, 2)
require.Equal(t, strings.Join(clusterFrame.ColumnNames, ","), topLines[0])
require.Equal(t, "events,1,1,1,dalem-local-clickhouse-blue-1,192.168.144.2,9000,1,default,,0,0,0", topLines[1])
require.Equal(t, "events,2,1,1,dalem-local-clickhouse-blue-2,192.168.144.4,9001,1,default,,0,0,0", topLines[2])
require.Equal(t, "events,3,1,1,dalem-local-clickhouse-blue-3,192.168.144.3,9002,1,default,,0,0,0", topLines[3])
resetFrames()
})
}
func createRandomFiles(tempDir string, num int) []string {
files := make([]string, num)
for i := 0; i < 5; i++ {
// we want to check hierarchies are preserved so create a simple folder for each file
fileDir := path.Join(tempDir, fmt.Sprintf("%d", i))
os.MkdirAll(fileDir, os.ModePerm) //nolint:errcheck
filepath := path.Join(fileDir, fmt.Sprintf("random-%d.log", i))
files[i] = filepath
os.Create(filepath) //nolint:errcheck
}
return files
}
func resetFrames() {
clusterFrame.Reset()
userFrame.Reset()
diskFrame.Reset()
}
func readFileLines(filename string) ([]string, error) {
file, err := os.Open(filename)
if err != nil {
return nil, err
}
defer file.Close()
var lines []string
scanner := bufio.NewScanner(file)
for scanner.Scan() {
lines = append(lines, scanner.Text())
}
return lines, scanner.Err()
}
func staticFolderName() string {
return "test"
}

View File

@ -0,0 +1,66 @@
package outputs
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
)
type Output interface {
Write(id string, bundles map[string]*data.DiagnosticBundle, config config.Configuration) (data.FrameErrors, error)
Configuration() config.Configuration
Description() string
// TODO: we will need to implement this for the convert function
//Read(config config.Configuration) (data.DiagnosticBundle, error)
}
// Register can be called from init() on an output in this package
// It will automatically be added to the Outputs map to be called externally
func Register(name string, output OutputFactory) {
// names must be unique
if _, ok := Outputs[name]; ok {
log.Error().Msgf("More than 1 output is trying to register under the name %s. Names must be unique.", name)
}
Outputs[name] = output
}
// OutputFactory lets us use a closure to get instances of the output struct
type OutputFactory func() (Output, error)
var Outputs = map[string]OutputFactory{}
func GetOutputNames() []string {
outputs := make([]string, len(Outputs))
i := 0
for k := range Outputs {
outputs[i] = k
i++
}
return outputs
}
func GetOutputByName(name string) (Output, error) {
if outputFactory, ok := Outputs[name]; ok {
//do something here
output, err := outputFactory()
if err != nil {
return nil, errors.Wrapf(err, "output %s could not be initialized", name)
}
return output, nil
}
return nil, fmt.Errorf("%s is not a valid output name", name)
}
func BuildConfigurationOptions() (map[string]config.Configuration, error) {
configurations := make(map[string]config.Configuration)
for name, collectorFactory := range Outputs {
output, err := collectorFactory()
if err != nil {
return nil, errors.Wrapf(err, "output %s could not be initialized", name)
}
configurations[name] = output.Configuration()
}
return configurations, nil
}

View File

@ -0,0 +1,44 @@
package outputs_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/file"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/terminal"
"github.com/stretchr/testify/require"
"testing"
)
func TestGetOutputNames(t *testing.T) {
t.Run("can get all output names", func(t *testing.T) {
outputNames := outputs.GetOutputNames()
require.ElementsMatch(t, []string{"simple", "report"}, outputNames)
})
}
func TestGetOutputByName(t *testing.T) {
t.Run("can get output by name", func(t *testing.T) {
output, err := outputs.GetOutputByName("simple")
require.Nil(t, err)
require.Equal(t, file.SimpleOutput{}, output)
})
t.Run("fails on non existing output", func(t *testing.T) {
output, err := outputs.GetOutputByName("random")
require.NotNil(t, err)
require.Equal(t, "random is not a valid output name", err.Error())
require.Nil(t, output)
})
}
func TestBuildConfigurationOptions(t *testing.T) {
t.Run("can get all output configurations", func(t *testing.T) {
outputs, err := outputs.BuildConfigurationOptions()
require.Nil(t, err)
require.Len(t, outputs, 2)
require.Contains(t, outputs, "simple")
require.Contains(t, outputs, "report")
})
}

View File

@ -0,0 +1,283 @@
package terminal
import (
"bufio"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/olekukonko/tablewriter"
"github.com/pkg/errors"
"os"
)
const OutputName = "report"
type ReportOutput struct {
}
func (r ReportOutput) Write(id string, bundles map[string]*data.DiagnosticBundle, conf config.Configuration) (data.FrameErrors, error) {
conf, err := conf.ValidateConfig(r.Configuration())
if err != nil {
return data.FrameErrors{}, err
}
format, err := config.ReadStringOptionsValue(conf, "format")
if err != nil {
return data.FrameErrors{}, err
}
nonInteractive, err := config.ReadBoolValue(conf, "continue")
if err != nil {
return data.FrameErrors{}, err
}
maxRows, err := config.ReadIntValue(conf, "row_limit")
if err != nil {
return data.FrameErrors{}, err
}
maxColumns, err := config.ReadIntValue(conf, "column_limit")
if err != nil {
return data.FrameErrors{}, err
}
frameErrors := data.FrameErrors{}
for name := range bundles {
frameError := printDiagnosticBundle(name, bundles[name], format, !nonInteractive, int(maxRows), int(maxColumns))
frameErrors.Errors = append(frameErrors.Errors, frameError.Errors...)
}
return data.FrameErrors{}, nil
}
func printDiagnosticBundle(name string, diag *data.DiagnosticBundle, format string, interactive bool, maxRows, maxColumns int) data.FrameErrors {
frameErrors := data.FrameErrors{}
for frameId, frame := range diag.Frames {
printFrameHeader(fmt.Sprintf("%s.%s", name, frameId))
err := printFrame(frame, format, maxRows, maxColumns)
if err != nil {
frameErrors.Errors = append(frameErrors.Errors, err)
}
if interactive {
err := waitForEnter()
if err != nil {
frameErrors.Errors = append(frameErrors.Errors, err)
}
}
}
return frameErrors
}
func waitForEnter() error {
fmt.Println("Press the Enter Key to view the next frame report")
for {
consoleReader := bufio.NewReaderSize(os.Stdin, 1)
input, err := consoleReader.ReadByte()
if err != nil {
return errors.New("Unable to read user input")
}
if input == 3 {
//ctl +c
fmt.Println("Exiting...")
os.Exit(0)
}
if input == 10 {
return nil
}
}
}
func printFrame(frame data.Frame, format string, maxRows, maxColumns int) error {
switch f := frame.(type) {
case data.DatabaseFrame:
return printDatabaseFrame(f, format, maxRows, maxColumns)
case data.ConfigFileFrame:
return printConfigFrame(f, format)
case data.DirectoryFileFrame:
return printDirectoryFileFrame(f, format, maxRows)
case data.HierarchicalFrame:
return printHierarchicalFrame(f, format, maxRows, maxColumns)
default:
// for now our data frame writer supports all frames
return printDatabaseFrame(f, format, maxRows, maxColumns)
}
}
func createTable(format string) *tablewriter.Table {
table := tablewriter.NewWriter(os.Stdout)
if format == "markdown" {
table.SetBorders(tablewriter.Border{Left: true, Top: false, Right: true, Bottom: false})
table.SetCenterSeparator("|")
}
return table
}
func printFrameHeader(title string) {
titleTable := tablewriter.NewWriter(os.Stdout)
titleTable.SetHeader([]string{title})
titleTable.SetAutoWrapText(false)
titleTable.SetAutoFormatHeaders(true)
titleTable.SetHeaderAlignment(tablewriter.ALIGN_CENTER)
titleTable.SetRowSeparator("\n")
titleTable.SetHeaderLine(false)
titleTable.SetBorder(false)
titleTable.SetTablePadding("\t") // pad with tabs
titleTable.SetNoWhiteSpace(true)
titleTable.Render()
}
func printHierarchicalFrame(frame data.HierarchicalFrame, format string, maxRows, maxColumns int) error {
err := printDatabaseFrame(frame, format, maxRows, maxColumns)
if err != nil {
return err
}
for _, subFrame := range frame.SubFrames {
err = printHierarchicalFrame(subFrame, format, maxRows, maxColumns)
if err != nil {
return err
}
}
return nil
}
func printDatabaseFrame(frame data.Frame, format string, maxRows, maxColumns int) error {
table := createTable(format)
table.SetAutoWrapText(false)
columns := len(frame.Columns())
if maxColumns > 0 && maxColumns < columns {
columns = maxColumns
}
table.SetHeader(frame.Columns()[:columns])
r := 0
trunColumns := 0
for {
values, ok, err := frame.Next()
if !ok || r == maxRows {
table.Render()
if trunColumns > 0 {
warning(fmt.Sprintf("Truncated %d columns, more available...", trunColumns))
}
if r == maxRows {
warning("Truncated rows, more available...")
}
return err
}
if err != nil {
return err
}
columns := len(values)
// -1 means unlimited
if maxColumns > 0 && maxColumns < columns {
trunColumns = columns - maxColumns
columns = maxColumns
}
row := make([]string, columns)
for i, value := range values {
if i == columns {
break
}
row[i] = fmt.Sprintf("%v", value)
}
table.Append(row)
r++
}
}
// currently we dump the whole config - useless in parts
func printConfigFrame(frame data.Frame, format string) error {
for {
values, ok, err := frame.Next()
if !ok {
return err
}
if err != nil {
return err
}
configFile := values[0].(data.File)
dat, err := os.ReadFile(configFile.FilePath())
if err != nil {
return err
}
// create a table per row - as each will be a file
table := createTable(format)
table.SetAutoWrapText(false)
table.SetAutoFormatHeaders(false)
table.ClearRows()
table.SetHeader([]string{configFile.FilePath()})
table.Append([]string{string(dat)})
table.Render()
}
}
func printDirectoryFileFrame(frame data.Frame, format string, maxRows int) error {
for {
values, ok, err := frame.Next()
if !ok {
return err
}
if err != nil {
return err
}
path := values[0].(data.SimpleFile)
file, err := os.Open(path.FilePath())
if err != nil {
// failure on one file causes rest to be ignored in frame...we could improve this
return errors.Wrapf(err, "Unable to read file %s", path.FilePath())
}
scanner := bufio.NewScanner(file)
i := 0
// create a table per row - as each will be a file
table := createTable(format)
table.SetAutoWrapText(false)
table.SetAutoFormatHeaders(false)
table.ClearRows()
table.SetHeader([]string{path.FilePath()})
for scanner.Scan() {
if i == maxRows {
fmt.Println()
table.Render()
warning("Truncated lines, more available...")
fmt.Print("\n")
break
}
table.Append([]string{scanner.Text()})
i++
}
}
}
// prints a warning
func warning(s string) {
fmt.Printf("\x1b[%dm%v\x1b[0m%s\n", 33, "WARNING: ", s)
}
func (r ReportOutput) Configuration() config.Configuration {
return config.Configuration{
Params: []config.ConfigParam{
config.StringOptions{
Value: "default",
Options: []string{"default", "markdown"},
Param: config.NewParam("format", "Format of tables. Default is terminal friendly.", false),
},
config.BoolParam{
Value: false,
Param: config.NewParam("continue", "Print report with no interaction", false),
},
config.IntParam{
Value: 10,
Param: config.NewParam("row_limit", "Max Rows to print per frame.", false),
},
config.IntParam{
Value: 8,
Param: config.NewParam("column_limit", "Max Columns to print per frame. Negative is unlimited.", false),
},
},
}
}
func (r ReportOutput) Description() string {
return "Writes out the diagnostic bundle to the terminal as a simple report."
}
// here we register the output for use
func init() {
outputs.Register(OutputName, func() (outputs.Output, error) {
return ReportOutput{}, nil
})
}

View File

@ -0,0 +1,128 @@
package config
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"strings"
)
type ConfigParam interface {
Name() string
Required() bool
Description() string
validate(defaultConfig ConfigParam) error
}
type Configuration struct {
Params []ConfigParam
}
type Param struct {
name string
description string
required bool
}
func NewParam(name string, description string, required bool) Param {
return Param{
name: name,
description: description,
required: required,
}
}
func (bp Param) Name() string {
return bp.name
}
func (bp Param) Required() bool {
return bp.required
}
func (bp Param) Description() string {
return bp.description
}
func (bp Param) validate(defaultConfig ConfigParam) error {
return nil
}
func (c Configuration) GetConfigParam(paramName string) (ConfigParam, error) {
for _, param := range c.Params {
if param.Name() == paramName {
return param, nil
}
}
return nil, fmt.Errorf("%s does not exist", paramName)
}
// ValidateConfig finds the intersection of a config c and a default config. Requires all possible params to be in default.
func (c Configuration) ValidateConfig(defaultConfig Configuration) (Configuration, error) {
var finalParams []ConfigParam
for _, defaultParam := range defaultConfig.Params {
setParam, err := c.GetConfigParam(defaultParam.Name())
if err == nil {
// check the set value is valid
if err := setParam.validate(defaultParam); err != nil {
return Configuration{}, fmt.Errorf("parameter %s is invalid - %s", defaultParam.Name(), err.Error())
}
finalParams = append(finalParams, setParam)
} else if defaultParam.Required() {
return Configuration{}, fmt.Errorf("missing required parameter %s - %s", defaultParam.Name(), err.Error())
} else {
finalParams = append(finalParams, defaultParam)
}
}
return Configuration{
Params: finalParams,
}, nil
}
type StringParam struct {
Param
Value string
AllowEmpty bool
}
func (sp StringParam) validate(defaultConfig ConfigParam) error {
dsp := defaultConfig.(StringParam)
if !dsp.AllowEmpty && strings.TrimSpace(sp.Value) == "" {
return fmt.Errorf("%s cannot be empty", sp.Name())
}
// if the parameter is not required it doesn't matter
return nil
}
type StringListParam struct {
Param
Values []string
}
type StringOptions struct {
Param
Options []string
Value string
AllowEmpty bool
}
func (so StringOptions) validate(defaultConfig ConfigParam) error {
dso := defaultConfig.(StringOptions)
if !dso.AllowEmpty && strings.TrimSpace(so.Value) == "" {
return fmt.Errorf("%s cannot be empty", so.Name())
}
if !utils.Contains(dso.Options, so.Value) {
return fmt.Errorf("%s is not a valid value for %s - %v", so.Value, so.Name(), so.Options)
}
// if the parameter is not required it doesn't matter
return nil
}
type IntParam struct {
Param
Value int64
}
type BoolParam struct {
Param
Value bool
}

View File

@ -0,0 +1,181 @@
package config_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/stretchr/testify/require"
"testing"
)
var conf = config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
Values: []string{"some", "values"},
Param: config.NewParam("paramA", "", false),
},
config.StringParam{
Value: "random",
Param: config.NewParam("paramB", "", true),
},
config.StringParam{
Value: "",
AllowEmpty: true,
Param: config.NewParam("paramC", "", false),
},
config.StringOptions{
Value: "random",
Options: []string{"random", "very_random", "very_very_random"},
Param: config.NewParam("paramD", "", false),
AllowEmpty: true,
},
},
}
func TestGetConfigParam(t *testing.T) {
t.Run("can find get config param by name", func(t *testing.T) {
paramA, err := conf.GetConfigParam("paramA")
require.Nil(t, err)
require.NotNil(t, paramA)
require.IsType(t, config.StringListParam{}, paramA)
stringListParam, ok := paramA.(config.StringListParam)
require.True(t, ok)
require.False(t, stringListParam.Required())
require.Equal(t, stringListParam.Name(), "paramA")
require.ElementsMatch(t, stringListParam.Values, []string{"some", "values"})
})
t.Run("throws error on missing element", func(t *testing.T) {
paramZ, err := conf.GetConfigParam("paramZ")
require.Nil(t, paramZ)
require.NotNil(t, err)
require.Equal(t, err.Error(), "paramZ does not exist")
})
}
func TestValidateConfig(t *testing.T) {
t.Run("validate adds the default and allows override", func(t *testing.T) {
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "custom",
Param: config.NewParam("paramB", "", true),
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.Nil(t, err)
require.NotNil(t, newConf)
require.Len(t, newConf.Params, 4)
// check first param
require.IsType(t, config.StringListParam{}, newConf.Params[0])
stringListParam, ok := newConf.Params[0].(config.StringListParam)
require.True(t, ok)
require.False(t, stringListParam.Required())
require.Equal(t, stringListParam.Name(), "paramA")
require.ElementsMatch(t, stringListParam.Values, []string{"some", "values"})
// check second param
require.IsType(t, config.StringParam{}, newConf.Params[1])
stringParam, ok := newConf.Params[1].(config.StringParam)
require.True(t, ok)
require.True(t, stringParam.Required())
require.Equal(t, "paramB", stringParam.Name())
require.Equal(t, "custom", stringParam.Value)
})
t.Run("validate errors if missing param", func(t *testing.T) {
//missing required paramB
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
Values: []string{"some", "values"},
Param: config.NewParam("paramA", "", false),
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.Nil(t, newConf.Params)
require.NotNil(t, err)
require.Equal(t, "missing required parameter paramB - paramB does not exist", err.Error())
})
t.Run("validate errors if invalid string value", func(t *testing.T) {
//missing required paramB
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("paramB", "", true),
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.Nil(t, newConf.Params)
require.NotNil(t, err)
require.Equal(t, "parameter paramB is invalid - paramB cannot be empty", err.Error())
})
t.Run("allow empty string value if specified", func(t *testing.T) {
//missing required paramB
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "",
Param: config.NewParam("paramC", "", true),
},
config.StringParam{
Value: "custom",
Param: config.NewParam("paramB", "", true),
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.NotNil(t, newConf.Params)
require.Nil(t, err)
})
t.Run("validate errors if invalid string options value", func(t *testing.T) {
//missing required paramB
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "not_random",
Param: config.NewParam("paramB", "", true),
},
config.StringOptions{
Value: "custom",
// this isn't ideal we need to ensure options are set for this to validate correctly
Options: []string{"random", "very_random", "very_very_random"},
Param: config.NewParam("paramD", "", true),
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.Nil(t, newConf.Params)
require.NotNil(t, err)
require.Equal(t, "parameter paramD is invalid - custom is not a valid value for paramD - [random very_random very_very_random]", err.Error())
})
t.Run("allow empty string value for StringOptions if specified", func(t *testing.T) {
//missing required paramB
customConf := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: "custom",
Param: config.NewParam("paramB", "", true),
},
config.StringOptions{
Param: config.Param{},
// this isn't ideal we need to ensure options are set for this to validate correctly
Options: []string{"random", "very_random", "very_very_random"},
Value: "",
},
},
}
newConf, err := customConf.ValidateConfig(conf)
require.NotNil(t, newConf.Params)
require.Nil(t, err)
})
//TODO: Do we need to test if parameters of the same name but wrong type are passed??
}

View File

@ -0,0 +1,73 @@
package config
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
)
func ReadStringListValues(conf Configuration, paramName string) ([]string, error) {
param, err := conf.GetConfigParam(paramName)
if err != nil {
return nil, err
}
value, ok := param.(StringListParam)
if !ok {
value, ok = param.(StringListParam)
if !ok {
return nil, fmt.Errorf("%s must be a list of strings", paramName)
}
}
return value.Values, nil
}
func ReadStringValue(conf Configuration, paramName string) (string, error) {
param, err := conf.GetConfigParam(paramName)
if err != nil {
return "", err
}
value, ok := param.(StringParam)
if !ok {
return "", fmt.Errorf("%s must be a list of strings", paramName)
}
return value.Value, nil
}
func ReadIntValue(conf Configuration, paramName string) (int64, error) {
param, err := conf.GetConfigParam(paramName)
if err != nil {
return 0, err
}
value, ok := param.(IntParam)
if !ok {
return 9, fmt.Errorf("%s must be an unsigned integer", paramName)
}
return value.Value, nil
}
func ReadBoolValue(conf Configuration, paramName string) (bool, error) {
param, err := conf.GetConfigParam(paramName)
if err != nil {
return false, err
}
value, ok := param.(BoolParam)
if !ok {
return false, fmt.Errorf("%s must be a boolean", paramName)
}
return value.Value, nil
}
func ReadStringOptionsValue(conf Configuration, paramName string) (string, error) {
param, err := conf.GetConfigParam(paramName)
if err != nil {
return "", err
}
value, ok := param.(StringOptions)
if !ok {
return "", fmt.Errorf("%s must be a string options", paramName)
}
if !utils.Contains(value.Options, value.Value) {
return "", fmt.Errorf("%s is not a valid option in %v for the the parameter %s", value.Value, value.Options, paramName)
}
return value.Value, nil
}

View File

@ -0,0 +1,141 @@
package config_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/stretchr/testify/require"
"testing"
)
func TestReadStringListValues(t *testing.T) {
t.Run("can find a string list param", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Specify list of tables to collect", false),
},
config.StringListParam{
Values: []string{"licenses", "settings"},
Param: config.NewParam("exclude_tables", "Specify list of tables not to collect", false),
},
},
}
excludeTables, err := config.ReadStringListValues(conf, "exclude_tables")
require.Nil(t, err)
require.Equal(t, []string{"licenses", "settings"}, excludeTables)
})
}
func TestReadStringValue(t *testing.T) {
t.Run("can find a string param", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Specify list of tables to collect", false),
},
config.StringParam{
Value: "/tmp/dump",
Param: config.NewParam("directory", "Specify a directory", false),
},
},
}
directory, err := config.ReadStringValue(conf, "directory")
require.Nil(t, err)
require.Equal(t, "/tmp/dump", directory)
})
}
func TestReadIntValue(t *testing.T) {
t.Run("can find an integer param", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.IntParam{
// nil means include everything
Value: 10000,
Param: config.NewParam("row_limit", "Max Rows to collect", false),
},
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Specify list of tables to collect", false),
},
config.StringParam{
Value: "/tmp/dump",
Param: config.NewParam("directory", "Specify a directory", false),
},
},
}
rowLimit, err := config.ReadIntValue(conf, "row_limit")
require.Nil(t, err)
require.Equal(t, int64(10000), rowLimit)
})
}
func TestReadBoolValue(t *testing.T) {
t.Run("can find a boolean param", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.BoolParam{
// nil means include everything
Value: true,
Param: config.NewParam("compress", "Compress data", false),
},
config.StringListParam{
// nil means include everything
Values: nil,
Param: config.NewParam("include_tables", "Specify list of tables to collect", false),
},
config.StringParam{
Value: "/tmp/dump",
Param: config.NewParam("directory", "Specify a directory", false),
},
},
}
compress, err := config.ReadBoolValue(conf, "compress")
require.Nil(t, err)
require.True(t, compress)
})
}
func TestReadStringOptionsValue(t *testing.T) {
t.Run("can find a string value in a list of options", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringOptions{
Param: config.NewParam("format", "List of formats", false),
Options: []string{"csv", "tsv", "binary", "json", "ndjson"},
Value: "csv",
AllowEmpty: false,
},
},
}
format, err := config.ReadStringOptionsValue(conf, "format")
require.Nil(t, err)
require.Equal(t, "csv", format)
})
t.Run("errors on invalid value", func(t *testing.T) {
conf := config.Configuration{
Params: []config.ConfigParam{
config.StringOptions{
Param: config.NewParam("format", "List of formats", false),
Options: []string{"csv", "tsv", "binary", "json", "ndjson"},
Value: "random",
AllowEmpty: false,
},
},
}
format, err := config.ReadStringOptionsValue(conf, "format")
require.Equal(t, "random is not a valid option in [csv tsv binary json ndjson] for the the parameter format", err.Error())
require.Equal(t, "", format)
})
}

View File

@ -0,0 +1,27 @@
package data
import (
"strings"
)
// DiagnosticBundle contains the results from a Collector
// each frame can represent a table or collection of data files. By allowing multiple frames a single DiagnosticBundle
// can potentially contain many related tables
type DiagnosticBundle struct {
Frames map[string]Frame
// Errors is a property to be set if the Collector has an error. This can be used to indicate a partial collection
// and failed frames
Errors FrameErrors
}
type FrameErrors struct {
Errors []error
}
func (fe *FrameErrors) Error() string {
errors := make([]string, len(fe.Errors))
for i := range errors {
errors[i] = fe.Errors[i].Error()
}
return strings.Join(errors, "\n")
}

View File

@ -0,0 +1,25 @@
package data_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
"github.com/stretchr/testify/require"
"testing"
)
func TestBundleError(t *testing.T) {
t.Run("can get a bundle error", func(t *testing.T) {
errs := make([]error, 3)
errs[0] = errors.New("Error 1")
errs[1] = errors.New("Error 2")
errs[2] = errors.New("Error 3")
fErrors := data.FrameErrors{
Errors: errs,
}
require.Equal(t, `Error 1
Error 2
Error 3`, fErrors.Error())
})
}

View File

@ -0,0 +1,88 @@
package data
import (
"database/sql"
"fmt"
"reflect"
"strings"
)
type DatabaseFrame struct {
name string
ColumnNames []string
rows *sql.Rows
columnTypes []*sql.ColumnType
vars []interface{}
}
func NewDatabaseFrame(name string, rows *sql.Rows) (DatabaseFrame, error) {
databaseFrame := DatabaseFrame{}
columnTypes, err := rows.ColumnTypes()
if err != nil {
return DatabaseFrame{}, err
}
databaseFrame.columnTypes = columnTypes
databaseFrame.name = name
vars := make([]interface{}, len(columnTypes))
columnNames := make([]string, len(columnTypes))
for i := range columnTypes {
value := reflect.Zero(columnTypes[i].ScanType()).Interface()
vars[i] = &value
columnNames[i] = columnTypes[i].Name()
}
databaseFrame.ColumnNames = columnNames
databaseFrame.vars = vars
databaseFrame.rows = rows
return databaseFrame, nil
}
func (f DatabaseFrame) Next() ([]interface{}, bool, error) {
values := make([]interface{}, len(f.columnTypes))
for f.rows.Next() {
if err := f.rows.Scan(f.vars...); err != nil {
return nil, false, err
}
for i := range f.columnTypes {
ptr := reflect.ValueOf(f.vars[i])
values[i] = ptr.Elem().Interface()
}
return values, true, nil //nolint
}
// TODO: raise issue as this seems to always raise an error
//err := f.rows.Err()
f.rows.Close()
return nil, false, nil
}
func (f DatabaseFrame) Columns() []string {
return f.ColumnNames
}
func (f DatabaseFrame) Name() string {
return f.name
}
type Order int
const (
Asc Order = 1
Desc Order = 2
)
type OrderBy struct {
Column string
Order Order
}
func (o OrderBy) String() string {
if strings.TrimSpace(o.Column) == "" {
return ""
}
switch o.Order {
case Asc:
return fmt.Sprintf(" ORDER BY %s ASC", o.Column)
case Desc:
return fmt.Sprintf(" ORDER BY %s DESC", o.Column)
}
return ""
}

View File

@ -0,0 +1,85 @@
package data_test
import (
"database/sql"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/DATA-DOG/go-sqlmock"
"github.com/stretchr/testify/require"
"testing"
)
func TestString(t *testing.T) {
t.Run("can order by asc", func(t *testing.T) {
orderBy := data.OrderBy{
Column: "created_at",
Order: data.Asc,
}
require.Equal(t, " ORDER BY created_at ASC", orderBy.String())
})
t.Run("can order by desc", func(t *testing.T) {
orderBy := data.OrderBy{
Column: "created_at",
Order: data.Desc,
}
require.Equal(t, " ORDER BY created_at DESC", orderBy.String())
})
}
func TestNextDatabaseFrame(t *testing.T) {
t.Run("can iterate sql rows", func(t *testing.T) {
rowValues := [][]interface{}{
{int64(1), "post_1", "hello"},
{int64(2), "post_2", "world"},
{int64(3), "post_3", "goodbye"},
{int64(4), "post_4", "world"},
}
mockRows := sqlmock.NewRows([]string{"id", "title", "body"})
for i := range rowValues {
mockRows.AddRow(rowValues[i][0], rowValues[i][1], rowValues[i][2])
}
rows := mockRowsToSqlRows(mockRows)
dbFrame, err := data.NewDatabaseFrame("test", rows)
require.ElementsMatch(t, dbFrame.Columns(), []string{"id", "title", "body"})
require.Nil(t, err)
i := 0
for {
values, ok, err := dbFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Len(t, values, 3)
require.ElementsMatch(t, values, rowValues[i])
i++
}
require.Equal(t, 4, i)
})
t.Run("can iterate empty sql rows", func(t *testing.T) {
mockRows := sqlmock.NewRows([]string{"id", "title", "body"})
rows := mockRowsToSqlRows(mockRows)
dbFrame, err := data.NewDatabaseFrame("test", rows)
require.ElementsMatch(t, dbFrame.Columns(), []string{"id", "title", "body"})
require.Nil(t, err)
i := 0
for {
_, ok, err := dbFrame.Next()
require.Nil(t, err)
if !ok {
break
}
i++
}
require.Equal(t, 0, i)
})
}
func mockRowsToSqlRows(mockRows *sqlmock.Rows) *sql.Rows {
db, mock, _ := sqlmock.New()
mock.ExpectQuery("select").WillReturnRows(mockRows)
rows, _ := db.Query("select")
return rows
}

View File

@ -0,0 +1,8 @@
package data
type Field struct {
// Name of the field
Name string
// A list of fields that must implement FieldType interface
Values []interface{}
}

View File

@ -0,0 +1,443 @@
package data
import (
"bufio"
"encoding/xml"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/pkg/errors"
"gopkg.in/yaml.v3"
"io/ioutil"
"os"
"path"
"path/filepath"
"regexp"
)
type File interface {
Copy(destPath string, removeSensitive bool) error
FilePath() string
}
type SimpleFile struct {
Path string
}
// Copy supports removeSensitive for other file types but for a simple file this doesn't do anything
func (s SimpleFile) Copy(destPath string, removeSensitive bool) error {
// simple copy easiest
if err := utils.CopyFile(s.FilePath(), destPath); err != nil {
return errors.Wrapf(err, "unable to copy file %s", s.FilePath())
}
return nil
}
func (s SimpleFile) FilePath() string {
return s.Path
}
func NewFileFrame(name string, filePaths []string) FileFrame {
i := 0
files := make([]File, len(filePaths))
for i, path := range filePaths {
files[i] = SimpleFile{
Path: path,
}
}
return FileFrame{
name: name,
i: &i,
files: files,
}
}
type FileFrame struct {
name string
i *int
files []File
}
func (f FileFrame) Next() ([]interface{}, bool, error) {
if len(f.files) == *(f.i) {
return nil, false, nil
}
file := f.files[*f.i]
*f.i++
value := make([]interface{}, 1)
value[0] = file
return value, true, nil
}
func (f FileFrame) Columns() []string {
return []string{"files"}
}
func (f FileFrame) Name() string {
return f.name
}
// DirectoryFileFrame represents a set of files under a directory
type DirectoryFileFrame struct {
FileFrame
Directory string
}
func NewFileDirectoryFrame(directory string, exts []string) (DirectoryFileFrame, []error) {
filePaths, errs := utils.ListFilesInDirectory(directory, exts)
files := make([]File, len(filePaths))
for i, path := range filePaths {
files[i] = SimpleFile{
Path: path,
}
}
i := 0
return DirectoryFileFrame{
Directory: directory,
FileFrame: FileFrame{
files: files,
i: &i,
},
}, errs
}
func (f DirectoryFileFrame) Next() ([]interface{}, bool, error) {
if len(f.files) == *(f.i) {
return nil, false, nil
}
file := f.files[*f.i]
*f.i++
value := make([]interface{}, 1)
value[0] = file
return value, true, nil
}
func (f DirectoryFileFrame) Columns() []string {
return []string{"files"}
}
func (f DirectoryFileFrame) Name() string {
return f.Directory
}
type ConfigFile interface {
File
FindLogPaths() ([]string, error)
FindIncludedConfig() (ConfigFile, error)
IsIncluded() bool
}
type ConfigFileFrame struct {
i *int
Directory string
files []ConfigFile
}
func (f ConfigFileFrame) Next() ([]interface{}, bool, error) {
if len(f.files) == *(f.i) {
return nil, false, nil
}
file := f.files[*f.i]
*f.i++
value := make([]interface{}, 1)
value[0] = file
return value, true, nil
}
func (f ConfigFileFrame) Name() string {
return f.Directory
}
func NewConfigFileFrame(directory string) (ConfigFileFrame, []error) {
files, errs := utils.ListFilesInDirectory(directory, []string{"*.xml", "*.yaml", "*.yml"})
// we can't predict the length because of include files
var configs []ConfigFile
for _, path := range files {
var configFile ConfigFile
switch ext := filepath.Ext(path); ext {
case ".xml":
configFile = XmlConfigFile{
Path: path,
Included: false,
}
case ".yml":
configFile = YamlConfigFile{
Path: path,
Included: false,
}
case ".yaml":
configFile = YamlConfigFile{
Path: path,
}
}
if configFile != nil {
configs = append(configs, configFile)
// add any included configs
iConf, err := configFile.FindIncludedConfig()
if err != nil {
errs = append(errs, err)
} else {
if iConf.FilePath() != "" {
configs = append(configs, iConf)
}
}
}
}
i := 0
return ConfigFileFrame{
i: &i,
Directory: directory,
files: configs,
}, errs
}
func (f ConfigFileFrame) Columns() []string {
return []string{"config"}
}
func (f ConfigFileFrame) FindLogPaths() (logPaths []string, errors []error) {
for _, configFile := range f.files {
paths, err := configFile.FindLogPaths()
if err != nil {
errors = append(errors, err)
} else {
logPaths = append(logPaths, paths...)
}
}
return logPaths, errors
}
type XmlConfigFile struct {
Path string
Included bool
}
// these patterns will be used to remove sensitive content - matches of the pattern will be replaced with the key
var xmlSensitivePatterns = map[string]*regexp.Regexp{
"<password>Replaced</password>": regexp.MustCompile(`<password>(.*)</password>`),
"<password_sha256_hex>Replaced</password_sha256_hex>": regexp.MustCompile(`<password_sha256_hex>(.*)</password_sha256_hex>`),
"<secret_access_key>Replaced</secret_access_key>": regexp.MustCompile(`<secret_access_key>(.*)</secret_access_key>`),
"<access_key_id>Replaced</access_key_id>": regexp.MustCompile(`<access_key_id>(.*)</access_key_id>`),
"<secret>Replaced</secret>": regexp.MustCompile(`<secret>(.*)</secret>`),
}
func (x XmlConfigFile) Copy(destPath string, removeSensitive bool) error {
if !removeSensitive {
// simple copy easiest
if err := utils.CopyFile(x.FilePath(), destPath); err != nil {
return errors.Wrapf(err, "unable to copy file %s", x.FilePath())
}
return nil
}
return sensitiveFileCopy(x.FilePath(), destPath, xmlSensitivePatterns)
}
func (x XmlConfigFile) FilePath() string {
return x.Path
}
func (x XmlConfigFile) IsIncluded() bool {
return x.Included
}
type XmlLoggerConfig struct {
XMLName xml.Name `xml:"logger"`
ErrorLog string `xml:"errorlog"`
Log string `xml:"log"`
}
type YandexXMLConfig struct {
XMLName xml.Name `xml:"yandex"`
Clickhouse XmlLoggerConfig `xml:"logger"`
IncludeFrom string `xml:"include_from"`
}
type XmlConfig struct {
XMLName xml.Name `xml:"clickhouse"`
Clickhouse XmlLoggerConfig `xml:"logger"`
IncludeFrom string `xml:"include_from"`
}
func (x XmlConfigFile) UnmarshallConfig() (XmlConfig, error) {
inputFile, err := ioutil.ReadFile(x.Path)
if err != nil {
return XmlConfig{}, err
}
var cConfig XmlConfig
err = xml.Unmarshal(inputFile, &cConfig)
if err == nil {
return XmlConfig{
Clickhouse: cConfig.Clickhouse,
IncludeFrom: cConfig.IncludeFrom,
}, nil
}
// attempt to marshall as yandex file
var yConfig YandexXMLConfig
err = xml.Unmarshal(inputFile, &yConfig)
if err != nil {
return XmlConfig{}, err
}
return XmlConfig{
Clickhouse: yConfig.Clickhouse,
IncludeFrom: yConfig.IncludeFrom,
}, nil
}
func (x XmlConfigFile) FindLogPaths() ([]string, error) {
var paths []string
config, err := x.UnmarshallConfig()
if err != nil {
return nil, err
}
if config.Clickhouse.Log != "" {
paths = append(paths, config.Clickhouse.Log)
}
if config.Clickhouse.ErrorLog != "" {
paths = append(paths, config.Clickhouse.ErrorLog)
}
return paths, nil
}
func (x XmlConfigFile) FindIncludedConfig() (ConfigFile, error) {
if x.Included {
//cant recurse
return XmlConfigFile{}, nil
}
config, err := x.UnmarshallConfig()
if err != nil {
return XmlConfigFile{}, err
}
// we need to convert this
if config.IncludeFrom != "" {
if filepath.IsAbs(config.IncludeFrom) {
return XmlConfigFile{Path: config.IncludeFrom, Included: true}, nil
}
confDir := filepath.Dir(x.FilePath())
return XmlConfigFile{Path: path.Join(confDir, config.IncludeFrom), Included: true}, nil
}
return XmlConfigFile{}, nil
}
type YamlConfigFile struct {
Path string
Included bool
}
var ymlSensitivePatterns = map[string]*regexp.Regexp{
"password: 'Replaced'": regexp.MustCompile(`password:\s*.*$`),
"password_sha256_hex: 'Replaced'": regexp.MustCompile(`password_sha256_hex:\s*.*$`),
"access_key_id: 'Replaced'": regexp.MustCompile(`access_key_id:\s*.*$`),
"secret_access_key: 'Replaced'": regexp.MustCompile(`secret_access_key:\s*.*$`),
"secret: 'Replaced'": regexp.MustCompile(`secret:\s*.*$`),
}
func (y YamlConfigFile) Copy(destPath string, removeSensitive bool) error {
if !removeSensitive {
// simple copy easiest
if err := utils.CopyFile(y.FilePath(), destPath); err != nil {
return errors.Wrapf(err, "unable to copy file %s", y.FilePath())
}
return nil
}
return sensitiveFileCopy(y.FilePath(), destPath, ymlSensitivePatterns)
}
func (y YamlConfigFile) FilePath() string {
return y.Path
}
func (y YamlConfigFile) IsIncluded() bool {
return y.Included
}
type YamlLoggerConfig struct {
Log string
ErrorLog string
}
type YamlConfig struct {
Logger YamlLoggerConfig
Include_From string
}
func (y YamlConfigFile) FindLogPaths() ([]string, error) {
var paths []string
inputFile, err := ioutil.ReadFile(y.Path)
if err != nil {
return nil, err
}
var config YamlConfig
err = yaml.Unmarshal(inputFile, &config)
if err != nil {
return nil, err
}
if config.Logger.Log != "" {
paths = append(paths, config.Logger.Log)
}
if config.Logger.ErrorLog != "" {
paths = append(paths, config.Logger.ErrorLog)
}
return paths, nil
}
func (y YamlConfigFile) FindIncludedConfig() (ConfigFile, error) {
if y.Included {
//cant recurse
return YamlConfigFile{}, nil
}
inputFile, err := ioutil.ReadFile(y.Path)
if err != nil {
return YamlConfigFile{}, err
}
var config YamlConfig
err = yaml.Unmarshal(inputFile, &config)
if err != nil {
return YamlConfigFile{}, err
}
if config.Include_From != "" {
if filepath.IsAbs(config.Include_From) {
return YamlConfigFile{Path: config.Include_From, Included: true}, nil
}
confDir := filepath.Dir(y.FilePath())
return YamlConfigFile{Path: path.Join(confDir, config.Include_From), Included: true}, nil
}
return YamlConfigFile{}, nil
}
func sensitiveFileCopy(sourcePath string, destPath string, patterns map[string]*regexp.Regexp) error {
destDir := filepath.Dir(destPath)
if err := os.MkdirAll(destDir, os.ModePerm); err != nil {
return errors.Wrapf(err, "unable to create directory %s", destDir)
}
// currently, we don't unmarshall into a struct - we want to preserve structure and comments. Possibly could
// be handled but for simplicity we do a line parse for now
inputFile, err := os.Open(sourcePath)
if err != nil {
return err
}
defer inputFile.Close()
outputFile, err := os.Create(destPath)
if err != nil {
return err
}
defer outputFile.Close()
writer := bufio.NewWriter(outputFile)
scanner := bufio.NewScanner(inputFile)
for scanner.Scan() {
line := scanner.Text()
for repl, pattern := range patterns {
line = pattern.ReplaceAllString(line, repl)
}
_, err = writer.WriteString(line + "\n")
if err != nil {
return err
}
}
writer.Flush()
return nil
}

View File

@ -0,0 +1,280 @@
package data_test
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"io/ioutil"
"os"
"path"
"path/filepath"
"strings"
"testing"
)
func TestNextFileDirectoryFrame(t *testing.T) {
t.Run("can iterate file frame", func(t *testing.T) {
tempDir := t.TempDir()
files := make([]string, 5)
for i := 0; i < 5; i++ {
fileDir := path.Join(tempDir, fmt.Sprintf("%d", i))
err := os.MkdirAll(fileDir, os.ModePerm)
require.Nil(t, err)
filepath := path.Join(fileDir, fmt.Sprintf("random-%d.txt", i))
files[i] = filepath
_, err = os.Create(filepath)
require.Nil(t, err)
}
fileFrame, errs := data.NewFileDirectoryFrame(tempDir, []string{"*.txt"})
require.Empty(t, errs)
i := 0
for {
values, ok, err := fileFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Len(t, values, 1)
require.Equal(t, files[i], values[0].(data.SimpleFile).Path)
i += 1
}
require.Equal(t, 5, i)
})
t.Run("can iterate file frame when empty", func(t *testing.T) {
// create 5 temporary files
tempDir := t.TempDir()
fileFrame, errs := data.NewFileDirectoryFrame(tempDir, []string{"*"})
require.Empty(t, errs)
i := 0
for {
_, ok, err := fileFrame.Next()
require.Nil(t, err)
if !ok {
break
}
}
require.Equal(t, 0, i)
})
}
func TestNewConfigFileFrame(t *testing.T) {
t.Run("can iterate config file frame", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "xml"))
require.Empty(t, errs)
i := 0
for {
values, ok, err := configFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Len(t, values, 1)
filePath := values[0].(data.XmlConfigFile).FilePath()
require.True(t, strings.Contains(filePath, ".xml"))
i += 1
}
// 5 not 3 due to the includes
require.Equal(t, 5, i)
})
t.Run("can iterate file frame when empty", func(t *testing.T) {
// create 5 temporary files
tempDir := t.TempDir()
configFrame, errs := data.NewConfigFileFrame(tempDir)
require.Empty(t, errs)
i := 0
for {
_, ok, err := configFrame.Next()
require.Nil(t, err)
if !ok {
break
}
}
require.Equal(t, 0, i)
})
}
func TestConfigFileFrameCopy(t *testing.T) {
t.Run("can copy non-sensitive xml config files", func(t *testing.T) {
tmrDir := t.TempDir()
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "xml"))
require.Empty(t, errs)
for {
values, ok, err := configFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Nil(t, err)
require.True(t, ok)
configFile := values[0].(data.XmlConfigFile)
newPath := path.Join(tmrDir, filepath.Base(configFile.FilePath()))
err = configFile.Copy(newPath, false)
require.FileExists(t, newPath)
sourceInfo, _ := os.Stat(configFile.FilePath())
destInfo, _ := os.Stat(newPath)
require.Equal(t, sourceInfo.Size(), destInfo.Size())
require.Nil(t, err)
}
})
t.Run("can copy sensitive xml config files", func(t *testing.T) {
tmrDir := t.TempDir()
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "xml"))
require.Empty(t, errs)
i := 0
sizes := map[string]int64{
"users.xml": int64(2039),
"default-password.xml": int64(188),
"config.xml": int64(61282),
"server-include.xml": int64(168),
"user-include.xml": int64(582),
}
var checkedFiles []string
for {
values, ok, err := configFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Nil(t, err)
require.True(t, ok)
configFile := values[0].(data.XmlConfigFile)
fileName := filepath.Base(configFile.FilePath())
newPath := path.Join(tmrDir, fileName)
err = configFile.Copy(newPath, true)
require.FileExists(t, newPath)
destInfo, _ := os.Stat(newPath)
require.Equal(t, sizes[fileName], destInfo.Size())
require.Nil(t, err)
bytes, err := ioutil.ReadFile(newPath)
require.Nil(t, err)
s := string(bytes)
checkedFiles = append(checkedFiles, fileName)
if fileName == "users.xml" || fileName == "default-password.xml" || fileName == "user-include.xml" {
require.True(t, strings.Contains(s, "<password>Replaced</password>") ||
strings.Contains(s, "<password_sha256_hex>Replaced</password_sha256_hex>"))
require.NotContains(t, s, "<password>REPLACE_ME</password>")
require.NotContains(t, s, "<password_sha256_hex>REPLACE_ME</password_sha256_hex>")
} else if fileName == "config.xml" {
require.True(t, strings.Contains(s, "<access_key_id>Replaced</access_key_id>"))
require.True(t, strings.Contains(s, "<secret_access_key>Replaced</secret_access_key>"))
require.True(t, strings.Contains(s, "<secret>Replaced</secret>"))
require.NotContains(t, s, "<access_key_id>REPLACE_ME</access_key_id>")
require.NotContains(t, s, "<secret_access_key>REPLACE_ME</secret_access_key>")
require.NotContains(t, s, "<secret>REPLACE_ME</secret>")
}
i++
}
require.ElementsMatch(t, []string{"users.xml", "default-password.xml", "user-include.xml", "config.xml", "server-include.xml"}, checkedFiles)
require.Equal(t, 5, i)
})
t.Run("can copy sensitive yaml config files", func(t *testing.T) {
tmrDir := t.TempDir()
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "yaml"))
require.Empty(t, errs)
i := 0
sizes := map[string]int64{
"users.yaml": int64(1023),
"default-password.yaml": int64(132),
"config.yaml": int64(42512),
"server-include.yaml": int64(21),
"user-include.yaml": int64(120),
}
var checkedFiles []string
for {
values, ok, err := configFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.Nil(t, err)
require.True(t, ok)
configFile := values[0].(data.YamlConfigFile)
fileName := filepath.Base(configFile.FilePath())
newPath := path.Join(tmrDir, fileName)
err = configFile.Copy(newPath, true)
require.FileExists(t, newPath)
destInfo, _ := os.Stat(newPath)
require.Equal(t, sizes[fileName], destInfo.Size())
require.Nil(t, err)
bytes, err := ioutil.ReadFile(newPath)
require.Nil(t, err)
s := string(bytes)
checkedFiles = append(checkedFiles, fileName)
if fileName == "users.yaml" || fileName == "default-password.yaml" || fileName == "user-include.yaml" {
require.True(t, strings.Contains(s, "password: 'Replaced'") ||
strings.Contains(s, "password_sha256_hex: 'Replaced'"))
require.NotContains(t, s, "password: 'REPLACE_ME'")
require.NotContains(t, s, "password_sha256_hex: \"REPLACE_ME\"")
} else if fileName == "config.yaml" {
require.True(t, strings.Contains(s, "access_key_id: 'Replaced'"))
require.True(t, strings.Contains(s, "secret_access_key: 'Replaced'"))
require.True(t, strings.Contains(s, "secret: 'Replaced'"))
require.NotContains(t, s, "access_key_id: 'REPLACE_ME'")
require.NotContains(t, s, "secret_access_key: REPLACE_ME")
require.NotContains(t, s, "secret: REPLACE_ME")
}
i++
}
require.ElementsMatch(t, []string{"users.yaml", "default-password.yaml", "user-include.yaml", "config.yaml", "server-include.yaml"}, checkedFiles)
require.Equal(t, 5, i)
})
}
func TestConfigFileFrameFindLogPaths(t *testing.T) {
t.Run("can find xml log paths", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "xml"))
require.Empty(t, errs)
paths, errs := configFrame.FindLogPaths()
require.Empty(t, errs)
require.ElementsMatch(t, []string{"/var/log/clickhouse-server/clickhouse-server.log",
"/var/log/clickhouse-server/clickhouse-server.err.log"}, paths)
})
t.Run("can handle empty log paths", func(t *testing.T) {
configFrame, errs := data.NewConfigFileFrame(t.TempDir())
require.Empty(t, errs)
paths, errs := configFrame.FindLogPaths()
require.Empty(t, errs)
require.Empty(t, paths)
})
t.Run("can find yaml log paths", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "yaml"))
require.Empty(t, errs)
paths, errs := configFrame.FindLogPaths()
require.Empty(t, errs)
require.ElementsMatch(t, []string{"/var/log/clickhouse-server/clickhouse-server.log",
"/var/log/clickhouse-server/clickhouse-server.err.log"}, paths)
})
}
// test the legacy format for ClickHouse xml config files with a yandex root tag
func TestYandexConfigFile(t *testing.T) {
t.Run("can find xml log paths with yandex root", func(t *testing.T) {
cwd, err := os.Getwd()
require.Nil(t, err)
configFrame, errs := data.NewConfigFileFrame(path.Join(cwd, "../../../testdata", "configs", "yandex_xml"))
require.Empty(t, errs)
paths, errs := configFrame.FindLogPaths()
require.Empty(t, errs)
require.ElementsMatch(t, []string{"/var/log/clickhouse-server/clickhouse-server.log",
"/var/log/clickhouse-server/clickhouse-server.err.log"}, paths)
})
}

View File

@ -0,0 +1,11 @@
package data
type BaseFrame struct {
Name string
}
type Frame interface {
Next() ([]interface{}, bool, error)
Columns() []string
Name() string
}

View File

@ -0,0 +1,35 @@
package data
type MemoryFrame struct {
i *int
ColumnNames []string
Rows [][]interface{}
name string
}
func NewMemoryFrame(name string, columns []string, rows [][]interface{}) MemoryFrame {
i := 0
return MemoryFrame{
i: &i,
Rows: rows,
ColumnNames: columns,
name: name,
}
}
func (f MemoryFrame) Next() ([]interface{}, bool, error) {
if f.Rows == nil || len(f.Rows) == *(f.i) {
return nil, false, nil
}
value := f.Rows[*f.i]
*f.i++
return value, true, nil
}
func (f MemoryFrame) Columns() []string {
return f.ColumnNames
}
func (f MemoryFrame) Name() string {
return f.name
}

View File

@ -0,0 +1,60 @@
package data_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/stretchr/testify/require"
"testing"
)
func TestNextMemoryFrame(t *testing.T) {
t.Run("can iterate memory frame", func(t *testing.T) {
columns := []string{"Filesystem", "Size", "Used", "Avail", "Use%", "Mounted on"}
rows := [][]interface{}{
{"sysfs", 0, 0, 0, 0, "/sys"},
{"proc", 0, 0, 0, 0, "/proc"},
{"udev", 33357840384, 0, 33357840384, 0, "/dev"},
{"devpts", 0, 0, 0, 0, "/dev/pts"},
{"tmpfs", 6682607616, 2228224, 6680379392, 1, "/run"},
{"/dev/mapper/system-root", 1938213220352, 118136926208, 1721548947456, 7.000000000000001, "/"},
}
memoryFrame := data.NewMemoryFrame("disks", columns, rows)
i := 0
for {
values, ok, err := memoryFrame.Next()
require.Nil(t, err)
if !ok {
break
}
require.ElementsMatch(t, values, rows[i])
require.Len(t, values, 6)
i += 1
}
require.Equal(t, 6, i)
})
t.Run("can iterate memory frame when empty", func(t *testing.T) {
memoryFrame := data.NewMemoryFrame("test", []string{}, [][]interface{}{})
i := 0
for {
_, ok, err := memoryFrame.Next()
require.Nil(t, err)
if !ok {
break
}
}
require.Equal(t, 0, i)
})
t.Run("can iterate memory frame when empty", func(t *testing.T) {
memoryFrame := data.MemoryFrame{}
i := 0
for {
_, ok, err := memoryFrame.Next()
require.Nil(t, err)
if !ok {
break
}
}
require.Equal(t, 0, i)
})
}

View File

@ -0,0 +1,27 @@
package data
func NewHierarchicalFrame(name string, frame Frame, subFrames []HierarchicalFrame) HierarchicalFrame {
return HierarchicalFrame{
name: name,
DataFrame: frame,
SubFrames: subFrames,
}
}
type HierarchicalFrame struct {
name string
DataFrame Frame
SubFrames []HierarchicalFrame
}
func (hf HierarchicalFrame) Name() string {
return hf.name
}
func (hf HierarchicalFrame) Columns() []string {
return hf.DataFrame.Columns()
}
func (hf HierarchicalFrame) Next() ([]interface{}, bool, error) {
return hf.DataFrame.Next()
}

View File

@ -0,0 +1,93 @@
package database
import (
"database/sql"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
_ "github.com/ClickHouse/clickhouse-go/v2"
"github.com/pkg/errors"
"strings"
)
type ClickhouseNativeClient struct {
host string
connection *sql.DB
}
func NewNativeClient(host string, port uint16, username string, password string) (*ClickhouseNativeClient, error) {
// debug output ?debug=true
connection, err := sql.Open("clickhouse", fmt.Sprintf("clickhouse://%s:%s@%s:%d/", username, password, host, port))
if err != nil {
return &ClickhouseNativeClient{}, err
}
if err := connection.Ping(); err != nil {
return &ClickhouseNativeClient{}, err
}
return &ClickhouseNativeClient{
host: host,
connection: connection,
}, nil
}
func (c *ClickhouseNativeClient) Ping() error {
return c.connection.Ping()
}
func (c *ClickhouseNativeClient) ReadTable(databaseName string, tableName string, excludeColumns []string, orderBy data.OrderBy, limit int64) (data.Frame, error) {
exceptClause := ""
if len(excludeColumns) > 0 {
exceptClause = fmt.Sprintf("EXCEPT(%s) ", strings.Join(excludeColumns, ","))
}
limitClause := ""
if limit >= 0 {
limitClause = fmt.Sprintf(" LIMIT %d", limit)
}
rows, err := c.connection.Query(fmt.Sprintf("SELECT * %sFROM %s.%s%s%s", exceptClause, databaseName, tableName, orderBy.String(), limitClause))
if err != nil {
return data.DatabaseFrame{}, err
}
return data.NewDatabaseFrame(fmt.Sprintf("%s.%s", databaseName, tableName), rows)
}
func (c *ClickhouseNativeClient) ReadTableNamesForDatabase(databaseName string) ([]string, error) {
rows, err := c.connection.Query(fmt.Sprintf("SHOW TABLES FROM %s", databaseName))
if err != nil {
return nil, err
}
defer rows.Close()
var tableNames []string
var name string
for rows.Next() {
if err := rows.Scan(&name); err != nil {
return nil, err
}
tableNames = append(tableNames, name)
}
return tableNames, nil
}
func (c *ClickhouseNativeClient) ExecuteStatement(id string, statement string) (data.Frame, error) {
rows, err := c.connection.Query(statement)
if err != nil {
return data.DatabaseFrame{}, err
}
return data.NewDatabaseFrame(id, rows)
}
func (c *ClickhouseNativeClient) Version() (string, error) {
frame, err := c.ExecuteStatement("version", "SELECT version() as version")
if err != nil {
return "", err
}
values, ok, err := frame.Next()
if err != nil {
return "", err
}
if !ok {
return "", errors.New("unable to read ClickHouse version")
}
if len(values) != 1 {
return "", errors.New("unable to read ClickHouse version - no rows returned")
}
return values[0].(string), nil
}

View File

@ -0,0 +1,231 @@
package database_test
import (
"context"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/database"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/wait"
"os"
"path"
"strconv"
"testing"
)
func TestMain(m *testing.M) {
// create a ClickHouse container
ctx := context.Background()
cwd, err := os.Getwd()
if err != nil {
// can't test without container
panic(err)
}
// for now, we test against a hardcoded database-server version but we should make this a property
req := testcontainers.ContainerRequest{
Image: fmt.Sprintf("clickhouse/clickhouse-server:%s", test.GetClickHouseTestVersion()),
ExposedPorts: []string{"9000/tcp"},
WaitingFor: wait.ForLog("Ready for connections"),
BindMounts: map[string]string{
"/etc/clickhouse-server/config.d/custom.xml": path.Join(cwd, "../../../testdata/docker/custom.xml"),
"/etc/clickhouse-server/users.d/admin.xml": path.Join(cwd, "../../../testdata/docker/admin.xml"),
},
}
clickhouseContainer, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
// can't test without container
panic(err)
}
p, _ := clickhouseContainer.MappedPort(ctx, "9000")
os.Setenv("CLICKHOUSE_DB_PORT", p.Port())
defer clickhouseContainer.Terminate(ctx) //nolint
os.Exit(m.Run())
}
func getClient(t *testing.T) *database.ClickhouseNativeClient {
mappedPort, err := strconv.Atoi(os.Getenv("CLICKHOUSE_DB_PORT"))
if err != nil {
t.Fatal("Unable to read port value from environment")
}
clickhouseClient, err := database.NewNativeClient("localhost", uint16(mappedPort), "", "")
if err != nil {
t.Fatalf("unable to build client : %v", err)
}
return clickhouseClient
}
func TestReadTableNamesForDatabase(t *testing.T) {
clickhouseClient := getClient(t)
t.Run("client can read tables for a database", func(t *testing.T) {
tables, err := clickhouseClient.ReadTableNamesForDatabase("system")
require.Nil(t, err)
require.Equal(t, 70, len(tables))
require.Contains(t, tables, "merge_tree_settings")
})
}
func TestReadTable(t *testing.T) {
clickhouseClient := getClient(t)
t.Run("client can get all rows for system.disks table", func(t *testing.T) {
// we read the table system.disks as this should contain only 1 row
frame, err := clickhouseClient.ReadTable("system", "disks", []string{}, data.OrderBy{}, 10)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [7]string{"name", "path", "free_space", "total_space", "keep_free_space", "type", "cache_path"})
i := 0
for {
values, ok, err := frame.Next()
if i == 0 {
require.Nil(t, err)
require.True(t, ok)
require.Equal(t, "default", values[0])
require.Equal(t, "/var/lib/clickhouse/", values[1])
require.Greater(t, values[2], uint64(0))
require.Greater(t, values[3], uint64(0))
require.Equal(t, values[4], uint64(0))
require.Equal(t, "local", values[5])
} else {
require.False(t, ok)
break
}
i += 1
}
})
t.Run("client can get all rows for system.databases table", func(t *testing.T) {
// we read the table system.databases as this should be small and consistent on fresh db instances
frame, err := clickhouseClient.ReadTable("system", "databases", []string{}, data.OrderBy{}, 10)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [6]string{"name", "engine", "data_path", "metadata_path", "uuid", "comment"})
expectedRows := [4][3]string{{"INFORMATION_SCHEMA", "Memory", "/var/lib/clickhouse/"},
{"default", "Atomic", "/var/lib/clickhouse/store/"},
{"information_schema", "Memory", "/var/lib/clickhouse/"},
{"system", "Atomic", "/var/lib/clickhouse/store/"}}
i := 0
for {
values, ok, err := frame.Next()
if i < 4 {
require.Nil(t, err)
require.True(t, ok)
require.Equal(t, expectedRows[i][0], values[0])
require.Equal(t, expectedRows[i][1], values[1])
require.Equal(t, expectedRows[i][2], values[2])
require.NotNil(t, values[3])
require.NotNil(t, values[4])
require.Equal(t, "", values[5])
} else {
require.False(t, ok)
break
}
i += 1
}
})
t.Run("client can get all rows for system.databases table with except", func(t *testing.T) {
frame, err := clickhouseClient.ReadTable("system", "databases", []string{"data_path", "comment"}, data.OrderBy{}, 10)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [4]string{"name", "engine", "metadata_path", "uuid"})
})
t.Run("client can limit rows for system.databases", func(t *testing.T) {
frame, err := clickhouseClient.ReadTable("system", "databases", []string{}, data.OrderBy{}, 1)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [6]string{"name", "engine", "data_path", "metadata_path", "uuid", "comment"})
expectedRows := [1][3]string{{"INFORMATION_SCHEMA", "Memory", "/var/lib/clickhouse/"}}
i := 0
for {
values, ok, err := frame.Next()
if i == 0 {
require.Nil(t, err)
require.True(t, ok)
require.Equal(t, expectedRows[i][0], values[0])
require.Equal(t, expectedRows[i][1], values[1])
require.Equal(t, expectedRows[i][2], values[2])
require.NotNil(t, values[3])
require.NotNil(t, values[4])
require.Equal(t, "", values[5])
} else {
require.False(t, ok)
break
}
i += 1
}
})
t.Run("client can order rows for system.databases", func(t *testing.T) {
frame, err := clickhouseClient.ReadTable("system", "databases", []string{}, data.OrderBy{
Column: "engine",
Order: data.Asc,
}, 10)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [6]string{"name", "engine", "data_path", "metadata_path", "uuid", "comment"})
expectedRows := [4][3]string{
{"default", "Atomic", "/var/lib/clickhouse/store/"},
{"system", "Atomic", "/var/lib/clickhouse/store/"},
{"INFORMATION_SCHEMA", "Memory", "/var/lib/clickhouse/"},
{"information_schema", "Memory", "/var/lib/clickhouse/"},
}
i := 0
for {
values, ok, err := frame.Next()
if i < 4 {
require.Nil(t, err)
require.True(t, ok)
require.Equal(t, expectedRows[i][0], values[0])
require.Equal(t, expectedRows[i][1], values[1])
require.Equal(t, expectedRows[i][2], values[2])
require.NotNil(t, values[3])
require.NotNil(t, values[4])
require.Equal(t, "", values[5])
} else {
require.False(t, ok)
break
}
i += 1
}
})
}
func TestExecuteStatement(t *testing.T) {
clickhouseClient := getClient(t)
t.Run("client can execute any statement", func(t *testing.T) {
statement := "SELECT path, count(*) as count FROM system.disks GROUP BY path;"
frame, err := clickhouseClient.ExecuteStatement("engines", statement)
require.Nil(t, err)
require.ElementsMatch(t, frame.Columns(), [2]string{"path", "count"})
expectedRows := [1][2]interface{}{
{"/var/lib/clickhouse/", uint64(1)},
}
i := 0
for {
values, ok, err := frame.Next()
if !ok {
require.Nil(t, err)
break
}
require.Nil(t, err)
require.Equal(t, expectedRows[i][0], values[0])
require.Equal(t, expectedRows[i][1], values[1])
i++
}
fmt.Println(i)
})
}
func TestVersion(t *testing.T) {
clickhouseClient := getClient(t)
t.Run("client can read version", func(t *testing.T) {
version, err := clickhouseClient.Version()
require.Nil(t, err)
require.NotEmpty(t, version)
})
}

View File

@ -0,0 +1,48 @@
package platform
import (
"errors"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/database"
"sync"
)
var once sync.Once
var dbInit sync.Once
// manages all resources that collectors and outputs may wish to ensure inc. db connections
type DBClient interface {
ReadTableNamesForDatabase(databaseName string) ([]string, error)
ReadTable(databaseName string, tableName string, excludeColumns []string, orderBy data.OrderBy, limit int64) (data.Frame, error)
ExecuteStatement(id string, statement string) (data.Frame, error)
Version() (string, error)
}
var manager *ResourceManager
type ResourceManager struct {
DbClient DBClient
}
func GetResourceManager() *ResourceManager {
once.Do(func() {
manager = &ResourceManager{}
})
return manager
}
func (m *ResourceManager) Connect(host string, port uint16, username string, password string) error {
var err error
var clientInstance DBClient
init := false
dbInit.Do(func() {
clientInstance, err = database.NewNativeClient(host, port, username, password)
manager.DbClient = clientInstance
init = true
})
if !init {
return errors.New("connect can only be called once")
}
return err
}

View File

@ -0,0 +1,82 @@
package platform_test
import (
"context"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/stretchr/testify/require"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/wait"
"os"
"path"
"strconv"
"testing"
)
func TestMain(m *testing.M) {
// create a ClickHouse container
ctx := context.Background()
cwd, err := os.Getwd()
if err != nil {
fmt.Println("unable to read current directory", err)
os.Exit(1)
}
// for now, we test against a hardcoded database-server version but we should make this a property
req := testcontainers.ContainerRequest{
Image: fmt.Sprintf("clickhouse/clickhouse-server:%s", test.GetClickHouseTestVersion()),
ExposedPorts: []string{"9000/tcp"},
WaitingFor: wait.ForLog("Ready for connections"),
BindMounts: map[string]string{
"/etc/clickhouse-server/config.d/custom.xml": path.Join(cwd, "../../testdata/docker/custom.xml"),
"/etc/clickhouse-server/users.d/admin.xml": path.Join(cwd, "../../testdata/docker/admin.xml"),
},
}
clickhouseContainer, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
// can't test without container
panic(err)
}
p, _ := clickhouseContainer.MappedPort(ctx, "9000")
os.Setenv("CLICKHOUSE_DB_PORT", p.Port())
defer clickhouseContainer.Terminate(ctx) //nolint
os.Exit(m.Run())
}
func TestConnect(t *testing.T) {
mappedPort, err := strconv.Atoi(os.Getenv("CLICKHOUSE_DB_PORT"))
if err != nil {
t.Fatal("Unable to read port value from environment")
}
t.Run("can only connect once", func(t *testing.T) {
// get before connection
manager := platform.GetResourceManager()
require.Nil(t, manager.DbClient)
// init connection
err = manager.Connect("localhost", uint16(mappedPort), "", "")
require.Nil(t, err)
require.NotNil(t, manager.DbClient)
// try and re-fetch connection
err = manager.Connect("localhost", uint16(mappedPort), "", "")
require.NotNil(t, err)
require.Equal(t, "connect can only be called once", err.Error())
})
}
func TestGetResourceManager(t *testing.T) {
t.Run("get resource manager", func(t *testing.T) {
manager := platform.GetResourceManager()
require.NotNil(t, manager)
manager2 := platform.GetResourceManager()
require.NotNil(t, manager2)
require.Equal(t, &manager, &manager2)
})
}

View File

@ -0,0 +1,165 @@
package test
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/pkg/errors"
"sort"
"strings"
)
type fakeClickhouseClient struct {
tables map[string][]string
QueryResponses map[string]*FakeDataFrame
}
func NewFakeClickhouseClient(tables map[string][]string) fakeClickhouseClient {
queryResponses := make(map[string]*FakeDataFrame)
return fakeClickhouseClient{
tables: tables,
QueryResponses: queryResponses,
}
}
func (f fakeClickhouseClient) ReadTableNamesForDatabase(databaseName string) ([]string, error) {
if _, ok := f.tables[databaseName]; ok {
return f.tables[databaseName], nil
}
return nil, fmt.Errorf("database %s does not exist", databaseName)
}
func (f fakeClickhouseClient) ReadTable(databaseName string, tableName string, excludeColumns []string, orderBy data.OrderBy, limit int64) (data.Frame, error) {
exceptClause := ""
if len(excludeColumns) > 0 {
exceptClause = fmt.Sprintf("EXCEPT(%s) ", strings.Join(excludeColumns, ","))
}
limitClause := ""
if limit >= 0 {
limitClause = fmt.Sprintf(" LIMIT %d", limit)
}
query := fmt.Sprintf("SELECT * %sFROM %s.%s%s%s", exceptClause, databaseName, tableName, orderBy.String(), limitClause)
frame, error := f.ExecuteStatement(fmt.Sprintf("read_table_%s.%s", databaseName, tableName), query)
if error != nil {
return frame, error
}
fFrame := *(frame.(*FakeDataFrame))
fFrame = fFrame.FilterColumns(excludeColumns)
fFrame = fFrame.Order(orderBy)
fFrame = fFrame.Limit(limit)
return fFrame, nil
}
func (f fakeClickhouseClient) ExecuteStatement(id string, statement string) (data.Frame, error) {
if frame, ok := f.QueryResponses[statement]; ok {
return frame, nil
}
return FakeDataFrame{}, errors.New(fmt.Sprintf("No recorded response for %s", statement))
}
func (f fakeClickhouseClient) Version() (string, error) {
return "21.12.3", nil
}
func (f fakeClickhouseClient) Reset() {
for key, frame := range f.QueryResponses {
frame.Reset()
f.QueryResponses[key] = frame
}
}
type FakeDataFrame struct {
i *int
Rows [][]interface{}
ColumnNames []string
name string
}
func NewFakeDataFrame(name string, columns []string, rows [][]interface{}) FakeDataFrame {
i := 0
return FakeDataFrame{
i: &i,
Rows: rows,
ColumnNames: columns,
name: name,
}
}
func (f FakeDataFrame) Next() ([]interface{}, bool, error) {
if len(f.Rows) == *(f.i) {
return nil, false, nil
}
value := f.Rows[*f.i]
*f.i++
return value, true, nil
}
func (f FakeDataFrame) Columns() []string {
return f.ColumnNames
}
func (f FakeDataFrame) Name() string {
return f.name
}
func (f *FakeDataFrame) Reset() {
i := 0
f.i = &i
}
func (f FakeDataFrame) FilterColumns(excludeColumns []string) FakeDataFrame {
// get columns we can remove
rColumns := utils.Intersection(f.ColumnNames, excludeColumns)
rIndexes := make([]int, len(rColumns))
// find the indexes of the columns to remove
for i, column := range rColumns {
rIndexes[i] = utils.IndexOf(f.ColumnNames, column)
}
newRows := make([][]interface{}, len(f.Rows))
for r, row := range f.Rows {
newRow := row
for i, index := range rIndexes {
newRow = utils.Remove(newRow, index-i)
}
newRows[r] = newRow
}
f.Rows = newRows
f.ColumnNames = utils.Distinct(f.ColumnNames, excludeColumns)
return f
}
func (f FakeDataFrame) Limit(rowLimit int64) FakeDataFrame {
if rowLimit >= 0 {
if int64(len(f.Rows)) > rowLimit {
f.Rows = f.Rows[:rowLimit]
}
}
return f
}
func (f FakeDataFrame) Order(orderBy data.OrderBy) FakeDataFrame {
if orderBy.Column == "" {
return f
}
cIndex := utils.IndexOf(f.ColumnNames, orderBy.Column)
sort.Slice(f.Rows, func(i, j int) bool {
left := f.Rows[i][cIndex]
right := f.Rows[j][cIndex]
if iLeft, ok := left.(int); ok {
if orderBy.Order == data.Asc {
return iLeft < right.(int)
}
return iLeft > right.(int)
} else {
// we aren't a full db - revert to string order
sLeft := left.(string)
sRight := right.(string)
if orderBy.Order == data.Asc {
return sLeft < sRight
}
return sLeft > sRight
}
})
return f
}

View File

@ -0,0 +1,16 @@
package test
import "os"
const defaultClickHouseVersion = "latest"
func GetClickHouseTestVersion() string {
return GetEnv("CLICKHOUSE_VERSION", defaultClickHouseVersion)
}
func GetEnv(key, fallback string) string {
if value, ok := os.LookupEnv(key); ok {
return value
}
return fallback
}

View File

@ -0,0 +1,94 @@
package utils
import (
"fmt"
"github.com/pkg/errors"
"io"
"io/fs"
"os"
"path/filepath"
)
func FileExists(name string) (bool, error) {
f, err := os.Stat(name)
if err == nil {
if !f.IsDir() {
return true, nil
}
return false, fmt.Errorf("%s is a directory", name)
}
if errors.Is(err, os.ErrNotExist) {
return false, nil
}
return false, err
}
func DirExists(name string) (bool, error) {
f, err := os.Stat(name)
if err == nil {
if f.IsDir() {
return true, nil
}
return false, fmt.Errorf("%s is a file", name)
}
if errors.Is(err, os.ErrNotExist) {
return false, nil
}
return false, err
}
func CopyFile(sourceFilename string, destFilename string) error {
exists, err := FileExists(sourceFilename)
if err != nil {
return err
}
if !exists {
return fmt.Errorf("%s does not exist", sourceFilename)
}
source, err := os.Open(sourceFilename)
if err != nil {
return err
}
defer source.Close()
destDir := filepath.Dir(destFilename)
if err := os.MkdirAll(destDir, os.ModePerm); err != nil {
return errors.Wrapf(err, "unable to create directory %s", destDir)
}
destination, err := os.Create(destFilename)
if err != nil {
return err
}
defer destination.Close()
_, err = io.Copy(destination, source)
return err
}
// patterns passed are an OR - any can be satisified and the file will be listed
func ListFilesInDirectory(directory string, patterns []string) ([]string, []error) {
var files []string
exists, err := DirExists(directory)
if err != nil {
return files, []error{err}
}
if !exists {
return files, []error{fmt.Errorf("directory %s does not exist", directory)}
}
var pathErrors []error
_ = filepath.Walk(directory, func(path string, info fs.FileInfo, err error) error {
if err != nil {
pathErrors = append(pathErrors, err)
} else if !info.IsDir() {
for _, pattern := range patterns {
if matched, err := filepath.Match(pattern, filepath.Base(path)); err != nil {
pathErrors = append(pathErrors, err)
} else if matched {
files = append(files, path)
}
}
}
return nil
})
return files, pathErrors
}

View File

@ -0,0 +1,133 @@
package utils_test
import (
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/stretchr/testify/require"
"os"
"path"
"testing"
)
func TestFileExists(t *testing.T) {
t.Run("returns true for file", func(t *testing.T) {
tempDir := t.TempDir()
filepath := path.Join(tempDir, "random.txt")
_, err := os.Create(filepath)
require.Nil(t, err)
exists, err := utils.FileExists(filepath)
require.True(t, exists)
require.Nil(t, err)
})
t.Run("doesn't return true for not existence file", func(t *testing.T) {
tempDir := t.TempDir()
file := path.Join(tempDir, "random.txt")
exists, err := utils.FileExists(file)
require.False(t, exists)
require.Nil(t, err)
})
t.Run("doesn't return true for directory", func(t *testing.T) {
tempDir := t.TempDir()
exists, err := utils.FileExists(tempDir)
require.False(t, exists)
require.NotNil(t, err)
require.Equal(t, fmt.Sprintf("%s is a directory", tempDir), err.Error())
})
}
func TestDirExists(t *testing.T) {
t.Run("doesn't return true for file", func(t *testing.T) {
tempDir := t.TempDir()
filepath := path.Join(tempDir, "random.txt")
_, err := os.Create(filepath)
require.Nil(t, err)
exists, err := utils.DirExists(filepath)
require.False(t, exists)
require.NotNil(t, err)
require.Equal(t, fmt.Sprintf("%s is a file", filepath), err.Error())
})
t.Run("returns true for directory", func(t *testing.T) {
tempDir := t.TempDir()
exists, err := utils.DirExists(tempDir)
require.True(t, exists)
require.Nil(t, err)
})
t.Run("doesn't return true random directory", func(t *testing.T) {
exists, err := utils.FileExists(fmt.Sprintf("%d", utils.MakeTimestamp()))
require.False(t, exists)
require.Nil(t, err)
})
}
func TestCopyFile(t *testing.T) {
t.Run("can copy file", func(t *testing.T) {
tempDir := t.TempDir()
sourcePath := path.Join(tempDir, "random.txt")
_, err := os.Create(sourcePath)
require.Nil(t, err)
destPath := path.Join(tempDir, "random-2.txt")
err = utils.CopyFile(sourcePath, destPath)
require.Nil(t, err)
})
t.Run("can copy nested file", func(t *testing.T) {
tempDir := t.TempDir()
sourcePath := path.Join(tempDir, "random.txt")
_, err := os.Create(sourcePath)
require.Nil(t, err)
destPath := path.Join(tempDir, "sub_dir", "random-2.txt")
err = utils.CopyFile(sourcePath, destPath)
require.Nil(t, err)
})
t.Run("fails when file does not exist", func(t *testing.T) {
tempDir := t.TempDir()
sourcePath := path.Join(tempDir, "random.txt")
destPath := path.Join(tempDir, "random-2.txt")
err := utils.CopyFile(sourcePath, destPath)
require.NotNil(t, err)
require.Equal(t, fmt.Sprintf("%s does not exist", sourcePath), err.Error())
})
}
func TestListFilesInDirectory(t *testing.T) {
tempDir := t.TempDir()
files := make([]string, 5)
for i := 0; i < 5; i++ {
fileDir := path.Join(tempDir, fmt.Sprintf("%d", i))
err := os.MkdirAll(fileDir, os.ModePerm)
require.Nil(t, err)
ext := ".txt"
if i%2 == 0 {
ext = ".csv"
}
filepath := path.Join(fileDir, fmt.Sprintf("random-%d%s", i, ext))
files[i] = filepath
_, err = os.Create(filepath)
require.Nil(t, err)
}
t.Run("can list all files", func(t *testing.T) {
mFiles, errs := utils.ListFilesInDirectory(tempDir, []string{"*"})
require.Len(t, mFiles, 5)
require.Empty(t, errs)
})
t.Run("can list by extension", func(t *testing.T) {
mFiles, errs := utils.ListFilesInDirectory(tempDir, []string{"*.csv"})
require.Len(t, mFiles, 3)
require.Empty(t, errs)
require.ElementsMatch(t, []string{files[0], files[2], files[4]}, mFiles)
})
t.Run("can list on multiple extensions files", func(t *testing.T) {
mFiles, errs := utils.ListFilesInDirectory(tempDir, []string{"*.csv", "*.txt"})
require.Len(t, mFiles, 5)
require.Empty(t, errs)
})
}

View File

@ -0,0 +1,49 @@
package utils
import (
"github.com/elastic/gosigar"
"strings"
)
func FindClickHouseProcesses() ([]gosigar.ProcArgs, error) {
pids := gosigar.ProcList{}
err := pids.Get()
if err != nil {
return nil, err
}
var clickhousePs []gosigar.ProcArgs
for _, pid := range pids.List {
args := gosigar.ProcArgs{}
if err := args.Get(pid); err != nil {
continue
}
if len(args.List) > 0 {
if strings.Contains(args.List[0], "clickhouse-server") {
clickhousePs = append(clickhousePs, args)
}
}
}
return clickhousePs, nil
}
func FindConfigsFromClickHouseProcesses() ([]string, error) {
clickhouseProcesses, err := FindClickHouseProcesses()
if err != nil {
return nil, err
}
var configs []string
if len(clickhouseProcesses) > 0 {
// we have candidate matches
for _, ps := range clickhouseProcesses {
for _, arg := range ps.List {
if strings.Contains(arg, "--config") {
configFile := strings.ReplaceAll(arg, "--config-file=", "")
// containers receive config with --config
configFile = strings.ReplaceAll(configFile, "--config=", "")
configs = append(configs, configFile)
}
}
}
}
return configs, err
}

View File

@ -0,0 +1,71 @@
package utils_test
import (
"context"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/stretchr/testify/require"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/wait"
"os"
"path"
"testing"
)
func TestMain(m *testing.M) {
// create a ClickHouse container
ctx := context.Background()
cwd, err := os.Getwd()
if err != nil {
fmt.Println("unable to read current directory", err)
os.Exit(1)
}
// for now, we test against a hardcoded database-server version but we should make this a property
req := testcontainers.ContainerRequest{
Image: fmt.Sprintf("clickhouse/clickhouse-server:%s", test.GetClickHouseTestVersion()),
ExposedPorts: []string{"9000/tcp"},
WaitingFor: wait.ForLog("Ready for connections"),
BindMounts: map[string]string{
"/etc/clickhouse-server/config.d/custom.xml": path.Join(cwd, "../../../testdata/docker/custom.xml"),
},
}
clickhouseContainer, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
// can't test without container
panic(err)
}
p, _ := clickhouseContainer.MappedPort(ctx, "9000")
os.Setenv("CLICKHOUSE_DB_PORT", p.Port())
defer clickhouseContainer.Terminate(ctx) //nolint
os.Exit(m.Run())
}
func TestFindClickHouseProcesses(t *testing.T) {
t.Run("can find ClickHouse processes", func(t *testing.T) {
processes, err := utils.FindClickHouseProcesses()
require.Nil(t, err)
// we might have clickhouse running locally during development as well as the above container so we allow 1 or more
require.GreaterOrEqual(t, len(processes), 1)
require.Equal(t, processes[0].List[0], "/usr/bin/clickhouse-server")
// flexible as services/containers pass the config differently
require.Contains(t, processes[0].List[1], "/etc/clickhouse-server/config.xml")
})
}
func TestFindConfigsFromClickHouseProcesses(t *testing.T) {
t.Run("can find ClickHouse configs", func(t *testing.T) {
configs, err := utils.FindConfigsFromClickHouseProcesses()
require.Nil(t, err)
require.GreaterOrEqual(t, len(configs), 1)
require.Equal(t, configs[0], "/etc/clickhouse-server/config.xml")
})
}

View File

@ -0,0 +1,68 @@
package utils
// Intersection of elements in s1 and s2
func Intersection(s1, s2 []string) (inter []string) {
hash := make(map[string]bool)
for _, e := range s1 {
hash[e] = false
}
for _, e := range s2 {
// If elements present in the hashmap then append intersection list.
if val, ok := hash[e]; ok {
if !val {
// only add once
inter = append(inter, e)
hash[e] = true
}
}
}
return inter
}
// Distinct returns elements in s1, not in s2
func Distinct(s1, s2 []string) (distinct []string) {
hash := make(map[string]bool)
for _, e := range s2 {
hash[e] = true
}
for _, e := range s1 {
if _, ok := hash[e]; !ok {
distinct = append(distinct, e)
}
}
return distinct
}
// Unique func Unique(s1 []string) (unique []string) returns unique elements in s1
func Unique(s1 []string) (unique []string) {
hash := make(map[string]bool)
for _, e := range s1 {
if _, ok := hash[e]; !ok {
unique = append(unique, e)
}
hash[e] = true
}
return unique
}
func Contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
func IndexOf(s []string, e string) int {
for i, a := range s {
if a == e {
return i
}
}
return -1
}
func Remove(slice []interface{}, s int) []interface{} {
return append(slice[:s], slice[s+1:]...)
}

View File

@ -0,0 +1,63 @@
package utils_test
import (
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/stretchr/testify/require"
"testing"
)
func TestIntersection(t *testing.T) {
t.Run("can perform intersection", func(t *testing.T) {
setA := []string{"A", "b", "C", "D", "E"}
setB := []string{"A", "B", "F", "C", "G"}
setC := utils.Intersection(setA, setB)
require.Len(t, setC, 2)
require.ElementsMatch(t, []string{"A", "C"}, setC)
})
}
func TestDistinct(t *testing.T) {
t.Run("can perform distinct", func(t *testing.T) {
setA := []string{"A", "b", "C", "D", "E"}
setB := []string{"A", "B", "F", "C", "G"}
setC := utils.Distinct(setA, setB)
require.Len(t, setC, 3)
require.ElementsMatch(t, []string{"b", "D", "E"}, setC)
})
t.Run("can perform distinct on empty", func(t *testing.T) {
setA := []string{"A", "b", "C", "D", "E"}
var setB []string
setC := utils.Distinct(setA, setB)
require.Len(t, setC, 5)
require.ElementsMatch(t, []string{"A", "b", "C", "D", "E"}, setC)
})
}
func TestContains(t *testing.T) {
t.Run("can perform contains", func(t *testing.T) {
setA := []string{"A", "b", "C", "D", "E"}
require.True(t, utils.Contains(setA, "A"))
require.True(t, utils.Contains(setA, "b"))
require.True(t, utils.Contains(setA, "C"))
require.True(t, utils.Contains(setA, "D"))
require.True(t, utils.Contains(setA, "E"))
require.False(t, utils.Contains(setA, "B"))
})
}
func TestUnique(t *testing.T) {
t.Run("can perform unique", func(t *testing.T) {
setA := []string{"A", "b", "D", "D", "E", "E", "A"}
setC := utils.Unique(setA)
require.Len(t, setC, 4)
require.ElementsMatch(t, []string{"A", "b", "D", "E"}, setC)
})
t.Run("can perform unique on empty", func(t *testing.T) {
var setA []string
setC := utils.Unique(setA)
require.Len(t, setC, 0)
})
}

View File

@ -0,0 +1,7 @@
package utils
import "time"
func MakeTimestamp() int64 {
return time.Now().UnixNano() / int64(time.Millisecond)
}

View File

@ -0,0 +1,115 @@
package internal
import (
c "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
o "github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/data"
"github.com/pkg/errors"
"github.com/rs/zerolog/log"
)
type runConfiguration struct {
id string
host string
port uint16
username string
password string
output string
collectors []string
collectorConfigs map[string]config.Configuration
outputConfig config.Configuration
}
func NewRunConfiguration(id string, host string, port uint16, username string, password string, output string, outputConfig config.Configuration,
collectors []string, collectorConfigs map[string]config.Configuration) *runConfiguration {
config := runConfiguration{
id: id,
host: host,
port: port,
username: username,
password: password,
collectors: collectors,
output: output,
collectorConfigs: collectorConfigs,
outputConfig: outputConfig,
}
return &config
}
func Capture(config *runConfiguration) {
bundles, err := collect(config)
if err != nil {
log.Fatal().Err(err).Msg("unable to perform collection")
}
log.Info().Msgf("collectors initialized")
if err = output(config, bundles); err != nil {
log.Fatal().Err(err).Msg("unable to create output")
}
log.Info().Msgf("bundle export complete")
}
func collect(config *runConfiguration) (map[string]*data.DiagnosticBundle, error) {
resourceManager := platform.GetResourceManager()
err := resourceManager.Connect(config.host, config.port, config.username, config.password)
if err != nil {
// if we can't connect this is fatal
log.Fatal().Err(err).Msg("Unable to connect to database")
}
//grab the required connectors - we pass what we can
bundles := make(map[string]*data.DiagnosticBundle)
log.Info().Msgf("connection established")
//these store our collection errors and will be output in the bundle
var collectorErrors [][]interface{}
for _, collectorName := range config.collectors {
collectorConfig := config.collectorConfigs[collectorName]
log.Info().Msgf("initializing %s collector", collectorName)
collector, err := c.GetCollectorByName(collectorName)
if err != nil {
log.Error().Err(err).Msgf("Unable to fetch collector %s", collectorName)
collectorErrors = append(collectorErrors, []interface{}{err.Error()})
continue
}
bundle, err := collector.Collect(collectorConfig)
if err != nil {
log.Error().Err(err).Msgf("Error in collector %s", collectorName)
collectorErrors = append(collectorErrors, []interface{}{err.Error()})
// this indicates a fatal error in the collector
continue
}
for _, fError := range bundle.Errors.Errors {
err = errors.Wrapf(fError, "Failure to collect frame in collector %s", collectorName)
collectorErrors = append(collectorErrors, []interface{}{err.Error()})
log.Warn().Msg(err.Error())
}
bundles[collectorName] = bundle
}
bundles["diag_trace"] = buildTraceBundle(collectorErrors)
return bundles, nil
}
func output(config *runConfiguration, bundles map[string]*data.DiagnosticBundle) error {
log.Info().Msgf("attempting to export bundle using %s output...", config.output)
output, err := o.GetOutputByName(config.output)
if err != nil {
return err
}
frameErrors, err := output.Write(config.id, bundles, config.outputConfig)
// we report over failing hard on frame errors - upto the output to determine what is fatal via error
for _, fError := range frameErrors.Errors {
log.Warn().Msgf("failure to write frame in output %s - %s", config.output, fError)
}
return err
}
func buildTraceBundle(collectorErrors [][]interface{}) *data.DiagnosticBundle {
errorBundle := data.DiagnosticBundle{
Frames: map[string]data.Frame{
"errors": data.NewMemoryFrame("errors", []string{"errors"}, collectorErrors),
},
Errors: data.FrameErrors{},
}
// add any other metrics from collection
return &errorBundle
}

View File

@ -0,0 +1,123 @@
package internal_test
import (
"context"
"fmt"
"github.com/ClickHouse/clickhouse-diagnostics/internal"
"github.com/ClickHouse/clickhouse-diagnostics/internal/collectors"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/clickhouse"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/collectors/system"
"github.com/ClickHouse/clickhouse-diagnostics/internal/outputs"
_ "github.com/ClickHouse/clickhouse-diagnostics/internal/outputs/file"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/config"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/test"
"github.com/ClickHouse/clickhouse-diagnostics/internal/platform/utils"
"github.com/stretchr/testify/require"
"github.com/testcontainers/testcontainers-go"
"github.com/testcontainers/testcontainers-go/wait"
"io/ioutil"
"os"
"path"
"strconv"
"testing"
)
func TestMain(m *testing.M) {
// create a ClickHouse container
ctx := context.Background()
cwd, err := os.Getwd()
if err != nil {
// can't test without container
panic(err)
}
// for now, we test against a hardcoded database-server version but we should make this a property
req := testcontainers.ContainerRequest{
Image: fmt.Sprintf("clickhouse/clickhouse-server:%s", test.GetClickHouseTestVersion()),
ExposedPorts: []string{"9000/tcp"},
WaitingFor: wait.ForLog("Ready for connections"),
BindMounts: map[string]string{
"/etc/clickhouse-server/config.d/custom.xml": path.Join(cwd, "../testdata/docker/custom.xml"),
"/etc/clickhouse-server/users.d/admin.xml": path.Join(cwd, "../testdata/docker/admin.xml"),
},
}
clickhouseContainer, err := testcontainers.GenericContainer(ctx, testcontainers.GenericContainerRequest{
ContainerRequest: req,
Started: true,
})
if err != nil {
// can't test without container
panic(err)
}
p, _ := clickhouseContainer.MappedPort(ctx, "9000")
os.Setenv("CLICKHOUSE_DB_PORT", p.Port())
defer clickhouseContainer.Terminate(ctx) //nolint
os.Exit(m.Run())
}
// Execute a full default capture, with simple output, and check if a bundle is produced and it's not empty
func TestCapture(t *testing.T) {
tmrDir := t.TempDir()
port, err := strconv.ParseUint(os.Getenv("CLICKHOUSE_DB_PORT"), 10, 16)
if err != nil {
t.Fatal("Unable to read port value from environment")
}
// test a simple output exists
_, err = outputs.GetOutputByName("simple")
require.Nil(t, err)
// this relies on the simple out not changing its params - test will likely fail if so
outputConfig := config.Configuration{
Params: []config.ConfigParam{
config.StringParam{
Value: tmrDir,
Param: config.NewParam("directory", "Directory in which to create dump. Defaults to the current directory.", false),
},
config.StringOptions{
Value: "csv",
Options: []string{"csv"},
Param: config.NewParam("format", "Format of exported files", false),
},
config.BoolParam{
Value: true,
Param: config.NewParam("skip_archive", "Don't compress output to an archive", false),
},
},
}
// test default collectors
collectorNames := collectors.GetCollectorNames(true)
// grab all configs - only default will be used because of collectorNames
collectorConfigs, err := collectors.BuildConfigurationOptions()
require.Nil(t, err)
conf := internal.NewRunConfiguration("random", "localhost", uint16(port), "", "", "simple", outputConfig, collectorNames, collectorConfigs)
internal.Capture(conf)
outputDir := path.Join(tmrDir, "random")
_, err = os.Stat(outputDir)
require.Nil(t, err)
require.True(t, !os.IsNotExist(err))
files, err := ioutil.ReadDir(outputDir)
require.Nil(t, err)
require.Len(t, files, 1)
outputDir = path.Join(outputDir, files[0].Name())
// check we have a folder per collector i.e. collectorNames + diag_trace
files, err = ioutil.ReadDir(outputDir)
require.Nil(t, err)
require.Len(t, files, len(collectorNames)+1)
expectedFolders := append(collectorNames, "diag_trace")
for _, file := range files {
require.True(t, file.IsDir())
utils.Contains(expectedFolders, file.Name())
}
// we don't test the specific collector outputs but make sure something was written to system
systemFolder := path.Join(outputDir, "system")
files, err = ioutil.ReadDir(systemFolder)
require.Nil(t, err)
require.Greater(t, len(files), 0)
// test diag_trace
diagFolder := path.Join(outputDir, "diag_trace")
files, err = ioutil.ReadDir(diagFolder)
require.Nil(t, err)
require.Equal(t, 1, len(files))
require.FileExists(t, path.Join(diagFolder, "errors.csv"))
}

View File

@ -0,0 +1,9 @@
package main
import (
"github.com/ClickHouse/clickhouse-diagnostics/cmd"
)
func main() {
cmd.Execute()
}

View File

@ -0,0 +1,8 @@
<clickhouse>
<network_max>5000000</network_max>
<test_profile>
<test_p>
</test_p>
</test_profile>
<pg_port>9008</pg_port>
</clickhouse>

View File

@ -0,0 +1,21 @@
<?xml version="1.0" ?>
<clickhouse>
<test_user>
<networks>
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
<password_sha256_hex>REPLACE_ME</password_sha256_hex>
<access_management>1</access_management>
</test_user>
<another_user>
<networks>
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
<passwird>REPLACE_ME</passwird>
<access_management>1</access_management>
</another_user>
</clickhouse>

View File

@ -0,0 +1 @@
network_max: 5000000

View File

@ -0,0 +1,7 @@
test_user:
password: 'REPLACE_ME'
networks:
ip: '::/0'
profile: default
quota: default
access_management: 1

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,8 @@
<clickhouse>
<users>
<default>
<password remove="1"/>
<password_sha256_hex>REPLACE_ME</password_sha256_hex>
</default>
</users>
</clickhouse>

View File

@ -0,0 +1,58 @@
<?xml version="1.0"?>
<clickhouse>
<!-- See also the files in users.d directory where the settings can be overridden. -->
<!-- Profiles of settings. -->
<include_from>../include/xml/user-include.xml</include_from>
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. -->
<max_memory_usage>10000000000</max_memory_usage>
<load_balancing>random</load_balancing>
<log_query_threads>1</log_query_threads>
</default>
<!-- Profile that allows only read queries. -->
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<!-- Users and ACL. -->
<users>
<test_user>
<include incl="test_user"></include>
</test_user>
<!-- If user name was not specified, 'default' user is used. -->
<default>
<password>REPLACE_ME</password>
<networks>
<ip>::/0</ip>
</networks>
<!-- Settings profile for user. -->
<profile>default</profile>
<!-- Quota for user. -->
<quota>default</quota>
<!-- User can create other users and grant rights to them. -->
<!-- <access_management>1</access_management> -->
</default>
</users>
<!-- Quotas. -->
<quotas>
<!-- Name of quota. -->
<default>
<!-- Limits for time interval. You could specify many intervals with different limits. -->
<interval>
<!-- Length of interval. -->
<duration>3600</duration>
<!-- No limits. Just calculate resource usage for time interval. -->
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
</clickhouse>

View File

@ -0,0 +1,969 @@
# This is an example of a configuration file "config.xml" rewritten in YAML
# You can read this documentation for detailed information about YAML configuration:
# https://clickhouse.com/docs/en/operations/configuration-files/
# NOTE: User and query level settings are set up in "users.yaml" file.
# If you have accidentally specified user-level settings here, server won't start.
# You can either move the settings to the right place inside "users.xml" file
# or add skip_check_for_incorrect_settings: 1 here.
include_from: "../include/yaml/server-include.yaml"
logger:
# Possible levels [1]:
# - none (turns off logging)
# - fatal
# - critical
# - error
# - warning
# - notice
# - information
# - debug
# - trace
# [1]: https://github.com/pocoproject/poco/blob/poco-1.9.4-release/Foundation/include/Poco/Logger.h#L105-L114
level: trace
log: /var/log/clickhouse-server/clickhouse-server.log
errorlog: /var/log/clickhouse-server/clickhouse-server.err.log
# Rotation policy
# See https://github.com/pocoproject/poco/blob/poco-1.9.4-release/Foundation/include/Poco/FileChannel.h#L54-L85
size: 1000M
count: 10
# console: 1
# Default behavior is autodetection (log to console if not daemon mode and is tty)
# Per level overrides (legacy):
# For example to suppress logging of the ConfigReloader you can use:
# NOTE: levels.logger is reserved, see below.
# levels:
# ConfigReloader: none
# Per level overrides:
# For example to suppress logging of the RBAC for default user you can use:
# (But please note that the logger name maybe changed from version to version, even after minor upgrade)
# levels:
# - logger:
# name: 'ContextAccess (default)'
# level: none
# - logger:
# name: 'DatabaseOrdinary (test)'
# level: none
# It is the name that will be shown in the clickhouse-client.
# By default, anything with "production" will be highlighted in red in query prompt.
# display_name: production
# Port for HTTP API. See also 'https_port' for secure connections.
# This interface is also used by ODBC and JDBC drivers (DataGrip, Dbeaver, ...)
# and by most of web interfaces (embedded UI, Grafana, Redash, ...).
http_port: 8123
# Port for interaction by native protocol with:
# - clickhouse-client and other native ClickHouse tools (clickhouse-benchmark, clickhouse-copier);
# - clickhouse-server with other clickhouse-servers for distributed query processing;
# - ClickHouse drivers and applications supporting native protocol
# (this protocol is also informally called as "the TCP protocol");
# See also 'tcp_port_secure' for secure connections.
tcp_port: 9000
# Compatibility with MySQL protocol.
# ClickHouse will pretend to be MySQL for applications connecting to this port.
mysql_port: 9004
# Compatibility with PostgreSQL protocol.
# ClickHouse will pretend to be PostgreSQL for applications connecting to this port.
postgresql_port: 9005
# HTTP API with TLS (HTTPS).
# You have to configure certificate to enable this interface.
# See the openSSL section below.
# https_port: 8443
# Native interface with TLS.
# You have to configure certificate to enable this interface.
# See the openSSL section below.
# tcp_port_secure: 9440
# Native interface wrapped with PROXYv1 protocol
# PROXYv1 header sent for every connection.
# ClickHouse will extract information about proxy-forwarded client address from the header.
# tcp_with_proxy_port: 9011
# Port for communication between replicas. Used for data exchange.
# It provides low-level data access between servers.
# This port should not be accessible from untrusted networks.
# See also 'interserver_http_credentials'.
# Data transferred over connections to this port should not go through untrusted networks.
# See also 'interserver_https_port'.
interserver_http_port: 9009
# Port for communication between replicas with TLS.
# You have to configure certificate to enable this interface.
# See the openSSL section below.
# See also 'interserver_http_credentials'.
# interserver_https_port: 9010
# Hostname that is used by other replicas to request this server.
# If not specified, than it is determined analogous to 'hostname -f' command.
# This setting could be used to switch replication to another network interface
# (the server may be connected to multiple networks via multiple addresses)
# interserver_http_host: example.yandex.ru
# You can specify credentials for authenthication between replicas.
# This is required when interserver_https_port is accessible from untrusted networks,
# and also recommended to avoid SSRF attacks from possibly compromised services in your network.
# interserver_http_credentials:
# user: interserver
# password: ''
# Listen specified address.
# Use :: (wildcard IPv6 address), if you want to accept connections both with IPv4 and IPv6 from everywhere.
# Notes:
# If you open connections from wildcard address, make sure that at least one of the following measures applied:
# - server is protected by firewall and not accessible from untrusted networks;
# - all users are restricted to subset of network addresses (see users.xml);
# - all users have strong passwords, only secure (TLS) interfaces are accessible, or connections are only made via TLS interfaces.
# - users without password have readonly access.
# See also: https://www.shodan.io/search?query=clickhouse
# listen_host: '::'
# Same for hosts without support for IPv6:
# listen_host: 0.0.0.0
# Default values - try listen localhost on IPv4 and IPv6.
# listen_host: '::1'
# listen_host: 127.0.0.1
# Don't exit if IPv6 or IPv4 networks are unavailable while trying to listen.
# listen_try: 0
# Allow multiple servers to listen on the same address:port. This is not recommended.
# listen_reuse_port: 0
# listen_backlog: 64
max_connections: 4096
# For 'Connection: keep-alive' in HTTP 1.1
keep_alive_timeout: 3
# gRPC protocol (see src/Server/grpc_protos/clickhouse_grpc.proto for the API)
# grpc_port: 9100
grpc:
enable_ssl: false
# The following two files are used only if enable_ssl=1
ssl_cert_file: /path/to/ssl_cert_file
ssl_key_file: /path/to/ssl_key_file
# Whether server will request client for a certificate
ssl_require_client_auth: false
# The following file is used only if ssl_require_client_auth=1
ssl_ca_cert_file: /path/to/ssl_ca_cert_file
# Default compression algorithm (applied if client doesn't specify another algorithm).
# Supported algorithms: none, deflate, gzip, stream_gzip
compression: deflate
# Default compression level (applied if client doesn't specify another level).
# Supported levels: none, low, medium, high
compression_level: medium
# Send/receive message size limits in bytes. -1 means unlimited
max_send_message_size: -1
max_receive_message_size: -1
# Enable if you want very detailed logs
verbose_logs: false
# Used with https_port and tcp_port_secure. Full ssl options list: https://github.com/ClickHouse-Extras/poco/blob/master/NetSSL_OpenSSL/include/Poco/Net/SSLManager.h#L71
openSSL:
server:
# Used for https server AND secure tcp port
# openssl req -subj "/CN=localhost" -new -newkey rsa:2048 -days 365 -nodes -x509 -keyout /etc/clickhouse-server/server.key -out /etc/clickhouse-server/server.crt
certificateFile: /etc/clickhouse-server/server.crt
privateKeyFile: /etc/clickhouse-server/server.key
# dhparams are optional. You can delete the dhParamsFile: element.
# To generate dhparams, use the following command:
# openssl dhparam -out /etc/clickhouse-server/dhparam.pem 4096
# Only file format with BEGIN DH PARAMETERS is supported.
dhParamsFile: /etc/clickhouse-server/dhparam.pem
verificationMode: none
loadDefaultCAFile: true
cacheSessions: true
disableProtocols: 'sslv2,sslv3'
preferServerCiphers: true
client:
# Used for connecting to https dictionary source and secured Zookeeper communication
loadDefaultCAFile: true
cacheSessions: true
disableProtocols: 'sslv2,sslv3'
preferServerCiphers: true
# Use for self-signed: verificationMode: none
invalidCertificateHandler:
# Use for self-signed: name: AcceptCertificateHandler
name: RejectCertificateHandler
# Default root page on http[s] server. For example load UI from https://tabix.io/ when opening http://localhost:8123
# http_server_default_response: |-
# <html ng-app="SMI2"><head><base href="http://ui.tabix.io/"></head><body><div ui-view="" class="content-ui"></div><script src="http://loader.tabix.io/master.js"></script></body></html>
# Maximum number of concurrent queries.
max_concurrent_queries: 100
# Maximum memory usage (resident set size) for server process.
# Zero value or unset means default. Default is "max_server_memory_usage_to_ram_ratio" of available physical RAM.
# If the value is larger than "max_server_memory_usage_to_ram_ratio" of available physical RAM, it will be cut down.
# The constraint is checked on query execution time.
# If a query tries to allocate memory and the current memory usage plus allocation is greater
# than specified threshold, exception will be thrown.
# It is not practical to set this constraint to small values like just a few gigabytes,
# because memory allocator will keep this amount of memory in caches and the server will deny service of queries.
max_server_memory_usage: 0
# Maximum number of threads in the Global thread pool.
# This will default to a maximum of 10000 threads if not specified.
# This setting will be useful in scenarios where there are a large number
# of distributed queries that are running concurrently but are idling most
# of the time, in which case a higher number of threads might be required.
max_thread_pool_size: 10000
# On memory constrained environments you may have to set this to value larger than 1.
max_server_memory_usage_to_ram_ratio: 0.9
# Simple server-wide memory profiler. Collect a stack trace at every peak allocation step (in bytes).
# Data will be stored in system.trace_log table with query_id = empty string.
# Zero means disabled.
total_memory_profiler_step: 4194304
# Collect random allocations and deallocations and write them into system.trace_log with 'MemorySample' trace_type.
# The probability is for every alloc/free regardless to the size of the allocation.
# Note that sampling happens only when the amount of untracked memory exceeds the untracked memory limit,
# which is 4 MiB by default but can be lowered if 'total_memory_profiler_step' is lowered.
# You may want to set 'total_memory_profiler_step' to 1 for extra fine grained sampling.
total_memory_tracker_sample_probability: 0
# Set limit on number of open files (default: maximum). This setting makes sense on Mac OS X because getrlimit() fails to retrieve
# correct maximum value.
# max_open_files: 262144
# Size of cache of uncompressed blocks of data, used in tables of MergeTree family.
# In bytes. Cache is single for server. Memory is allocated only on demand.
# Cache is used when 'use_uncompressed_cache' user setting turned on (off by default).
# Uncompressed cache is advantageous only for very short queries and in rare cases.
# Note: uncompressed cache can be pointless for lz4, because memory bandwidth
# is slower than multi-core decompression on some server configurations.
# Enabling it can sometimes paradoxically make queries slower.
uncompressed_cache_size: 8589934592
# Approximate size of mark cache, used in tables of MergeTree family.
# In bytes. Cache is single for server. Memory is allocated only on demand.
# You should not lower this value.
mark_cache_size: 5368709120
# If you enable the `min_bytes_to_use_mmap_io` setting,
# the data in MergeTree tables can be read with mmap to avoid copying from kernel to userspace.
# It makes sense only for large files and helps only if data reside in page cache.
# To avoid frequent open/mmap/munmap/close calls (which are very expensive due to consequent page faults)
# and to reuse mappings from several threads and queries,
# the cache of mapped files is maintained. Its size is the number of mapped regions (usually equal to the number of mapped files).
# The amount of data in mapped files can be monitored
# in system.metrics, system.metric_log by the MMappedFiles, MMappedFileBytes metrics
# and in system.asynchronous_metrics, system.asynchronous_metrics_log by the MMapCacheCells metric,
# and also in system.events, system.processes, system.query_log, system.query_thread_log, system.query_views_log by the
# CreatedReadBufferMMap, CreatedReadBufferMMapFailed, MMappedFileCacheHits, MMappedFileCacheMisses events.
# Note that the amount of data in mapped files does not consume memory directly and is not accounted
# in query or server memory usage - because this memory can be discarded similar to OS page cache.
# The cache is dropped (the files are closed) automatically on removal of old parts in MergeTree,
# also it can be dropped manually by the SYSTEM DROP MMAP CACHE query.
mmap_cache_size: 1000
# Cache size in bytes for compiled expressions.
compiled_expression_cache_size: 134217728
# Cache size in elements for compiled expressions.
compiled_expression_cache_elements_size: 10000
# Path to data directory, with trailing slash.
path: /var/lib/clickhouse/
# Path to temporary data for processing hard queries.
tmp_path: /var/lib/clickhouse/tmp/
# Policy from the <storage_configuration> for the temporary files.
# If not set <tmp_path> is used, otherwise <tmp_path> is ignored.
# Notes:
# - move_factor is ignored
# - keep_free_space_bytes is ignored
# - max_data_part_size_bytes is ignored
# - you must have exactly one volume in that policy
# tmp_policy: tmp
# Directory with user provided files that are accessible by 'file' table function.
user_files_path: /var/lib/clickhouse/user_files/
# LDAP server definitions.
ldap_servers: ''
# List LDAP servers with their connection parameters here to later 1) use them as authenticators for dedicated local users,
# who have 'ldap' authentication mechanism specified instead of 'password', or to 2) use them as remote user directories.
# Parameters:
# host - LDAP server hostname or IP, this parameter is mandatory and cannot be empty.
# port - LDAP server port, default is 636 if enable_tls is set to true, 389 otherwise.
# bind_dn - template used to construct the DN to bind to.
# The resulting DN will be constructed by replacing all '{user_name}' substrings of the template with the actual
# user name during each authentication attempt.
# user_dn_detection - section with LDAP search parameters for detecting the actual user DN of the bound user.
# This is mainly used in search filters for further role mapping when the server is Active Directory. The
# resulting user DN will be used when replacing '{user_dn}' substrings wherever they are allowed. By default,
# user DN is set equal to bind DN, but once search is performed, it will be updated with to the actual detected
# user DN value.
# base_dn - template used to construct the base DN for the LDAP search.
# The resulting DN will be constructed by replacing all '{user_name}' and '{bind_dn}' substrings
# of the template with the actual user name and bind DN during the LDAP search.
# scope - scope of the LDAP search.
# Accepted values are: 'base', 'one_level', 'children', 'subtree' (the default).
# search_filter - template used to construct the search filter for the LDAP search.
# The resulting filter will be constructed by replacing all '{user_name}', '{bind_dn}', and '{base_dn}'
# substrings of the template with the actual user name, bind DN, and base DN during the LDAP search.
# Note, that the special characters must be escaped properly in XML.
# verification_cooldown - a period of time, in seconds, after a successful bind attempt, during which a user will be assumed
# to be successfully authenticated for all consecutive requests without contacting the LDAP server.
# Specify 0 (the default) to disable caching and force contacting the LDAP server for each authentication request.
# enable_tls - flag to trigger use of secure connection to the LDAP server.
# Specify 'no' for plain text (ldap://) protocol (not recommended).
# Specify 'yes' for LDAP over SSL/TLS (ldaps://) protocol (recommended, the default).
# Specify 'starttls' for legacy StartTLS protocol (plain text (ldap://) protocol, upgraded to TLS).
# tls_minimum_protocol_version - the minimum protocol version of SSL/TLS.
# Accepted values are: 'ssl2', 'ssl3', 'tls1.0', 'tls1.1', 'tls1.2' (the default).
# tls_require_cert - SSL/TLS peer certificate verification behavior.
# Accepted values are: 'never', 'allow', 'try', 'demand' (the default).
# tls_cert_file - path to certificate file.
# tls_key_file - path to certificate key file.
# tls_ca_cert_file - path to CA certificate file.
# tls_ca_cert_dir - path to the directory containing CA certificates.
# tls_cipher_suite - allowed cipher suite (in OpenSSL notation).
# Example:
# my_ldap_server:
# host: localhost
# port: 636
# bind_dn: 'uid={user_name},ou=users,dc=example,dc=com'
# verification_cooldown: 300
# enable_tls: yes
# tls_minimum_protocol_version: tls1.2
# tls_require_cert: demand
# tls_cert_file: /path/to/tls_cert_file
# tls_key_file: /path/to/tls_key_file
# tls_ca_cert_file: /path/to/tls_ca_cert_file
# tls_ca_cert_dir: /path/to/tls_ca_cert_dir
# tls_cipher_suite: ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:AES256-GCM-SHA384
# Example (typical Active Directory with configured user DN detection for further role mapping):
# my_ad_server:
# host: localhost
# port: 389
# bind_dn: 'EXAMPLE\{user_name}'
# user_dn_detection:
# base_dn: CN=Users,DC=example,DC=com
# search_filter: '(&amp;(objectClass=user)(sAMAccountName={user_name}))'
# enable_tls: no
# To enable Kerberos authentication support for HTTP requests (GSS-SPNEGO), for those users who are explicitly configured
# to authenticate via Kerberos, define a single 'kerberos' section here.
# Parameters:
# principal - canonical service principal name, that will be acquired and used when accepting security contexts.
# This parameter is optional, if omitted, the default principal will be used.
# This parameter cannot be specified together with 'realm' parameter.
# realm - a realm, that will be used to restrict authentication to only those requests whose initiator's realm matches it.
# This parameter is optional, if omitted, no additional filtering by realm will be applied.
# This parameter cannot be specified together with 'principal' parameter.
# Example:
# kerberos: ''
# Example:
# kerberos:
# principal: HTTP/clickhouse.example.com@EXAMPLE.COM
# Example:
# kerberos:
# realm: EXAMPLE.COM
# Sources to read users, roles, access rights, profiles of settings, quotas.
user_directories:
users_xml:
# Path to configuration file with predefined users.
path: users.yaml
local_directory:
# Path to folder where users created by SQL commands are stored.
path: /var/lib/clickhouse/access/
# # To add an LDAP server as a remote user directory of users that are not defined locally, define a single 'ldap' section
# # with the following parameters:
# # server - one of LDAP server names defined in 'ldap_servers' config section above.
# # This parameter is mandatory and cannot be empty.
# # roles - section with a list of locally defined roles that will be assigned to each user retrieved from the LDAP server.
# # If no roles are specified here or assigned during role mapping (below), user will not be able to perform any
# # actions after authentication.
# # role_mapping - section with LDAP search parameters and mapping rules.
# # When a user authenticates, while still bound to LDAP, an LDAP search is performed using search_filter and the
# # name of the logged in user. For each entry found during that search, the value of the specified attribute is
# # extracted. For each attribute value that has the specified prefix, the prefix is removed, and the rest of the
# # value becomes the name of a local role defined in ClickHouse, which is expected to be created beforehand by
# # CREATE ROLE command.
# # There can be multiple 'role_mapping' sections defined inside the same 'ldap' section. All of them will be
# # applied.
# # base_dn - template used to construct the base DN for the LDAP search.
# # The resulting DN will be constructed by replacing all '{user_name}', '{bind_dn}', and '{user_dn}'
# # substrings of the template with the actual user name, bind DN, and user DN during each LDAP search.
# # scope - scope of the LDAP search.
# # Accepted values are: 'base', 'one_level', 'children', 'subtree' (the default).
# # search_filter - template used to construct the search filter for the LDAP search.
# # The resulting filter will be constructed by replacing all '{user_name}', '{bind_dn}', '{user_dn}', and
# # '{base_dn}' substrings of the template with the actual user name, bind DN, user DN, and base DN during
# # each LDAP search.
# # Note, that the special characters must be escaped properly in XML.
# # attribute - attribute name whose values will be returned by the LDAP search. 'cn', by default.
# # prefix - prefix, that will be expected to be in front of each string in the original list of strings returned by
# # the LDAP search. Prefix will be removed from the original strings and resulting strings will be treated
# # as local role names. Empty, by default.
# # Example:
# # ldap:
# # server: my_ldap_server
# # roles:
# # my_local_role1: ''
# # my_local_role2: ''
# # role_mapping:
# # base_dn: 'ou=groups,dc=example,dc=com'
# # scope: subtree
# # search_filter: '(&amp;(objectClass=groupOfNames)(member={bind_dn}))'
# # attribute: cn
# # prefix: clickhouse_
# # Example (typical Active Directory with role mapping that relies on the detected user DN):
# # ldap:
# # server: my_ad_server
# # role_mapping:
# # base_dn: 'CN=Users,DC=example,DC=com'
# # attribute: CN
# # scope: subtree
# # search_filter: '(&amp;(objectClass=group)(member={user_dn}))'
# # prefix: clickhouse_
# Default profile of settings.
default_profile: default
# Comma-separated list of prefixes for user-defined settings.
# custom_settings_prefixes: ''
# System profile of settings. This settings are used by internal processes (Distributed DDL worker and so on).
# system_profile: default
# Buffer profile of settings.
# This settings are used by Buffer storage to flush data to the underlying table.
# Default: used from system_profile directive.
# buffer_profile: default
# Default database.
default_database: default
# Server time zone could be set here.
# Time zone is used when converting between String and DateTime types,
# when printing DateTime in text formats and parsing DateTime from text,
# it is used in date and time related functions, if specific time zone was not passed as an argument.
# Time zone is specified as identifier from IANA time zone database, like UTC or Africa/Abidjan.
# If not specified, system time zone at server startup is used.
# Please note, that server could display time zone alias instead of specified name.
# Example: W-SU is an alias for Europe/Moscow and Zulu is an alias for UTC.
# timezone: Europe/Moscow
# You can specify umask here (see "man umask"). Server will apply it on startup.
# Number is always parsed as octal. Default umask is 027 (other users cannot read logs, data files, etc; group can only read).
# umask: 022
# Perform mlockall after startup to lower first queries latency
# and to prevent clickhouse executable from being paged out under high IO load.
# Enabling this option is recommended but will lead to increased startup time for up to a few seconds.
mlock_executable: true
# Reallocate memory for machine code ("text") using huge pages. Highly experimental.
remap_executable: false
# Uncomment below in order to use JDBC table engine and function.
# To install and run JDBC bridge in background:
# * [Debian/Ubuntu]
# export MVN_URL=https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc-bridge
# export PKG_VER=$(curl -sL $MVN_URL/maven-metadata.xml | grep '<release>' | sed -e 's|.*>\(.*\)<.*|\1|')
# wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge_$PKG_VER-1_all.deb
# apt install --no-install-recommends -f ./clickhouse-jdbc-bridge_$PKG_VER-1_all.deb
# clickhouse-jdbc-bridge &
# * [CentOS/RHEL]
# export MVN_URL=https://repo1.maven.org/maven2/ru/yandex/clickhouse/clickhouse-jdbc-bridge
# export PKG_VER=$(curl -sL $MVN_URL/maven-metadata.xml | grep '<release>' | sed -e 's|.*>\(.*\)<.*|\1|')
# wget https://github.com/ClickHouse/clickhouse-jdbc-bridge/releases/download/v$PKG_VER/clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm
# yum localinstall -y clickhouse-jdbc-bridge-$PKG_VER-1.noarch.rpm
# clickhouse-jdbc-bridge &
# Please refer to https://github.com/ClickHouse/clickhouse-jdbc-bridge#usage for more information.
# jdbc_bridge:
# host: 127.0.0.1
# port: 9019
# Configuration of clusters that could be used in Distributed tables.
# https://clickhouse.com/docs/en/operations/table_engines/distributed/
remote_servers:
# Test only shard config for testing distributed storage
test_shard_localhost:
# Inter-server per-cluster secret for Distributed queries
# default: no secret (no authentication will be performed)
# If set, then Distributed queries will be validated on shards, so at least:
# - such cluster should exist on the shard,
# - such cluster should have the same secret.
# And also (and which is more important), the initial_user will
# be used as current user for the query.
# Right now the protocol is pretty simple and it only takes into account:
# - cluster name
# - query
# Also it will be nice if the following will be implemented:
# - source hostname (see interserver_http_host), but then it will depends from DNS,
# it can use IP address instead, but then the you need to get correct on the initiator node.
# - target hostname / ip address (same notes as for source hostname)
# - time-based security tokens
secret: 'REPLACE_ME'
shard:
# Optional. Whether to write data to just one of the replicas. Default: false (write data to all replicas).
# internal_replication: false
# Optional. Shard weight when writing data. Default: 1.
# weight: 1
replica:
host: localhost
port: 9000
# Optional. Priority of the replica for load_balancing. Default: 1 (less value has more priority).
# priority: 1
test_cluster_two_shards_localhost:
shard:
- replica:
host: localhost
port: 9000
- replica:
host: localhost
port: 9000
test_cluster_two_shards:
shard:
- replica:
host: 127.0.0.1
port: 9000
- replica:
host: 127.0.0.2
port: 9000
test_cluster_two_shards_internal_replication:
shard:
- internal_replication: true
replica:
host: 127.0.0.1
port: 9000
- internal_replication: true
replica:
host: 127.0.0.2
port: 9000
test_shard_localhost_secure:
shard:
replica:
host: localhost
port: 9440
secure: 1
test_unavailable_shard:
shard:
- replica:
host: localhost
port: 9000
- replica:
host: localhost
port: 1
# The list of hosts allowed to use in URL-related storage engines and table functions.
# If this section is not present in configuration, all hosts are allowed.
# remote_url_allow_hosts:
# Host should be specified exactly as in URL. The name is checked before DNS resolution.
# Example: "yandex.ru", "yandex.ru." and "www.yandex.ru" are different hosts.
# If port is explicitly specified in URL, the host:port is checked as a whole.
# If host specified here without port, any port with this host allowed.
# "yandex.ru" -> "yandex.ru:443", "yandex.ru:80" etc. is allowed, but "yandex.ru:80" -> only "yandex.ru:80" is allowed.
# If the host is specified as IP address, it is checked as specified in URL. Example: "[2a02:6b8:a::a]".
# If there are redirects and support for redirects is enabled, every redirect (the Location field) is checked.
# Regular expression can be specified. RE2 engine is used for regexps.
# Regexps are not aligned: don't forget to add ^ and $. Also don't forget to escape dot (.) metacharacter
# (forgetting to do so is a common source of error).
# If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
# By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
# Values for substitutions are specified in /clickhouse/name_of_substitution elements in that file.
# ZooKeeper is used to store metadata about replicas, when using Replicated tables.
# Optional. If you don't use replicated tables, you could omit that.
# See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
# zookeeper:
# - node:
# host: example1
# port: 2181
# - node:
# host: example2
# port: 2181
# - node:
# host: example3
# port: 2181
# Substitutions for parameters of replicated tables.
# Optional. If you don't use replicated tables, you could omit that.
# See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/#creating-replicated-tables
# macros:
# shard: 01
# replica: example01-01-1
# Reloading interval for embedded dictionaries, in seconds. Default: 3600.
builtin_dictionaries_reload_interval: 3600
# Maximum session timeout, in seconds. Default: 3600.
max_session_timeout: 3600
# Default session timeout, in seconds. Default: 60.
default_session_timeout: 60
# Sending data to Graphite for monitoring. Several sections can be defined.
# interval - send every X second
# root_path - prefix for keys
# hostname_in_path - append hostname to root_path (default = true)
# metrics - send data from table system.metrics
# events - send data from table system.events
# asynchronous_metrics - send data from table system.asynchronous_metrics
# graphite:
# host: localhost
# port: 42000
# timeout: 0.1
# interval: 60
# root_path: one_min
# hostname_in_path: true
# metrics: true
# events: true
# events_cumulative: false
# asynchronous_metrics: true
# graphite:
# host: localhost
# port: 42000
# timeout: 0.1
# interval: 1
# root_path: one_sec
# metrics: true
# events: true
# events_cumulative: false
# asynchronous_metrics: false
# Serve endpoint for Prometheus monitoring.
# endpoint - mertics path (relative to root, statring with "/")
# port - port to setup server. If not defined or 0 than http_port used
# metrics - send data from table system.metrics
# events - send data from table system.events
# asynchronous_metrics - send data from table system.asynchronous_metrics
# status_info - send data from different component from CH, ex: Dictionaries status
# prometheus:
# endpoint: /metrics
# port: 9363
# metrics: true
# events: true
# asynchronous_metrics: true
# status_info: true
# Query log. Used only for queries with setting log_queries = 1.
query_log:
# What table to insert data. If table is not exist, it will be created.
# When query log structure is changed after system update,
# then old table will be renamed and new table will be created automatically.
database: system
table: query_log
# PARTITION BY expr: https://clickhouse.com/docs/en/table_engines/mergetree-family/custom_partitioning_key/
# Example:
# event_date
# toMonday(event_date)
# toYYYYMM(event_date)
# toStartOfHour(event_time)
partition_by: toYYYYMM(event_date)
# Table TTL specification: https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree/#mergetree-table-ttl
# Example:
# event_date + INTERVAL 1 WEEK
# event_date + INTERVAL 7 DAY DELETE
# event_date + INTERVAL 2 WEEK TO DISK 'bbb'
# ttl: 'event_date + INTERVAL 30 DAY DELETE'
# Instead of partition_by, you can provide full engine expression (starting with ENGINE = ) with parameters,
# Example: engine: 'ENGINE = MergeTree PARTITION BY toYYYYMM(event_date) ORDER BY (event_date, event_time) SETTINGS index_granularity = 1024'
# Interval of flushing data.
flush_interval_milliseconds: 7500
# Trace log. Stores stack traces collected by query profilers.
# See query_profiler_real_time_period_ns and query_profiler_cpu_time_period_ns settings.
trace_log:
database: system
table: trace_log
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
# Query thread log. Has information about all threads participated in query execution.
# Used only for queries with setting log_query_threads = 1.
query_thread_log:
database: system
table: query_thread_log
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
# Query views log. Has information about all dependent views associated with a query.
# Used only for queries with setting log_query_views = 1.
query_views_log:
database: system
table: query_views_log
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
# Uncomment if use part log.
# Part log contains information about all actions with parts in MergeTree tables (creation, deletion, merges, downloads).
part_log:
database: system
table: part_log
partition_by: toYYYYMM(event_date)
flush_interval_milliseconds: 7500
# Uncomment to write text log into table.
# Text log contains all information from usual server log but stores it in structured and efficient way.
# The level of the messages that goes to the table can be limited (<level>), if not specified all messages will go to the table.
# text_log:
# database: system
# table: text_log
# flush_interval_milliseconds: 7500
# level: ''
# Metric log contains rows with current values of ProfileEvents, CurrentMetrics collected with "collect_interval_milliseconds" interval.
metric_log:
database: system
table: metric_log
flush_interval_milliseconds: 7500
collect_interval_milliseconds: 1000
# Asynchronous metric log contains values of metrics from
# system.asynchronous_metrics.
asynchronous_metric_log:
database: system
table: asynchronous_metric_log
# Asynchronous metrics are updated once a minute, so there is
# no need to flush more often.
flush_interval_milliseconds: 60000
# OpenTelemetry log contains OpenTelemetry trace spans.
opentelemetry_span_log:
# The default table creation code is insufficient, this <engine> spec
# is a workaround. There is no 'event_time' for this log, but two times,
# start and finish. It is sorted by finish time, to avoid inserting
# data too far away in the past (probably we can sometimes insert a span
# that is seconds earlier than the last span in the table, due to a race
# between several spans inserted in parallel). This gives the spans a
# global order that we can use to e.g. retry insertion into some external
# system.
engine: |-
engine MergeTree
partition by toYYYYMM(finish_date)
order by (finish_date, finish_time_us, trace_id)
database: system
table: opentelemetry_span_log
flush_interval_milliseconds: 7500
# Crash log. Stores stack traces for fatal errors.
# This table is normally empty.
crash_log:
database: system
table: crash_log
partition_by: ''
flush_interval_milliseconds: 1000
# Parameters for embedded dictionaries, used in Yandex.Metrica.
# See https://clickhouse.com/docs/en/dicts/internal_dicts/
# Path to file with region hierarchy.
# path_to_regions_hierarchy_file: /opt/geo/regions_hierarchy.txt
# Path to directory with files containing names of regions
# path_to_regions_names_files: /opt/geo/
# top_level_domains_path: /var/lib/clickhouse/top_level_domains/
# Custom TLD lists.
# Format: name: /path/to/file
# Changes will not be applied w/o server restart.
# Path to the list is under top_level_domains_path (see above).
top_level_domains_lists: ''
# public_suffix_list: /path/to/public_suffix_list.dat
# Configuration of external dictionaries. See:
# https://clickhouse.com/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts
dictionaries_config: '*_dictionary.xml'
# Uncomment if you want data to be compressed 30-100% better.
# Don't do that if you just started using ClickHouse.
# compression:
# # Set of variants. Checked in order. Last matching case wins. If nothing matches, lz4 will be used.
# case:
# Conditions. All must be satisfied. Some conditions may be omitted.
# # min_part_size: 10000000000 # Min part size in bytes.
# # min_part_size_ratio: 0.01 # Min size of part relative to whole table size.
# # What compression method to use.
# method: zstd
# Allow to execute distributed DDL queries (CREATE, DROP, ALTER, RENAME) on cluster.
# Works only if ZooKeeper is enabled. Comment it if such functionality isn't required.
distributed_ddl:
# Path in ZooKeeper to queue with DDL queries
path: /clickhouse/task_queue/ddl
# Settings from this profile will be used to execute DDL queries
# profile: default
# Controls how much ON CLUSTER queries can be run simultaneously.
# pool_size: 1
# Cleanup settings (active tasks will not be removed)
# Controls task TTL (default 1 week)
# task_max_lifetime: 604800
# Controls how often cleanup should be performed (in seconds)
# cleanup_delay_period: 60
# Controls how many tasks could be in the queue
# max_tasks_in_queue: 1000
# Settings to fine tune MergeTree tables. See documentation in source code, in MergeTreeSettings.h
# merge_tree:
# max_suspicious_broken_parts: 5
# Protection from accidental DROP.
# If size of a MergeTree table is greater than max_table_size_to_drop (in bytes) than table could not be dropped with any DROP query.
# If you want do delete one table and don't want to change clickhouse-server config, you could create special file <clickhouse-path>/flags/force_drop_table and make DROP once.
# By default max_table_size_to_drop is 50GB; max_table_size_to_drop=0 allows to DROP any tables.
# The same for max_partition_size_to_drop.
# Uncomment to disable protection.
# max_table_size_to_drop: 0
# max_partition_size_to_drop: 0
# Example of parameters for GraphiteMergeTree table engine
graphite_rollup_example:
pattern:
regexp: click_cost
function: any
retention:
- age: 0
precision: 3600
- age: 86400
precision: 60
default:
function: max
retention:
- age: 0
precision: 60
- age: 3600
precision: 300
- age: 86400
precision: 3600
# Directory in <clickhouse-path> containing schema files for various input formats.
# The directory will be created if it doesn't exist.
format_schema_path: /var/lib/clickhouse/format_schemas/
# Default query masking rules, matching lines would be replaced with something else in the logs
# (both text logs and system.query_log).
# name - name for the rule (optional)
# regexp - RE2 compatible regular expression (mandatory)
# replace - substitution string for sensitive data (optional, by default - six asterisks)
query_masking_rules:
rule:
name: hide encrypt/decrypt arguments
regexp: '((?:aes_)?(?:encrypt|decrypt)(?:_mysql)?)\s*\(\s*(?:''(?:\\''|.)+''|.*?)\s*\)'
# or more secure, but also more invasive:
# (aes_\w+)\s*\(.*\)
replace: \1(???)
# Uncomment to use custom http handlers.
# rules are checked from top to bottom, first match runs the handler
# url - to match request URL, you can use 'regex:' prefix to use regex match(optional)
# methods - to match request method, you can use commas to separate multiple method matches(optional)
# headers - to match request headers, match each child element(child element name is header name), you can use 'regex:' prefix to use regex match(optional)
# handler is request handler
# type - supported types: static, dynamic_query_handler, predefined_query_handler
# query - use with predefined_query_handler type, executes query when the handler is called
# query_param_name - use with dynamic_query_handler type, extracts and executes the value corresponding to the <query_param_name> value in HTTP request params
# status - use with static type, response status code
# content_type - use with static type, response content-type
# response_content - use with static type, Response content sent to client, when using the prefix 'file://' or 'config://', find the content from the file or configuration send to client.
# http_handlers:
# - rule:
# url: /
# methods: POST,GET
# headers:
# pragma: no-cache
# handler:
# type: dynamic_query_handler
# query_param_name: query
# - rule:
# url: /predefined_query
# methods: POST,GET
# handler:
# type: predefined_query_handler
# query: 'SELECT * FROM system.settings'
# - rule:
# handler:
# type: static
# status: 200
# content_type: 'text/plain; charset=UTF-8'
# response_content: config://http_server_default_response
send_crash_reports:
# Changing <enabled> to true allows sending crash reports to
# the ClickHouse core developers team via Sentry https://sentry.io
# Doing so at least in pre-production environments is highly appreciated
enabled: false
# Change <anonymize> to true if you don't feel comfortable attaching the server hostname to the crash report
anonymize: false
# Default endpoint should be changed to different Sentry DSN only if you have
# some in-house engineers or hired consultants who're going to debug ClickHouse issues for you
endpoint: 'https://6f33034cfe684dd7a3ab9875e57b1c8d@o388870.ingest.sentry.io/5226277'
# Uncomment to disable ClickHouse internal DNS caching.
# disable_internal_dns_cache: 1
storage_configuration:
disks:
s3:
secret_access_key: REPLACE_ME
access_key_id: 'REPLACE_ME'

View File

@ -0,0 +1,6 @@
# Users and ACL.
users:
# If user name was not specified, 'default' user is used.
default:
password_sha256_hex: "REPLACE_ME"

View File

@ -0,0 +1,47 @@
include_from: "../include/yaml/user-include.yaml"
# Profiles of settings.
profiles:
# Default settings.
default:
# Maximum memory usage for processing single query, in bytes.
max_memory_usage: 10000000000
load_balancing: random
# Profile that allows only read queries.
readonly:
readonly: 1
# Users and ACL.
users:
# If user name was not specified, 'default' user is used.
default:
password: 'REPLACE_ME'
networks:
ip: '::/0'
# Settings profile for user.
profile: default
# Quota for user.
quota: default
# User can create other users and grant rights to them.
# access_management: 1
# Quotas.
quotas:
# Name of quota.
default:
# Limits for time interval. You could specify many intervals with different limits.
interval:
# Length of interval.
duration: 3600
# No limits. Just calculate resource usage for time interval.
queries: 0
errors: 0
result_rows: 0
read_rows: 0
execution_time: 0

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,15 @@
<clickhouse>
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Allows us to create replicated databases. -->
<allow_experimental_database_replicated>1</allow_experimental_database_replicated>
</default>
</profiles>
<users>
<default>
<access_management>1</access_management>
</default>
</users>
</clickhouse>

View File

@ -0,0 +1,9 @@
<?xml version="1.0" ?>
<clickhouse>
<listen_host>::</listen_host>
<listen_host>0.0.0.0</listen_host>
<listen_try>1</listen_try>
<logger>
<console>1</console>
</logger>
</clickhouse>

View File

@ -0,0 +1 @@
dummy hz file for tests