WIP update-aggregate-funcions-in-zh

This commit is contained in:
benbiti 2021-03-09 19:20:52 +08:00
parent c1e5dc92a4
commit c24207037f
2 changed files with 17 additions and 75 deletions

View File

@ -396,11 +396,6 @@ SELECT quantileTDigestWeighted(number, 1) FROM numbers(10)
- [中位数](#median)
- [分位数](#quantiles)
## quantiles(level1, level2, …)(x) {#quantiles}
所有分位数函数也有相应的函数: `quantiles`, `quantilesDeterministic`, `quantilesTiming`, `quantilesTimingWeighted`, `quantilesExact`, `quantilesExactWeighted`, `quantilesTDigest`。这些函数一次计算所列层次的所有分位数,并返回结果值的数组。
## stochasticLinearRegression {#agg_functions-stochasticlinearregression}
该函数实现随机线性回归。 它支持自定义参数的学习率、L2正则化系数、微批并且具有少量更新权重的方法[Adam](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam) (默认), [simple SGD](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) [Momentum](https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Momentum) [Nesterov](https://mipt.ru/upload/medialibrary/d7e/41-91.pdf))。
@ -473,57 +468,5 @@ evalMLMethod(model, param1, param2) FROM test_data
- [stochasticLogisticRegression](#agg_functions-stochasticlogisticregression)
- [线性回归和逻辑回归之间的区别](https://stackoverflow.com/questions/12146914/what-is-the-difference-between-linear-regression-and-logistic-regression)
## stochasticLogisticRegression {#agg_functions-stochasticlogisticregression}
该函数实现随机逻辑回归。 它可以用于二进制分类问题支持与stochasticLinearRegression相同的自定义参数并以相同的方式工作。
### 参数 {#agg_functions-stochasticlogisticregression-parameters}
参数与stochasticLinearRegression中的参数完全相同:
`learning rate`, `l2 regularization coefficient`, `mini-batch size`, `method for updating weights`.
欲了解更多信息,请参阅 [参数](#agg_functions-stochasticlinearregression-parameters).
``` text
stochasticLogisticRegression(1.0, 1.0, 10, 'SGD')
```
**1.** 安装
<!-- -->
参考stochasticLinearRegression相关文档
预测标签的取值范围为[-1, 1]
**2.** 预测
<!-- -->
使用已经保存的state我们可以预测标签为 `1` 的对象的概率。
``` sql
WITH (SELECT state FROM your_model) AS model SELECT
evalMLMethod(model, param1, param2) FROM test_data
```
查询结果返回一个列的概率。注意 `evalMLMethod` 的第一个参数是 `AggregateFunctionState` 对象,接下来的参数是列的特性。
我们也可以设置概率的范围, 这样需要给元素指定不同的标签。
``` sql
SELECT ans < 1.1 AND ans > 0.5 FROM
(WITH (SELECT state FROM your_model) AS model SELECT
evalMLMethod(model, param1, param2) AS ans FROM test_data)
```
结果是标签。
`test_data` 是一个像 `train_data` 一样的表,但是不包含目标值。
**另请参阅**
- [随机指标线上回归](#agg_functions-stochasticlinearregression)
- [线性回归和逻辑回归之间的差异](https://stackoverflow.com/questions/12146914/what-is-the-difference-between-linear-regression-and-logistic-regression)
[原始文章](https://clickhouse.tech/docs/en/query_language/agg_functions/reference/) <!--hide-->

View File

@ -4,40 +4,39 @@ toc_priority: 222
# stochasticLogisticRegression {#agg_functions-stochasticlogisticregression}
This function implements stochastic logistic regression. It can be used for binary classification problem, supports the same custom parameters as stochasticLinearRegression and works the same way.
该函数实现随机逻辑回归。 它可以用于二进制分类问题支持与stochasticLinearRegression相同的自定义参数并以相同的方式工作。
### Parameters {#agg_functions-stochasticlogisticregression-parameters}
### 参数 {#agg_functions-stochasticlogisticregression-parameters}
Parameters are exactly the same as in stochasticLinearRegression:
参数与stochasticLinearRegression中的参数完全相同:
`learning rate`, `l2 regularization coefficient`, `mini-batch size`, `method for updating weights`.
For more information see [parameters](#agg_functions-stochasticlinearregression-parameters).
欲了解更多信息,参见 [参数] (#agg_functions-stochasticlinearregression-parameters).
``` text
``` sql
stochasticLogisticRegression(1.0, 1.0, 10, 'SGD')
```
**1.** Fitting
**1.** 拟合
<!-- -->
See the `Fitting` section in the [stochasticLinearRegression](#stochasticlinearregression-usage-fitting) description.
参考[stochasticLinearRegression](#stochasticlinearregression-usage-fitting) `拟合` 章节文档。
Predicted labels have to be in \[-1, 1\].
预测标签的取值范围为\[-1, 1\]
**2.** Predicting
**2.** 预测
<!-- -->
Using saved state we can predict probability of object having label `1`.
使用已经保存的state我们可以预测标签为 `1` 的对象的概率。
``` sql
WITH (SELECT state FROM your_model) AS model SELECT
evalMLMethod(model, param1, param2) FROM test_data
```
The query will return a column of probabilities. Note that first argument of `evalMLMethod` is `AggregateFunctionState` object, next are columns of features.
查询结果返回一个列的概率。注意 `evalMLMethod` 的第一个参数是 `AggregateFunctionState` 对象,接下来的参数是列的特性。
We can also set a bound of probability, which assigns elements to different labels.
我们也可以设置概率的范围, 这样需要给元素指定不同的标签。
``` sql
SELECT ans < 1.1 AND ans > 0.5 FROM
@ -45,11 +44,11 @@ stochasticLogisticRegression(1.0, 1.0, 10, 'SGD')
evalMLMethod(model, param1, param2) AS ans FROM test_data)
```
Then the result will be labels.
结果是标签。
`test_data` is a table like `train_data` but may not contain target value.
`test_data` 是一个像 `train_data` 一样的表,但是不包含目标值。
**See Also**
**参见**
- [stochasticLinearRegression](../../../sql-reference/aggregate-functions/reference/stochasticlinearregression.md#agg_functions-stochasticlinearregression)
- [Difference between linear and logistic regressions.](https://stackoverflow.com/questions/12146914/what-is-the-difference-between-linear-regression-and-logistic-regression)
- [随机指标线性回归](../../../sql-reference/aggregate-functions/reference/stochasticlinearregression.md#agg_functions-stochasticlinearregression)
- [线性回归和逻辑回归之间的差异](https://stackoverflow.com/questions/12146914/what-is-the-difference-between-linear-regression-and-logistic-regression)