mirror of https://github.com/ClickHouse/ClickHouse.git synced 2024-12-13 01:41:59 +00:00

Yong Wang 6f6a103c5f Kusto-phase2: remove dialect auto option. use table function instead of subquery for kql() function fix type.

2023-08-26 07:40:39 -07:00

52 KiB

Raw Blame History

KQL implemented features

October 9, 2022

operator

distinct
Customers | distinct *
Customers | distinct Occupation
Customers | distinct Occupation, Education
Customers | where Age <30 | distinct Occupation, Education
Customers | where Age <30 | order by Age| distinct Occupation, Education

String functions

reverse
print reverse(123)
print reverse(123.34)
print reverse('clickhouse')
print reverse(3h)
print reverse(datetime(2017-1-1 12:23:34))
parse_command_line
print parse_command_line('echo \"hello world!\" print$?', \"Windows\")
parse_csv
print result=parse_csv('aa,b,cc')
print result_multi_record=parse_csv('record1,a,b,c\nrecord2,x,y,z')
parse_json
print parse_json( dynamic([1, 2, 3]))
print parse_json('{"a":123.5, "b":"{\\"c\\":456}"}')
extract_json
print extract_json( "$.a" , '{"a":123, "b":"{\\"c\\":456}"}' , typeof(int))
parse_version
print parse_version('1')
print parse_version('1.2.3.40')

Bug fixed

correct array index in expression
array index should start with 0

Summarize should generate alias or use correct columns

if bin is used , the column should be in select list if no alias include
if no column included in aggregate functions, ( like count() ), should has alias with fun name + '',e.g count
if column name included in aggregate functions, should have fun name + "_" + column name , like count(Age) -> count_Age

if argument of an aggregate functions is an exprision, Columns1 ... Columnsn should be used as alias

Customers | summarize count() by bin(Age, 10)  
┌─Age─┬─count_─┐
│  40 │      2 │
│  20 │      6 │
│  30 │      4 │
└─────┴────────┘
Customers | summarize count(Age) by bin(Age, 10)  
┌─Age─┬─count_Age─┐
│  40 │         2 │
│  20 │         6 │
│  30 │         4 │
└─────┴───────────┘
Customers | summarize count(Age+1) by bin(Age+1, 10)
┌─Columns1─┬─count_─┐
│       40 │      2 │
│       20 │      6 │
│       30 │      4 │
└──────────┴────────┘

extend doesn't replace existing columns
throw exception if use quoted string as alias
repeat() doesn't work with count argument as negative value
substring() doesn't work right with negative offsets
endofmonth() doesn't return correct result
split() outputs array instead of string
split() returns empty string when arg goes out of bound
split() doesn't work with negative index

September 26, 2022

Bug fixed :

"select * from kql" results in syntax error
Parsing ipv4 with arrayStringConcat throws exception
CH Client crashes on invalid function name
extract() doesn't work right with 4th argument i.e typeof()
parse_ipv6_mask return incorrect results
timespan returns wrong output in seconds
timespan doesn't work for nanoseconds and tick
totimespan() doesn't work for nanoseconds and tick timespan unit
data types should throw exception in certain cases
decimal does not support scientific notation
extend statement causes client core dumping
extend crashes with array sorting
Core dump happens when WHERE keyword doesn't follow field name
Null values are missing in the result of `make_list_with_nulls'
trim functions use non-unique aliases
format_ipv4_mask returns incorrect mask value

September 12, 2022

Extend operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extendoperator
T | extend T | extend duration = endTime - startTime
T | project endTime, startTime | extend duration = endTime - startTime

Array functions

array_reverse
print array_reverse(dynamic(["this", "is", "an", "example"])) == dynamic(["example","an","is","this"])
array_rotate_left
print array_rotate_left(dynamic([1,2,3,4,5]), 2) == dynamic([3,4,5,1,2])
print array_rotate_left(dynamic([1,2,3,4,5]), -2) == dynamic([4,5,1,2,3])
array_rotate_right
print array_rotate_right(dynamic([1,2,3,4,5]), -2) == dynamic([3,4,5,1,2])
print array_rotate_right(dynamic([1,2,3,4,5]), 2) == dynamic([4,5,1,2,3])
array_shift_left
print array_shift_left(dynamic([1,2,3,4,5]), 2) == dynamic([3,4,5,null,null])
print array_shift_left(dynamic([1,2,3,4,5]), -2) == dynamic([null,null,1,2,3])
print array_shift_left(dynamic([1,2,3,4,5]), 2, -1) == dynamic([3,4,5,-1,-1])
print array_shift_left(dynamic(['a', 'b', 'c']), 2) == dynamic(['c','',''])
array_shift_right
print array_shift_right(dynamic([1,2,3,4,5]), -2) == dynamic([3,4,5,null,null])
print array_shift_right(dynamic([1,2,3,4,5]), 2) == dynamic([null,null,1,2,3])
print array_shift_right(dynamic([1,2,3,4,5]), -2, -1) == dynamic([3,4,5,-1,-1])
print array_shift_right(dynamic(['a', 'b', 'c']), -2) == dynamic(['c','',''])
pack_array
print x = 1, y = x * 2, z = y * 2, pack_array(x,y,z)

Please note that only arrays of elements of the same type may be created at this time. The underlying reasons are explained under the release note section of the dynamic data type.
repeat
print repeat(1, 0) == dynamic([])
print repeat(1, 3) == dynamic([1, 1, 1])
print repeat("asd", 3) == dynamic(['asd', 'asd', 'asd'])
print repeat(timespan(1d), 3) == dynamic([86400, 86400, 86400])
print repeat(true, 3) == dynamic([true, true, true])
zip
print zip(dynamic([1,3,5]), dynamic([2,4,6]))

Please note that only arrays of the same type are supported in our current implementation. The underlying reasons are explained under the release note section of the dynamic data type.

Data types

dynamic
print isnull(dynamic(null))
print dynamic(1) == 1
print dynamic(timespan(1d)) == 86400
print dynamic([1, 2, 3])
print dynamic([[1], [2], [3]])
print dynamic(['a', "b", 'c'])

According to the KQL specifications dynamic is a literal, which means that no function calls are permitted. Expressions producing literals such as datetime and timespan and their aliases (ie. date and time, respectively) along with nested dynamic literals are allowed.

Please note that our current implementation supports only scalars and arrays made up of elements of the same type. Support for mixed types and property bags is deferred for now, based on our understanding of the required effort and discussion with representatives of the QRadar team.

Mathematical functions

isnan
print isnan(double(nan)) == true
print isnan(4.2) == false
print isnan(4) == false
print isnan(real(+inf)) == false

Set functions

Please note that functions returning arrays with set semantics may return them in any particular order, which may be subject to change in the future.

jaccard_index
print jaccard_index(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3, 4, 4, 4])) == 0.75
print jaccard_index(dynamic([1, 2, 3]), dynamic([])) == 0
print jaccard_index(dynamic([]), dynamic([1, 2, 3, 4])) == 0
print isnan(jaccard_index(dynamic([]), dynamic([])))
print jaccard_index(dynamic([1, 2, 3]), dynamic([4, 5, 6, 7])) == 0
print jaccard_index(dynamic(['a', 's', 'd']), dynamic(['f', 'd', 's', 'a'])) == 0.75
print jaccard_index(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])) == 0.25
set_difference
print set_difference(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])) == dynamic([])
print array_sort_asc(set_difference(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([4, 5, 6])
print set_difference(dynamic([4]), dynamic([1, 2, 3])) == dynamic([4])
print array_sort_asc(set_difference(dynamic([1, 2, 3, 4, 5]), dynamic([5]), dynamic([2, 4])))[1] == dynamic([1, 3])
print array_sort_asc(set_difference(dynamic([1, 2, 3]), dynamic([])))[1] == dynamic([1, 2, 3])
print array_sort_asc(set_difference(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])))[1] == dynamic(['d', 's'])
print array_sort_asc(set_difference(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])))[1] == dynamic(['Chewbacca', 'Han Solo'])
set_has_element
print set_has_element(dynamic(["this", "is", "an", "example"]), "example") == true
print set_has_element(dynamic(["this", "is", "an", "example"]), "test") == false
print set_has_element(dynamic([1, 2, 3]), 2) == true
print set_has_element(dynamic([1, 2, 3, 4.2]), 4) == false
set_intersect
print array_sort_asc(set_intersect(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
print array_sort_asc(set_intersect(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
print set_intersect(dynamic([4]), dynamic([1, 2, 3])) == dynamic([])
print set_intersect(dynamic([1, 2, 3, 4, 5]), dynamic([1, 3, 5]), dynamic([2, 5])) == dynamic([5])
print set_intersect(dynamic([1, 2, 3]), dynamic([])) == dynamic([])
print set_intersect(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])) == dynamic(['a'])
print set_intersect(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])) == dynamic(['Darth Vader'])
set_union
print array_sort_asc(set_union(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
print array_sort_asc(set_union(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3, 4, 5, 6])
print array_sort_asc(set_union(dynamic([4]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3, 4])
print array_sort_asc(set_union(dynamic([1, 3, 4]), dynamic([5]), dynamic([2, 4])))[1] == dynamic([1, 2, 3, 4, 5])
print array_sort_asc(set_union(dynamic([1, 2, 3]), dynamic([])))[1] == dynamic([1, 2, 3])
print array_sort_asc(set_union(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])))[1] == dynamic(['a', 'd', 'f', 's'])
print array_sort_asc(set_union(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])))[1] == dynamic(['Chewbacca', 'Darth Sidious', 'Darth Vader', 'Han Solo'])

August 29, 2022

mv-expand operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/mvexpandoperator Note: expand on array columns only

test cases

CREATE TABLE T
(    
   a UInt8,
   b Array(String),
   c Array(Int8),
   d Array(Int8)
) ENGINE = Memory;

INSERT INTO T VALUES (1, ['Salmon', 'Steak','Chicken'],[1,2,3,4],[5,6,7,8])

T | mv-expand c  
T | mv-expand c, d  
T | mv-expand b | mv-expand c  
T | mv-expand c to typeof(bool)  
T | mv-expand with_itemindex=index b, c, d  
T | mv-expand array_concat(c,d)   
T | mv-expand x = c, y = d   
T | mv-expand xy = array_concat(c, d)  
T | mv-expand with_itemindex=index c,d to typeof(bool)

make-series operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/make-seriesoperator

test case make-series on datetime column

CREATE TABLE T
(    
   Supplier Nullable(String),
   Fruit String ,
   Price Float64,
   Purchase Date 
) ENGINE = Memory;

INSERT INTO T VALUES  ('Aldi','Apple',4,'2016-09-10');
INSERT INTO T VALUES  ('Costco','Apple',2,'2016-09-11');
INSERT INTO T VALUES  ('Aldi','Apple',6,'2016-09-10');
INSERT INTO T VALUES  ('Costco','Snargaluff',100,'2016-09-12');
INSERT INTO T VALUES  ('Aldi','Apple',7,'2016-09-12');
INSERT INTO T VALUES  ('Aldi','Snargaluff',400,'2016-09-11');
INSERT INTO T VALUES  ('Costco','Snargaluff',104,'2016-09-12');
INSERT INTO T VALUES  ('Aldi','Apple',5,'2016-09-12');
INSERT INTO T VALUES  ('Aldi','Snargaluff',600,'2016-09-11');
INSERT INTO T VALUES  ('Costco','Snargaluff',200,'2016-09-10');

Have from and to

T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  to datetime(2016-09-13) step 1d by Supplier, Fruit

Has from , without to

T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  step 1d by Supplier, Fruit

Without from , has to

T |  make-series PriceAvg = avg(Price) default=0 on Purchase  to datetime(2016-09-13) step 1d by Supplier, Fruit

Without from , without to

T |  make-series PriceAvg = avg(Price) default=0 on Purchase step 1d by Supplier, Fruit

Without by clause

T |  make-series PriceAvg = avg(Price) default=0 on Purchase step 1d

Without aggregation alias

T |  make-series avg(Price) default=0 on Purchase step 1d by Supplier, Fruit

Has group expression alias

T |  make-series avg(Price) default=0 on Purchase step 1d by Supplier_Name = Supplier, Fruit

Use different step value

T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  to datetime(2016-09-13) step 3d by Supplier, Fruit

test case make-series on numeric column

CREATE TABLE T2
(    
   Supplier Nullable(String),
   Fruit String ,
   Price Int32,
   Purchase Int32  
) ENGINE = Memory;

INSERT INTO T2 VALUES  ('Aldi','Apple',4,10);
INSERT INTO T2 VALUES  ('Costco','Apple',2,11);
INSERT INTO T2 VALUES  ('Aldi','Apple',6,10);
INSERT INTO T2 VALUES  ('Costco','Snargaluff',100,12);
INSERT INTO T2 VALUES  ('Aldi','Apple',7,12);
INSERT INTO T2 VALUES  ('Aldi','Snargaluff',400,11);
INSERT INTO T2 VALUES  ('Costco','Snargaluff',104,12);
INSERT INTO T2 VALUES  ('Aldi','Apple',5,12);
INSERT INTO T2 VALUES  ('Aldi','Snargaluff',600,11);
INSERT INTO T2 VALUES  ('Costco','Snargaluff',200,10);

Have from and to

T2 | make-series PriceAvg=avg(Price) default=0 on Purchase from 10 to  15 step  1.0  by Supplier, Fruit;

Has from , without to

T2 | make-series PriceAvg=avg(Price) default=0 on Purchase from 10 step  1.0  by Supplier, Fruit;

Without from , has to

T2 | make-series PriceAvg=avg(Price) default=0 on Purchase to 18 step  4.0  by Supplier, Fruit;

Without from , without to

T2 | make-series PriceAvg=avg(Price) default=0 on Purchase step  2.0  by Supplier, Fruit;

Without by clause

T2 | make-series PriceAvg=avg(Price) default=0 on Purchase step  2.0;

Aggregate Functions

bin
print bin(4.5, 1)
print bin(time(16d), 7d)
print bin(datetime(1970-05-11 13:45:07), 1d)
stdev
Customers | summarize t = stdev(Age) by FirstName
stdevif
Customers | summarize t = stdevif(Age, Age < 10) by FirstName
binary_all_and
Customers | summarize t = binary_all_and(Age) by FirstName
binary_all_or
Customers | summarize t = binary_all_or(Age) by FirstName
binary_all_xor
Customers | summarize t = binary_all_xor(Age) by FirstName
percentiles
Customers | summarize percentiles(Age, 30, 40, 50, 60, 70) by FirstName
percentilesw
DataTable | summarize t = percentilesw(Bucket, Frequency, 50, 75, 99.9)
percentile
Customers | summarize t = percentile(Age, 50) by FirstName
percentilew
DataTable | summarize t = percentilew(Bucket, Frequency, 50)

Dynamic functions

array_sort_asc
Only support the constant dynamic array.
Returns an array. So, each element of the input has to be of same datatype.
print t = array_sort_asc(dynamic([null, 'd', 'a', 'c', 'c']))
print t = array_sort_asc(dynamic([4, 1, 3, 2]))
print t = array_sort_asc(dynamic(['b', 'a', 'c']), dynamic(['q', 'p', 'r']))
print t = array_sort_asc(dynamic(['q', 'p', 'r']), dynamic(['clickhouse','hello', 'world']))
print t = array_sort_asc( dynamic(['d', null, 'a', 'c', 'c']) , false)
print t = array_sort_asc( dynamic(['d', null, 'a', 'c', 'c']) , 1 > 2)
print t = array_sort_asc( dynamic([null, 'd', null, null, 'a', 'c', 'c', null, null, null]) , false)
print t = array_sort_asc( dynamic([null, null, null]) , false)
print t = array_sort_asc(dynamic([2, 1, null,3]), dynamic([20, 10, 40, 30]), 1 > 2)
print t = array_sort_asc(dynamic([2, 1, null,3]), dynamic([20, 10, 40, 30, 50, 3]), 1 > 2)
array_sort_desc (only support the constant dynamic array)

print t = array_sort_desc(dynamic([null, 'd', 'a', 'c', 'c']))
print t = array_sort_desc(dynamic([4, 1, 3, 2]))
print t = array_sort_desc(dynamic(['b', 'a', 'c']), dynamic(['q', 'p', 'r']))
print t = array_sort_desc(dynamic(['q', 'p', 'r']), dynamic(['clickhouse','hello', 'world']))
print t = array_sort_desc( dynamic(['d', null, 'a', 'c', 'c']) , false)
print t = array_sort_desc( dynamic(['d', null, 'a', 'c', 'c']) , 1 > 2)
print t = array_sort_desc( dynamic([null, 'd', null, null, 'a', 'c', 'c', null, null, null]) , false)
print t = array_sort_desc( dynamic([null, null, null]) , false)
print t = array_sort_desc(dynamic([2, 1, null, 3]), dynamic([20, 10, 40, 30]), 1 > 2)
print t = array_sort_desc(dynamic([2, 1, null,3, null]), dynamic([20, 10, 40, 30, 50, 3]), 1 > 2)
array_concat
print array_concat(dynamic([1, 2, 3]), dynamic([4, 5]), dynamic([6, 7, 8, 9])) == dynamic([1, 2, 3, 4, 5, 6, 7, 8, 9])
array_iff / array_iif
print array_iif(dynamic([true, false, true]), dynamic([1, 2, 3]), dynamic([4, 5, 6])) == dynamic([1, 5, 3])
print array_iif(dynamic([true, false, true]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3])
print array_iif(dynamic([true, false, true, false]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3, null])
print array_iif(dynamic([1, 0, -1, 44, 0]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3, 4, null])
array_slice
print array_slice(dynamic([1,2,3]), 1, 2) == dynamic([2, 3])
print array_slice(dynamic([1,2,3,4,5]), 2, -1) == dynamic([3, 4, 5])
print array_slice(dynamic([1,2,3,4,5]), -3, -2) == dynamic([3, 4])
array_split
print array_split(dynamic([1,2,3,4,5]), 2) == dynamic([[1,2],[3,4,5]])
print array_split(dynamic([1,2,3,4,5]), dynamic([1,3])) == dynamic([[1],[2,3],[4,5]])

DateTimeFunctions

ago
print ago(2h)
endofday
print endofday(datetime(2017-01-01 10:10:17), -1)
print endofday(datetime(2017-01-01 10:10:17), 1)
print endofday(datetime(2017-01-01 10:10:17))
endofmonth
print endofmonth(datetime(2017-01-01 10:10:17), -1)
print endofmonth(datetime(2017-01-01 10:10:17), 1)
print endofmonth(datetime(2017-01-01 10:10:17))
endofweek
print endofweek(datetime(2017-01-01 10:10:17), 1)
print endofweek(datetime(2017-01-01 10:10:17), -1)
print endofweek(datetime(2017-01-01 10:10:17))
endofyear
print endofyear(datetime(2017-01-01 10:10:17), -1)
print endofyear(datetime(2017-01-01 10:10:17), 1)
print endofyear(datetime(2017-01-01 10:10:17))
make_datetime
print make_datetime(2017,10,01)
print make_datetime(2017,10,01,12,10)
print make_datetime(2017,10,01,12,11,0.1234567)
datetime_diff
print datetime_diff('year',datetime(2017-01-01),datetime(2000-12-31))
print datetime_diff('quarter',datetime(2017-07-01),datetime(2017-03-30))
print datetime_diff('minute',datetime(2017-10-30 23:05:01),datetime(2017-10-30 23:00:59))
unixtime_microseconds_todatetime
print unixtime_microseconds_todatetime(1546300800000000)
unixtime_milliseconds_todatetime
print unixtime_milliseconds_todatetime(1546300800000)
unixtime_nanoseconds_todatetime
print unixtime_nanoseconds_todatetime(1546300800000000000)
datetime_part
print datetime_part('day', datetime(2017-10-30 01:02:03.7654321))
datetime_add
print datetime_add('day',1,datetime(2017-10-30 01:02:03.7654321))
format_timespan
print format_timespan(time(1d), 'd-[hh:mm:ss]')
print format_timespan(time('12:30:55.123'), 'ddddd-[hh:mm:ss.ffff]')
format_datetime
print format_datetime(todatetime('2009-06-15T13:45:30.6175425'), 'yy-M-dd [H:mm:ss.fff]')
print format_datetime(datetime(2015-12-14 02:03:04.12345), 'y-M-d h:m:s tt')
todatetime
print todatetime('2014-05-25T08:20:03.123456Z')
print todatetime('2014-05-25 20:03.123')
[totimespan] (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/totimespanfunction) print totimespan('0.01:34:23') print totimespan(1d)

August 15, 2022

double quote support
print res = strcat("double ","quote")

Aggregate functions

bin_at
print res = bin_at(6.5, 2.5, 7)
print res = bin_at(1h, 1d, 12h)
print res = bin_at(datetime(2017-05-15 10:20:00.0), 1d, datetime(1970-01-01 12:00:00.0))
print res = bin_at(datetime(2017-05-17 10:20:00.0), 7d, datetime(2017-06-04 00:00:00.0))
array_index_of
Supports only basic lookup. Do not support start_index, length and occurrence
print output = array_index_of(dynamic(['John', 'Denver', 'Bob', 'Marley']), 'Marley')
print output = array_index_of(dynamic([1, 2, 3]), 2)
array_sum
print output = array_sum(dynamic([2, 5, 3]))
print output = array_sum(dynamic([2.5, 5.5, 3]))
array_length
print output = array_length(dynamic(['John', 'Denver', 'Bob', 'Marley']))
print output = array_length(dynamic([1, 2, 3]))

Conversion

tobool / toboolean print tobool(true) == true print toboolean(false) == false print tobool(0) == false print toboolean(19819823) == true print tobool(-2) == true print isnull(toboolean('a')) print tobool('true') == true print toboolean('false') == false
todouble / toreal print todouble(4) == 4 print toreal(4.2) == 4.2 print isnull(todouble('a')) print toreal('-0.3') == -0.3
toint print isnull(toint('a'))
print toint(4) == 4
print toint('4') == 4
print isnull(toint(4.2))
tostring print tostring(123) == '123'
print tostring('asd') == 'asd'

Data Types

dynamic
Supports only 1D array
print output = dynamic(['a', 'b', 'c'])
print output = dynamic([1, 2, 3])
bool,boolean
print bool(1)
print boolean(0)
datetime
print datetime(2015-12-31 23:59:59.9)
print datetime('2015-12-31 23:59:59.9')
print datetime("2015-12-31:)
guid
print guid(74be27de-1e4e-49d9-b579-fe0b331d3642)
print guid('74be27de-1e4e-49d9-b579-fe0b331d3642')
print guid('74be27de1e4e49d9b579fe0b331d3642')
int
print int(1)
long
print long(16)
real
print real(1)
timespan ,time
Note the timespan is used for calculating datatime, so the output is in seconds. e.g. time(1h) = 3600 print 1d
print 30m
print time('0.12:34:56.7')
print time(2h)
print timespan(2h)

StringFunctions

base64_encode_fromguid
print Quine = base64_encode_fromguid('ae3133f2-6e22-49ae-b06a-16e6a9b212eb')
base64_decode_toarray
print base64_decode_toarray('S3VzdG8=')
base64_decode_toguid
print base64_decode_toguid('YWUzMTMzZjItNmUyMi00OWFlLWIwNmEtMTZlNmE5YjIxMmVi')
replace_regex
print replace_regex('Hello, World!', '.', '\\0\\0')
has_any_index
print idx = has_any_index('this is an example', dynamic(['this', 'example']))
translate
print translate('krasp', 'otsku', 'spark')
trim
print trim('--', '--https://bing.com--')
trim_end
print trim_end('.com', 'bing.com')
trim_start
print trim_start('[^\\w]+', strcat('- ','Te st1','// $'))

DateTimeFunctions

startofyear
print startofyear(datetime(2017-01-01 10:10:17), -1)
print startofyear(datetime(2017-01-01 10:10:17), 0)
print startofyear(datetime(2017-01-01 10:10:17), 1)
weekofyear
print week_of_year(datetime(2020-12-31))
print week_of_year(datetime(2020-06-15))
print week_of_year(datetime(1970-01-01))
print week_of_year(datetime(2000-01-01))
startofweek
print startofweek(datetime(2017-01-01 10:10:17), -1)
print startofweek(datetime(2017-01-01 10:10:17), 0)
print startofweek(datetime(2017-01-01 10:10:17), 1)
startofmonth
print startofmonth(datetime(2017-01-01 10:10:17), -1)
print startofmonth(datetime(2017-01-01 10:10:17), 0)
print startofmonth(datetime(2017-01-01 10:10:17), 1)
startofday
print startofday(datetime(2017-01-01 10:10:17), -1)
print startofday(datetime(2017-01-01 10:10:17), 0)
print startofday(datetime(2017-01-01 10:10:17), 1)
monthofyear
print monthofyear(datetime("2015-12-14"))
hourofday
print hourofday(datetime(2015-12-14 18:54:00))
getyear
print getyear(datetime(2015-10-12))
getmonth
print getmonth(datetime(2015-10-12))
dayofyear
print dayofyear(datetime(2015-12-14))
dayofmonth
print (datetime(2015-12-14))
unixtime_seconds_todatetime
print unixtime_seconds_todatetime(1546300800)
dayofweek
print dayofweek(datetime(2015-12-20))
now
print now()
print now(2d)
print now(-2h)
print now(5microseconds)
print now(5seconds)
print now(6minutes)
print now(-2d)
print now(time(1d))

Binary functions

binary_and
print binary_and(15, 3) == 3
print binary_and(1, 2) == 0
binary_not
print binary_not(1) == -2
binary_or
print binary_or(3, 8) == 11
print binary_or(1, 2) == 3
binary_shift_left
print binary_shift_left(1, 1) == 2
print binary_shift_left(1, 64) == 1
binary_shift_right
print binary_shift_right(1, 1) == 0 print binary_shift_right(1, 64) == 1
binary_xor
print binary_xor(1, 3) == 2
bitset_count_ones
print bitset_count_ones(42) == 3

IP functions

format_ipv4
print format_ipv4('192.168.1.255', 24) == '192.168.1.0'
print format_ipv4(3232236031, 24) == '192.168.1.0'
format_ipv4_mask
print format_ipv4_mask('192.168.1.255', 24) == '192.168.1.0/24'
print format_ipv4_mask(3232236031, 24) == '192.168.1.0/24'
ipv4_compare
print ipv4_compare('127.0.0.1', '127.0.0.1') == 0
print ipv4_compare('192.168.1.1', '192.168.1.255') < 0
print ipv4_compare('192.168.1.1/24', '192.168.1.255/24') == 0
print ipv4_compare('192.168.1.1', '192.168.1.255', 24) == 0
ipv4_is_match
print ipv4_is_match('127.0.0.1', '127.0.0.1') == true
print ipv4_is_match('192.168.1.1', '192.168.1.255') == false
print ipv4_is_match('192.168.1.1/24', '192.168.1.255/24') == true
print ipv4_is_match('192.168.1.1', '192.168.1.255', 24) == true
ipv6_compare
print ipv6_compare('::ffff:7f00:1', '127.0.0.1') == 0
print ipv6_compare('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995') < 0
print ipv6_compare('192.168.1.1/24', '192.168.1.255/24') == 0
print ipv6_compare('fe80::85d:e82c:9446:7994/127', 'fe80::85d:e82c:9446:7995/127') == 0
print ipv6_compare('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995', 127) == 0
ipv6_is_match
print ipv6_is_match('::ffff:7f00:1', '127.0.0.1') == true
print ipv6_is_match('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995') == false
print ipv6_is_match('192.168.1.1/24', '192.168.1.255/24') == true
print ipv6_is_match('fe80::85d:e82c:9446:7994/127', 'fe80::85d:e82c:9446:7995/127') == true
print ipv6_is_match('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995', 127) == true
parse_ipv4_mask
print parse_ipv4_mask('127.0.0.1', 24) == 2130706432
print parse_ipv4_mask('192.1.168.2', 31) == 3221334018
print parse_ipv4_mask('192.1.168.3', 31) == 3221334018
print parse_ipv4_mask('127.2.3.4', 32) == 2130838276
parse_ipv6_mask
print parse_ipv6_mask('127.0.0.1', 24) == '0000:0000:0000:0000:0000:ffff:7f00:0000'
print parse_ipv6_mask('fe80::85d:e82c:9446:7994', 120) == 'fe80:0000:0000:0000:085d:e82c:9446:7900'

August 1, 2022

The config setting to allow modify dialect setting.

Set dialect setting in server configuration XML at user level(users.xml). This sets the dialect at server startup and CH will do query parsing for all users with default profile according to dialect value.

For example: <profiles>  <default> <load_balancing>random</load_balancing> <dialect>kusto</dialect> </default>

Query can be executed with HTTP client as below once dialect is set in users.xml echo "KQL query" | curl -sS "http://localhost:8123/?" --data-binary @-
To execute the query using clickhouse-client , Update clickhouse-client.xml as below and connect clickhouse-client with --config-file option (clickhouse-client --config-file=<config-file path>)

<config> <dialect>kusto</dialect> </config>

OR pass dialect setting with '--'. For example : clickhouse-client --dialect='kusto' -q "KQL query"

strcmp (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/strcmpfunction)
print strcmp('abc','ABC')
parse_url (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/parseurlfunction)
print Result = parse_url('scheme://username:password@www.google.com:1234/this/is/a/path?k1=v1&k2=v2#fragment')
parse_urlquery (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/parseurlqueryfunction)
print Result = parse_urlquery('k1=v1&k2=v2&k3=v3')
print operator (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/printoperator)
print x=1, s=strcat('Hello', ', ', 'World!')
Aggregate Functions:
make_list()
Customers | summarize t = make_list(FirstName) by FirstName Customers | summarize t = make_list(FirstName, 10) by FirstName
make_list_if()
Customers | summarize t = make_list_if(FirstName, Age > 10) by FirstName Customers | summarize t = make_list_if(FirstName, Age > 10, 10) by FirstName
make_list_with_nulls()
Customers | summarize t = make_list_with_nulls(Age) by FirstName
make_set()
Customers | summarize t = make_set(FirstName) by FirstName Customers | summarize t = make_set(FirstName, 10) by FirstName
make_set_if()
Customers | summarize t = make_set_if(FirstName, Age > 10) by FirstName Customers | summarize t = make_set_if(FirstName, Age > 10, 10) by FirstName

IP functions

The following functions now support arbitrary expressions as their argument:

July 17, 2022

Renamed dialect from sql_dialect to dialect

set dialect='clickhouse'
set dialect='kusto'

IP functions

parse_ipv4 "Customers | project parse_ipv4('127.0.0.1')"
parse_ipv6 "Customers | project parse_ipv6('127.0.0.1')"

Please note that the functions listed below only take constant parameters for now. Further improvement is to be expected to support expressions.

ipv4_is_private "Customers | project ipv4_is_private('192.168.1.6/24')" "Customers | project ipv4_is_private('192.168.1.6')"
ipv4_is_in_range "Customers | project ipv4_is_in_range('127.0.0.1', '127.0.0.1')" "Customers | project ipv4_is_in_range('192.168.1.6', '192.168.1.1/24')"
ipv4_netmask_suffix "Customers | project ipv4_netmask_suffix('192.168.1.1/24')" "Customers | project ipv4_netmask_suffix('192.168.1.1')"

string functions

support subquery for in orerator (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/in-cs-operator)
(subquery need to be wrapped with bracket inside bracket)

Customers | where Age in ((Customers|project Age|where Age < 30)) Note: case-insensitive not supported yet
has_all (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/has-all-operator)
Customers|where Occupation has_any ('Skilled','abcd')
note : subquery not supported yet
has _any (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/has-anyoperator)
Customers|where Occupation has_all ('Skilled','abcd')
note : subquery not supported yet
countof (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/countoffunction)
Customers | project countof('The cat sat on the mat', 'at')
Customers | project countof('The cat sat on the mat', 'at', 'normal')
Customers | project countof('The cat sat on the mat', 'at', 'regex')
extract ( https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extractfunction)
Customers | project extract('(\\b[A-Z]+\\b).+(\\b\\d+)', 0, 'The price of PINEAPPLE ice cream is 20')
Customers | project extract('(\\b[A-Z]+\\b).+(\\b\\d+)', 1, 'The price of PINEAPPLE ice cream is 20')
Customers | project extract('(\\b[A-Z]+\\b).+(\\b\\d+)', 2, 'The price of PINEAPPLE ice cream is 20')
Customers | project extract('(\\b[A-Z]+\\b).+(\\b\\d+)', 3, 'The price of PINEAPPLE ice cream is 20')
Customers | project extract('(\\b[A-Z]+\\b).+(\\b\\d+)', 2, 'The price of PINEAPPLE ice cream is 20', typeof(real))
extract_all (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extractallfunction)

Customers | project extract_all('(\\w)(\\w+)(\\w)','The price of PINEAPPLE ice cream is 20')
note: captureGroups not supported yet
split (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/splitfunction)
Customers | project split('aa_bb', '_')
Customers | project split('aaa_bbb_ccc', '_', 1)
Customers | project split('', '_')
Customers | project split('a__b', '_')
Customers | project split('aabbcc', 'bb')
strcat_delim (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/strcat-delimfunction)
Customers | project strcat_delim('-', '1', '2', 'A') , 1s)
Customers | project strcat_delim('-', '1', '2', strcat('A','b'))
note: only support string now.
indexof (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/indexoffunction)
Customers | project indexof('abcdefg','cde')
Customers | project indexof('abcdefg','cde',2)
Customers | project indexof('abcdefg','cde',6)
note: length and occurrence not supported yet

July 4, 2022

sql_dialect

default is clickhouse
set sql_dialect='clickhouse'
only process kql
set sql_dialect='kusto'

KQL() function

create table
CREATE TABLE kql_table4 ENGINE = Memory AS select *, now() as new_column From kql(Customers | project LastName,Age);
verify the content of kql_table
select * from kql_table
insert into table
create a tmp table:
```
CREATE TABLE temp
(    
    FirstName Nullable(String),
    LastName String, 
    Age Nullable(UInt8)
) ENGINE = Memory;
```
INSERT INTO temp select * from kql(Customers|project FirstName,LastName,Age);
verify the content of temp
select * from temp
Select from kql()
Select * from kql(Customers|project FirstName)

KQL operators:

Tabular expression statements
Customers
Select Column
Customers | project FirstName,LastName,Occupation
Limit returned results
Customers | project FirstName,LastName,Occupation | take 1 | take 3
sort, order
Customers | order by Age desc , FirstName asc
Filter
Customers | where Occupation == 'Skilled Manual'
summarize
Customers |summarize max(Age) by Occupation

KQL string operators and functions

contains
Customers |where Education contains 'degree'
!contains
Customers |where Education !contains 'degree'
contains_cs
Customers |where Education contains 'Degree'
!contains_cs
Customers |where Education !contains 'Degree'
endswith
Customers | where FirstName endswith 'RE'
!endswith
Customers | where !FirstName endswith 'RE'
endswith_cs
Customers | where FirstName endswith_cs 're'
!endswith_cs
Customers | where FirstName !endswith_cs 're'
==
Customers | where Occupation == 'Skilled Manual'
!=
Customers | where Occupation != 'Skilled Manual'
has
Customers | where Occupation has 'skilled'
!has
Customers | where Occupation !has 'skilled'
has_cs
Customers | where Occupation has 'Skilled'
!has_cs
Customers | where Occupation !has 'Skilled'
hasprefix
Customers | where Occupation hasprefix_cs 'Ab'
!hasprefix
Customers | where Occupation !hasprefix_cs 'Ab'
hasprefix_cs
Customers | where Occupation hasprefix_cs 'ab'
!hasprefix_cs
Customers | where Occupation! hasprefix_cs 'ab'
hassuffix
Customers | where Occupation hassuffix 'Ent'
!hassuffix
Customers | where Occupation !hassuffix 'Ent'
hassuffix_cs
Customers | where Occupation hassuffix 'ent'
!hassuffix_cs
Customers | where Occupation hassuffix 'ent'
in
Customers |where Education in ('Bachelors','High School')
!in
Customers | where Education !in ('Bachelors','High School')
matches regex
Customers | where FirstName matches regex 'P.*r'
startswith
Customers | where FirstName startswith 'pet'
!startswith
Customers | where FirstName !startswith 'pet'
startswith_cs
Customers | where FirstName startswith_cs 'pet'
!startswith_cs
Customers | where FirstName !startswith_cs 'pet'
base64_encode_tostring()
Customers | project base64_encode_tostring('Kusto1') | take 1
base64_decode_tostring()
Customers | project base64_decode_tostring('S3VzdG8x') | take 1
isempty()
Customers | where isempty(LastName)
isnotempty()
Customers | where isnotempty(LastName)
isnotnull()
Customers | where isnotnull(FirstName)
isnull()
Customers | where isnull(FirstName)
url_decode()
Customers | project url_decode('https%3A%2F%2Fwww.test.com%2Fhello%20word') | take 1
url_encode()
Customers | project url_encode('https://www.test.com/hello word') | take 1
substring()
Customers | project name_abbr = strcat(substring(FirstName,0,3), ' ', substring(LastName,2))
strcat()
Customers | project name = strcat(FirstName, ' ', LastName)
strlen()
Customers | project FirstName, strlen(FirstName)
strrep()
Customers | project strrep(FirstName,2,'_')
toupper()
Customers | project toupper(FirstName)
tolower()
Customers | project tolower(FirstName)

Aggregate Functions

arg_max()
arg_min()
avg()
avgif()
count()
countif()
max()
maxif()
min()
minif()
sum()
sumif()
dcount()
dcountif()
bin

52 KiB Raw Blame History

KQL implemented features

October 9, 2022

operator

String functions

Bug fixed

September 26, 2022

Bug fixed :

September 12, 2022

Extend operator

Array functions

Data types

Mathematical functions

Set functions

August 29, 2022

mv-expand operator

make-series operator

Aggregate Functions

Dynamic functions

DateTimeFunctions

August 15, 2022

Aggregate functions

Conversion

Data Types

StringFunctions

DateTimeFunctions

Binary functions

IP functions

August 1, 2022

IP functions

July 17, 2022

Renamed dialect from sql_dialect to dialect

IP functions

string functions

July 4, 2022

sql_dialect

KQL() function

KQL operators:

KQL string operators and functions

Aggregate Functions

52 KiB

Raw Blame History