ClickHouse/src/Parsers/Kusto/KQL_ReleaseNote.md

52 KiB

KQL implemented features

October 9, 2022

operator

  • distinct
    Customers | distinct *
    Customers | distinct Occupation
    Customers | distinct Occupation, Education
    Customers | where Age <30 | distinct Occupation, Education
    Customers | where Age <30 | order by Age| distinct Occupation, Education

String functions

  • reverse
    print reverse(123)
    print reverse(123.34)
    print reverse('clickhouse')
    print reverse(3h)
    print reverse(datetime(2017-1-1 12:23:34))

  • parse_command_line
    print parse_command_line('echo \"hello world!\" print$?', \"Windows\")

  • parse_csv
    print result=parse_csv('aa,b,cc')
    print result_multi_record=parse_csv('record1,a,b,c\nrecord2,x,y,z')

  • parse_json
    print parse_json( dynamic([1, 2, 3]))
    print parse_json('{"a":123.5, "b":"{\\"c\\":456}"}')

  • extract_json
    print extract_json( "$.a" , '{"a":123, "b":"{\\"c\\":456}"}' , typeof(int))

  • parse_version
    print parse_version('1')
    print parse_version('1.2.3.40')

Bug fixed

September 26, 2022

Bug fixed :

"select * from kql" results in syntax error
Parsing ipv4 with arrayStringConcat throws exception
CH Client crashes on invalid function name
extract() doesn't work right with 4th argument i.e typeof()
parse_ipv6_mask return incorrect results
timespan returns wrong output in seconds
timespan doesn't work for nanoseconds and tick
totimespan() doesn't work for nanoseconds and tick timespan unit
data types should throw exception in certain cases
decimal does not support scientific notation
extend statement causes client core dumping
extend crashes with array sorting
Core dump happens when WHERE keyword doesn't follow field name
Null values are missing in the result of `make_list_with_nulls'
trim functions use non-unique aliases
format_ipv4_mask returns incorrect mask value

September 12, 2022

Extend operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extendoperator
T | extend T | extend duration = endTime - startTime
T | project endTime, startTime | extend duration = endTime - startTime

Array functions

  • array_reverse
    print array_reverse(dynamic(["this", "is", "an", "example"])) == dynamic(["example","an","is","this"])

  • array_rotate_left
    print array_rotate_left(dynamic([1,2,3,4,5]), 2) == dynamic([3,4,5,1,2])
    print array_rotate_left(dynamic([1,2,3,4,5]), -2) == dynamic([4,5,1,2,3])

  • array_rotate_right
    print array_rotate_right(dynamic([1,2,3,4,5]), -2) == dynamic([3,4,5,1,2])
    print array_rotate_right(dynamic([1,2,3,4,5]), 2) == dynamic([4,5,1,2,3])

  • array_shift_left
    print array_shift_left(dynamic([1,2,3,4,5]), 2) == dynamic([3,4,5,null,null])
    print array_shift_left(dynamic([1,2,3,4,5]), -2) == dynamic([null,null,1,2,3])
    print array_shift_left(dynamic([1,2,3,4,5]), 2, -1) == dynamic([3,4,5,-1,-1])
    print array_shift_left(dynamic(['a', 'b', 'c']), 2) == dynamic(['c','',''])

  • array_shift_right
    print array_shift_right(dynamic([1,2,3,4,5]), -2) == dynamic([3,4,5,null,null])
    print array_shift_right(dynamic([1,2,3,4,5]), 2) == dynamic([null,null,1,2,3])
    print array_shift_right(dynamic([1,2,3,4,5]), -2, -1) == dynamic([3,4,5,-1,-1])
    print array_shift_right(dynamic(['a', 'b', 'c']), -2) == dynamic(['c','',''])

  • pack_array
    print x = 1, y = x * 2, z = y * 2, pack_array(x,y,z)

    Please note that only arrays of elements of the same type may be created at this time. The underlying reasons are explained under the release note section of the dynamic data type.

  • repeat
    print repeat(1, 0) == dynamic([])
    print repeat(1, 3) == dynamic([1, 1, 1])
    print repeat("asd", 3) == dynamic(['asd', 'asd', 'asd'])
    print repeat(timespan(1d), 3) == dynamic([86400, 86400, 86400])
    print repeat(true, 3) == dynamic([true, true, true])

  • zip
    print zip(dynamic([1,3,5]), dynamic([2,4,6]))

    Please note that only arrays of the same type are supported in our current implementation. The underlying reasons are explained under the release note section of the dynamic data type.

Data types

  • dynamic
    print isnull(dynamic(null))
    print dynamic(1) == 1
    print dynamic(timespan(1d)) == 86400
    print dynamic([1, 2, 3])
    print dynamic([[1], [2], [3]])
    print dynamic(['a', "b", 'c'])

    According to the KQL specifications dynamic is a literal, which means that no function calls are permitted. Expressions producing literals such as datetime and timespan and their aliases (ie. date and time, respectively) along with nested dynamic literals are allowed.

    Please note that our current implementation supports only scalars and arrays made up of elements of the same type. Support for mixed types and property bags is deferred for now, based on our understanding of the required effort and discussion with representatives of the QRadar team.

Mathematical functions

  • isnan
    print isnan(double(nan)) == true
    print isnan(4.2) == false
    print isnan(4) == false
    print isnan(real(+inf)) == false

Set functions

Please note that functions returning arrays with set semantics may return them in any particular order, which may be subject to change in the future.

  • jaccard_index
    print jaccard_index(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3, 4, 4, 4])) == 0.75
    print jaccard_index(dynamic([1, 2, 3]), dynamic([])) == 0
    print jaccard_index(dynamic([]), dynamic([1, 2, 3, 4])) == 0
    print isnan(jaccard_index(dynamic([]), dynamic([])))
    print jaccard_index(dynamic([1, 2, 3]), dynamic([4, 5, 6, 7])) == 0
    print jaccard_index(dynamic(['a', 's', 'd']), dynamic(['f', 'd', 's', 'a'])) == 0.75
    print jaccard_index(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])) == 0.25

  • set_difference
    print set_difference(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])) == dynamic([])
    print array_sort_asc(set_difference(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([4, 5, 6])
    print set_difference(dynamic([4]), dynamic([1, 2, 3])) == dynamic([4])
    print array_sort_asc(set_difference(dynamic([1, 2, 3, 4, 5]), dynamic([5]), dynamic([2, 4])))[1] == dynamic([1, 3])
    print array_sort_asc(set_difference(dynamic([1, 2, 3]), dynamic([])))[1] == dynamic([1, 2, 3])
    print array_sort_asc(set_difference(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])))[1] == dynamic(['d', 's'])
    print array_sort_asc(set_difference(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])))[1] == dynamic(['Chewbacca', 'Han Solo'])

  • set_has_element
    print set_has_element(dynamic(["this", "is", "an", "example"]), "example") == true
    print set_has_element(dynamic(["this", "is", "an", "example"]), "test") == false
    print set_has_element(dynamic([1, 2, 3]), 2) == true
    print set_has_element(dynamic([1, 2, 3, 4.2]), 4) == false

  • set_intersect
    print array_sort_asc(set_intersect(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
    print array_sort_asc(set_intersect(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
    print set_intersect(dynamic([4]), dynamic([1, 2, 3])) == dynamic([])
    print set_intersect(dynamic([1, 2, 3, 4, 5]), dynamic([1, 3, 5]), dynamic([2, 5])) == dynamic([5])
    print set_intersect(dynamic([1, 2, 3]), dynamic([])) == dynamic([])
    print set_intersect(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])) == dynamic(['a'])
    print set_intersect(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])) == dynamic(['Darth Vader'])

  • set_union
    print array_sort_asc(set_union(dynamic([1, 1, 2, 2, 3, 3]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3])
    print array_sort_asc(set_union(dynamic([1, 4, 2, 3, 5, 4, 6]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3, 4, 5, 6])
    print array_sort_asc(set_union(dynamic([4]), dynamic([1, 2, 3])))[1] == dynamic([1, 2, 3, 4])
    print array_sort_asc(set_union(dynamic([1, 3, 4]), dynamic([5]), dynamic([2, 4])))[1] == dynamic([1, 2, 3, 4, 5])
    print array_sort_asc(set_union(dynamic([1, 2, 3]), dynamic([])))[1] == dynamic([1, 2, 3])
    print array_sort_asc(set_union(dynamic(['a', 's', 'd']), dynamic(['a', 'f'])))[1] == dynamic(['a', 'd', 'f', 's'])
    print array_sort_asc(set_union(dynamic(['Chewbacca', 'Darth Vader', 'Han Solo']), dynamic(['Darth Sidious', 'Darth Vader'])))[1] == dynamic(['Chewbacca', 'Darth Sidious', 'Darth Vader', 'Han Solo'])

August 29, 2022

mv-expand operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/mvexpandoperator Note: expand on array columns only

  • test cases
    CREATE TABLE T
    (    
       a UInt8,
       b Array(String),
       c Array(Int8),
       d Array(Int8)
    ) ENGINE = Memory;
    
    INSERT INTO T VALUES (1, ['Salmon', 'Steak','Chicken'],[1,2,3,4],[5,6,7,8])
    
    T | mv-expand c  
    T | mv-expand c, d  
    T | mv-expand b | mv-expand c  
    T | mv-expand c to typeof(bool)  
    T | mv-expand with_itemindex=index b, c, d  
    T | mv-expand array_concat(c,d)   
    T | mv-expand x = c, y = d   
    T | mv-expand xy = array_concat(c, d)  
    T | mv-expand with_itemindex=index c,d to typeof(bool)  
    

make-series operator

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/make-seriesoperator

  • test case make-series on datetime column
    CREATE TABLE T
    (    
       Supplier Nullable(String),
       Fruit String ,
       Price Float64,
       Purchase Date 
    ) ENGINE = Memory;
    
    INSERT INTO T VALUES  ('Aldi','Apple',4,'2016-09-10');
    INSERT INTO T VALUES  ('Costco','Apple',2,'2016-09-11');
    INSERT INTO T VALUES  ('Aldi','Apple',6,'2016-09-10');
    INSERT INTO T VALUES  ('Costco','Snargaluff',100,'2016-09-12');
    INSERT INTO T VALUES  ('Aldi','Apple',7,'2016-09-12');
    INSERT INTO T VALUES  ('Aldi','Snargaluff',400,'2016-09-11');
    INSERT INTO T VALUES  ('Costco','Snargaluff',104,'2016-09-12');
    INSERT INTO T VALUES  ('Aldi','Apple',5,'2016-09-12');
    INSERT INTO T VALUES  ('Aldi','Snargaluff',600,'2016-09-11');
    INSERT INTO T VALUES  ('Costco','Snargaluff',200,'2016-09-10');
    
    Have from and to
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  to datetime(2016-09-13) step 1d by Supplier, Fruit
    
    Has from , without to
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  step 1d by Supplier, Fruit
    
    Without from , has to
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase  to datetime(2016-09-13) step 1d by Supplier, Fruit
    
    Without from , without to
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase step 1d by Supplier, Fruit
    
    Without by clause
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase step 1d
    
    Without aggregation alias
    T |  make-series avg(Price) default=0 on Purchase step 1d by Supplier, Fruit
    
    Has group expression alias
    T |  make-series avg(Price) default=0 on Purchase step 1d by Supplier_Name = Supplier, Fruit
    
    Use different step value
    T |  make-series PriceAvg = avg(Price) default=0 on Purchase from datetime(2016-09-10)  to datetime(2016-09-13) step 3d by Supplier, Fruit
    
  • test case make-series on numeric column
    CREATE TABLE T2
    (    
       Supplier Nullable(String),
       Fruit String ,
       Price Int32,
       Purchase Int32  
    ) ENGINE = Memory;
    
    INSERT INTO T2 VALUES  ('Aldi','Apple',4,10);
    INSERT INTO T2 VALUES  ('Costco','Apple',2,11);
    INSERT INTO T2 VALUES  ('Aldi','Apple',6,10);
    INSERT INTO T2 VALUES  ('Costco','Snargaluff',100,12);
    INSERT INTO T2 VALUES  ('Aldi','Apple',7,12);
    INSERT INTO T2 VALUES  ('Aldi','Snargaluff',400,11);
    INSERT INTO T2 VALUES  ('Costco','Snargaluff',104,12);
    INSERT INTO T2 VALUES  ('Aldi','Apple',5,12);
    INSERT INTO T2 VALUES  ('Aldi','Snargaluff',600,11);
    INSERT INTO T2 VALUES  ('Costco','Snargaluff',200,10);
    
    Have from and to
    T2 | make-series PriceAvg=avg(Price) default=0 on Purchase from 10 to  15 step  1.0  by Supplier, Fruit;
    
    Has from , without to
    T2 | make-series PriceAvg=avg(Price) default=0 on Purchase from 10 step  1.0  by Supplier, Fruit;
    
    Without from , has to
    T2 | make-series PriceAvg=avg(Price) default=0 on Purchase to 18 step  4.0  by Supplier, Fruit;
    
    Without from , without to
    T2 | make-series PriceAvg=avg(Price) default=0 on Purchase step  2.0  by Supplier, Fruit;
    
    Without by clause
    T2 | make-series PriceAvg=avg(Price) default=0 on Purchase step  2.0;
    

Aggregate Functions

  • bin
    print bin(4.5, 1)
    print bin(time(16d), 7d)
    print bin(datetime(1970-05-11 13:45:07), 1d)

  • stdev
    Customers | summarize t = stdev(Age) by FirstName

  • stdevif
    Customers | summarize t = stdevif(Age, Age < 10) by FirstName

  • binary_all_and
    Customers | summarize t = binary_all_and(Age) by FirstName

  • binary_all_or
    Customers | summarize t = binary_all_or(Age) by FirstName

  • binary_all_xor
    Customers | summarize t = binary_all_xor(Age) by FirstName

  • percentiles
    Customers | summarize percentiles(Age, 30, 40, 50, 60, 70) by FirstName

  • percentilesw
    DataTable | summarize t = percentilesw(Bucket, Frequency, 50, 75, 99.9)

  • percentile
    Customers | summarize t = percentile(Age, 50) by FirstName

  • percentilew
    DataTable | summarize t = percentilew(Bucket, Frequency, 50)

Dynamic functions

  • array_sort_asc
    Only support the constant dynamic array.
    Returns an array. So, each element of the input has to be of same datatype.
    print t = array_sort_asc(dynamic([null, 'd', 'a', 'c', 'c']))
    print t = array_sort_asc(dynamic([4, 1, 3, 2]))
    print t = array_sort_asc(dynamic(['b', 'a', 'c']), dynamic(['q', 'p', 'r']))
    print t = array_sort_asc(dynamic(['q', 'p', 'r']), dynamic(['clickhouse','hello', 'world']))
    print t = array_sort_asc( dynamic(['d', null, 'a', 'c', 'c']) , false)
    print t = array_sort_asc( dynamic(['d', null, 'a', 'c', 'c']) , 1 > 2)
    print t = array_sort_asc( dynamic([null, 'd', null, null, 'a', 'c', 'c', null, null, null]) , false)
    print t = array_sort_asc( dynamic([null, null, null]) , false)
    print t = array_sort_asc(dynamic([2, 1, null,3]), dynamic([20, 10, 40, 30]), 1 > 2)
    print t = array_sort_asc(dynamic([2, 1, null,3]), dynamic([20, 10, 40, 30, 50, 3]), 1 > 2)

  • array_sort_desc (only support the constant dynamic array)

    print t = array_sort_desc(dynamic([null, 'd', 'a', 'c', 'c']))
    print t = array_sort_desc(dynamic([4, 1, 3, 2]))
    print t = array_sort_desc(dynamic(['b', 'a', 'c']), dynamic(['q', 'p', 'r']))
    print t = array_sort_desc(dynamic(['q', 'p', 'r']), dynamic(['clickhouse','hello', 'world']))
    print t = array_sort_desc( dynamic(['d', null, 'a', 'c', 'c']) , false)
    print t = array_sort_desc( dynamic(['d', null, 'a', 'c', 'c']) , 1 > 2)
    print t = array_sort_desc( dynamic([null, 'd', null, null, 'a', 'c', 'c', null, null, null]) , false)
    print t = array_sort_desc( dynamic([null, null, null]) , false)
    print t = array_sort_desc(dynamic([2, 1, null, 3]), dynamic([20, 10, 40, 30]), 1 > 2)
    print t = array_sort_desc(dynamic([2, 1, null,3, null]), dynamic([20, 10, 40, 30, 50, 3]), 1 > 2)

  • array_concat
    print array_concat(dynamic([1, 2, 3]), dynamic([4, 5]), dynamic([6, 7, 8, 9])) == dynamic([1, 2, 3, 4, 5, 6, 7, 8, 9])

  • array_iff / array_iif
    print array_iif(dynamic([true, false, true]), dynamic([1, 2, 3]), dynamic([4, 5, 6])) == dynamic([1, 5, 3])
    print array_iif(dynamic([true, false, true]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3])
    print array_iif(dynamic([true, false, true, false]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3, null])
    print array_iif(dynamic([1, 0, -1, 44, 0]), dynamic([1, 2, 3, 4]), dynamic([4, 5, 6])) == dynamic([1, 5, 3, 4, null])

  • array_slice
    print array_slice(dynamic([1,2,3]), 1, 2) == dynamic([2, 3])
    print array_slice(dynamic([1,2,3,4,5]), 2, -1) == dynamic([3, 4, 5])
    print array_slice(dynamic([1,2,3,4,5]), -3, -2) == dynamic([3, 4])

  • array_split
    print array_split(dynamic([1,2,3,4,5]), 2) == dynamic([[1,2],[3,4,5]])
    print array_split(dynamic([1,2,3,4,5]), dynamic([1,3])) == dynamic([[1],[2,3],[4,5]])

DateTimeFunctions

  • ago
    print ago(2h)

  • endofday
    print endofday(datetime(2017-01-01 10:10:17), -1)
    print endofday(datetime(2017-01-01 10:10:17), 1)
    print endofday(datetime(2017-01-01 10:10:17))

  • endofmonth
    print endofmonth(datetime(2017-01-01 10:10:17), -1)
    print endofmonth(datetime(2017-01-01 10:10:17), 1)
    print endofmonth(datetime(2017-01-01 10:10:17))

  • endofweek
    print endofweek(datetime(2017-01-01 10:10:17), 1)
    print endofweek(datetime(2017-01-01 10:10:17), -1)
    print endofweek(datetime(2017-01-01 10:10:17))

  • endofyear
    print endofyear(datetime(2017-01-01 10:10:17), -1)
    print endofyear(datetime(2017-01-01 10:10:17), 1)
    print endofyear(datetime(2017-01-01 10:10:17))

  • make_datetime
    print make_datetime(2017,10,01)
    print make_datetime(2017,10,01,12,10)
    print make_datetime(2017,10,01,12,11,0.1234567)

  • datetime_diff
    print datetime_diff('year',datetime(2017-01-01),datetime(2000-12-31))
    print datetime_diff('quarter',datetime(2017-07-01),datetime(2017-03-30))
    print datetime_diff('minute',datetime(2017-10-30 23:05:01),datetime(2017-10-30 23:00:59))

  • unixtime_microseconds_todatetime
    print unixtime_microseconds_todatetime(1546300800000000)

  • unixtime_milliseconds_todatetime
    print unixtime_milliseconds_todatetime(1546300800000)

  • unixtime_nanoseconds_todatetime
    print unixtime_nanoseconds_todatetime(1546300800000000000)

  • datetime_part
    print datetime_part('day', datetime(2017-10-30 01:02:03.7654321))

  • datetime_add
    print datetime_add('day',1,datetime(2017-10-30 01:02:03.7654321))

  • format_timespan
    print format_timespan(time(1d), 'd-[hh:mm:ss]')
    print format_timespan(time('12:30:55.123'), 'ddddd-[hh:mm:ss.ffff]')

  • format_datetime
    print format_datetime(todatetime('2009-06-15T13:45:30.6175425'), 'yy-M-dd [H:mm:ss.fff]')
    print format_datetime(datetime(2015-12-14 02:03:04.12345), 'y-M-d h:m:s tt')

  • todatetime
    print todatetime('2014-05-25T08:20:03.123456Z')
    print todatetime('2014-05-25 20:03.123')

  • [totimespan] (https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/totimespanfunction) print totimespan('0.01:34:23') print totimespan(1d)

August 15, 2022

double quote support
print res = strcat("double ","quote")

Aggregate functions

  • bin_at
    print res = bin_at(6.5, 2.5, 7)
    print res = bin_at(1h, 1d, 12h)
    print res = bin_at(datetime(2017-05-15 10:20:00.0), 1d, datetime(1970-01-01 12:00:00.0))
    print res = bin_at(datetime(2017-05-17 10:20:00.0), 7d, datetime(2017-06-04 00:00:00.0))

  • array_index_of
    Supports only basic lookup. Do not support start_index, length and occurrence
    print output = array_index_of(dynamic(['John', 'Denver', 'Bob', 'Marley']), 'Marley')
    print output = array_index_of(dynamic([1, 2, 3]), 2)

  • array_sum
    print output = array_sum(dynamic([2, 5, 3]))
    print output = array_sum(dynamic([2.5, 5.5, 3]))

  • array_length
    print output = array_length(dynamic(['John', 'Denver', 'Bob', 'Marley']))
    print output = array_length(dynamic([1, 2, 3]))

Conversion

  • tobool / toboolean print tobool(true) == true print toboolean(false) == false print tobool(0) == false print toboolean(19819823) == true print tobool(-2) == true print isnull(toboolean('a')) print tobool('true') == true print toboolean('false') == false

  • todouble / toreal print todouble(4) == 4 print toreal(4.2) == 4.2 print isnull(todouble('a')) print toreal('-0.3') == -0.3

  • toint print isnull(toint('a'))
    print toint(4) == 4
    print toint('4') == 4
    print isnull(toint(4.2))

  • tostring print tostring(123) == '123'
    print tostring('asd') == 'asd'

Data Types

  • dynamic
    Supports only 1D array
    print output = dynamic(['a', 'b', 'c'])
    print output = dynamic([1, 2, 3])

  • bool,boolean
    print bool(1)
    print boolean(0)

  • datetime
    print datetime(2015-12-31 23:59:59.9)
    print datetime('2015-12-31 23:59:59.9')
    print datetime("2015-12-31:)

  • guid
    print guid(74be27de-1e4e-49d9-b579-fe0b331d3642)
    print guid('74be27de-1e4e-49d9-b579-fe0b331d3642')
    print guid('74be27de1e4e49d9b579fe0b331d3642')

  • int
    print int(1)

  • long
    print long(16)

  • real
    print real(1)

  • timespan ,time
    Note the timespan is used for calculating datatime, so the output is in seconds. e.g. time(1h) = 3600 print 1d
    print 30m
    print time('0.12:34:56.7')
    print time(2h)
    print timespan(2h)

StringFunctions

  • base64_encode_fromguid
    print Quine = base64_encode_fromguid('ae3133f2-6e22-49ae-b06a-16e6a9b212eb')
  • base64_decode_toarray
    print base64_decode_toarray('S3VzdG8=')
  • base64_decode_toguid
    print base64_decode_toguid('YWUzMTMzZjItNmUyMi00OWFlLWIwNmEtMTZlNmE5YjIxMmVi')
  • replace_regex
    print replace_regex('Hello, World!', '.', '\\0\\0')
  • has_any_index
    print idx = has_any_index('this is an example', dynamic(['this', 'example']))
  • translate
    print translate('krasp', 'otsku', 'spark')
  • trim
    print trim('--', '--https://bing.com--')
  • trim_end
    print trim_end('.com', 'bing.com')
  • trim_start
    print trim_start('[^\\w]+', strcat('- ','Te st1','// $'))

DateTimeFunctions

  • startofyear
    print startofyear(datetime(2017-01-01 10:10:17), -1)
    print startofyear(datetime(2017-01-01 10:10:17), 0)
    print startofyear(datetime(2017-01-01 10:10:17), 1)

  • weekofyear
    print week_of_year(datetime(2020-12-31))
    print week_of_year(datetime(2020-06-15))
    print week_of_year(datetime(1970-01-01))
    print week_of_year(datetime(2000-01-01))

  • startofweek
    print startofweek(datetime(2017-01-01 10:10:17), -1)
    print startofweek(datetime(2017-01-01 10:10:17), 0)
    print startofweek(datetime(2017-01-01 10:10:17), 1)

  • startofmonth
    print startofmonth(datetime(2017-01-01 10:10:17), -1)
    print startofmonth(datetime(2017-01-01 10:10:17), 0)
    print startofmonth(datetime(2017-01-01 10:10:17), 1)

  • startofday
    print startofday(datetime(2017-01-01 10:10:17), -1)
    print startofday(datetime(2017-01-01 10:10:17), 0)
    print startofday(datetime(2017-01-01 10:10:17), 1)

  • monthofyear
    print monthofyear(datetime("2015-12-14"))

  • hourofday
    print hourofday(datetime(2015-12-14 18:54:00))

  • getyear
    print getyear(datetime(2015-10-12))

  • getmonth
    print getmonth(datetime(2015-10-12))

  • dayofyear
    print dayofyear(datetime(2015-12-14))

  • dayofmonth
    print (datetime(2015-12-14))

  • unixtime_seconds_todatetime
    print unixtime_seconds_todatetime(1546300800)

  • dayofweek
    print dayofweek(datetime(2015-12-20))

  • now
    print now()
    print now(2d)
    print now(-2h)
    print now(5microseconds)
    print now(5seconds)
    print now(6minutes)
    print now(-2d)
    print now(time(1d))

Binary functions

IP functions

  • format_ipv4
    print format_ipv4('192.168.1.255', 24) == '192.168.1.0'
    print format_ipv4(3232236031, 24) == '192.168.1.0'
  • format_ipv4_mask
    print format_ipv4_mask('192.168.1.255', 24) == '192.168.1.0/24'
    print format_ipv4_mask(3232236031, 24) == '192.168.1.0/24'
  • ipv4_compare
    print ipv4_compare('127.0.0.1', '127.0.0.1') == 0
    print ipv4_compare('192.168.1.1', '192.168.1.255') < 0
    print ipv4_compare('192.168.1.1/24', '192.168.1.255/24') == 0
    print ipv4_compare('192.168.1.1', '192.168.1.255', 24) == 0
  • ipv4_is_match
    print ipv4_is_match('127.0.0.1', '127.0.0.1') == true
    print ipv4_is_match('192.168.1.1', '192.168.1.255') == false
    print ipv4_is_match('192.168.1.1/24', '192.168.1.255/24') == true
    print ipv4_is_match('192.168.1.1', '192.168.1.255', 24) == true
  • ipv6_compare
    print ipv6_compare('::ffff:7f00:1', '127.0.0.1') == 0
    print ipv6_compare('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995') < 0
    print ipv6_compare('192.168.1.1/24', '192.168.1.255/24') == 0
    print ipv6_compare('fe80::85d:e82c:9446:7994/127', 'fe80::85d:e82c:9446:7995/127') == 0
    print ipv6_compare('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995', 127) == 0
  • ipv6_is_match
    print ipv6_is_match('::ffff:7f00:1', '127.0.0.1') == true
    print ipv6_is_match('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995') == false
    print ipv6_is_match('192.168.1.1/24', '192.168.1.255/24') == true
    print ipv6_is_match('fe80::85d:e82c:9446:7994/127', 'fe80::85d:e82c:9446:7995/127') == true
    print ipv6_is_match('fe80::85d:e82c:9446:7994', 'fe80::85d:e82c:9446:7995', 127) == true
  • parse_ipv4_mask
    print parse_ipv4_mask('127.0.0.1', 24) == 2130706432
    print parse_ipv4_mask('192.1.168.2', 31) == 3221334018
    print parse_ipv4_mask('192.1.168.3', 31) == 3221334018
    print parse_ipv4_mask('127.2.3.4', 32) == 2130838276
  • parse_ipv6_mask
    print parse_ipv6_mask('127.0.0.1', 24) == '0000:0000:0000:0000:0000:ffff:7f00:0000'
    print parse_ipv6_mask('fe80::85d:e82c:9446:7994', 120) == 'fe80:0000:0000:0000:085d:e82c:9446:7900'

August 1, 2022

The config setting to allow modify dialect setting.

  • Set dialect setting in server configuration XML at user level(users.xml). This sets the dialect at server startup and CH will do query parsing for all users with default profile according to dialect value.

For example: <profiles> <!-- Default settings. --> <default> <load_balancing>random</load_balancing> <dialect>kusto</dialect> </default>

  • Query can be executed with HTTP client as below once dialect is set in users.xml echo "KQL query" | curl -sS "http://localhost:8123/?" --data-binary @-

  • To execute the query using clickhouse-client , Update clickhouse-client.xml as below and connect clickhouse-client with --config-file option (clickhouse-client --config-file=<config-file path>)

    <config> <dialect>kusto</dialect> </config>

OR pass dialect setting with '--'. For example : clickhouse-client --dialect='kusto' -q "KQL query"

IP functions

July 17, 2022

Renamed dialect from sql_dialect to dialect

set dialect='clickhouse'
set dialect='kusto'

IP functions

  • parse_ipv4 "Customers | project parse_ipv4('127.0.0.1')"
  • parse_ipv6 "Customers | project parse_ipv6('127.0.0.1')"

Please note that the functions listed below only take constant parameters for now. Further improvement is to be expected to support expressions.

  • ipv4_is_private "Customers | project ipv4_is_private('192.168.1.6/24')" "Customers | project ipv4_is_private('192.168.1.6')"
  • ipv4_is_in_range "Customers | project ipv4_is_in_range('127.0.0.1', '127.0.0.1')" "Customers | project ipv4_is_in_range('192.168.1.6', '192.168.1.1/24')"
  • ipv4_netmask_suffix "Customers | project ipv4_netmask_suffix('192.168.1.1/24')" "Customers | project ipv4_netmask_suffix('192.168.1.1')"

string functions

July 4, 2022

sql_dialect

  • default is clickhouse
    set sql_dialect='clickhouse'
  • only process kql
    set sql_dialect='kusto'

KQL() function

  • create table
    CREATE TABLE kql_table4 ENGINE = Memory AS select *, now() as new_column From kql(Customers | project LastName,Age);
    verify the content of kql_table
    select * from kql_table

  • insert into table
    create a tmp table:

    CREATE TABLE temp
    (    
        FirstName Nullable(String),
        LastName String, 
        Age Nullable(UInt8)
    ) ENGINE = Memory;
    

    INSERT INTO temp select * from kql(Customers|project FirstName,LastName,Age);
    verify the content of temp
    select * from temp

  • Select from kql()
    Select * from kql(Customers|project FirstName)

KQL operators:

  • Tabular expression statements
    Customers
  • Select Column
    Customers | project FirstName,LastName,Occupation
  • Limit returned results
    Customers | project FirstName,LastName,Occupation | take 1 | take 3
  • sort, order
    Customers | order by Age desc , FirstName asc
  • Filter
    Customers | where Occupation == 'Skilled Manual'
  • summarize
    Customers |summarize max(Age) by Occupation

KQL string operators and functions

  • contains
    Customers |where Education contains 'degree'

  • !contains
    Customers |where Education !contains 'degree'

  • contains_cs
    Customers |where Education contains 'Degree'

  • !contains_cs
    Customers |where Education !contains 'Degree'

  • endswith
    Customers | where FirstName endswith 'RE'

  • !endswith
    Customers | where !FirstName endswith 'RE'

  • endswith_cs
    Customers | where FirstName endswith_cs 're'

  • !endswith_cs
    Customers | where FirstName !endswith_cs 're'

  • ==
    Customers | where Occupation == 'Skilled Manual'

  • !=
    Customers | where Occupation != 'Skilled Manual'

  • has
    Customers | where Occupation has 'skilled'

  • !has
    Customers | where Occupation !has 'skilled'

  • has_cs
    Customers | where Occupation has 'Skilled'

  • !has_cs
    Customers | where Occupation !has 'Skilled'

  • hasprefix
    Customers | where Occupation hasprefix_cs 'Ab'

  • !hasprefix
    Customers | where Occupation !hasprefix_cs 'Ab'

  • hasprefix_cs
    Customers | where Occupation hasprefix_cs 'ab'

  • !hasprefix_cs
    Customers | where Occupation! hasprefix_cs 'ab'

  • hassuffix
    Customers | where Occupation hassuffix 'Ent'

  • !hassuffix
    Customers | where Occupation !hassuffix 'Ent'

  • hassuffix_cs
    Customers | where Occupation hassuffix 'ent'

  • !hassuffix_cs
    Customers | where Occupation hassuffix 'ent'

  • in
    Customers |where Education in ('Bachelors','High School')

  • !in
    Customers | where Education !in ('Bachelors','High School')

  • matches regex
    Customers | where FirstName matches regex 'P.*r'

  • startswith
    Customers | where FirstName startswith 'pet'

  • !startswith
    Customers | where FirstName !startswith 'pet'

  • startswith_cs
    Customers | where FirstName startswith_cs 'pet'

  • !startswith_cs
    Customers | where FirstName !startswith_cs 'pet'

  • base64_encode_tostring()
    Customers | project base64_encode_tostring('Kusto1') | take 1

  • base64_decode_tostring()
    Customers | project base64_decode_tostring('S3VzdG8x') | take 1

  • isempty()
    Customers | where isempty(LastName)

  • isnotempty()
    Customers | where isnotempty(LastName)

  • isnotnull()
    Customers | where isnotnull(FirstName)

  • isnull()
    Customers | where isnull(FirstName)

  • url_decode()
    Customers | project url_decode('https%3A%2F%2Fwww.test.com%2Fhello%20word') | take 1

  • url_encode()
    Customers | project url_encode('https://www.test.com/hello word') | take 1

  • substring()
    Customers | project name_abbr = strcat(substring(FirstName,0,3), ' ', substring(LastName,2))

  • strcat()
    Customers | project name = strcat(FirstName, ' ', LastName)

  • strlen()
    Customers | project FirstName, strlen(FirstName)

  • strrep()
    Customers | project strrep(FirstName,2,'_')

  • toupper()
    Customers | project toupper(FirstName)

  • tolower()
    Customers | project tolower(FirstName)

Aggregate Functions

  • arg_max()
  • arg_min()
  • avg()
  • avgif()
  • count()
  • countif()
  • max()
  • maxif()
  • min()
  • minif()
  • sum()
  • sumif()
  • dcount()
  • dcountif()
  • bin