mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-12-01 03:52:15 +00:00
48645eae33
Sometimes it is odd to get TLD itself from the cutToFirstSignificantSubdomain() (since you will not get TLD itself if you pass it directly): - cutToFirstSignificantSubdomain('org') -> "" - cutToFirstSignificantSubdomain('www.org') -> org - cutToFirstSignificantSubdomain('kernel.org') -> kernel.org - cutToFirstSignificantSubdomain('www.kernel.org') -> kernel.org So add one more function to get www.org in this case: - cutToFirstSignificantSubdomainWithWWW('org') -> "" - cutToFirstSignificantSubdomainWithWWW('www.org') -> www.org - cutToFirstSignificantSubdomainWithWWW('kernel.org') -> kernel.org - cutToFirstSignificantSubdomainWithWWW('www.kernel.org') -> kernel.org P.S. not sure about the naming though, so it will great if someone has suggestion for the name.
123 lines
2.0 KiB
Plaintext
123 lines
2.0 KiB
Plaintext
====SCHEMA====
|
|
http
|
|
https
|
|
svn+ssh
|
|
|
|
http
|
|
|
|
====HOST====
|
|
www.example.com
|
|
|
|
www.example.com
|
|
127.0.0.1
|
|
www.example.com
|
|
www.example.com
|
|
www.example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
====NETLOC====
|
|
paul@www.example.com:80
|
|
127.0.0.1:443
|
|
127.0.0.1:443
|
|
example.ru
|
|
example.ru
|
|
paul:zozo@example.ru
|
|
paul:zozo@example.ru
|
|
www.example.com
|
|
www.example.com
|
|
example.com
|
|
====DOMAIN====
|
|
com
|
|
|
|
ru
|
|
|
|
com
|
|
com
|
|
com
|
|
====PATH====
|
|
П
|
|
%D%9
|
|
/?query=hello world+foo+bar
|
|
/?query=hello world+foo+bar
|
|
/?query=hello world+foo+bar
|
|
/?query=hello world+foo+bar
|
|
|
|
/a/b/c
|
|
/a/b/c
|
|
/a/b/c
|
|
/a/b/c
|
|
====QUERY STRING====
|
|
|
|
|
|
query=hello world+foo+bar
|
|
query=hello world+foo+bar
|
|
query=hello world+foo+bar
|
|
query=hello world+foo+bar
|
|
====FRAGMENT====
|
|
|
|
|
|
a=b
|
|
a=b
|
|
a=b
|
|
====QUERY STRING AND FRAGMENT====
|
|
|
|
|
|
query=hello world+foo+bar
|
|
query=hello world+foo+bar#a=b
|
|
query=hello world+foo+bar#a=b
|
|
query=hello world+foo+bar#a=b
|
|
#a=b
|
|
====CUT TO FIRST SIGNIFICANT SUBDOMAIN====
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
com
|
|
|
|
====CUT TO FIRST SIGNIFICANT SUBDOMAIN WITH WWW====
|
|
|
|
www.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
example.com
|
|
====CUT WWW====
|
|
http://example.com
|
|
http://example.com:1234
|
|
http://example.com/a/b/c
|
|
http://example.com/a/b/c?a=b
|
|
http://example.com/a/b/c?a=b#d=f
|
|
http://paul@example.com/a/b/c?a=b#d=f
|
|
//paul@example.com/a/b/c?a=b#d=f
|
|
====CUT QUERY STRING====
|
|
http://www.example.com
|
|
http://www.example.com:1234
|
|
http://www.example.com/a/b/c
|
|
http://www.example.com/a/b/c
|
|
http://www.example.com/a/b/c#d=f
|
|
http://paul@www.example.com/a/b/c#d=f
|
|
//paul@www.example.com/a/b/c#d=f
|
|
====CUT FRAGMENT====
|
|
http://www.example.com
|
|
http://www.example.com:1234
|
|
http://www.example.com/a/b/c
|
|
http://www.example.com/a/b/c?a=b
|
|
http://www.example.com/a/b/c?a=b
|
|
http://paul@www.example.com/a/b/c?a=b
|
|
//paul@www.example.com/a/b/c?a=b
|
|
====CUT QUERY STRING AND FRAGMENT====
|
|
http://www.example.com
|
|
http://www.example.com:1234
|
|
http://www.example.com/a/b/c
|
|
http://www.example.com/a/b/c
|
|
http://www.example.com/a/b/c
|
|
http://paul@www.example.com/a/b/c
|
|
//paul@www.example.com/a/b/c
|
|
//paul@www.example.com/a/b/c
|