Commit Graph

12 Commits

Author SHA1 Message Date
Azat Khuzhin
78bc48236b Relax symbols that are allowed in userinfo in *domain*RFC()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-24 20:51:48 +01:00
Azat Khuzhin
066389e6ff Relax symbols that are allowed in userinfo in netloc()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-24 20:51:44 +01:00
Azat Khuzhin
83071164cc Fix off-by-one error in netloc()
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2023-02-24 20:15:29 +01:00
vdimir
14d0f6457b
Add tests and doc for some url-related functions 2022-10-26 10:52:57 +00:00
Quanfa Fu
b07f65343d Add functions: domainRFC, topLevelDomainRFC, domainWithoutWWWRFC... 2022-10-23 12:01:26 +08:00
Quanfa Fu
dbe68ab0a8 Fix wrong behave of domain func with URLs contains UserInfo part and '@'
When UserInfo part and '@' appear in the URL, the host after @ should
be returned. For example, when url is "https://user:pass@clickhouse.com/",
start_of_host should be char 'c' after '@', end_of_host should be '/'
other than ':'.
2022-10-19 14:27:06 +08:00
zzsmdfj
4dcb411f4f to #31092_add_encodeURLComponent_function 2022-02-16 10:19:20 +08:00
cmsxbc
37349a9d0f
add function decodeURLFormComponent 2022-01-07 20:51:30 +08:00
Azat Khuzhin
e0c1780370 Fix topLevelDomain() for IDN hosts 2021-06-09 10:59:56 +03:00
Azat Khuzhin
48645eae33 Add cutToFirstSignificantSubdomainWithWWW()
Sometimes it is odd to get TLD itself from the
cutToFirstSignificantSubdomain() (since you will not get TLD itself if
you pass it directly):
- cutToFirstSignificantSubdomain('org')            -> ""
- cutToFirstSignificantSubdomain('www.org')        -> org
- cutToFirstSignificantSubdomain('kernel.org')     -> kernel.org
- cutToFirstSignificantSubdomain('www.kernel.org') -> kernel.org

So add one more function to get www.org in this case:
- cutToFirstSignificantSubdomainWithWWW('org')            -> ""
- cutToFirstSignificantSubdomainWithWWW('www.org')        -> www.org
- cutToFirstSignificantSubdomainWithWWW('kernel.org')     -> kernel.org
- cutToFirstSignificantSubdomainWithWWW('www.kernel.org') -> kernel.org

P.S. not sure about the naming though, so it will great if someone has
suggestion for the name.
2020-11-18 21:09:27 +03:00
Guillaume Tassery
500a8d22fa create netloc function 2020-06-02 15:34:08 +07:00
Ivan
97f2a2213e
Move all folders inside /dbms one level up (#9974)
* Move some code outside dbms/src folder
* Fix paths
2020-04-02 02:51:21 +03:00