mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-26 01:22:04 +00:00
Update url-functions.md
This commit is contained in:
parent
a68ebaafd7
commit
33dc8ce456
@ -7,9 +7,35 @@ sidebar_label: URLs
|
|||||||
# Functions for Working with URLs
|
# Functions for Working with URLs
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
The functions mentioned in this section for the most part do not follow the RFC convention as they are maximally simplified for improved performance. Functions following a specific RFC convention have `RFC` appended to the function name and are generally less performant.
|
The functions mentioned in this section for the most part do not follow the RFC-3986 convention as they are maximally simplified for improved performance. Functions following the RFC-3986 convention have `RFC` appended to the function name and are generally less performant.
|
||||||
|
|
||||||
|
- When should I pick the non-`RFC` variant?
|
||||||
|
— Pick the non-`RFC` variant when working with domains which are allowed to be publically registered and when userinfo and the `@` symbol does not appear in the URL.
|
||||||
:::
|
:::
|
||||||
|
|
||||||
|
The table belows details which symbols are restricted (`✗`) and which are available (`✔`) for use in the whole URL between the two variants.
|
||||||
|
|
||||||
|
|Symbol | non-`RFC`| `RFC` |
|
||||||
|
|-------|----------|-------|
|
||||||
|
| ' ' | ✗ |✗ |
|
||||||
|
| \t | ✗ |✗ |
|
||||||
|
| < | ✗ |✗ |
|
||||||
|
| > | ✗ |✗ |
|
||||||
|
| % | ✗ |✔* |
|
||||||
|
| { | ✗ |✗ |
|
||||||
|
| } | ✗ |✗ |
|
||||||
|
| \| | ✗ |✗ |
|
||||||
|
| \\\ | ✗ |✗ |
|
||||||
|
| ^ | ✗ |✗ |
|
||||||
|
| ~ | ✗ |✔* |
|
||||||
|
| [ | ✗ |✗ |
|
||||||
|
| ] | ✗ |✔ |
|
||||||
|
| ; | ✗ |✔* |
|
||||||
|
| = | ✗ |✔* |
|
||||||
|
| & | ✗ |✔* |
|
||||||
|
|
||||||
|
The symbols above marked `*` are sub-delimiters in the RFC 3986 convention and are allowed for userinfo following the `@` symbol.
|
||||||
|
|
||||||
## Functions that Extract Parts of a URL
|
## Functions that Extract Parts of a URL
|
||||||
|
|
||||||
If the relevant part isn’t present in a URL, an empty string is returned.
|
If the relevant part isn’t present in a URL, an empty string is returned.
|
||||||
@ -18,7 +44,7 @@ If the relevant part isn’t present in a URL, an empty string is returned.
|
|||||||
|
|
||||||
Extracts the protocol from a URL.
|
Extracts the protocol from a URL.
|
||||||
|
|
||||||
Examples of typical returned values: http, https, ftp, mailto, tel, magnet...
|
Examples of typical returned values: http, https, ftp, mailto, tel, magnet…
|
||||||
|
|
||||||
### domain
|
### domain
|
||||||
|
|
||||||
@ -32,7 +58,7 @@ domain(URL)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `url` — URL. [String](../data-types/string.md).
|
- `URL` — URL. Type: [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
The URL can be specified with or without a protocol. Examples:
|
The URL can be specified with or without a protocol. Examples:
|
||||||
|
|
||||||
@ -114,7 +140,7 @@ topLevelDomain(URL)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `url` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
The URL can be specified with or without a protocol. Examples:
|
The URL can be specified with or without a protocol. Examples:
|
||||||
@ -128,7 +154,7 @@ https://clickhouse.com/time/
|
|||||||
|
|
||||||
**Returned values**
|
**Returned values**
|
||||||
|
|
||||||
- Domain name if ClickHouse can parse the input string as a URL. Otherwise, an empty string. [String](../data-types/string.md).
|
- Domain name if ClickHouse can parse the input string as a URL. Otherwise, an empty string. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -156,7 +182,7 @@ topLevelDomainRFC(URL)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `url` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
:::note
|
:::note
|
||||||
The URL can be specified with or without a protocol. Examples:
|
The URL can be specified with or without a protocol. Examples:
|
||||||
@ -170,7 +196,7 @@ https://clickhouse.com/time/
|
|||||||
|
|
||||||
**Returned values**
|
**Returned values**
|
||||||
|
|
||||||
- Domain name if ClickHouse can parse the input string as a URL. Otherwise, an empty string. [String](../data-types/string.md).
|
- Domain name if ClickHouse can parse the input string as a URL. Otherwise, an empty string. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -425,17 +451,17 @@ This can be useful if you need a fresh TLD list or if you have a custom list.
|
|||||||
**Syntax**
|
**Syntax**
|
||||||
|
|
||||||
``` sql
|
``` sql
|
||||||
cutToFirstSignificantSubdomainCustom(URL, TLD)
|
cutToFirstSignificantSubdomain(URL, TLD)
|
||||||
```
|
```
|
||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `URL` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
- `TLD` — Custom TLD list name. [String](../data-types/string.md).
|
- `TLD` — Custom TLD list name. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- Part of the domain that includes top-level subdomains up to the first significant subdomain. [String](../data-types/string.md).
|
- Part of the domain that includes top-level subdomains up to the first significant subdomain. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -505,12 +531,14 @@ cutToFirstSignificantSubdomainCustomWithWWW(URL, TLD)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `URL` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
- `TLD` — Custom TLD list name. [String](../data-types/string.md).
|
- `TLD` — Custom TLD list name. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- Part of the domain that includes top-level subdomains up to the first significant subdomain without stripping `www`. [String](../data-types/string.md).
|
- Part of the domain that includes top-level subdomains up to the first significant subdomain without stripping `www`.
|
||||||
|
|
||||||
|
Type: [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -580,12 +608,14 @@ firstSignificantSubdomainCustom(URL, TLD)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `URL` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
- `TLD` — Custom TLD list name. [String](../data-types/string.md).
|
- `TLD` — Custom TLD list name. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- First significant subdomain. [String](../data-types/string.md).
|
- First significant subdomain.
|
||||||
|
|
||||||
|
Type: [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -825,11 +855,13 @@ netloc(URL)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `url` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- `username:password@host:port`. [String](../data-types/string.md).
|
- `username:password@host:port`.
|
||||||
|
|
||||||
|
Type: `String`.
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
@ -879,12 +911,14 @@ cutURLParameter(URL, name)
|
|||||||
|
|
||||||
**Arguments**
|
**Arguments**
|
||||||
|
|
||||||
- `URL` — URL. [String](../data-types/string.md).
|
- `URL` — URL. [String](../../sql-reference/data-types/string.md).
|
||||||
- `name` — name of URL parameter. [String](../data-types/string.md) or [Array](../data-types/array.md) of Strings.
|
- `name` — name of URL parameter. [String](../../sql-reference/data-types/string.md) or [Array](../../sql-reference/data-types/array.md) of Strings.
|
||||||
|
|
||||||
**Returned value**
|
**Returned value**
|
||||||
|
|
||||||
- URL with `name` URL parameter removed. [String](../data-types/string.md).
|
- URL with `name` URL parameter removed.
|
||||||
|
|
||||||
|
Type: `String`.
|
||||||
|
|
||||||
**Example**
|
**Example**
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user