mirror of
https://github.com/ClickHouse/ClickHouse.git
synced 2024-11-26 17:41:59 +00:00
67c2e50331
* update presentations * CLICKHOUSE-2936: redirect from clickhouse.yandex.ru and clickhouse.yandex.com * update submodule * lost files * CLICKHOUSE-2981: prefer sphinx docs over original reference * CLICKHOUSE-2981: docs styles more similar to main website + add flags to switch language links * update presentations * Less confusing directory structure (docs -> doc/reference/) * Minify sphinx docs too * Website release script: fail fast + pass docker hash on deploy * Do not underline links in docs * shorter * cleanup docker images * tune nginx config * CLICKHOUSE-3043: get rid of habrastorage links * Lost translation * CLICKHOUSE-2936: temporary client-side redirect * behaves weird in test * put redirect back * CLICKHOUSE-3047: copy docs txts to public too * move to proper file * remove old pages to avoid confusion * Remove reference redirect warning for now * Refresh README.md * Yellow buttons in docs * Use svg flags instead of unicode ones in docs * fix test website instance * Put flags to separate files * wrong flag * Copy Yandex.Metrica introduction from main page to docs * Yet another home page structure change, couple new blocks (CLICKHOUSE-3045) * Update Contacts section * CLICKHOUSE-2849: more detailed legal information * CLICKHOUSE-2978 preparation - split by files * More changes in Contacts block * Tune texts on index page * update presentations * One more benchmark * Add usage sections to index page, adapted from slides * Get the roadmap started, based on slides from last ClickHouse Meetup * CLICKHOUSE-2977: some rendering tuning * Get rid of excessive section in the end of getting started * Make headers linkable * CLICKHOUSE-2981: links to editing reference - https://github.com/yandex/ClickHouse/issues/849 * CLICKHOUSE-2981: fix mobile styles in docs * Ban crawling of duplicating docs * Open some external links in new tab * Ban old docs too * Lots of trivial fixes in english docs * Lots of trivial fixes in russian docs * Remove getting started copies in markdown * Add Yandex.Webmaster * Fix some sphinx warnings * More warnings fixed in english docs * More sphinx warnings fixed * Add code-block:: text * More code-block:: text * These headers look not that well * Better switch between documentation languages * merge use_case.rst into ya_metrika_task.rst * Edit the agg_functions.rst texts * Add lost empty lines * Lost blank lines * Add new logo sizes * update presentations * Next step in migrating to new documentation * Fix all warnings in en reference * Fix all warnings in ru reference * Re-arrange existing reference * Move operation tips to main reference * Fix typos noticed by milovidov@ * Get rid of zookeeper.md * Looks like duplicate of tutorial.html * Fix some mess with html tags in tutorial * No idea why nobody noticed this before, but it was completely not clear whet to get the data * Match code block styling between main and tutorial pages (in favor of the latter) * Get rid of some copypaste in tutorial * Normalize header styles * Move example_datasets to sphinx * Move presentations submodule to website * Move and update README.md * No point in duplicating articles from habrahabr here * Move development-related docs as is for now * doc/reference/ -> docs/ (to match the URL on website) * Adapt links to match the previous commit * Adapt development docs to rst (still lacks translation and strikethrough support) * clean on release * blacklist presentations in gulp * strikethrough support in sphinx * just copy development folder for now * fix weird introduction in style article * Style guide translation (WIP) * Finish style guide translation to English * gulp clean separately * Update year in LICENSE * Initial CONTRIBUTING.md * Fix remaining links to old docs in tutorial * Some tutorial fixes * Typo * Another typo * Update list of authors from yandex-team accoding to git log
70 lines
3.3 KiB
ReStructuredText
70 lines
3.3 KiB
ReStructuredText
Hash functions
|
|
--------------
|
|
Hash functions can be used for deterministic pseudo-random shuffling of elements.
|
|
|
|
|
|
halfMD5
|
|
~~~~~~~
|
|
Calculates the MD5 from a string. Then it takes the first 8 bytes of the hash and interprets them as UInt64 in big endian.
|
|
Accepts a String-type argument. Returns UInt64.
|
|
This function works fairly slowly (5 million short strings per second per processor core).
|
|
If you don't need MD5 in particular, use the 'sipHash64' function instead.
|
|
|
|
MD5
|
|
~~~
|
|
Calculates the MD5 from a string and returns the resulting set of bytes as FixedString(16).
|
|
If you don't need MD5 in particular, but you need a decent cryptographic 128-bit hash, use the 'sipHash128' function instead.
|
|
If you need the same result as gives 'md5sum' utility, write ``lower(hex(MD5(s)))``.
|
|
|
|
sipHash64
|
|
~~~~~~~~~
|
|
Calculates SipHash from a string.
|
|
Accepts a String-type argument. Returns UInt64.
|
|
SipHash is a cryptographic hash function. It works at least three times faster than MD5. For more information, see https://131002.net/siphash/
|
|
|
|
sipHash128
|
|
~~~~~~~~~~
|
|
Calculates SipHash from a string.
|
|
Accepts a String-type argument. Returns FixedString(16).
|
|
Differs from sipHash64 in that the final xor-folding state is only done up to 128 bits.
|
|
|
|
cityHash64
|
|
~~~~~~~~~~
|
|
Calculates CityHash64 from a string or a similar hash function for any number of any type of arguments.
|
|
For String-type arguments, CityHash is used. This is a fast non-cryptographic hash function for strings with decent quality.
|
|
For other types of arguments, a decent implementation-specific fast non-cryptographic hash function is used.
|
|
If multiple arguments are passed, the function is calculated using the same rules and chain combinations using the CityHash combinator.
|
|
For example, you can compute the checksum of an entire table with accuracy up to the row order: ``SELECT sum(cityHash64(*)) FROM table``.
|
|
|
|
intHash32
|
|
~~~~~~~~~
|
|
Calculates a 32-bit hash code from any type of integer.
|
|
This is a relatively fast non-cryptographic hash function of average quality for numbers.
|
|
|
|
intHash64
|
|
~~~~~~~~~
|
|
Calculates a 64-bit hash code from any type of integer.
|
|
It works faster than intHash32. Average quality.
|
|
|
|
SHA1
|
|
~~~~
|
|
|
|
SHA224
|
|
~~~~~~
|
|
|
|
SHA256
|
|
~~~~~~
|
|
Calculates SHA-1, SHA-224, or SHA-256 from a string and returns the resulting set of bytes as FixedString(20), FixedString(28), or FixedString(32).
|
|
The function works fairly slowly (SHA-1 processes about 5 million short strings per second per processor core, while SHA-224 and SHA-256 process about 2.2 million).
|
|
We recommend using this function only in cases when you need a specific hash function and you can't select it.
|
|
Even in these cases, we recommend applying the function offline and pre-calculating values when inserting them into the table, instead of applying it in SELECTS.
|
|
|
|
URLHash(url[, N])
|
|
~~~~~~~~~~~~~~~~~
|
|
A fast, decent-quality non-cryptographic hash function for a string obtained from a URL using some type of normalization.
|
|
|
|
``URLHash(s)`` - Calculates a hash from a string without one of the trailing symbols ``/``,``?`` or ``#`` at the end, if present.
|
|
|
|
``URL Hash(s, N)`` - Calculates a hash from a string up to the N level in the URL hierarchy, without one of the trailing symbols ``/``,``?`` or ``#`` at the end, if present.
|
|
Levels are the same as in URLHierarchy. This function is specific to Yandex.Metrica.
|