Improve docs

2024-11-26 01:22:04 +00:00 · 2021-08-02 12:32:45 +00:00 · 2021-08-02 12:32:45 +00:00 · d9db3dcff8
commit d9db3dcff8
parent 6951e8147d
2 changed files with 22 additions and 8 deletions
--- a/docs/en/sql-reference/functions/nlp-functions.md
+++ b/docs/en/sql-reference/functions/nlp-functions.md
@ -3,11 +3,14 @@ toc_priority: 67
 toc_title: NLP
 ---
-# Natural Language Processing functions {#nlp-functions}
+# [experimental] Natural Language Processing functions {#nlp-functions}
 !!! warning "Warning"
    This is an experimental feature that is currently in development and is not ready for general use. It will change in unpredictable backwards-incompatible ways in the future releases. Set `allow_experimental_nlp_functions = 1` to enable it.
 ## stem {#stem}
-Performs stemming on a previously tokenized text.
+Performs stemming on a given word.
 **Syntax**
@ -38,7 +41,7 @@ Result:
 ## lemmatize {#lemmatize}
-Performs lemmatization on a given word.
+Performs lemmatization on a given word. Needs dictionaries to operate. Dictionaries can be obtained [here](https://github.com/vpodpecan/lemmagen3/tree/master/src/lemmagen3/models).
 **Syntax**
@ -79,7 +82,11 @@ Configuration:
 ## synonyms {#synonyms}
-Finds synonyms to a given word. 
+Finds synonyms to a given word. There are two types of synonym extensions: `plain` and `wordnet`.
 With `plain` extension type we need to provide a path to simple text file, where each line corresponds to certain synonym set. Words in this line must be separated with space or tab characters.
 With `wordnet` extension type we need to provide a path to directory with WordNet thesaurus in it. Thesaurus must contain WordNet sense index.
 **Syntax**
@ -89,7 +96,7 @@ synonyms('extension_name', word)
 **Arguments**
-   `extension_name` — Name of the extention in which search will be performed. [String](../../sql-reference/data-types/string.md#string).
+-   `extension_name` — Name of the extension in which search will be performed. [String](../../sql-reference/data-types/string.md#string).
 -   `word` — Word that will be searched in extension. [String](../../sql-reference/data-types/string.md#string).
 **Examples**
--- a/docs/ru/sql-reference/functions/nlp-functions.md
+++ b/docs/ru/sql-reference/functions/nlp-functions.md
@ -3,7 +3,10 @@ toc_priority: 67
 toc_title: NLP
 ---
-# Функции для работы с ествественным языком {#nlp-functions}
+# [экспериментально] Функции для работы с ествественным языком {#nlp-functions}
 !!! warning "Предупреждение"
    Сейчас использование функций для работы с ествественным языком является экспериментальной возможностью. Чтобы использовать данные функции, включите настройку `allow_experimental_nlp_functions = 1`.
 ## stem {#stem}
@ -38,7 +41,7 @@ Result:
 ## lemmatize {#lemmatize}
-Данная функция проводит лемматизацию для заданного слова.
+Данная функция проводит лемматизацию для заданного слова. Для работы лемматизатора необходимы словари, которые можно найти [здесь](https://github.com/vpodpecan/lemmagen3/tree/master/src/lemmagen3/models).
 **Синтаксис**
@ -79,7 +82,11 @@ SELECT lemmatize('en', 'wolves');
 ## synonyms {#synonyms}
-Находит синонимы к заданному слову.
+Находит синонимы к заданному слову. Представлены два типа расширений словарей: `plain` и `wordnet`.
 Для работы расширения типа `plain` необходимо указать путь до простого текстового файла, где каждая строка соотвествует одному набору синонимов. Слова в данной строке должны быть разделены с помощью пробела или знака табуляции.
 Для работы расширения типа `plain` необходимо указать путь до WordNet тезауруса. Тезаурус должен содержать WordNet sense index.
 **Синтаксис**