The following new provile events had been added:
- FileSync - Number of times the F_FULLFSYNC/fsync/fdatasync function was called for files.
- DirectorySync - Number of times the F_FULLFSYNC/fsync/fdatasync function was called for directories.
- FileSyncElapsedMicroseconds - Total time spent waiting for F_FULLFSYNC/fsync/fdatasync syscall for files.
- DirectorySyncElapsedMicroseconds - Total time spent waiting for F_FULLFSYNC/fsync/fdatasync syscall for directories.
v2: rewrite test to sh with retries
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
- Introduced with the C++20 <bit> header
- The problem with __builtin_c(l|t)z() is that 0 as input has an
undefined result (*) and the code did not always check. The std::
versions do not have this issue.
- In some cases, we continue to use buildin_c(l|t)z(), (e.g. in
src/Common/BitHelpers.h) because the std:: versions only accept
unsigned inputs (and they also check that) and the casting would be
ugly.
(*) https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
- Updated exception message in ForkWriteBuffer
- Added test case to tests/queries/0_stateless/02346_into_outfile_and_stdout.sh for calling nextImpl more than once
Implementation:
- Added a new buffer ForkWriteBuffer takes a vector of WriteBuffer and writes data to all of them. It uses the buffer of the first element as its buffer and copies data from first buffer to all the other buffers
Testing:
- Updated tests/queries/0_stateless/02346_into_outfile_and_stdout.sh
Documentation:
- Updated the english documentation for SELECT.. INTO OUTFILE with AND STDOUT.
For parts exchange right now an error message in stored in the
HTTPSession, via HTTPSession::attachSessionData(), and for each request
this HTTPSession::sessionData() is checked and if not empty printed into
the log.
However it should be reported only once, otherwise you will get this
message on and on, even for different tables. Here is an example of such
messages:
2022.07.13 07:56:09.342997 [ 683 ] {} <Error> test_juq8qk.rmt2 (ede90518-4710-48bb-ab43-caf0b1b157f4): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(DB::StorageReplicatedMergeTree::LogEntryPtr &) const: Code: 86. DB::Exception: Received error from remote server /?endpoint=DataPartsExchange%3A%2Ftest%2F01165%2Ftest_juq8qk%2Frmt%2Freplicas%2F1&part=206--1406300905-20220713-1657673769_0_0_0&client_protocol_version=7&compress=false&remote_fs_metadata=s3. HTTP status code: 500 Internal Server Error, body: Code: 236. DB::Exception: Transferring part to replica was cancelled. (ABORTED) (version 22.7.1.1781 (official build)). (RECEIVED_ERROR_FROM_REMOTE_IO_SERVER), Stack trace (when copying this message, always include the lines below):
# this is the time when this message is written ^
2022.07.13 07:56:10.528554 [ 814 ] {} <Information> DatabaseCatalog: Removing metadata /var/lib/clickhouse/metadata_dropped/test_juq8qk.rmt2.ede90518-4710-48bb-ab43-caf0b1b157f4.sql of dropped table test_juq8qk.rmt2 (ede90518-4710-48bb-ab43-caf0b1b157f4)
# now this table had been removed ^
2022.07.13 07:56:27.442003 [ 683 ] {} <Debug> test_1orbeb.mutations_and_quorum2 (bc811afd-1f57-4f9c-a04d-9e1e1b60e891): Fetching part 201901_0_0_0 from /clickhouse/tables/test_1orbeb/test_01090/mutations_and_quorum/replicas/1
# here fetch part is scheduled for another table ^
2022.07.13 07:56:27.442213 [ 683 ] {} <Trace> HTTPCommon: Failed communicating with 250c6af4615d with error 'Received error from remote server /?endpoint=DataPartsExchange%3A%2Ftest%2F01165%2Ftest_juq8qk%2Frmt%2Freplicas%2F1&part=255--1372061156-20220713-1657673769_0_0_0&client_protocol_version=7&compress=false&remote_fs_metadata=s3. HTTP status code: 500 Internal Server Error, body: Code: 236. DB::Exception: Transferring part to replica was cancelled. (ABORTED) (version 22.7.1.1781 (official build))' will try to reconnect session
# however it still reports an error for the already removed table test_juq8qk.rmt
2022.07.13 07:56:27.442246 [ 683 ] {} <Trace> ReadWriteBufferFromHTTP: Sending request to http://250c6af4615d:9009/?endpoint=DataPartsExchange%3A%2Fclickhouse%2Ftables%2Ftest_1orbeb%
# but this is just a noisy message and it still doing the job correctly ^
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
A simple HelloWorld program with zero includes except iostream triggers
a build of ca. 2000 source files. The reason is that ClickHouse's
top-level CMakeLists.txt overrides "add_executable()" to link all
binaries against "clickhouse_new_delete". This links against
"clickhouse_common_io", which in turn has lots of 3rd party library
dependencies ... Without linking "clickhouse_new_delete", the number of
compiled files for "HelloWorld" goes down to ca. 70.
As an example, the self-extracting-executable needs none of its current
dependencies but other programs may also benefit.
In order to restore access to the original "add_executable()", the
overriding version is now prefixed. There is precedence for a
"clickhouse_" prefix (as opposed to "ch_"), for example
"clickhouse_split_debug_symbols". In general prefixing makes sense also
because overriding CMake commands relies on undocumented behavior and is
considered not-so-great practice (*).
(*) https://crascit.com/2018/09/14/do-not-redefine-cmake-commands/
Implementation:
- Added a bool to ASTQueryWithOutput & patched the usage in ClientBase.
- Added a new buffer TeeWriteBuffer which extends from WriteBufferFromFile (used to write data to the file) and has WriteBufferFromFileDescriptor (used to write data to stdout). The WriteBufferFromFileDescriptor uses the same buffer as TeeWriteBuffer.
- Added a new bool select_into_outfile_and_stdout in ClientBase to enable/disable progress rendering.
Testing:
- Added a test tests/queries/0_stateless/02346_into_outfile_and_stdout.sh
Documentation:
- Updated the english documentation for the new option in SELECT.
First part, updated most UTF8, hashing, memory and codecs. Except
utf8lower and upper, maybe a little later.
That includes huge amount of research with movemask dealing. Exact
details and blog post TBD.
cmake/target.cmake defines macros for the supported platforms, this
commit changes predefined system macros to our own macros.
__linux__ --> OS_LINUX
__APPLE__ --> OS_DARWIN
__FreeBSD__ --> OS_FREEBSD
It is possible for ReadBufferFromS3::nextImpl() called even after eof(),
at least once, and in this case, if the file was empty, then local
working_buffer will be null, while impl.working_buffer will be empty,
but not null, and so local position() after impl->position() =
position() will be incorrect.
I found this with test_storage_s3/test.py::test_empty_file in debug
build, assertion catched this, so maybe it worth get back debug
integration build...
v2: fix test_log_family_s3 failures
https://s3.amazonaws.com/clickhouse-test-reports/37801/b5e6e2ddae94d6a7eac551309cb67003dff97df1/integration_tests__asan__actions__[2/3].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
WriteBufferFromS3::is_finalized is not set if finalizeImpl() throws,
while WriteBuffer::finalized correctly set even in case of exception, so
it should be used instead.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>