- Updated exception message in ForkWriteBuffer
- Added test case to tests/queries/0_stateless/02346_into_outfile_and_stdout.sh for calling nextImpl more than once
Implementation:
- Added a new buffer ForkWriteBuffer takes a vector of WriteBuffer and writes data to all of them. It uses the buffer of the first element as its buffer and copies data from first buffer to all the other buffers
Testing:
- Updated tests/queries/0_stateless/02346_into_outfile_and_stdout.sh
Documentation:
- Updated the english documentation for SELECT.. INTO OUTFILE with AND STDOUT.
For parts exchange right now an error message in stored in the
HTTPSession, via HTTPSession::attachSessionData(), and for each request
this HTTPSession::sessionData() is checked and if not empty printed into
the log.
However it should be reported only once, otherwise you will get this
message on and on, even for different tables. Here is an example of such
messages:
2022.07.13 07:56:09.342997 [ 683 ] {} <Error> test_juq8qk.rmt2 (ede90518-4710-48bb-ab43-caf0b1b157f4): auto DB::StorageReplicatedMergeTree::processQueueEntry(ReplicatedMergeTreeQueue::SelectedEntryPtr)::(anonymous class)::operator()(DB::StorageReplicatedMergeTree::LogEntryPtr &) const: Code: 86. DB::Exception: Received error from remote server /?endpoint=DataPartsExchange%3A%2Ftest%2F01165%2Ftest_juq8qk%2Frmt%2Freplicas%2F1&part=206--1406300905-20220713-1657673769_0_0_0&client_protocol_version=7&compress=false&remote_fs_metadata=s3. HTTP status code: 500 Internal Server Error, body: Code: 236. DB::Exception: Transferring part to replica was cancelled. (ABORTED) (version 22.7.1.1781 (official build)). (RECEIVED_ERROR_FROM_REMOTE_IO_SERVER), Stack trace (when copying this message, always include the lines below):
# this is the time when this message is written ^
2022.07.13 07:56:10.528554 [ 814 ] {} <Information> DatabaseCatalog: Removing metadata /var/lib/clickhouse/metadata_dropped/test_juq8qk.rmt2.ede90518-4710-48bb-ab43-caf0b1b157f4.sql of dropped table test_juq8qk.rmt2 (ede90518-4710-48bb-ab43-caf0b1b157f4)
# now this table had been removed ^
2022.07.13 07:56:27.442003 [ 683 ] {} <Debug> test_1orbeb.mutations_and_quorum2 (bc811afd-1f57-4f9c-a04d-9e1e1b60e891): Fetching part 201901_0_0_0 from /clickhouse/tables/test_1orbeb/test_01090/mutations_and_quorum/replicas/1
# here fetch part is scheduled for another table ^
2022.07.13 07:56:27.442213 [ 683 ] {} <Trace> HTTPCommon: Failed communicating with 250c6af4615d with error 'Received error from remote server /?endpoint=DataPartsExchange%3A%2Ftest%2F01165%2Ftest_juq8qk%2Frmt%2Freplicas%2F1&part=255--1372061156-20220713-1657673769_0_0_0&client_protocol_version=7&compress=false&remote_fs_metadata=s3. HTTP status code: 500 Internal Server Error, body: Code: 236. DB::Exception: Transferring part to replica was cancelled. (ABORTED) (version 22.7.1.1781 (official build))' will try to reconnect session
# however it still reports an error for the already removed table test_juq8qk.rmt
2022.07.13 07:56:27.442246 [ 683 ] {} <Trace> ReadWriteBufferFromHTTP: Sending request to http://250c6af4615d:9009/?endpoint=DataPartsExchange%3A%2Fclickhouse%2Ftables%2Ftest_1orbeb%
# but this is just a noisy message and it still doing the job correctly ^
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
A simple HelloWorld program with zero includes except iostream triggers
a build of ca. 2000 source files. The reason is that ClickHouse's
top-level CMakeLists.txt overrides "add_executable()" to link all
binaries against "clickhouse_new_delete". This links against
"clickhouse_common_io", which in turn has lots of 3rd party library
dependencies ... Without linking "clickhouse_new_delete", the number of
compiled files for "HelloWorld" goes down to ca. 70.
As an example, the self-extracting-executable needs none of its current
dependencies but other programs may also benefit.
In order to restore access to the original "add_executable()", the
overriding version is now prefixed. There is precedence for a
"clickhouse_" prefix (as opposed to "ch_"), for example
"clickhouse_split_debug_symbols". In general prefixing makes sense also
because overriding CMake commands relies on undocumented behavior and is
considered not-so-great practice (*).
(*) https://crascit.com/2018/09/14/do-not-redefine-cmake-commands/
Implementation:
- Added a bool to ASTQueryWithOutput & patched the usage in ClientBase.
- Added a new buffer TeeWriteBuffer which extends from WriteBufferFromFile (used to write data to the file) and has WriteBufferFromFileDescriptor (used to write data to stdout). The WriteBufferFromFileDescriptor uses the same buffer as TeeWriteBuffer.
- Added a new bool select_into_outfile_and_stdout in ClientBase to enable/disable progress rendering.
Testing:
- Added a test tests/queries/0_stateless/02346_into_outfile_and_stdout.sh
Documentation:
- Updated the english documentation for the new option in SELECT.
First part, updated most UTF8, hashing, memory and codecs. Except
utf8lower and upper, maybe a little later.
That includes huge amount of research with movemask dealing. Exact
details and blog post TBD.
cmake/target.cmake defines macros for the supported platforms, this
commit changes predefined system macros to our own macros.
__linux__ --> OS_LINUX
__APPLE__ --> OS_DARWIN
__FreeBSD__ --> OS_FREEBSD
It is possible for ReadBufferFromS3::nextImpl() called even after eof(),
at least once, and in this case, if the file was empty, then local
working_buffer will be null, while impl.working_buffer will be empty,
but not null, and so local position() after impl->position() =
position() will be incorrect.
I found this with test_storage_s3/test.py::test_empty_file in debug
build, assertion catched this, so maybe it worth get back debug
integration build...
v2: fix test_log_family_s3 failures
https://s3.amazonaws.com/clickhouse-test-reports/37801/b5e6e2ddae94d6a7eac551309cb67003dff97df1/integration_tests__asan__actions__[2/3].html
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
WriteBufferFromS3::is_finalized is not set if finalizeImpl() throws,
while WriteBuffer::finalized correctly set even in case of exception, so
it should be used instead.
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Enable:
- bugprone-lambda-function-name: "Checks for attempts to get the name of
a function from within a lambda expression. The name of a lambda is
always something like operator(), which is almost never what was
intended."
- bugprone-unhandled-self-assignment: "Finds user-defined copy
assignment operators which do not protect the code against
self-assignment either by checking self-assignment explicitly or using
the copy-and-swap or the copy-and-move method.""
- hicpp-invalid-access-moved: "Warns if an object is used after it has
been moved."
- hicpp-use-noexcept: "This check replaces deprecated dynamic exception
specifications with the appropriate noexcept specification (introduced
in C++11)"
- hicpp-use-override: "Adds override (introduced in C++11) to overridden
virtual functions and removes virtual from those functions as it is
not required."
- performance-type-promotion-in-math-fn: "Finds calls to C math library
functions (from math.h or, in C++, cmath) with implicit float to
double promotions."
Split up:
- cppcoreguidelines-*. Some of them may be useful (haven't checked in
detail), therefore allow to toggle them individually.
Disable:
- linuxkernel-*. Obvious.
Official docs:
Some headers from C library were deprecated in C++ and are no longer
welcome in C++ codebases. Some have no effect in C++. For more details
refer to the C++ 14 Standard [depr.c.headers] section. This check
replaces C standard library headers with their C++ alternatives and
removes redundant ones.
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
previously allowed.
Hence, this change
- removes shared_ptr_helper and as a result all inherited create() methods,
- instead, Storage objects are now created using make_shared<>() by the
caller (for that to work, many constructors had to be made public), and
- all Storage classes were marked as noncopyable using boost::noncopyable.
In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.