Commit Graph

1447 Commits

Author SHA1 Message Date
avogar
2583e6d3ce Use string_view 2022-09-28 13:14:54 +00:00
Robert Schulze
09c62f6728
Consolidate config_formats.h into config.h
Less duplication, less confusion ...
2022-09-28 12:59:05 +00:00
Robert Schulze
78fc36ca49
Generate config.h into ${CONFIG_INCLUDE_PATH}
This makes the target location consistent with other auto-generated
files like config_formats.h, config_core.h, and config_functions.h and
simplifies the build of clickhouse_common.
2022-09-28 12:48:26 +00:00
Kruglov Pavel
f1ac2d66be
Merge branch 'master' into json-object-each-row 2022-09-28 14:15:02 +02:00
avogar
1bd7e531db Better exception message for duplicate column names in schema inference 2022-09-28 12:07:25 +00:00
avogar
6a1cb604c4 Style 2022-09-22 17:06:56 +00:00
avogar
4f32ef9bb7 Add docs 2022-09-22 17:04:42 +00:00
avogar
d3d06251a3 Add setting to obtain object name as column value in JSONObjectEachRow format 2022-09-22 16:48:54 +00:00
avogar
f23a77156f Check file path for path traversal attacks in errors logger for input formats 2022-09-22 13:56:51 +00:00
Kruglov Pavel
55d7addcfe
Merge branch 'master' into fix-format-row 2022-09-22 12:32:58 +02:00
Kruglov Pavel
2c83abaaba
Merge pull request #41614 from ClickHouse/Avogar-patch-1
Fix typos in JSON formats after #40910
2022-09-22 10:58:47 +02:00
Alexey Milovidov
45afacdae4
Merge pull request #41186 from ClickHouse/fix-three-fourth-of-trash
Fix more than half of the trash
2022-09-22 07:28:26 +03:00
Arthur Passos
cf1ed58710 Use separate functions for parquet time32 and time64 2022-09-21 14:56:11 +02:00
Vladimir C
efa34b4013 Fix style 2022-09-21 14:56:11 +02:00
Arthur Passos
c0914a39a7 Add Parquet Time32/64 conversion to CH DateTime32/64 2022-09-21 14:56:11 +02:00
Kruglov Pavel
dcb8fbc3f8
Fix JSONEachRow 2022-09-21 14:25:34 +02:00
Kruglov Pavel
95135e1e31
Fix typos in JSON formats after #40910 2022-09-21 14:24:26 +02:00
Kruglov Pavel
22e11aef2d
Merge pull request #40910 from Avogar/new-json-formats
Add new JSON formats, add improvements and refactoring
2022-09-21 14:19:08 +02:00
avogar
6239a1a235 Fix build 2022-09-21 11:29:00 +00:00
avogar
f956e7915e Fix tests 2022-09-20 20:37:30 +00:00
Alexey Milovidov
45bd3cfc30 Merge branch 'master' into fix-three-fourth-of-trash 2022-09-20 21:27:41 +02:00
avogar
b86aec41d4 Remove unused file after renaming 2022-09-20 13:54:54 +00:00
avogar
868ce8bc16 Fix comments, make better naming, add docs, add setting output_format_json_quote_64bit_floats 2022-09-20 13:49:17 +00:00
avogar
a7de3daa13 Fix tests 2022-09-19 14:13:46 +00:00
Kruglov Pavel
57f0dc1f89
Merge branch 'master' into fix-format-row 2022-09-19 14:37:58 +02:00
Alexey Milovidov
730655d4fd Fix 8/9 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
91baedf03a Fix 6/7 of trash 2022-09-19 08:53:20 +02:00
Alexey Milovidov
84f42e0874 Fix 3/4 of trash 2022-09-19 08:50:53 +02:00
Alexey Milovidov
81e8cb4be6
Merge branch 'master' into fix-bug-orc 2022-09-19 06:38:17 +03:00
Alexey Milovidov
d4b9fe41be
Merge pull request #41457 from ClickHouse/remove-trash-5
Remove trash from Field
2022-09-19 06:36:48 +03:00
Alexey Milovidov
8764fa4439 Fix very strange behavior of Apache ORC 2022-09-18 08:25:25 +02:00
Alexey Milovidov
791de6592b Remove trash from Field 2022-09-18 05:16:08 +02:00
avogar
1de7b65b97 Fix tests 2022-09-16 14:05:08 +00:00
Kruglov Pavel
2d4a6b38af
Merge branch 'master' into fix-totals-extremes 2022-09-16 15:03:59 +02:00
Alexey Milovidov
da01982652
Merge pull request #41046 from azat/build/llvm-15
Switch to llvm/clang 15
2022-09-16 07:31:06 +03:00
Kruglov Pavel
73cf72a5a4
Merge pull request #41309 from Avogar/fix-msgpack
Add column type check before UUID insertion in MsgPack format
2022-09-15 11:37:57 +02:00
avogar
59e7eb084c Add column type check before UUID insertion in MsgPack format 2022-09-14 11:15:10 +00:00
Kruglov Pavel
3396ff6c3a
Merge pull request #40516 from zjial/record_errors_for_import_by_csv
Record errors while reading text formats (CSV, TSV).
2022-09-14 12:52:32 +02:00
Kruglov Pavel
110be0688e
Merge pull request #40909 from ClickHouse/Avogar-patch-1
Make better exception message in schema inference
2022-09-13 14:44:29 +02:00
Kruglov Pavel
3f4e998802
Merge branch 'master' into fix-format-row 2022-09-13 14:37:10 +02:00
Kruglov Pavel
17621b5607
Merge branch 'master' into fix-totals-extremes 2022-09-13 14:36:31 +02:00
zhenjial
5841d9e9b0 sync before destruct 2022-09-13 15:53:24 +08:00
zhenjial
67c08e3e22 sync before destruct 2022-09-13 15:06:22 +08:00
zhenjial
16c8cd0bd3 wait write finish 2022-09-13 14:19:40 +08:00
Kruglov Pavel
702ddff5f6
Fix style 2022-09-12 19:38:34 +02:00
Kruglov Pavel
060adfbe93
Merge branch 'master' into new-json-formats 2022-09-12 19:37:46 +02:00
avogar
8ac2fc7b26 Don't outout totals/extremes in all row formats, update docs 2022-09-12 17:21:40 +00:00
Alexey Milovidov
2aedd41023
Remove strange code (#40195)
* Remove strange code

* Even more code removal

* Fix style

* Remove even more code

* Simplify code by making it slower

* Attempt to do something

* Attempt to do something

* Well do something with this horrible trash

* Add a test
2022-09-12 16:29:23 +02:00
avogar
846e6b0f61 Fix tests 2022-09-12 11:27:11 +00:00
Kruglov Pavel
6535301888
Merge branch 'master' into Avogar-patch-1 2022-09-12 12:23:28 +02:00
Azat Khuzhin
c1e70169d2 Suppress clang-analyzer-cplusplus.NewDelete in MsgPackRowInputFormat
Appartently there is some issue with clang-15, since even the following
example shows error [1].

  [1]: https://gist.github.com/azat/027f0e949ea836fc2e6269113ceb8752

clang-tidy report [1]:

    FAILED: src/CMakeFiles/dbms.dir/Processors/Formats/Impl/MsgPackRowInputFormat.cpp.o                                                                                                            /usr/bin/cmake -E __run_co_compile --launcher="prlimit;--as=10000000000;--data=5000000000;--cpu=1000;/usr/bin/ccache" --tidy=/usr/bin/clang-tidy-15 --source=/ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp -- /usr/bin/clang++-15 --target=x86_64-linux-gnu --sysroot=/ch/cmake/linux/../../contrib/sysroot/linux-x86_64/x86_64-linux-gnu/libc  -DAWS_SDK_VERSION_MAJOR=1 -DAWS_SDK_VERSION_MINOR=7 -DAWS_SDK_VERSION_PATCH=231 -DBOOST_ASIO_HAS_STD_INVOKE_RESULT=1 -DBOOST_ASIO_STANDALONE=1 -DCARES_STATICLIB -DCONFIGDIR=\"\" -DENABLE_MULTITARGET_CODE=1 -DENABLE_OPENSSL_ENCRYPTION -DHAS_RESERVED_IDENTIFIER -DHAVE_CONFIG_H -DLIBSASL_EXPORTS=1 -DLZ4_DISABLE_DEPRECATE_WARNINGS=1 -DOBSOLETE_CRAM_ATTR=1 -DOBSOLETE_DIGEST_ATTR=1 -DPLUGINDIR=\"\" -DPOCO_ENABLE_CPP11 -DPOCO_HAVE_FD_EPOLL -DPOCO_OS_FAMILY_UNIX -DSASLAUTHD_CONF_FILE_DEFAULT=\"\" -DSNAPPY_CODEC_AVAILABLE -DSTD_EXCEPTION_HAS_STACK_TRACE=1 -DUNALIGNED_OK -DWITH_COVERAGE=0 -DWITH_GZFILEOP -DX86_64 -DZLIB_COMPAT -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -Iincludes/configs -I/ch/src -Isrc -Isrc/Core/include -I/ch/base/glibc-compatibility/memcpy -I/ch/base/base/.. -Ibase/base/.. -I/ch/contrib/cctz/include -I/ch/base/pcg-random/. -I/ch/contrib/miniselect/include -I/ch/contrib/zstd/lib -Icontrib/cyrus-sasl-cmake -I/ch/contrib/lz4/lib -I/ch/src/Common/mysqlxx/. -Icontrib/c-ares -I/ch/contrib/c-ares -I/ch/contrib/c-ares/include -isystem /ch/contrib/libcxx/include -isystem /ch/contrib/libcxxabi/include -isystem /ch/contrib/libunwind/include -isystem /ch/contrib/libdivide/. -isystem /ch/contrib/jemalloc-cmake/include -isystem /ch/contrib/llvm/llvm/include -isystem contrib/llvm/llvm/include -isystem /ch/contrib/abseil-cpp -isystem /ch/contrib/croaring/cpp -isystem /ch/contrib/croaring/include -isystem /ch/contrib/cityhash102/include -isystem /ch/contrib/boost -isystem /ch/contrib/poco/Net/include -isystem /ch/contrib/poco/Foundation/include -isystem /ch/contrib/poco/NetSSL_OpenSSL/include -isystem /ch/contrib/poco/Crypto/include -isystem /ch/contrib/boringssl/include -isystem /ch/contrib/poco/Util/include -isystem /ch/contrib/poco/JSON/include -isystem /ch/contrib/poco/XML/include -isystem /ch/contrib/replxx/include -isystem /ch/contrib/fmtlib-cmake/../fmtlib/include -isystem /ch/contrib/magic_enum/include -isystem /ch/contrib/double-conversion -isystem /ch/contrib/dragonbox/include -isystem /ch/contrib/re2 -isystem contrib/re2-cmake -isystem /ch/contrib/zlib-ng -isystem contrib/zlib-ng-cmake -isystem /ch/contrib/pdqsort -isystem /ch/contrib/xz/src/liblzma/api -isystem /ch/contrib/aws-c-common/include -isystem /ch/contrib/aws-c-event-stream/include -isystem /ch/contrib/aws/aws-cpp-sdk-s3/include -isystem /ch/contrib/aws/aws-cpp-sdk-core/include -isystem contrib/aws-s3-cmake/include -isystem /ch/contrib/snappy -isystem contrib/snappy-cmake -isystem /ch/contrib/msgpack-c/include -isystem /ch/contrib/fast_float/include -isystem /ch/contrib/librdkafka-cmake/include -isystem /ch/contrib/librdkafka/src -isystem contrib/librdkafka-cmake/auxdir -isystem /ch/contrib/cppkafka/include -isystem /ch/contrib/nats-io/src -isystem /ch/contrib/nats-io/src/adapters -isystem /ch/contrib/nats-io/src/include -isystem /ch/contrib/nats-io/src/unix -isystem /ch/contrib/libuv/include -isystem /ch/contrib/krb5/src/include -isystem contrib/krb5-cmake/include -isystem /ch/contrib/NuRaft/include -isystem /ch/contrib/poco/MongoDB/include -isystem contrib/mariadb-connector-c-cmake/include-public -isystem /ch/contrib/mariadb-connector-c/include -isystem /ch/contrib/mariadb-connector-c/libmariadb -isystem /ch/contrib/icu/icu4c/source/i18n -isystem /ch/contrib/icu/icu4c/source/common -isystem /ch/contrib/capnproto/c++/src -isystem /ch/contrib/arrow/cpp/src -isystem /ch/contrib/arrow-cmake/cpp/src -isystem contrib/arrow-cmake/cpp/src -isystem contrib/arrow-cmake/../orc/c++/include -isystem /ch/contrib/orc/c++/include -isystem contrib/avro-cmake/include -isystem /ch/contrib/avro/lang/c++/api -isystem /ch/contrib/openldap-cmake/linux_x86_64/include -isystem /ch/contrib/openldap/include -isystem /ch/contrib/sparsehash-c11 -isystem /ch/contrib/protobuf/src -isystem src/Server/grpc_protos -isystem /ch/contrib/grpc/include -isystem /ch/contrib/libhdfs3/include -isystem /ch/contrib/hive-metastore -isystem /ch/contrib/thrift/lib/cpp/src -isystem contrib/thrift-cmake -isystem /ch/contrib/azure/sdk/core/azure-core/inc-isystem /ch/contrib/azure/sdk/identity/azure-identity/inc -isystem /ch/contrib/azure/sdk/storage/azure-storage-common/inc -isystem /ch/contrib/azure/sdk/storage/azure-storage-blobs/inc -isystem /ch/contrib/s2geometry/src -isystem /ch/contrib/AMQP-CPP/include -isystem /ch/contrib/AMQP-CPP -isystem /ch/contrib/sqlite-amalgamation -isystem /ch/contrib/rocksdb/include -isystem /ch/contrib/libpqxx/include -isystem /ch/contrib/libpq -isystem /ch/contrib/libpq/include -isystem /ch/contrib/libstemmer_c/include -isystem /ch/contrib/wordnet-blast -isystem /ch/contrib/lemmagen-c/include -isystem /ch/contrib/simdjson/include -isystem /ch/contrib/rapidjson/include -isystem /ch/contrib/consistent-hashing --gcc-toolchain=/ch/cmake/linux/../../contrib/sysroot/linux-x86_64 -std=c++20 -fdiagnostics-color=always -Xclang -fuse-ctor-homing -fsized-deallocation  -UNDEBUG -gdwarf-aranges -pipe -mssse3 -msse4.1 -msse4.2 -mpclmul -mpopcnt -fasynchronous-unwind-tables -falign-functions=32 -mbranches-within-32B-boundaries -fdiagnostics-absolute-paths -fstrict-vtable-pointers -fexperimental-new-pass-manager -Wall -Wextra -Weverything -Wpedantic -Wno-zero -length-array -Wno-c++98-compat-pedantic -Wno-c++98-compat -Wno-c++20-compat -Wno-conversion -Wno-ctad-maybe-unsupported -Wno-disabled-macro-expansion -Wno-documentation-unknown-command -Wno-double-promotion -Wno-exit-time-destructors -Wno-float-equal -Wno-global-constructors -Wno-missing-prototypes -Wno-missing-variable-declarations -Wno-padded -Wno-switch-enum -Wno-undefined-func-template -Wno-unused-template -Wno-vla -Wno-weak-template-vtables -Wno-weak-vtables -Wno-thread-safety-negative -g -O0 -g -gdwarf-4 -fno-inline  -D_LIBCPP_DEBUG=0   -D OS_LINUX -I/ch/base -I/ch/contrib/magic_enum/include -include /ch/src/Core/iostream_debug_helpers.h -Werror -nostdinc++ -std=gnu++2a -MD -MT src/CMakeFiles/dbms.dir/Processors/Formats/Impl/MsgPackRowInputFormat.cpp.o -MF src/CMakeFiles/dbms.dir/Processors/Formats/Impl/MsgPackRowInputFormat.cpp.o.d -o src/CMakeFiles/dbms.dir/Processors/Formats/Impl/MsgPackRowInputFormat.cpp.o -c /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp

    /ch/contrib/msgpack-c/include/msgpack/v1/detail/cpp11_zone.hpp:195:9: error: Attempt to free released memory [clang-analyzer-cplusplus.NewDelete,-warnings-as-errors]
            ::free(p);
            ^
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:509:5: note: Taking false branch
        if (buf.eof())
        ^
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:514:24: note: Assuming 'i' is not equal to field 'number_of_columns'
        for (size_t i = 0; i != number_of_columns; ++i)
                           ^~~~~~~~~~~~~~~~~~~~~~
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:514:5: note: Loop condition is true.  Entering loop body
        for (size_t i = 0; i != number_of_columns; ++i)
        ^
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:516:30: note: Calling 'MsgPackSchemaReader::readObject'
            auto object_handle = readObject();
                                 ^~~~~~~~~~~~
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:426:5: note: Taking false branch
        if (buf.eof())
        ^
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:433:5: note: Loop condition is true.  Entering loop body
        while (need_more_data)
        ^
    /ch/src/Processors/Formats/Impl/MsgPackRowInputFormat.cpp:438:29: note: Calling 'unpack'
                object_handle = msgpack::unpack(buf.position(), buf.buffer().end() - buf.position(), offset);
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /ch/contrib/msgpack-c/include/msgpack/v3/unpack.hpp:52:12: note: Calling 'unpack'
        return msgpack::v3::unpack(data, len, off, referenced, f, user_data, limit);
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /ch/contrib/msgpack-c/include/msgpack/v3/unpack.hpp:35:5: note: Control jumps to the 'default' case at line 40
        switch(ret) {
        ^
    /ch/contrib/msgpack-c/include/msgpack/v3/unpack.hpp:41:9: note:  Execution continues on line 43
            break;
            ^
    /ch/contrib/msgpack-c/include/msgpack/v3/unpack.hpp:43:35: note: Calling '~unique_ptr'
        return msgpack::object_handle();
                                      ^
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:269:19: note: Calling 'unique_ptr::reset'
      ~unique_ptr() { reset(); }
                      ^~~~~~~
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:314:9: note: '__tmp' is non-null
        if (__tmp)
            ^~~~~
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:314:5: note: Taking true branch
        if (__tmp)
        ^
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:315:7: note: Calling 'default_delete::operator()'
          __ptr_.second()(__tmp);
          ^~~~~~~~~~~~~~~~~~~~~~
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:54:5: note: Memory is released
        delete __ptr;
        ^~~~~~~~~~~~
    /ch/contrib/libcxx/include/__memory/unique_ptr.h:54:5: note: Calling 'zone::operator delete'
        delete __ptr;
        ^~~~~~~~~~~~
    /ch/contrib/msgpack-c/include/msgpack/v1/detail/cpp11_zone.hpp:195:9: note: Attempt to free released memory
            ::free(p);
            ^~~~~~~~~

  [1]: https://s3.amazonaws.com/clickhouse-builds/41046/9677898b3b234a5ba0371edaf719ea8890d084ff/binary_tidy/build_log.log

Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
2022-09-10 21:38:35 +02:00
Alexey Milovidov
fa62c7e982 Fix half of trash 2022-09-10 04:08:16 +02:00
avogar
6d5f9e5554 Proper implementation for rowFormat function, delete rowFormatNoNewLine function 2022-09-09 17:42:33 +00:00
Kruglov Pavel
c33aa54032
Fix 2022-09-09 17:53:26 +02:00
Kruglov Pavel
f669d305b6
Fix comment 2022-09-09 17:45:47 +02:00
zhenjial
bd9fabc3f7 code optimization, add test 2022-09-09 23:27:42 +08:00
avogar
ad68b7be0f Better 2022-09-09 15:01:45 +00:00
avogar
46a0318a36 Support JSONColumnsWithMetadata input format 2022-09-08 17:58:44 +00:00
zhenjial
469ceaa156 code optimization 2022-09-09 00:47:43 +08:00
avogar
c380decbbb Make better, add new settings 2022-09-08 16:07:20 +00:00
avogar
545be27f81 Merge branch 'master' of github.com:ClickHouse/ClickHouse into new-json-formats 2022-09-08 13:48:10 +00:00
Anton Popov
f0a404e2c8 Merge remote-tracking branch 'upstream/master' into HEAD 2022-09-06 15:51:16 +00:00
zhenjial
0f788d98f5 new implementation 2022-09-06 20:39:54 +08:00
zhenjial
18db90dcfc Record errors while reading text formats (CSV, TSV). 2022-09-06 17:19:15 +08:00
Kruglov Pavel
77071381e4
fix build 2022-09-02 16:37:33 +02:00
avogar
afc34dca41 Add new JSON formats, add improvements and refactoring 2022-09-01 19:00:24 +00:00
Kruglov Pavel
7a4a65bc36
Make better exception message in schema inference 2022-09-01 20:36:08 +02:00
Kruglov Pavel
f53aa86a20
Merge pull request #40485 from arthurpassos/fix-parquet-chunked-array-deserialization
Add support for extended (chunked) arrays for Parquet format
2022-09-01 19:40:40 +02:00
Alexey Milovidov
6b2e227c8b Fix integration test 2022-08-27 22:28:38 +02:00
Kruglov Pavel
e6e7f5db93
Merge pull request #40491 from mini4/fix-settings-input_format_tsv_skip_first_lines
Fix bug in settings input_format_tsv_skip_first_lines of format TSV
2022-08-24 15:57:45 +02:00
Kruglov Pavel
0781e8b4f7
Merge pull request #40534 from Avogar/nested-in-avro
Support reading Array(Record) into flatten nested table in Avro
2022-08-24 13:33:12 +02:00
kgurjev
f62c2c3221 Fix bug in settings input_format_tsv_skip_first_lines of format TSV 2022-08-24 10:02:57 +03:00
avogar
29a887578b Fix 2022-08-23 11:42:57 +00:00
avogar
581e569d04 Support reading Array(Record) into flatten nested table in Avro 2022-08-23 11:05:02 +00:00
Arthur Passos
f8e2ab0a20 Use FileReader::GetRecordBatchReader instead of FileReader::ReadRowGroup to parse Parquet 2022-08-22 08:21:32 -03:00
avogar
612ffaffde Make schema inference cache better, respect format settings that can change the schema 2022-08-19 16:39:13 +00:00
Kruglov Pavel
b67cb9e378
Merge pull request #40173 from Avogar/arrow-dict
Improve and fix dictionaries in Arrow format
2022-08-18 20:54:55 +02:00
Kruglov Pavel
09a2ff8843
Merge pull request #40293 from joshuataylor/feature/arrow-large-binary-string
Add support for LARGE_BINARY/LARGE_STRING with Arrow
2022-08-18 14:01:58 +02:00
avogar
a6318cecd5 Fix hive test 2022-08-18 11:32:42 +00:00
Nikolai Kochetov
5a85531ef7
Merge pull request #38286 from Avogar/schema-inference-cache
Add schema inference cache for s3/hdfs/file/url
2022-08-18 13:07:50 +02:00
Yakov Olkhovskiy
40fd6e189a
call readColumnWithStringData 2022-08-17 09:54:01 -04:00
Kruglov Pavel
19af748737
Fix typo 2022-08-17 14:29:09 +02:00
Kruglov Pavel
00d04456ff
Try reduce code duplication 2022-08-17 14:28:15 +02:00
avogar
8dd54c043d Merge branch 'master' of github.com:ClickHouse/ClickHouse into schema-inference-cache 2022-08-17 11:47:40 +00:00
Josh Taylor
628d2bbff5 Add support for LARGE_BINARY/LARGE_STRING with Arrow 2022-08-17 10:25:06 +08:00
avogar
99d8727335 Fix tests 2022-08-16 12:56:51 +00:00
avogar
936c457734 Remove unnended field 2022-08-16 09:51:52 +00:00
avogar
e1ff996ec3 Allow to specify structure hints in schema inference 2022-08-16 09:46:57 +00:00
Kruglov Pavel
2c5c0d6d47
Fix typo 2022-08-15 19:55:28 +02:00
avogar
ca0d883c0f Fix possible segfault in CapnProto input format 2022-08-15 15:36:18 +00:00
avogar
c160033837 Fix 2022-08-15 11:38:28 +00:00
avogar
78e197063c Better example 2022-08-12 19:08:36 +00:00
avogar
763f84b623 Remove bad comment 2022-08-12 19:05:57 +00:00
avogar
9addded80e Remove logging 2022-08-12 19:01:02 +00:00
avogar
000336622a Remove logging 2022-08-12 18:59:52 +00:00
avogar
398576e9c9 Improve and fix dictionaries in Arrow format 2022-08-12 18:56:21 +00:00
Kseniia Sumarokova
a6cfc7bc3b
Merge pull request #34651 from alexX512/master
New caching strategies
2022-08-12 17:23:37 +02:00
Anton Popov
3fdf428834
Merge pull request #39186 from Avogar/numbers-schema-inference
Add new features in schema inference
2022-08-11 00:53:54 +02:00
Arthur Passos
c4d8ad2222 Add docs 2022-08-09 15:58:46 -03:00
Arthur Passos
e724e7bef6 Update arrow dict to lc comment 2022-08-09 15:52:37 -03:00
Arthur Passos
6eb89fd780 Fix both arrow dict de-serialization and dict of nullable de-serialization 2022-08-09 15:06:22 -03:00
Arthur Passos
be1e32c3f1
Merge branch 'ClickHouse:master' into fix_arrow_column_dictionary_to_ch_lc 2022-08-09 15:04:06 -03:00
Kruglov Pavel
088e8cf9bd
Merge branch 'master' into numbers-schema-inference 2022-08-09 14:00:36 +02:00
Kruglov Pavel
99b9e85a8f
Merge pull request #39646 from Avogar/more-formats
Add more Pretty formats
2022-08-09 13:59:47 +02:00
avogar
1304e3487c Add comments, remove unneded stuff 2022-08-08 13:43:14 +00:00
avogar
2f95726b06 Fix comments 2022-08-08 12:41:00 +00:00
alexX512
6bf29cb610 Change class LRUCache to class CachBase. Check running CacheBase with default pcahce policy SLRU 2022-08-07 19:59:30 +00:00
avogar
9b1a267203 Refactor, remove TTL, add size limit, add system table and system query 2022-08-05 16:20:15 +00:00
Arthur Passos
62d48053c0 Use insertDefault instead of insert(0) 2022-08-04 15:53:44 -03:00
Arthur Passos
c307e9a228 Fix ArrowColumn dictionary to CH low cardinality conversion 2022-08-04 15:34:44 -03:00
Kruglov Pavel
235649cb98
Merge pull request #39458 from Avogar/fix-cancel-insert-into-function
Fix WriteBuffer finalize when cancel insert into function
2022-08-04 13:02:08 +02:00
Kruglov Pavel
6b2186bfeb
Merge branch 'master' into numbers-schema-inference 2022-08-02 19:34:53 +02:00
Kruglov Pavel
42136b7630
Merge pull request #39647 from Avogar/fix-arrow-strings
Fix strings in dictionary in Arrow format
2022-08-01 12:46:07 +02:00
Alexey Milovidov
4828be7fc4 Fix double escaping in the metadata of FORMAT JSON 2022-07-30 23:56:41 +02:00
Kruglov Pavel
ccd1e1bdb8
Merge branch 'master' into fix-cancel-insert-into-function 2022-07-29 20:27:32 +02:00
avogar
01a309d4e3 Fix strings in dictionary in Arrow format 2022-07-27 12:02:27 +00:00
avogar
f925046dc4 Add more Pretty formats 2022-07-27 11:37:02 +00:00
Kruglov Pavel
381ea139c2
Merge branch 'master' into schema-inference-cache 2022-07-27 11:35:36 +02:00
Kruglov Pavel
53159db782
Merge branch 'master' into numbers-schema-inference 2022-07-26 12:32:49 +02:00
Kruglov Pavel
83c7da6e88
Merge branch 'master' into fix-protobuf-capnp-empty-message 2022-07-25 13:02:41 +02:00
Alexey Milovidov
388d06fda1
Merge pull request #39535 from ClickHouse/stringref
Less usage of StringRef
2022-07-25 04:06:11 +03:00
Robert Schulze
4333750985
Less usage of StringRef
... replaced by std::string_view, see #39262
2022-07-24 18:33:52 +00:00
Alexander Tokmakov
bed2206ae9
Merge pull request #39460 from ClickHouse/remove_some_dead_and_commented_code
Remove some dead and commented code
2022-07-22 13:24:34 +03:00
avogar
794aa691bc Merge branch 'master' of github.com:ClickHouse/ClickHouse into fix-protobuf-capnp-empty-message 2022-07-21 17:04:37 +00:00
Kruglov Pavel
9252f42b4c
Merge branch 'master' into schema-inference-cache 2022-07-21 18:59:14 +02:00
avogar
fd534aa3fa wqMerge branch 'master' of github.com:ClickHouse/ClickHouse into numbers-schema-inference 2022-07-21 15:43:17 +00:00
Alexander Tokmakov
a8da5d96fc remove some dead and commented code 2022-07-21 15:05:48 +02:00
avogar
6b541aa98f Fix WriteBuffer finalize when cancel insert into function 2022-07-21 12:18:37 +00:00
Nikolai Kochetov
e15967e9db
Merge pull request #38475 from ClickHouse/additional-filters
Additional filters for a table (from setting)
2022-07-21 07:52:04 +02:00
Alexey Milovidov
844042fc18
Merge pull request #39433 from ClickHouse/revert-39396-try-fix-write-buffer-terminate
Revert "Fix WriteBuffer finalize in destructor when cacnel query"
2022-07-21 07:04:07 +03:00
Alexey Milovidov
dcda9d3bd1
Merge pull request #39365 from Avogar/fix-capnproto-abort
Avoid possible abort() in CapnProto on exception descruction
2022-07-21 05:20:45 +03:00
Kruglov Pavel
92995a832b
Revert "Fix WriteBuffer finalize in destructor when cacnel query" 2022-07-21 01:45:16 +02:00
Nikolai Kochetov
91043351aa Fixing build. 2022-07-20 20:30:16 +00:00
Kruglov Pavel
46da17ca8c
Merge branch 'master' into numbers-schema-inference 2022-07-20 13:32:39 +02:00
Kruglov Pavel
3046cd6d29
Merge branch 'master' into schema-inference-cache 2022-07-20 13:30:42 +02:00
avogar
784ee11594 Add settings to skip fields with unsupported types in Protobuf/CapnProto schema inference 2022-07-20 11:16:25 +00:00
Kruglov Pavel
a1b63b4a02
Fix style 2022-07-20 12:07:22 +02:00
Kruglov Pavel
7722b647b7
Merge pull request #39396 from Avogar/try-fix-write-buffer-terminate
Fix WriteBuffer finalize in destructor when cacnel query
2022-07-20 12:06:20 +02:00
avogar
5c16d6b553 Fix WriteBuffer finalize in destructor when cacnel query 2022-07-19 19:21:30 +00:00
avogar
4f020654be Get rid of unneded ifdefs 2022-07-19 12:12:40 +00:00
avogar
6eb234a1cc Avoid abort() in capnproto on exception descruction 2022-07-18 19:53:24 +00:00
Robert Schulze
32637cb1b9
Fix build 2022-07-18 07:58:59 +00:00
Robert Schulze
13482af4ee
First try at reducing the use of StringRef
- to be replaced by std::string_view
- suggested in #39262
2022-07-17 17:26:02 +00:00
Robert Schulze
deda29b46b
Pass const StringRef by value, not by reference
See #39224
2022-07-15 11:34:56 +00:00
Kruglov Pavel
b38241b08a
Merge branch 'master' into schema-inference-cache 2022-07-14 12:29:54 +02:00
avogar
7cde9d3b40 Add new features in schema inference 2022-07-13 15:57:55 +00:00
vdimir
63aebd17b2 Remove TabSeparatedSorted 2022-07-12 20:22:35 +02:00
vdimir
46df417c2e Fix empty line sorting in TabSeparatedSorted 2022-07-12 20:22:35 +02:00
vdimir
f51b25b262 clickhouse test ignore order via special format 2022-07-12 20:22:35 +02:00
Kruglov Pavel
4080f055b6
Merge pull request #38477 from Avogar/sql-insert-format
Add SQLInsert output format
2022-07-04 15:06:33 +02:00
avogar
5b0fd31c64 Put column names in quotes 2022-06-30 16:14:30 +00:00
Antonio Andelic
de264117fd
Merge pull request #38118 from bigo-sg/storagehive_struct_type
Add struct type support in `StorageHive`
2022-06-30 09:11:13 +02:00
mergify[bot]
9482c99ab8
Merge branch 'master' into sql-insert-format 2022-06-29 11:03:07 +00:00
Robert Schulze
f692ead6ad
Don't use std::unique_lock unless we have to
Replace where possible by std::lock_guard which is more light-weight.
2022-06-28 19:19:06 +00:00
avogar
9bb68bc6de Add SQLInsert output format 2022-06-27 18:31:57 +00:00
avogar
5155262a16 Add some additional information to cache keys 2022-06-27 12:43:24 +00:00
lgbo-ustc
cd8e5c7c49 update headers 2022-06-23 17:43:54 +08:00
lgbo-ustc
96e6f9a2d0 fixed code style 2022-06-23 16:10:01 +08:00
lgbo-ustc
c1770c22b9 Merge remote-tracking branch 'ck/master' into storagehive_struct_type 2022-06-23 15:54:20 +08:00
Kseniia Sumarokova
e48ce50863
Update ArrowBufferedStreams.cpp 2022-06-20 19:12:51 +02:00
kssenii
5dd1bb2fd8 improvements for getFileSize 2022-06-20 15:22:56 +02:00
lgbo-ustc
8c629085e4 simplified code 2022-06-17 09:36:59 +08:00
lgbo-ustc
35d534c213 nested struct in struct 2022-06-16 16:45:05 +08:00
Alexey Milovidov
5e9e5a4eaf
Merge pull request #37525 from Avogar/avro-structs
Support Maps and Records, allow to insert null as default in Avro format
2022-06-15 00:04:29 +03:00
Kseniia Sumarokova
0ae2168fb6
Merge pull request #36328 from bigo-sg/async_hdfs_read_buffer
Apply read_method 'threadpool' for StorageHive
2022-06-10 15:04:21 +02:00
taiyang-li
9fd9ff66bd remove some test code 2022-06-09 09:55:50 +08:00
taiyang-li
c65c56fd48 fix typo 2022-06-07 09:58:29 +08:00
mergify[bot]
ddf7210ecc
Merge branch 'master' into remove-useless-code-2 2022-06-03 13:58:45 +00:00
taiyang-li
f202c35311 Merge branch 'master' into async_hdfs_read_buffer 2022-06-03 17:52:09 +08:00
Paul Loyd
32d267ec6c
Stop removing UTF-8 BOM in RowBinary* formats
Fixes #37420
2022-06-01 13:12:55 +08:00
Maksim Kita
bacee7f19c
Merge pull request #37195 from kitaisreal/merging-sorted-algorithm-single-column-specialization
MergingSortedAlgorithm single column specialization
2022-05-31 16:46:18 +02:00
taiyang-li
047387bf1c fix 2 bugs: 1. select count(1) from hive_table; 2. select _file, _path from hive_table 2022-05-31 17:39:02 +08:00
avogar
4c9812d4c1 Allow to skip some of the first rows in CSV/TSV formats 2022-05-25 15:00:11 +00:00
avogar
038a422aeb Add setting to insert null as default 2022-05-25 12:56:59 +00:00
avogar
7817d6aea3 Support Maps and Records in Avro format 2022-05-25 11:20:28 +00:00
Maksim Kita
83554d1f2d Fixed style 2022-05-25 13:05:39 +02:00
Maksim Kita
9a9df26eec Fixed tests 2022-05-25 11:44:37 +02:00
Kruglov Pavel
6c9a524f6b
Merge pull request #37192 from Avogar/formats-with-names
Improve performance and memory usage for select of subset of columns for some formats
2022-05-24 13:28:14 +02:00
avogar
3651ef93fe Fix performance test 2022-05-23 17:42:13 +00:00
avogar
034c7122be Mark JSONColumns supports subset of columns 2022-05-23 15:26:01 +00:00
avogar
ce4adb447f Fix named tuples output in ORC/Arrow/Parquet formats 2022-05-23 14:21:08 +00:00
Kruglov Pavel
f539fb835d
Merge branch 'master' into formats-with-names 2022-05-23 12:14:20 +02:00
Kruglov Pavel
ce48e8e102
Merge pull request #36975 from Avogar/json-columns-formats
Add columnar JSON formats
2022-05-23 12:11:28 +02:00
Kruglov Pavel
9bc74439c1
Merge pull request #37327 from Avogar/arrow-strings
Allow to use String type instead of Binary in Arrow/Parquet/ORC formats
2022-05-23 12:05:33 +02:00
mergify[bot]
747aa5575c
Merge branch 'master' into remove-useless-code-2 2022-05-22 17:41:57 +00:00
Kruglov Pavel
704c78063f
Fix special build 2022-05-20 19:54:02 +02:00
Anton Popov
cb0e6c2718 mark all operators bool() as explicit 2022-05-20 15:29:54 +00:00
avogar
566d1b15fd Merge branch 'master' of github.com:ClickHouse/ClickHouse into formats-with-names 2022-05-20 13:54:52 +00:00
avogar
d2304f5d15 Make better 2022-05-20 12:07:29 +00:00
avogar
a6a430c5ee Merge branch 'master' of github.com:ClickHouse/ClickHouse into json-columns-formats 2022-05-20 11:08:30 +00:00
mergify[bot]
1ac4199e78
Merge branch 'master' into arrow-strings 2022-05-20 10:43:33 +00:00
avogar
cd6a29897e Apply input_format_max_rows_to_read_for_schema_inference for all files in globs in total 2022-05-18 17:56:36 +00:00
Kruglov Pavel
d81616ff65
Remove unnecessary include 2022-05-18 17:44:39 +02:00
avogar
a0369fb9a6 Allow to use String type instead of Binary in Arrow/Parquet/ORC formats 2022-05-18 14:51:21 +00:00
avogar
12010a81b7 Make better 2022-05-18 09:25:26 +00:00
Robert Schulze
0c55ac76d2
A few clangtidy updates
Enable:

- bugprone-lambda-function-name: "Checks for attempts to get the name of
  a function from within a lambda expression. The name of a lambda is
  always something like operator(), which is almost never what was
  intended."

- bugprone-unhandled-self-assignment: "Finds user-defined copy
  assignment operators which do not protect the code against
  self-assignment either by checking self-assignment explicitly or using
  the copy-and-swap or the copy-and-move method.""

- hicpp-invalid-access-moved: "Warns if an object is used after it has
  been moved."

- hicpp-use-noexcept: "This check replaces deprecated dynamic exception
  specifications with the appropriate noexcept specification (introduced
  in C++11)"

- hicpp-use-override: "Adds override (introduced in C++11) to overridden
  virtual functions and removes virtual from those functions as it is
  not required."

- performance-type-promotion-in-math-fn: "Finds calls to C math library
  functions (from math.h or, in C++, cmath) with implicit float to
  double promotions."

Split up:

- cppcoreguidelines-*. Some of them may be useful (haven't checked in
  detail), therefore allow to toggle them individually.

Disable:

- linuxkernel-*. Obvious.
2022-05-17 20:56:57 +02:00
Kruglov Pavel
8572879c37
Remove redundant code 2022-05-16 17:58:20 +02:00
Robert Schulze
e3cfec5b09
Merge remote-tracking branch 'origin/master' into clangtidies 2022-05-16 10:12:50 +02:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
cef13c2c02 Allow to skip unknown columns in Native format 2022-05-13 14:27:15 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
mergify[bot]
4a661b6e78
Merge branch 'master' into json-columns-formats 2022-05-13 11:32:03 +00:00
avogar
02679c7222 Fix tests 2022-05-10 16:27:59 +00:00
avogar
ea0362b3a3 Fix tests 2022-05-10 16:20:38 +00:00
avogar
9abdacdd2e Remove logging 2022-05-09 13:30:41 +00:00
avogar
054318b555 Fix invalid output LowCardinality -> ArrowDictionary 2022-05-09 13:29:42 +00:00
avogar
1e8d7ae749 Fix 2022-05-09 11:29:40 +00:00
avogar
04fdd75c56 Make JSONColumns frormats mono block by default 2022-05-09 11:13:44 +00:00
Robert Schulze
1b81bb49b4
Enable clang-tidy modernize-deprecated-headers & hicpp-deprecated-headers
Official docs:

  Some headers from C library were deprecated in C++ and are no longer
  welcome in C++ codebases. Some have no effect in C++. For more details
  refer to the C++ 14 Standard [depr.c.headers] section. This check
  replaces C standard library headers with their C++ alternatives and
  removes redundant ones.
2022-05-09 08:23:33 +02:00
Robert Schulze
7d3913f350
Enable clang-tidy bugprone-assert-side-effect
Official docs:

  Finds assert() with side effect. The condition of assert() is
  evaluated only in debug builds so a condition with side effect can
  cause different behavior in debug / release builds.
2022-05-08 19:15:55 +02:00
avogar
3a13c3e372 Fix comments 2022-05-06 16:50:34 +00:00
avogar
62a7ba3f26 Add columnar JSON formats 2022-05-06 16:48:48 +00:00
Anton Popov
515f68eead Merge remote-tracking branch 'upstream/master' into dynamic-columns-14 2022-05-06 16:10:51 +00:00
Anton Popov
566c08086a support Object type inside other types 2022-05-06 14:44:00 +00:00
Anton Popov
13e8db6299
Merge pull request #36762 from CurtizJ/dynamic-columns-12
Fix insertion to columns of type `Object` from multiple files
2022-05-06 14:14:32 +02:00
Kruglov Pavel
77e55c344c
Merge pull request #36667 from Avogar/mysqldump-format
Add MySQLDump input format
2022-05-04 19:49:48 +02:00
Kruglov Pavel
ffec3655fe
Fix special build 2022-05-04 17:14:15 +02:00
mergify[bot]
64084b5e32
Merge branch 'master' into shared_ptr_helper3 2022-05-03 20:46:16 +00:00
Dmitry Novik
5ba7a55c18
Merge pull request #36650 from bigo-sg/hive_text_parallel_parsing
Parallel parsing of hive text format
2022-05-03 15:56:28 +02:00
Kruglov Pavel
d613f7eab0
Merge branch 'master' into mysqldump-format 2022-05-02 13:31:57 +02:00
Antonio Andelic
a1a22b0007
Merge pull request #35149 from ContentSquare/nullables_with_proto3
Nullables with proto3 using Google wrappers
2022-05-02 09:49:37 +02:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
   previously allowed.

Hence, this change

- removes shared_ptr_helper and as a result all inherited create() methods,

- instead, Storage objects are now created using make_shared<>() by the
  caller (for that to work, many constructors had to be made public), and

- all Storage classes were marked as noncopyable using boost::noncopyable.

In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Robert Schulze
89aa9ae00f
Fixed clang-tidy check "bugprone-branch-clone"
The check is currently *not* part of .clang-tidy. It complains about:
(1) "switch has multiple consecutive identical branches"
(2) "repeated branch in conditional chain"

About (1): Lots of findings in switches were about redundant
"[[fallthrough]]" in places where the compiler would not warn anyways. I
have cleaned these up.

About (2): In if-else_if-else chains, fixing the warning would usually
mean concatenating multiple if-conditions. As this would reduce
readability in most cases, I did not fix these places.

Because of (2), I also refrained from adding "bugprone-branch-clone" to
.clang-tidy.
2022-04-30 19:40:28 +02:00
mergify[bot]
cc08ccb420
Merge branch 'master' into remove-useless-code-2 2022-04-30 12:48:15 +00:00
Jakub Kuklis
a1f2dd6d34 Adding two settings in place of one, improvements to the test clarity 2022-04-29 10:01:51 +02:00
Jakub Kuklis
507ba1042c Adding a setting to enable Google wrappers special treatment 2022-04-29 10:01:51 +02:00
Jakub Kuklis
6d5c1e2fc0 Adding a setting to enable special treatment of google wrappers 2022-04-29 10:01:50 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
Anton Popov
1fc51e09ff fix insertion to column of type Object from multiple files via table function 2022-04-28 18:51:13 +00:00
avogar
d295de1689 Fix comments and test 2022-04-28 14:59:35 +00:00
Kruglov Pavel
4d08587559
Merge branch 'master' into mysqldump-format 2022-04-28 15:58:18 +02:00
Kseniia Sumarokova
4c371f710e
Merge pull request #36676 from kssenii/refactor-with-size-buffer
Better version of SeekableReadBufferWithSize
2022-04-28 13:44:25 +02:00
taiyang-li
99aa5fdc81 remove useless code 2022-04-27 11:15:04 +08:00
vdimir
81b86799e7
Fixup PrometheusTextOutputFormat 2022-04-26 14:57:37 +00:00
vdimir
d5d98ed951
PrometheusTextOutputFormat: support lables, histograms and summaries 2022-04-26 14:57:36 +00:00
vdimir
be0aa06958
Add output format Prometheus 2022-04-26 14:57:35 +00:00
kssenii
9d364cdce2 Refactor 2022-04-26 15:33:53 +02:00
Kruglov Pavel
a462d94157
Fix error codes 2022-04-26 13:25:07 +02:00
Kruglov Pavel
e3b222b519
Fix typo 2022-04-26 13:24:10 +02:00
avogar
33d845dade Add MySQLDump input format 2022-04-26 10:42:56 +00:00
taiyang-li
b7cc344d62 remove useless codes 2022-04-26 14:42:43 +08:00
taiyang-li
99dee35b6e parallel parsing of hive text format 2022-04-26 14:33:10 +08:00
Kruglov Pavel
34c342fdd3
Merge pull request #36205 from Avogar/improve-globs
Some refactoring around schema inference with globs
2022-04-25 13:14:46 +02:00
avogar
80eacc8533 Merge branch 'master' of github.com:ClickHouse/ClickHouse into improve-json-schema-inference 2022-04-22 17:18:44 +00:00
Kseniia Sumarokova
33bb48106f
Merge pull request #36314 from CurtizJ/print-bad-filenames
Show names of erroneous files in case of parsing errors while executing table functions
2022-04-22 13:24:55 +02:00
mergify[bot]
e38a3c3595
Merge branch 'master' into alias 2022-04-21 15:02:30 +00:00
Maksim Kita
57444fc7d3
Merge pull request #36444 from rschu1ze/clang-tidy-fixes
Clang tidy fixes
2022-04-21 16:11:27 +02:00
mergify[bot]
1ba1cad5cf
Merge branch 'master' into improve-globs 2022-04-21 11:52:13 +00:00
Kruglov Pavel
a6186f7ba4
Merge pull request #36333 from ClickHouse/bool-sync-after-error
Fix tech debt for Bool and Map data types
2022-04-21 13:32:14 +02:00
Kruglov Pavel
813e228fcc
Merge branch 'master' into improve-globs 2022-04-20 16:31:47 +02:00
Anton Popov
d4df38a0e6 fix tests 2022-04-20 14:13:04 +00:00
Alexander Tokmakov
1d30a97fd2 Merge branch 'master' into remove-useless-code-2 2022-04-20 11:45:56 +02:00
Robert Schulze
b24ca8de52
Fix various clang-tidy warnings
When I tried to add cool new clang-tidy 14 warnings, I noticed that the
current clang-tidy settings already produce a ton of warnings. This
commit addresses many of these. Almost all of them were non-critical,
i.e. C vs. C++ style casts.
2022-04-20 10:29:05 +02:00
Anton Popov
bee4ca9b62 add more tests for error diagnostics in files 2022-04-19 15:56:34 +00:00
Anton Popov
3e361c9759 Merge remote-tracking branch 'upstream/master' into HEAD 2022-04-19 14:18:04 +00:00
Alexey Milovidov
f6ab2bd523
Merge pull request #36312 from ClickHouse/remove-arcadia
Remove remaining parts of Arcadia
2022-04-18 07:02:54 +03:00
Alexey Milovidov
242919eddd Remove abbreviation 2022-04-18 01:02:49 +02:00
mergify[bot]
4fed033dca
Merge branch 'master' into alias 2022-04-17 14:37:04 +00:00
fenglv
2392d4e2b5 fix 2022-04-16 16:08:28 +00:00
Alexey Milovidov
7206838c75 Fix tech debt for Bool and Map data types 2022-04-16 16:09:04 +02:00
fenglv
58111115c5 fix style 2022-04-16 06:21:09 +00:00
fenglv
74ef1b0198 Add aliases JSONLines and NDJSON for JSONEachRow 2022-04-16 06:01:07 +00:00
Anton Popov
2de6668b3f show names of erroneous files 2022-04-16 00:10:47 +00:00
Alexey Milovidov
cbeeb7ec4f Remove Arcadia 2022-04-16 00:20:47 +02:00
avogar
42726639f3 Check ORC/Parquet/Arrow format magic bytes before loading file in memory 2022-04-13 19:27:38 +00:00
avogar
f5f1db86d9 Remove commented code 2022-04-13 19:15:52 +00:00
avogar
8b60aeb7bc Improve schema inference for json objects 2022-04-13 19:13:40 +00:00
avogar
1c065f8c7a Some refactoring around schema inference with globs 2022-04-13 17:02:48 +00:00
Alexey Milovidov
a54c01cf72 Remove useless code in ReplicatedMergeTreeRestartingThread 2022-04-11 00:44:30 +02:00
avogar
1c783ed88a Resolve conflicts 2022-04-07 12:17:48 +00:00
avogar
d2017a63b1 Merge branch 'master' of github.com:ClickHouse/ClickHouse into improve-schema-inference 2022-04-07 11:36:40 +00:00
Kruglov Pavel
f3f8f27db5
Merge pull request #35735 from Avogar/allow-read-bools-as-numbers
Allow to infer and parse bools as numbers in JSON input formats
2022-04-07 13:20:49 +02:00
taiyang-li
2ef316801c Merge branch 'master' into use_minmax_index 2022-04-07 10:53:25 +08:00
Kruglov Pavel
ec2213493f
Merge branch 'master' into allow-read-bools-as-numbers 2022-04-06 14:53:02 +02:00
Kruglov Pavel
9141066de3
Merge branch 'master' into improve-schema-inference 2022-04-06 13:51:07 +02:00
taiyang-li
acb9f1632e suppoort skip splits in orc and parquet 2022-04-06 16:40:22 +08:00
mergify[bot]
1e43e26fa1
Merge branch 'master' into fix-order 2022-04-02 12:00:29 +00:00
avogar
ab2a963287 Merge branch 'master' of github.com:ClickHouse/ClickHouse into allow-read-bools-as-numbers 2022-03-31 14:09:43 +00:00
Kruglov Pavel
252d66e80d
Update src/Processors/Formats/ISchemaReader.cpp
Co-authored-by: Antonio Andelic <antonio2368@users.noreply.github.com>
2022-03-31 16:08:37 +02:00
mergify[bot]
24ade25d61
Merge branch 'master' into improve-schema-inference 2022-03-31 13:42:47 +00:00
avogar
836e7dae67 Fix bug in indexes of not presented columns in -WithNames formats 2022-03-31 12:24:40 +00:00
avogar
d272356324 Minor code improvement 2022-03-31 10:55:09 +00:00
avogar
74275da7ee Make better 2022-03-31 10:52:34 +00:00
avogar
000f3043e7 Make better 2022-03-29 17:40:07 +00:00
avogar
3fc36627b3 Allow to infer and parse bools as numbers in JSON input formats 2022-03-29 17:37:31 +00:00
avogar
ce97ccbfb9 Improve schema inference for JSONEachRow and TSKV formats 2022-03-29 14:47:51 +00:00
Antonio Andelic
9990abb76a Use compile-time check for Exception messages, fix wrong messages 2022-03-29 13:16:11 +00:00
avogar
97f5033ea9 Fix tests 2022-03-29 13:07:37 +00:00
mergify[bot]
343588de2c
Merge branch 'master' into improve-schema-inference 2022-03-29 13:06:00 +00:00
Anton Popov
9610139477
Merge pull request #35629 from CurtizJ/dynamic-columns-5
Support schema inference for type `Object` in format `JSONEachRow`
2022-03-29 14:17:09 +02:00
Anton Popov
d677635cd8
Merge pull request #35592 from CurtizJ/dynamic-columns-4
Add parallel parsing and schema inference for format `JSONAsObject`
2022-03-28 19:29:55 +02:00
Anton Popov
67195bfdd5 support schema inference for type Object in format JSONEachRow 2022-03-25 21:51:53 +00:00
avogar
6fb3c3be04 Fix comments and build 2022-03-25 12:02:21 +00:00
Kruglov Pavel
d45143ffe0
Merge branch 'master' into improve-schema-inference 2022-03-25 12:05:40 +01:00
Vladimir C
ae92963b15
Fix build error in Formats/ISchemaReader.cpp 2022-03-25 11:30:25 +01:00
Kruglov Pavel
287e1a6efc
Update src/Processors/Formats/ISchemaReader.cpp
Co-authored-by: Vladimir C <vdimir@clickhouse.com>
2022-03-24 19:16:52 +01:00
Kruglov Pavel
6a9df9d471
Update src/Processors/Formats/ISchemaReader.cpp
Co-authored-by: Vladimir C <vdimir@clickhouse.com>
2022-03-24 19:16:47 +01:00
Kruglov Pavel
3b801a4093
Update src/Processors/Formats/ISchemaReader.cpp
Co-authored-by: Vladimir C <vdimir@clickhouse.com>
2022-03-24 19:16:41 +01:00
Anton Popov
78100abc5f add parallel parsing and schema inference for type Object 2022-03-24 17:51:35 +00:00
avogar
557edbd172 Add some improvements and fixes in schema inference 2022-03-24 12:54:12 +00:00