Commit Graph

213 Commits

Author SHA1 Message Date
lgbo-ustc
35d534c213 nested struct in struct 2022-06-16 16:45:05 +08:00
lgbo-ustc
e115e3f731 remove unused header 2022-06-16 09:53:04 +08:00
lgbo-ustc
655e42c9bc remove trace logs 2022-06-16 09:44:41 +08:00
lgbo-ustc
4f13521aa6 struct type support for storage hive 2022-06-16 09:35:34 +08:00
taiyang-li
57b6cf6c09 fix build error 2022-06-08 09:58:09 +08:00
taiyang-li
73a484256e Merge branch 'master' into async_hdfs_read_buffer 2022-06-07 12:16:46 +08:00
taiyang-li
c65c56fd48 fix typo 2022-06-07 09:58:29 +08:00
taiyang-li
b36d9f8143 refactor readinto 2022-06-06 12:58:22 +08:00
taiyang-li
047387bf1c fix 2 bugs: 1. select count(1) from hive_table; 2. select _file, _path from hive_table 2022-05-31 17:39:02 +08:00
taiyang-li
dbb8a09825 merge master and solve conflict 2022-05-30 10:47:04 +08:00
taiyang-li
ea450b86cb add some prefetch metric codes 2022-05-27 18:06:40 +08:00
taiyang-li
561c87222d add prefetch for hive text 2022-05-26 11:04:35 +08:00
Nikolai Kochetov
1b85f2c1d6 Merge branch 'master' into refactor-read-metrics-and-callbacks 2022-05-25 16:27:40 +02:00
taiyang-li
29e2157469 change as request 2022-05-23 18:42:54 +08:00
taiyang-li
14f84f02d5 Merge branch 'master' into async_hdfs_read_buffer 2022-05-23 18:36:21 +08:00
Nikolai Kochetov
56feef01e7 Move some resources 2022-05-20 19:49:31 +00:00
avogar
a4cf07708c Fix comments 2022-05-20 14:57:27 +00:00
avogar
566d1b15fd Merge branch 'master' of github.com:ClickHouse/ClickHouse into formats-with-names 2022-05-20 13:54:52 +00:00
Kseniia Sumarokova
d4ad138a04
Merge pull request #37103 from bigo-sg/hive_partition_key_read
optimization for reading hive file  when all columns to read are partition keys
2022-05-19 14:24:00 +02:00
lgbo-ustc
1497e08301 update exception msg 2022-05-17 19:27:43 +08:00
taiyang-li
14ab7eb5a3 merge master and solve conflict 2022-05-17 16:28:08 +08:00
lgbo-ustc
0b3468a150 TOO_MANY_PARTITIONS 2022-05-17 15:50:03 +08:00
lgbo-ustc
f4f4a2d85b reuse setting max_partitions_to_read 2022-05-17 15:49:14 +08:00
lgbo-ustc
a161a21992 add max partitions check for each hive table 2022-05-17 15:37:32 +08:00
lgbo-ustc
d8ad9ad2a6 update codes 2022-05-17 09:27:03 +08:00
avogar
68bb07d166 Better naming 2022-05-13 18:39:19 +00:00
avogar
b17fec659a Improve performance and memory usage for select of subset of columns for some formats 2022-05-13 13:51:28 +00:00
lgbo-ustc
4411fd87c8 reading optimization when all columns to read are partition keys 2022-05-11 16:49:30 +08:00
Robert Schulze
e583099158
Fix build, pt. V 2022-05-04 15:50:52 +02:00
mergify[bot]
64084b5e32
Merge branch 'master' into shared_ptr_helper3 2022-05-03 20:46:16 +00:00
Dmitry Novik
5ba7a55c18
Merge pull request #36650 from bigo-sg/hive_text_parallel_parsing
Parallel parsing of hive text format
2022-05-03 15:56:28 +02:00
Robert Schulze
777b5bc15b
Don't let storages inherit from boost::noncopyable
... IStorage has deleted copy ctor / assignment already
2022-05-03 09:07:08 +02:00
Robert Schulze
330212e0f4
Remove inherited create() method + disallow copying
The original motivation for this commit was that shared_ptr_helper used
std::shared_ptr<>() which does two heap allocations instead of
make_shared<>() which does a single allocation. Turned out that
1. the affected code (--> Storages/) is not on a hot path (rendering the
performance argument moot ...)
2. yet copying Storage objects is potentially dangerous and was
   previously allowed.

Hence, this change

- removes shared_ptr_helper and as a result all inherited create() methods,

- instead, Storage objects are now created using make_shared<>() by the
  caller (for that to work, many constructors had to be made public), and

- all Storage classes were marked as noncopyable using boost::noncopyable.

In sum, we are (likely) not making things faster but the code becomes
cleaner and harder to misuse.
2022-05-02 08:46:52 +02:00
Amos Bird
4a5e4274f0
base should not depend on Common 2022-04-29 10:26:35 +08:00
taiyang-li
99dee35b6e parallel parsing of hive text format 2022-04-26 14:33:10 +08:00
taiyang-li
9e37764bb0 fix build error 2022-04-22 12:37:01 +08:00
taiyang-li
883139ff69 fix code syle 2022-04-21 18:31:13 +08:00
taiyang-li
94d0358b15 fix code style 2022-04-21 17:40:55 +08:00
taiyang-li
169dae2a35 ready for review 2022-04-21 17:37:12 +08:00
taiyang-li
9a251a820b fix bug 2022-04-21 17:13:59 +08:00
taiyang-li
87e76a1757 add swtich contril 2022-04-21 12:30:14 +08:00
taiyang-li
3b722eea7a profileing 2022-04-20 20:59:36 +08:00
taiyang-li
d533b569ad debugging 2022-04-20 19:58:31 +08:00
taiyang-li
56fe6fa608 finish dev 2022-04-20 17:49:53 +08:00
taiyang-li
fb6a56d4b0 finish debug 2022-04-20 16:24:18 +08:00
taiyang-li
0ad2a76fae Merge remote-tracking branch 'origin/master' into async_hdfs_read_buffer 2022-04-16 18:45:39 +08:00
taiyang-li
cd83fd5f8a tobe debug 2022-04-16 18:41:18 +08:00
avogar
42726639f3 Check ORC/Parquet/Arrow format magic bytes before loading file in memory 2022-04-13 19:27:38 +00:00
taiyang-li
090fd72884 fix bug 2022-04-11 11:19:31 +08:00
taiyang-li
7e89f760f3 remove useless code 2022-04-09 10:43:58 +08:00
taiyang-li
70f4503ba5 use global context for cache 2022-04-09 00:28:07 +08:00
taiyang-li
cd807da838 finish test 2022-04-09 00:15:33 +08:00
taiyang-li
e319df1799 finish dev 2022-04-08 23:58:56 +08:00
taiyang-li
2c99ef0ecc refactor HiveTableMetadata 2022-04-08 23:04:24 +08:00
taiyang-li
2e6f0db825 first commit 2022-04-08 15:12:24 +08:00
taiyang-li
87507ec9e8 fix conflicts 2022-04-07 20:52:54 +08:00
taiyang-li
d7c79c3a54 merge master and solve conflicts 2022-04-07 20:48:16 +08:00
taiyang-li
e9de38c52b fix bug 2022-04-07 20:45:07 +08:00
taiyang-li
2dc420c66b rename some symbols in hivefile 2022-04-07 15:48:42 +08:00
taiyang-li
4763a39802 merge bigo-sg/use_minmax_index and solve conflict 2022-04-07 15:45:28 +08:00
taiyang-li
046a2ba51c rename some symboles 2022-04-07 15:35:08 +08:00
taiyang-li
ad074fee91 merge use_minmax_index and solve conflict 2022-04-07 15:19:45 +08:00
taiyang-li
f02d769343 fix build error 2022-04-07 14:29:35 +08:00
taiyang-li
acc7046d54 remove some useless virtual and rename some functions in HiveFile 2022-04-07 11:46:57 +08:00
taiyang-li
df00bd214d merge bigo-sg/use_minmax_index and solve conflict 2022-04-07 11:18:24 +08:00
taiyang-li
2ef316801c Merge branch 'master' into use_minmax_index 2022-04-07 10:53:25 +08:00
taiyang-li
0b0c8ef09e add integration tests 2022-04-06 18:47:34 +08:00
taiyang-li
acb9f1632e suppoort skip splits in orc and parquet 2022-04-06 16:40:22 +08:00
taiyang-li
43e8af697a fix code style 2022-04-06 11:41:16 +08:00
taiyang-li
38f149b533 optimize trivial count hive query 2022-04-04 15:28:26 +08:00
taiyang-li
4e2d5f1841 Merge remote-tracking branch 'bigo-sg/use_minmax_index' into optimize_trivial_hive_query 2022-04-04 10:42:28 +08:00
taiyang-li
cbfc0f6bac fix typo 2022-04-04 10:42:22 +08:00
Kseniia Sumarokova
d3b3294872
Merge pull request #35365 from bigo-sg/improve_access_type
Improve check access in table functions
2022-04-01 10:47:02 +02:00
taiyang-li
16bb4c4ad0 respect remote_url_allow_hosts for hive 2022-03-30 15:33:59 +08:00
taiyang-li
0af6fdb576 fix building 2022-03-30 11:28:21 +08:00
taiyang-li
b79cec6806 Merge branch 'use_minmax_index' of https://github.com/bigo-sg/ClickHouse into use_minmax_index 2022-03-25 23:33:49 +08:00
taiyang-li
eee8949150 fix code 2022-03-25 23:33:46 +08:00
taiyang-li
4aaa361f2e Merge remote-tracking branch 'ck/master' into use_minmax_index 2022-03-25 22:48:03 +08:00
李扬
9cc528b01f
Update HiveFile.h 2022-03-23 21:57:58 +08:00
taiyang-li
ae3d55c6a2 merge master and fix conflict 2022-03-23 14:31:12 +08:00
taiyang-li
68d5b538aa fix build error 2022-03-23 11:15:42 +08:00
lgbo-ustc
967d5a8055 Merge remote-tracking branch 'ck/master' into hive_column_pruning_bug 2022-03-21 19:52:06 +08:00
taiyang-li
49b6f3dfc5 merge master and fix conflict 2022-03-21 15:05:43 +08:00
taiyang-li
bf05b94940 fix build 2022-03-21 15:03:28 +08:00
taiyang-li
7d50bd1eb3 add access type hive 2022-03-21 11:19:45 +08:00
lgbo-ustc
f7aa40af5b update codes 2022-03-21 09:25:20 +08:00
lgbo-ustc
e78cfe3b26 update codes 2022-03-18 15:07:52 +08:00
lgbo-ustc
abfaa82bca fixed hive query bugs 2022-03-15 12:01:34 +08:00
Anton Popov
36ec379aeb Merge remote-tracking branch 'upstream/master' into HEAD 2022-03-14 16:28:35 +00:00
Kseniia Sumarokova
e6ee891c9c
Merge pull request #34957 from bigo-sg/hive_random_access_file_cache
Optimization for first time to read a random access readbuffer in hive
2022-03-10 11:36:22 +01:00
Kseniia Sumarokova
1eb2bae792
Merge pull request #34954 from bigo-sg/hive_read_columns_pruning
read columns pruning for hive
2022-03-08 10:17:24 +01:00
lgbo-ustc
256e92ffee Merge remote-tracking branch 'ck/master' into hive_random_access_file_cache 2022-03-08 14:14:40 +08:00
lgbo-ustc
a8cfc2458a update codes 2022-03-08 11:55:15 +08:00
Kseniia Sumarokova
5511f2f6e6
Merge pull request #34940 from bigo-sg/hive_client_connection_pool
Use connection pool in HiveMetastoreClient
2022-03-07 17:14:56 +01:00
Kseniia Sumarokova
28b9ec01c0
Merge pull request #34945 from bigo-sg/hive_bug_fixed
unexpected result when use `in` in hive query
2022-03-07 17:13:11 +01:00
lgbo-ustc
8ae5296ee8 fixed compile errors 2022-03-07 17:26:48 +08:00
lgbo-ustc
cfeedd2cb5 fixed code style 2022-03-07 12:28:31 +08:00
lgbo-ustc
c37eedd887 update codes 2022-03-07 10:30:54 +08:00
lgbo-ustc
75a50a30c4 update codes 2022-03-07 09:43:53 +08:00
lgbo-ustc
d907b70cc4 update codes: get actual read block 2022-03-07 09:26:05 +08:00