Commit Graph

5 Commits

Author SHA1 Message Date
Azat Khuzhin
31a6532f01 Fix 01814_distributed_push_down_limit flakiness
Before it was possible that the query was already cancelled, and so you
will get ExceptionWhileProcessing and QueryFinish:

    ┌─query───────────────────────────────────────────────────────────────────┬─query_id─────────────────────────────┬─read_rows─┬─type─────────────────────┐
    │ SELECT `key` FROM `test_qvcjdo`.`data_01814` GROUP BY `key` LIMIT 0, 10 │ 2d39bfd6-f0e7-404a-9990-7703f7a4ec3a │        40 │ QueryFinish              │
    │ SELECT `key` FROM `test_qvcjdo`.`data_01814` GROUP BY `key` LIMIT 0, 10 │ d930bf54-e965-42fd-9d48-5e54a588187a │        40 │ QueryFinish              │
    │ SELECT `key` FROM `test_qvcjdo`.`data_01814` GROUP BY `key` LIMIT 0, 10 │ d930bf54-e965-42fd-9d48-5e54a588187a │        40 │ ExceptionWhileProcessing │
    └─────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────┴───────────┴──────────────────────────┘

And so you will got more then two rows.

CI: https://s3.amazonaws.com/clickhouse-test-reports/33660/a4e8e61d57eab14116983a340c65e5e2d7039ed5/stateless_tests__release__wide_parts_enabled__actions_.html
2022-01-15 15:04:24 +03:00
Azat Khuzhin
bcce1d70b2 tests: update tests with event_time/event_date = today() to >= yesterday()
This will fix failures like in [1], from query_log from artifacts:

    SELECT
        query,
        event_time
    FROM system.query_log
    WHERE (NOT is_initial_query) AND (query NOT LIKE '%system%query_log%') AND (query LIKE concat('WITH%', 'test_84qkvq', '%AS `id_no` %')) AND (type = 'QueryFinish')

    Query id: c5d70aba-b0aa-4f92-bdb3-29547b9aabb1

    ┌─query──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬──────────event_time─┐
    │ WITH _CAST('test_84qkvq', 'Nullable(String)') AS `id_no` SELECT `one`.`dummy`, ignore(`id_no`) FROM `system`.`one` WHERE `dummy` IN (0, 2) │ 2021-12-25 23:59:59 │
    │ WITH _CAST('test_84qkvq', 'Nullable(String)') AS `id_no` SELECT `one`.`dummy`, ignore(`id_no`) FROM `system`.`one` WHERE `dummy` IN (0, 2) │ 2021-12-25 23:59:59 │
    └────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────────────┘

    2 rows in set. Elapsed: 0.032 sec.

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/33175/465a9bf615e1b233606460f956c09f71931c99a2/stateless_tests__debug__actions__[2/3].html
2021-12-26 16:35:15 +03:00
Vitaly Baranov
39d73c01b2 Add tags to tests. 2021-09-12 17:15:28 +03:00
Azat Khuzhin
2fb95d9ee0 Rework SELECT from Distributed query stages optimization
Before this patch it wasn't possible to optimize simple SELECT * FROM
dist ORDER BY (w/o GROUP BY and DISTINCT) to more optimal stage
(QueryProcessingStage::WithMergeableStateAfterAggregationAndLimit),
since that code was under
allow_nondeterministic_optimize_skip_unused_shards, rework it and make
it possible.

Also now distributed_push_down_limit is respected for
optimize_distributed_group_by_sharding_key.

Next step will be to enable distributed_push_down_limit by default.

v2: fix detection of aggregates
2021-08-02 21:04:29 +03:00
Azat Khuzhin
18e8f0eb5e Add ability to push down LIMIT for distributed queries
This way the remote nodes will not need to send all the rows, so this
will decrease network io and also this will make queries w/
optimize_aggregation_in_order=1/LIMIT X and w/o ORDER BY faster since it
initiator will not need to read all the rows, only first X (but note
that for this you need to your data to be sharded correctly or you may
get inaccurate results).

Note, that having lots of processing stages will increase the complexity
of interpreter (it is already not that clean and simple right now).

Although using separate QueryProcessingStage looks pretty natural.

Another option is to make WithMergeableStateAfterAggregation always, but
in this case you will not be able to disable only this optimization,
i.e. if there will be some issue with it.

v2: fix OFFSET
v3: convert 01814_distributed_push_down_limit test to .sh and add retries
v4: add test with OFFSET
v5: add new query stage into the bash completion
v6/tests: use LIMIT O,L syntax over LIMIT L OFFSET O since it is broken in ANTLR parser
          https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_(antlr_debug).html#fail1
v7/tests: set use_hedged_requests to 0, to avoid excessive log entries on retries
          https://clickhouse-test-reports.s3.yandex.net/23027/a18a06399b7aeacba7c50b5d1e981ada5df19745/functional_stateless_tests_flaky_check_(address).html#fail1
2021-06-09 02:29:50 +03:00