Skip to content

branch-4.1: [fix](be) Fix RuntimeFilter selectivity sampling_frequency lost during VExprContext recreation #62355#62810

Open
BiteTheDDDDt wants to merge 1 commit intoapache:branch-4.1from
BiteTheDDDDt:pick-62355-to-4.1
Open

branch-4.1: [fix](be) Fix RuntimeFilter selectivity sampling_frequency lost during VExprContext recreation #62355#62810
BiteTheDDDDt wants to merge 1 commit intoapache:branch-4.1from
BiteTheDDDDt:pick-62355-to-4.1

Conversation

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor

Cherry-pick of #62355 to branch-4.1.

Conflict resolution

be/src/exec/runtime_filter/runtime_filter_consumer.cpp — branch-4.1 already replaced if (null_aware) return Status::InternalError(...) with DCHECK(null_aware == false) << "..." for the bitmap-predicate branch. Kept 4.1's DCHECK style and applied the new sampling_frequency argument in VRuntimeFilterWrapper::create_shared(...). The other two min/max branches in the same file already used DCHECK on both branches and merged automatically.

No code logic from #62355 was changed.

Original PR

#62355

Copilot AI review requested due to automatic review settings April 24, 2026 08:48
@BiteTheDDDDt BiteTheDDDDt requested a review from yiguolei as a code owner April 24, 2026 08:48
@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@BiteTheDDDDt
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Cherry-picks the fix from #62355 into branch-4.1 to ensure runtime filter selectivity tracking remains functional when VExprContext is recreated, by preserving and propagating sampling_frequency via the VRuntimeFilterWrapper.

Changes:

  • Persist sampling_frequency inside VRuntimeFilterWrapper and apply it to VExprContext during open().
  • Update RuntimeFilterConsumer::_get_push_exprs() to pass the computed sampling_frequency into all created VRuntimeFilterWrapper instances.
  • Add/extend unit tests covering context recreation and sampling/reset behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
be/src/exprs/vruntimefilter_wrapper.h Extends wrapper constructor to carry sampling_frequency (defaulting to DISABLE).
be/src/exprs/vruntimefilter_wrapper.cpp Propagates wrapper-stored sampling_frequency into VExprContext during open().
be/src/exec/runtime_filter/runtime_filter_consumer.cpp Computes sampling frequency once and passes it into wrapper creation for all RF predicate types.
be/test/exec/runtime_filter/runtime_filter_selectivity_test.cpp Adds regression/unit tests documenting reset behavior and proper accumulation under valid sampling.
be/test/exec/runtime_filter/vruntimefilter_wrapper_sampling_test.cpp Adds focused tests validating propagation across VExprContext recreation paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 84.62% (11/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.28% (20013/37563)
Line Coverage 36.70% (188533/513670)
Region Coverage 33.05% (146551/443368)
Branch Coverage 34.13% (64113/187841)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 15.38% (2/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.43% (20020/36783)
Line Coverage 35.45% (181424/511845)
Region Coverage 29.62% (132815/448344)
Branch Coverage 31.21% (59250/189819)

…g VExprContext recreation (apache#62355)

introduced by apache#59832

Problem Summary:

RuntimeFilter selectivity tracking was completely non-functional because
`sampling_frequency` was lost during VExprContext recreation.

In `RuntimeFilterConsumer::_get_push_exprs()`, `sampling_frequency=32`
was set
on a temporary `probe_ctx` VExprContext. However, only
VRuntimeFilterWrapper
expressions (VExpr) were returned, not the VExprContext. When
`_append_rf_into_conjuncts()` later created a new VExprContext via
`VExprContext::create_shared(expr)`, the new context had default
`_sampling_frequency=-1` (DISABLE_SAMPLING).

With `_sampling_frequency=-1`, the condition `(_judge_counter++) >= -1`
evaluated to `0 >= -1 → true` on every call, causing
`reset_judge_selectivity()`
to fire every time. This meant selectivity counters were perpetually
reset
and never accumulated, making the runtime filter selectivity
optimization
completely ineffective.

**Fix**: Store `sampling_frequency` in VRuntimeFilterWrapper (which
survives
VExprContext recreation) and propagate it to VExprContext in
`VRuntimeFilterWrapper::open()`, which is called on both original and
cloned
contexts.

Fixed a bug where RuntimeFilter selectivity tracking was non-functional
due to
sampling_frequency being lost during VExprContext recreation, causing
runtime
filters that should be skipped (due to low selectivity) to never be
identified.

- Test: Unit Test
    - Added 2 regression tests to runtime_filter_selectivity_test.cpp
    - Added 3 new tests in vruntimefilter_wrapper_sampling_test.cpp
    - All 22 tests pass
- Behavior changed: No (selectivity tracking was broken before, this
makes it work as designed)
- Does this need documentation: No

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
(cherry picked from commit 191061a)
@morningman
Copy link
Copy Markdown
Contributor

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 84.62% (11/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.25% (20004/37568)
Line Coverage 36.69% (188516/513789)
Region Coverage 33.05% (146560/443494)
Branch Coverage 34.13% (64129/187889)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 100.00% (13/13) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 70.79% (26043/36788)
Line Coverage 52.94% (271023/511972)
Region Coverage 46.34% (207817/448506)
Branch Coverage 49.77% (94500/189873)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants