[fix](variant) Skip full footer scan when constructing VariantStatsCaculator#62819
Open
csun5285 wants to merge 1 commit intoapache:masterfrom
Open
[fix](variant) Skip full footer scan when constructing VariantStatsCaculator#62819csun5285 wants to merge 1 commit intoapache:masterfrom
csun5285 wants to merge 1 commit intoapache:masterfrom
Conversation
…culator SegmentWriter::init() can run multiple times against the same writer (vertical compaction's key columns + per value-column-group calls), and the footer accumulates entries across calls. The calculator was scanning the whole footer on every construction, so each additional init() walked an ever-larger footer that included entries it cannot address via the init's own `column_ids`. Snapshot the footer size before _create_writers appends new entries and pass it to VariantStatsCaculator as `footer_column_offset`, so the constructor only scans its own slice. Per-init() construction cost goes from O(footer accumulated size) to O(this init's column_ids size); the total cost across N vertical-compaction init() calls drops from O(N^2) to O(N). All existing behavior is preserved (including the defensive Status::NotFound on missing footer entries). One new unit test covers the offset case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
Contributor
Author
|
run buildall |
Contributor
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In vertical compaction,
SegmentWriter::init()is called multiple times against the same writer (key columns first, then each value-column group). The footer accumulates across calls, so every additionalinit()re-scans an ever-larger footer — including entries from priorinit()s that the currentcolumn_idscannot address.init()callsFix
Snapshot
_footer.columns_size()before_create_writersappends new entries and pass it toVariantStatsCaculatorasfooter_column_offset. The constructor only walks[offset, end)— its own slice.What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)