fix: Nested self-referential CASE chains should not cause exponential hashing work during physical planning.#22175
Merged
Conversation
Nested self-referential CASE chains cause exponential hash work during physical planning because CaseExpr derives Hash over both the original CaseBody and the derived ProjectedCaseBody in EvalMethod. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
EvalMethod (including ProjectedCaseBody) is deterministically derived from CaseBody in try_new(), so hashing it is redundant. With the derived Hash, nested CASE chains caused exponential hash work during ProjectionExec::compute_properties because each level doubled the tree traversal. Manual Hash/Eq on body-only fixes this. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
rluvaton
approved these changes
May 14, 2026
rluvaton
reviewed
May 14, 2026
Contributor
Author
|
@alamb this doesn't look related to our change... do you know who to talk to about CI issues? |
Dandandan
reviewed
May 14, 2026
|
|
||
| // eval_method is functionally derived from body, so excluding it from | ||
| // Hash/Eq avoids redundantly hashing the expression tree twice. For | ||
| // nested CASE chains this prevents exponential blowup (see #22173). |
Contributor
There was a problem hiding this comment.
Suggested change
| // nested CASE chains this prevents exponential blowup (see #22173). | |
| // nested CASE chains this prevents exponential blowup (see https://github.com/apache/datafusion/issues/22173). |
Contributor
Restarted - seems some flaky thing |
Dandandan
approved these changes
May 14, 2026
Contributor
|
FYI @pepijnve I wonder if there is any way to test for this (a benchmark in sql_planner perhaps) so we don't break it in the future |
Contributor
|
Dang completely missed that. Would it make sense to use something like |
Contributor
I think in general we are trying to keep our (already quite extensive) dependency chain low |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Rationale for this change
Explained in issue
What changes are included in this PR?
eval_methodfieldAre these changes tested?
A unit test was added
Are there any user-facing changes?
Pathologically nested CASE/WHEN queries will plan significantly faster.