Skip to content

[CALCITE-7479] Remove redundant aggregate group keys with FD#4892

Open
xiedeyantu wants to merge 1 commit intoapache:mainfrom
xiedeyantu:agg-remove
Open

[CALCITE-7479] Remove redundant aggregate group keys with FD#4892
xiedeyantu wants to merge 1 commit intoapache:mainfrom
xiedeyantu:agg-remove

Conversation

@xiedeyantu
Copy link
Copy Markdown
Member

@xiedeyantu xiedeyantu commented Apr 18, 2026

Jira Link

CALCITE-7479

Changes Proposed

@xiedeyantu xiedeyantu marked this pull request as draft April 18, 2026 16:08
@xiedeyantu
Copy link
Copy Markdown
Member Author

Just draft.

@xiedeyantu xiedeyantu force-pushed the agg-remove branch 3 times, most recently from 137af74 to ebcde09 Compare April 19, 2026 02:06
@xiedeyantu xiedeyantu changed the title [CALCITE-0000] Remove redundant aggregate group keys with FD [CALCITE-7479] Remove redundant aggregate group keys with FD Apr 19, 2026
Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine.
How come it is never applicable?
Are all our tests using optimal group by keys?
Or we just don't have many tables with keys?

@xiedeyantu
Copy link
Copy Markdown
Member Author

This looks fine. How come it is never applicable? Are all our tests using optimal group by keys? Or we just don't have many tables with keys?

"How come it is never applicable?" I didn't quite understand the meaning of this sentence. Because the functional dependencies we currently support are limited, I used the primary key as an example.

@mihaibudiu
Copy link
Copy Markdown
Contributor

No other tests were modified

@xiedeyantu
Copy link
Copy Markdown
Member Author

I'm concerned about whether this rewrite is meaningful, as it requires the introduction of
"single_value".

@mihaibudiu
Copy link
Copy Markdown
Contributor

I think there are other uses of SINGLE_VALUE in the compiler.

@xiedeyantu
Copy link
Copy Markdown
Member Author

No other tests were modified

Because this new rule is not set as a "default" rule, it doesn't have any other impact.

@mihaibudiu
Copy link
Copy Markdown
Contributor

Can you make it default and see what happens?

@xiedeyantu
Copy link
Copy Markdown
Member Author

I think there are other uses of SINGLE_VALUE in the compiler.

Are you suggesting that I don't need to worry too much about "single_value"? Can the compiler do more?

@xiedeyantu
Copy link
Copy Markdown
Member Author

Can you make it default and see what happens?

Of course, I can create a new branch to add it as a default rule and observe the outcome later.

@mihaibudiu
Copy link
Copy Markdown
Contributor

Doesn't even have to be a branch, you can run the experiment and report the results.

@mihaibudiu
Copy link
Copy Markdown
Contributor

There are lots of uses of SINGLE_VALUE, I think that one is fine.
I am more worried about whether the functional dependency is sufficiently conservative.

@xiedeyantu
Copy link
Copy Markdown
Member Author

@mihaibudiu If you have time, could you please take another look at this PR?

@mihaibudiu
Copy link
Copy Markdown
Contributor

I have approved it already. Anything in particular you'd like me to check?

@xiedeyantu
Copy link
Copy Markdown
Member Author

Since there were some changes in the latest commit, I wanted to see if you need to confirm it again, even though it followed the solution discussed in the other PR.

Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine.

* <p>{@code ANY_VALUE} is chosen over {@code SINGLE_VALUE} because functional
* dependence only guarantees that a removed key maps to a single <em>value</em>
* per group, not that the group contains a single <em>row</em>.
* {@code SINGLE_VALUE} enforces the stricter, row-count constraint at runtime
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this explanation is not necessary. If I was confused, it doesn't mean other people will as well. In fact, SINGLE_VALUE is documented properly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about me attaching the examples and explanations from that PR to the Jira ticket, and then removing this section of explanation from the rules?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's fine.

@sonarqubecloud
Copy link
Copy Markdown

@xiedeyantu xiedeyantu added the LGTM-will-merge-soon Overall PR looks OK. Only minor things left. label Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

LGTM-will-merge-soon Overall PR looks OK. Only minor things left.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants