Skip to content

[CALCITE-7484] Add a rule to eliminate redundant aggregates functions over GROUP BY keys#4903

Open
xuzifu666 wants to merge 5 commits intoapache:mainfrom
xuzifu666:agg_function_rule
Open

[CALCITE-7484] Add a rule to eliminate redundant aggregates functions over GROUP BY keys#4903
xuzifu666 wants to merge 5 commits intoapache:mainfrom
xuzifu666:agg_function_rule

Conversation

@xuzifu666
Copy link
Copy Markdown
Member

sql(sql).withRule(CoreRules.AGGREGATE_REDUCE_FUNCTIONS, CoreRules.PROJECT_MERGE).check();
}

/** Test case for
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please look at this PR #4808, we should create new test files for the rule tests.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! done.

* <li>{@code MIN}</li>
* <li>{@code AVG}</li>
* <li>{@code ANY_VALUE}</li>
* <li>{@code FIRST_VALUE}</li>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think FIRST_VALUE and LAST_VALUE only work for window aggregates.

Copy link
Copy Markdown
Member Author

@xuzifu666 xuzifu666 Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, should not support FIRST_VALUE/LAST_VALUE, I had removed them.

*/
@Test void testAggregateFunctionOfGroupByKeys() {
final String sql = "select sal, max(sal) as sal_max, min(sal) as sal_min,\n"
+ "avg(sal) sal_avg, any_value(sal) as sal_val, first_value(sal) as sal_first,\n"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query should be illegal, since FIRST_VALUE and LAST_VALUE are only defined for window aggregates. Please file an issue - this program should not be accepted.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out this issue; I have already documented the relevant jira https://issues.apache.org/jira/browse/CALCITE-7485.

@xuzifu666 xuzifu666 changed the title [CALCITE-7484] Add AggregateFunctionOfGroupByKeysRule to eliminate redundant aggregates over GROUP BY keys [CALCITE-7484] Add a rule to eliminate redundant aggregates over GROUP BY keys Apr 22, 2026
@xuzifu666 xuzifu666 changed the title [CALCITE-7484] Add a rule to eliminate redundant aggregates over GROUP BY keys [CALCITE-7484] Add a rule to eliminate redundant aggregates functions over GROUP BY keys Apr 22, 2026
default:
return null;
}
final int groupIndex = aggregate.getGroupSet().asList().indexOf(arg);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-- query 1
SELECT
    deptno,
    MAX(deptno + sal) AS "max(deptno + sal)"
FROM emp
GROUP BY deptno, sal
ORDER BY deptno;

-- output
 deptno | max(deptno + sal) 
--------+-------------------
     10 |           1310.00
     10 |           5010.00
     10 |           2460.00
     20 |           2995.00
     20 |           3020.00
     20 |            820.00
     20 |           1120.00
     30 |           1530.00
     30 |           1280.00
     30 |           1630.00
     30 |           2880.00
     30 |            980.00
(12 rows)

-- query 2
SELECT
    deptno,
    deptno + sal AS "deptno + sal"
FROM emp
GROUP BY deptno, sal
ORDER BY deptno;

-- output
 deptno | deptno + sal 
--------+--------------
     10 |      1310.00
     10 |      5010.00
     10 |      2460.00
     20 |      2995.00
     20 |      3020.00
     20 |       820.00
     20 |      1120.00
     30 |      1530.00
     30 |      1280.00
     30 |      1630.00
     30 |      2880.00
     30 |       980.00
(12 rows)

Could you rewrite complex expressions such as AggFunc(col1 + col2)or something similar?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense!This scenario likely involves many branching scenarios and requires consideration of many factors. Perhaps we could first enable basic single-field optimization in this version, and I would document an issue to gradually improve the relevant rules.

* @see CoreRules#AGGREGATE_FUNCTION_OF_GROUP_BY_KEYS
*/
@Value.Enclosing
public class AggregateFunctionOfGroupByKeysRule
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the name AggregateReduceFunctionsOnGroupKeysRule might be more appropriate?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion, done.

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants