[CALCITE-7484] Add a rule to eliminate redundant aggregates functions over GROUP BY keys#4903
[CALCITE-7484] Add a rule to eliminate redundant aggregates functions over GROUP BY keys#4903xuzifu666 wants to merge 5 commits intoapache:mainfrom
Conversation
…dundant aggregates over GROUP BY keys
| sql(sql).withRule(CoreRules.AGGREGATE_REDUCE_FUNCTIONS, CoreRules.PROJECT_MERGE).check(); | ||
| } | ||
|
|
||
| /** Test case for |
There was a problem hiding this comment.
Please look at this PR #4808, we should create new test files for the rule tests.
There was a problem hiding this comment.
Thanks for the suggestion! done.
| * <li>{@code MIN}</li> | ||
| * <li>{@code AVG}</li> | ||
| * <li>{@code ANY_VALUE}</li> | ||
| * <li>{@code FIRST_VALUE}</li> |
There was a problem hiding this comment.
I think FIRST_VALUE and LAST_VALUE only work for window aggregates.
There was a problem hiding this comment.
Yes, should not support FIRST_VALUE/LAST_VALUE, I had removed them.
| */ | ||
| @Test void testAggregateFunctionOfGroupByKeys() { | ||
| final String sql = "select sal, max(sal) as sal_max, min(sal) as sal_min,\n" | ||
| + "avg(sal) sal_avg, any_value(sal) as sal_val, first_value(sal) as sal_first,\n" |
There was a problem hiding this comment.
This query should be illegal, since FIRST_VALUE and LAST_VALUE are only defined for window aggregates. Please file an issue - this program should not be accepted.
There was a problem hiding this comment.
Thanks for pointing out this issue; I have already documented the relevant jira https://issues.apache.org/jira/browse/CALCITE-7485.
| default: | ||
| return null; | ||
| } | ||
| final int groupIndex = aggregate.getGroupSet().asList().indexOf(arg); |
There was a problem hiding this comment.
-- query 1
SELECT
deptno,
MAX(deptno + sal) AS "max(deptno + sal)"
FROM emp
GROUP BY deptno, sal
ORDER BY deptno;
-- output
deptno | max(deptno + sal)
--------+-------------------
10 | 1310.00
10 | 5010.00
10 | 2460.00
20 | 2995.00
20 | 3020.00
20 | 820.00
20 | 1120.00
30 | 1530.00
30 | 1280.00
30 | 1630.00
30 | 2880.00
30 | 980.00
(12 rows)
-- query 2
SELECT
deptno,
deptno + sal AS "deptno + sal"
FROM emp
GROUP BY deptno, sal
ORDER BY deptno;
-- output
deptno | deptno + sal
--------+--------------
10 | 1310.00
10 | 5010.00
10 | 2460.00
20 | 2995.00
20 | 3020.00
20 | 820.00
20 | 1120.00
30 | 1530.00
30 | 1280.00
30 | 1630.00
30 | 2880.00
30 | 980.00
(12 rows)
Could you rewrite complex expressions such as AggFunc(col1 + col2)or something similar?
There was a problem hiding this comment.
That makes sense!This scenario likely involves many branching scenarios and requires consideration of many factors. Perhaps we could first enable basic single-field optimization in this version, and I would document an issue to gradually improve the relevant rules.
| * @see CoreRules#AGGREGATE_FUNCTION_OF_GROUP_BY_KEYS | ||
| */ | ||
| @Value.Enclosing | ||
| public class AggregateFunctionOfGroupByKeysRule |
There was a problem hiding this comment.
I wonder if the name AggregateReduceFunctionsOnGroupKeysRule might be more appropriate?
There was a problem hiding this comment.
Thanks for the suggestion, done.
|



jira: https://issues.apache.org/jira/browse/CALCITE-7484