Skip to content

[CALCITE-7437] Type coercion for quantifier operators is incomplete#4877

Open
Dwrite wants to merge 6 commits intoapache:mainfrom
Dwrite:calcite-7437
Open

[CALCITE-7437] Type coercion for quantifier operators is incomplete#4877
Dwrite wants to merge 6 commits intoapache:mainfrom
Dwrite:calcite-7437

Conversation

@Dwrite
Copy link
Copy Markdown

@Dwrite Dwrite commented Apr 11, 2026

Summary
Fix RuntimeException when using SOME/ANY/ALL with mismatched types and all quantifiers should work in the same way, like IN.
The issue:
Calcite currently crashes with a RuntimeException when a quantifier operator (like SOME, ANY, or ALL) compares columns with different types—for example, a VARCHAR on the left and a SMALLINT from the subquery.

The error (SELECT deptno, dname > SOME(SELECT empno FROM emp) AS b FROM dept):
'java.lang.RuntimeException: while resolving method 'gt[class java.lang.String, short]' in c org.apache.calcite.runtime.SqlFunctions This happens because the SqlValidator doesn't trigger TypeCoercion for these operators, so no CAST is inserted. When it gets to code generation, Linq4j can't find a way to compare a raw String with a primitive short.

The fix:
I updated TypeCoercionImpl to handle these quantifier operators. Now, it follows the same implicit cast rules we use for standard comparisons or the IN operator.

Testing:
Added a case in sub-query.iq. The plan now correctly shows a CAST (e.g., CAST($1):SMALLINT) before the join condition.

Note: If the data itself is a non-numeric string (like "ACCOUNTING"), you'll still get a NumberFormatException at runtime, but this is expected behavior (similar to how CAST('A' AS INT) behaves in other engines). The fix ensures the engine can at least generate the correct execution plan instead of crashing during the validation or code-gen phase.

@xiedeyantu
Copy link
Copy Markdown
Member

This appears to be an AI-generated solution. We don't object to using AI to solve problems, but it's necessary to confirm that the code changes and comments meet review criteria before submission. For example, the title should be [CALCITE-7437], and the PR description should ideally follow the existing template.

@Dwrite Dwrite changed the title [Calcite 7437] Type coercion for quantifier operators is incomplete [Calcite-7437] Type coercion for quantifier operators is incomplete Apr 11, 2026
@Dwrite Dwrite changed the title [Calcite-7437] Type coercion for quantifier operators is incomplete [CALCITE-7437] Type coercion for quantifier operators is incomplete Apr 11, 2026
@Dwrite Dwrite changed the title [CALCITE-7437] Type coercion for quantifier operators is incomplete [[CALCITE-7437](https://issues.apache.org/jira/browse/CALCITE-7437)]Type coercion for quantifier operators is incomplete Apr 11, 2026
@Dwrite Dwrite changed the title [[CALCITE-7437](https://issues.apache.org/jira/browse/CALCITE-7437)]Type coercion for quantifier operators is incomplete [CALCITE-7437] Type coercion for quantifier operators is incomplete Apr 11, 2026
@Dwrite
Copy link
Copy Markdown
Author

Dwrite commented Apr 11, 2026

This appears to be an AI-generated solution. We don't object to using AI to solve problems, but it's necessary to confirm that the code changes and comments meet review criteria before submission. For example, the title should be [CALCITE-7437], and the PR description should ideally follow the existing template.

yeah. The initial draft of this description was generated with the assistance of Gemini to ensure clarity and professional phrasing, then reviewed and refined by me to accurately reflect the technical implementation. The core solution and logic were developed by myself.

Comment thread core/src/main/java/org/apache/calcite/sql/validate/implicit/TypeCoercionImpl.java Outdated
Comment thread core/src/main/java/org/apache/calcite/sql/validate/implicit/TypeCoercionImpl.java Outdated
<![CDATA[
LogicalProject(DEPTNO=[$0], EXPR$1=[OR(AND(IS NOT NULL($5), <>($2, 0)), AND(<($3, $2), null, <>($2, 0), IS NULL($5)))])
LogicalJoin(condition=[=($1, $4)], joinType=[left])
LogicalJoin(condition=[=(CAST($1):INTEGER NOT NULL, $4)], joinType=[left])
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks dangerous: how do we know that $1 is not null?
A cast to a NOT NULL type may fail at runtime for a NULL value.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you didn't address this question.
Why is it safe to assume $1 is not null?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m still looking into this issue.

Copy link
Copy Markdown
Author

@Dwrite Dwrite Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NOT NULL in the CAST reflects the nullability of the original column (name is defined as NOT NULL in the DEPT table), preserved by syncAttributes inside coerceOperandType. This is consistent with how inOperationCoercion handles the same case. Is this behavior acceptable?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the original column is indeed NOT NULL, then this is fine.
I will review this PR.

@sonarqubecloud
Copy link
Copy Markdown

@Dwrite Dwrite requested a review from mihaibudiu April 24, 2026 14:53
# 'gt[class java.lang.String, short]' in class SqlFunctions
# After fix: incompatibleValueType validation error
SELECT deptno, dname > SOME(SELECT empno FROM emp) AS b FROM dept;
For input string: "ACCOUNTING"
Copy link
Copy Markdown
Contributor

@mihaibudiu mihaibudiu Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a more descriptive error fragment you can show?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. just added

argTypes[0] = type1;
final boolean isSubQuery = node2 instanceof SqlSelect;
// For subquery, use the row type directly.
// For collection, use the component type for comparison, not the array type.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array -> collection

RelDataType widenType =
commonTypeForBinaryComparison(columnIthTypes.get(0), columnIthTypes.get(1));
if (widenType == null) {
widenType = getTightestCommonType(columnIthTypes.get(0), columnIthTypes.get(1));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit strange, trying twice, but if that's what "IN" does, keep it this way

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. This logic is derived from the IN operator

.createTypeWithNullability(desired, source.isNullable() || desired.isNullable());
coerced = rowTypeCoercion(scope1, node2, i, target) || coerced;
} else {
// Collection path (ARRAY[...]): coerce the whole array operand once.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g., ARRAY.
Coerce the collection

collectionWidenType =
binding.getTypeFactory()
.enforceTypeWithNullability(collectionWidenType, type2.isNullable());
if (coerceOperandType(scope, binding.getCall(), 1, collectionWidenType)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please make sure there is test coverage for both cases (subquery and collections)?
If there are already tests, fine, but if not, please write some.
In particular, this will be tricky when the row type on the RHS contains nested collections or ROW, e.g., ROW(ROW()) or ROW(MAP<VARCHAR, INT ARRAY>>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants