test: add PostgreSQL TPC-H integration tests by bestbeforetoday · Pull Request #855 · substrait-io/substrait-java

bestbeforetoday · 2026-06-08T17:33:12Z

Extends the changes in #700 to generate TPC-H data on demand during test execution and avoid checking in large amounts of test data.

Signed-off-by: Niels Pardon <par@zurich.ibm.com>

Signed-off-by: Mark S. Lewis <Mark.S.Lewis@outlook.com>

nielspardon · 2026-06-10T07:38:33Z

+  // TODO: These queries produce different results when generated from Substrait
+  private static final List<Integer> EXCLUDED_QUERIES = List.of(14);


interesting that query 14 is not producing the same result for you while for my PR with the static data it was query 21 that was not producing the same result

The Calcite version has been bumped up between those two PRs. Possibly that has made a difference.

I notice that with larger scale factors more failure start to appear. I suspect this might be due to resource constraints in the containerized test environment so stuck to a small scale factor. It might also be that a larger variety of data shows up edge case failures.

I downgraded the Calcite version (to 1.41.0) and ran this test locally with identical (scale factor 0.001) test data. This gives failure for TPC-H query 21, just as you were seeing before. Using Calcite 1.42.0 produces failure for only TPC-H query 14.

Increasing the scale factor to 0.01, query 21 remains the only failure with Calcite 1.41.0 whereas with Calcite 1.42.0 both queries 8 and 14 fail:

PostgreSqlIntegrationTest > testTpcH(int) > [8] 8 FAILED org.opentest4j.AssertionFailedError: Reference and generated SQL produce 2 different results. Reference SQL: select "O_YEAR", sum(case when "NATION" = 'EGYPT' then "VOLUME" else 0 end) / sum("VOLUME") as "MKT_SHARE" from ( select extract(year from "O"."O_ORDERDATE") as "O_YEAR", "L"."L_EXTENDEDPRICE" * (1 - "L"."L_DISCOUNT") as "VOLUME", "N2"."N_NAME" as "NATION" from "PART" "P", "SUPPLIER" "S", "LINEITEM" "L", "ORDERS" "O", "CUSTOMER" "C", "NATION" "N1", "NATION" "N2", "REGION" "R" where "P"."P_PARTKEY" = "L"."L_PARTKEY" and "S"."S_SUPPKEY" = "L"."L_SUPPKEY" and "L"."L_ORDERKEY" = "O"."O_ORDERKEY" and "O"."O_CUSTKEY" = "C"."C_CUSTKEY" and "C"."C_NATIONKEY" = "N1"."N_NATIONKEY" and "N1"."N_REGIONKEY" = "R"."R_REGIONKEY" and "R"."R_NAME" = 'MIDDLE EAST' and "S"."S_NATIONKEY" = "N2"."N_NATIONKEY" and "O"."O_ORDERDATE" between date '1995-01-01' and date '1996-12-31' and "P"."P_TYPE" = 'PROMO BRUSHED COPPER' ) as "ALL_NATIONS" group by "O_YEAR" order by "O_YEAR" Generated SQL: SELECT "t3"."$f600" AS "O_YEAR", "t3"."$f4" AS "MKT_SHARE" FROM (SELECT EXTRACT(YEAR FROM "ORDERS"."O_ORDERDATE") AS "$f600", SUM(CAST(CASE WHEN CAST("NATION0"."N_NAME" AS VARCHAR(25)) = 'EGYPT' THEN "LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT") ELSE 0 END AS DECIMAL(19, 0))) / SUM("LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT")) AS "$f4" FROM "PART", "SUPPLIER", "LINEITEM", "ORDERS", "CUSTOMER", "NATION", "NATION" AS "NATION0", "REGION" WHERE "PART"."P_PARTKEY" = "LINEITEM"."L_PARTKEY" AND "SUPPLIER"."S_SUPPKEY" = "LINEITEM"."L_SUPPKEY" AND ("LINEITEM"."L_ORDERKEY" = "ORDERS"."O_ORDERKEY" AND ("ORDERS"."O_CUSTKEY" = "CUSTOMER"."C_CUSTKEY" AND "CUSTOMER"."C_NATIONKEY" = "NATION"."N_NATIONKEY")) AND ("NATION"."N_REGIONKEY" = "REGION"."R_REGIONKEY" AND CAST("REGION"."R_NAME" AS VARCHAR(25)) = 'MIDDLE EAST' AND ("SUPPLIER"."S_NATIONKEY" = "NATION0"."N_NATIONKEY" AND ("ORDERS"."O_ORDERDATE" >= DATE '1995-01-01' AND "ORDERS"."O_ORDERDATE" <= DATE '1996-12-31' AND "PART"."P_TYPE" = 'PROMO BRUSHED COPPER'))) GROUP BY EXTRACT(YEAR FROM "ORDERS"."O_ORDERDATE") ORDER BY 1) AS "t3"

PostgreSqlIntegrationTest > testTpcH(int) > [14] 14 FAILED org.opentest4j.AssertionFailedError: Reference and generated SQL produce 2 different results. Reference SQL: select 100.00 * sum(case when "P"."P_TYPE" like 'PROMO%' then "L"."L_EXTENDEDPRICE" * (1 - "L"."L_DISCOUNT") else 0 end) / sum("L"."L_EXTENDEDPRICE" * (1 - "L"."L_DISCOUNT")) as "PROMO_REVENUE" from "LINEITEM" "L", "PART" "P" where "L"."L_PARTKEY" = "P"."P_PARTKEY" and "L"."L_SHIPDATE" >= date '1994-08-01' and "L"."L_SHIPDATE" < date '1994-08-01' + interval '1 month' Generated SQL: SELECT 100.00 * SUM(CAST(CASE WHEN "PART"."P_TYPE" LIKE 'PROMO%' THEN "LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT") ELSE 0 END AS DECIMAL(19, 0))) / SUM("LINEITEM"."L_EXTENDEDPRICE" * (1 - "LINEITEM"."L_DISCOUNT")) AS "PROMO_REVENUE" FROM "LINEITEM", "PART" WHERE "LINEITEM"."L_PARTKEY" = "PART"."P_PARTKEY" AND "LINEITEM"."L_SHIPDATE" >= DATE '1994-08-01' AND "LINEITEM"."L_SHIPDATE" < (DATE '1994-08-01' + INTERVAL '0-1' YEAR TO MONTH)

The failures above demonstrate the value of these tests, since they are not picked up by any of the existing unit tests. This change aims to deliver the tests, not to resolve existing problems that they highlight. That should happen in other PRs.

Assert that expected failures occur to ensure the list of expected failures is accurate. Also increase the scale factor used to generate TPC-H test data to increase the chances of detecting edge-case inconsistencies in query results. Signed-off-by: Mark S. Lewis <Mark.S.Lewis@outlook.com>

benbellick · 2026-06-11T18:33:48Z

I'm not super familiar with PostgreSQL TPC-H, but where do these test files actually come from?

I believe Niels created them (for PR #700) by modifying the existing TPC-H queries so that the SQL syntax was acceptable to PostgreSQL. He can talk to that better than me though.

I would personally prefer for exactly the same input SQL used to generate the Substrait plan also be used as the reference SQL. I will look at that when I get a chance.

feat(isthmus): add PostgreSQL TPC-H integration testing

a288923

Signed-off-by: Niels Pardon <par@zurich.ibm.com>

bestbeforetoday force-pushed the tpch-reference-tests branch from 553da51 to 2489ce5 Compare June 8, 2026 18:06

bestbeforetoday marked this pull request as ready for review June 8, 2026 18:44

bestbeforetoday requested a review from benbellick June 8, 2026 18:45

bestbeforetoday changed the title ~~feat: add PostgreSQL TPC-H integration tests~~ test: add PostgreSQL TPC-H integration tests Jun 8, 2026

bestbeforetoday force-pushed the tpch-reference-tests branch from 2489ce5 to c9be0fd Compare June 9, 2026 12:30

feat: use tpchgen-cli to generate test data

3f317ba

Signed-off-by: Mark S. Lewis <Mark.S.Lewis@outlook.com>

bestbeforetoday force-pushed the tpch-reference-tests branch from c9be0fd to 3f317ba Compare June 9, 2026 13:06

nielspardon reviewed Jun 9, 2026

View reviewed changes

Comment thread isthmus/src/test/java/io/substrait/isthmus/integration/SuccessfulExitCheckStrategy.java Outdated

fix: typo on SuccessfulExitCheckStrategy JavaDoc

3f36dca

Signed-off-by: Mark S. Lewis <Mark.S.Lewis@outlook.com>

bestbeforetoday force-pushed the tpch-reference-tests branch from 72f0420 to 3f36dca Compare June 9, 2026 17:42

test: better PostgreSqlIntegrationTest failure reporting

73eddfd

Signed-off-by: Mark S. Lewis <Mark.S.Lewis@outlook.com>

nielspardon reviewed Jun 10, 2026

View reviewed changes

nielspardon mentioned this pull request Jun 11, 2026

feat(isthmus): add PostgreSQL TPC-H integration testing #700

Closed

benbellick reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add PostgreSQL TPC-H integration tests#855

test: add PostgreSQL TPC-H integration tests#855
bestbeforetoday wants to merge 5 commits into
substrait-io:mainfrom
bestbeforetoday:tpch-reference-tests

bestbeforetoday commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

nielspardon Jun 10, 2026

Uh oh!

bestbeforetoday Jun 10, 2026 •

edited

Loading

Uh oh!

bestbeforetoday Jun 11, 2026

Uh oh!

bestbeforetoday Jun 11, 2026

Uh oh!

benbellick Jun 11, 2026

Uh oh!

bestbeforetoday Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		// TODO: These queries produce different results when generated from Substrait
		private static final List<Integer> EXCLUDED_QUERIES = List.of(14);

Conversation

bestbeforetoday commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

nielspardon Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

bestbeforetoday Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bestbeforetoday Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

bestbeforetoday Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

benbellick Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

bestbeforetoday Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bestbeforetoday commented Jun 8, 2026 •

edited

Loading

bestbeforetoday Jun 10, 2026 •

edited

Loading