Add postgres_synced_tables bundle resource#5268
Conversation
Co-authored-by: Isaac
Register postgres_synced_tables in apitypes.yml, add a testserver route for the operation-polling URL, and wire up the TestAll fixture entry. Co-authored-by: Isaac
Co-authored-by: Isaac
Co-authored-by: Isaac
…/bind/mock allow-lists Add postgres_synced_tables to the per-resource allow-lists and mock fixtures that enumerate all bundle resource types: - unsupportedResources in apply_bundle_permissions_test (no ACL API) - allResourceTypes expected list and allowList in run_as_test (no run_as concept) - mockBundle in apply_target_mode_test + notUserNamed carve-out - TestResourcesBindSupport fixture + GetSyncedTable mock expectation - Refresh acceptance/bundle/refschema/out.fields.txt snapshot Co-authored-by: Isaac
Running ./task generate-direct-resources populates the missing ignore_remote_changes block for postgres_synced_tables. Every spec field is now marked spec:input_only, so the planner stops flagging the empty spec returned by GET as drift. The manual recreate_on_changes block in resources.yml is unchanged on purpose: it covers the intent side (a user editing databricks.yml must still trigger delete+create because no UpdateSyncedTable endpoint exists). Added a comment at the top of the block explaining how the two declarations cooperate. Same pattern as secret_scopes, which is the other no-Update resource.
Adds a no_drift invariant test config for postgres_synced_tables. This is the regression guard for the V12 forever-recreate bug — if RemapState ever drops the ignore_remote_changes coverage on a spec field, this test will catch the bug at CI time instead of customer deploy time. Excluded from the Cloud variant for the same reason as the other postgres_* configs: Lakebase Autoscaling is AWS-only and the production fixture used by the cloud variant doesn't have a Lakebase project bound to the test workspace.
## Changes
New `postgres_catalogs` resource binding a Unity Catalog catalog to a Postgres database on a Lakebase Autoscaling branch. Supported on both direct and terraform deployment engines.
The spec fields are classified as both `recreate_on_changes` and `ignore_remote_changes: input_only`. The two cover orthogonal diffs the planner runs — recreate fires on local edits to an immutable field, and ignore_remote silences the phantom drift from GET not echoing spec back today. Lift the `input_only` entries once the backend starts returning spec.
## Tests
Acceptance coverage: `basic` and `recreate` exercise each engine, plus the existing `no_drift` and `migrate` invariants pick up the new resource. Both engines produce identical human-readable output and identical wire bodies; only the captured request streams diverge by filename (`out.requests.{direct,terraform}.json`).
Verified end to end on a live workspace: the bundle deploys a project and catalog, a row written directly into the bound Postgres database becomes visible through the UC federated view, and a follow-up write shows up on re-read.
_This PR was written by Claude Code._
Resolves sibling-add conflicts across: - Bundle config registration (Resources struct, AllResources, SupportedResources) - Direct engine all.go + apitypes.yml + resources.yml - Testserver fake_workspace, postgres CRUD switch, handler routes - All-resources allow-lists (type_test, statemgmt fixtures, mutator tests) - No-drift invariant matrix - Changelog All conflicts were the same shape: both branches added a new entry next to the existing postgres_* siblings. Kept both, ordered catalogs before synced_tables to match the production sequencing (catalog must exist before a synced table can reference it).
Mirrors what postgres-catalog did: the resource is now produced for both the direct and terraform engines. - New tfdyn converter in bundle/deploy/terraform/tfdyn/, with unit tests that lock in the wire shape (spec block, scheduling_policy enum, primary_key_columns list, nested new_pipeline_spec). - Wired into GroupToTerraformName (databricks_postgres_synced_table), the postgres-resource set in interpolate.go and util.go, and removed from lifecycle_test.go's direct-only ignore list. - Acceptance test test.toml now runs both engines (direct + terraform) on AWS only, matching the catalog config. basic/script writes out.requests.$DATABRICKS_BUNDLE_ENGINE.json so the captured wire bodies are visible per engine. - Renamed the existing single-engine out.requests.json to out.requests.direct.json and generated out.requests.terraform.json. - Regenerated affected baselines. The migrate invariant test now passes for postgres_synced_tables too, since the resource is no longer direct-only.
The bundle now declares its own postgres_project + postgres_catalog chain alongside the synced table, so the cloud variant can deploy against a real workspace without out-of-band setup. - Source table is samples.nyctaxi.trips directly (ships on every UC-enabled workspace; no intermediate CREATE TABLE needed). - A single UC schema is still created in main for the pipeline's internal storage (storage_catalog/storage_schema), which must pre-exist on the workspace. - recreate test toggles timeseries_key instead of scheduling_policy, so the second deploy doesn't require CDF on samples.nyctaxi.trips (which is read-only). - Cross-resource references go through the catalog's catalog_id (synced_table_id) and the project's id (branch path), exercising the interpolate-postgres-resources path on both engines. - test.toml gains [[Server]] stubs for the SQL statements API and the UC tables-delete API so the local variant can run the schema create. - Regenerated baselines for both engines.
Drop the schema-create / schema-delete shell commands from the test
scripts and declare the storage schema as a schemas resource in the
bundle. Same lifecycle as everything else — bundle destroy walks the
dependency graph and tears it down in order, so a partial failure
leaks one fewer thing.
new_pipeline_spec now references the schema via:
storage_catalog: ${resources.schemas.pipeline_storage.catalog_name}
storage_schema: ${resources.schemas.pipeline_storage.name}
which exercises one more piece of cross-resource interpolation.
Also drops the SQL / UC tables-delete server stubs from test.toml
since the local scripts no longer hit those endpoints.
Found on aws-prod-ucws: a deploy targeting samples.nyctaxi.trips as the synced-table source returns Cannot create more than 20 synced database table(s) per source table. (400 BAD_REQUEST) There's a hard server-side limit of 20 synced tables per source, and samples.nyctaxi.trips is depleted on shared workspaces. The original script created a per-test source table for this reason (see synced_database_tables/basic for the same workaround). I removed it chasing simplicity; this restores it. The pipeline-storage schema stays bundle-managed (the schemas resource added in the previous commit); only the source-table side goes back to being script-managed.
The previous jq filter deleted only the random fields (timestamps, uid, pipeline_id, message). It left detailed_state, which is timing- dependent on cloud: real workspaces are still in SYNCED_TABLE_PROVISIONING_PIPELINE_RESOURCES at the GET, while the fake testserver always returns SYNCED_TABLE_ONLINE. The cloud response also carries ongoing_sync_progress and project which the fake doesn't. Switch to projecting just the deterministic identity + UC provisioning state, which is ACTIVE in both environments.
# Conflicts: # NEXT_CHANGELOG.md # acceptance/bundle/invariant/test.toml # bundle/config/mutator/resourcemutator/apply_bundle_permissions_test.go # bundle/config/mutator/resourcemutator/apply_target_mode_test.go # bundle/config/resources.go # bundle/config/resources_test.go # bundle/deploy/terraform/interpolate.go # bundle/deploy/terraform/pkg.go # bundle/deploy/terraform/util.go # bundle/direct/dresources/all.go # bundle/direct/dresources/all_test.go # bundle/direct/dresources/apitypes.yml # bundle/direct/dresources/postgres_catalog.go # bundle/direct/dresources/resources.yml # bundle/direct/dresources/type_test.go # bundle/statemgmt/state_load_test.go # libs/testserver/fake_workspace.go # libs/testserver/handlers.go # libs/testserver/postgres.go
Approval status: pending
|
Drop the TrimSyncedTablesPrefix helper + unit test. Source synced_table_id inline in RemapState with strings.TrimPrefix(remote.Name, "synced_tables/"), mirroring postgres_catalogs.RemapState which sources catalog_id inline via remote.Status.CatalogId. The synced-table API doesn't expose the user-facing id as a named field on either SyncedTable or its Status — it only appears as the trailing component of remote.Name, so the prefix strip is structurally necessary. The docstring on the field calls out the asymmetry with postgres_catalogs. Also: move PostgresSyncedTables above PostgresOperations in the testserver struct + constructor (so the operations map stays after the resource maps). Also: link the changelog entry to #5268.
|
|
||
| // TrimSyncedTablesPrefix extracts the user-facing synced table id from the API name. | ||
| // E.g. "synced_tables/main.public.trips" -> "main.public.trips". | ||
| func TrimSyncedTablesPrefix(name string) string { |
There was a problem hiding this comment.
seems this can just live in bundle/direct/dresources/postgres_synced_table.go
| want string | ||
| }{ | ||
| {"happy path", "synced_tables/main.public.trips_synced", "main.public.trips_synced"}, | ||
| {"missing prefix is returned unchanged", "main.public.trips_synced", "main.public.trips_synced"}, |
There was a problem hiding this comment.
add test that this only trims the prefix and not other occurrences of synced_tables/ inside the string
| # drift for the same fields because the GET API does not echo back the | ||
| # spec. Together they make no-op deploys idempotent while a real config | ||
| # edit still triggers a recreate. Same pattern as secret_scopes. | ||
| recreate_on_changes: |
There was a problem hiding this comment.
| Plan: 1 to add, 0 to change, 1 to delete, 3 unchanged | ||
| contains error: 'Plan: 1 to add, 0 to change, 1 to delete, 0 unchanged' not found in the output. |
There was a problem hiding this comment.
seems that's the wrong output here?
| if s.SyncedTableId == "" { | ||
| return | ||
| } | ||
| baseURL.Path = "explore/data/" + s.SyncedTableId |
There was a problem hiding this comment.
From comment above
SyncedTableId is the user-specified three-part UC name (catalog.schema.table).
then we need to do something like https://github.com/databricks/cli/pull/5123/changes#diff-6455f929cc9efa54a058012ab9bccafce863f12cf53943d94e5fc34a2aae0f44R65
|
|
||
| // Postgres Synced Tables: | ||
| server.Handle("POST", "/api/2.0/postgres/synced_tables", func(req Request) any { | ||
| syncedTableID := req.URL.Query().Get("synced_table_id") |
There was a problem hiding this comment.
any reason we extract it here and not inside PostgresSyncedTableCreate? I guess testserver/handlers.go is a bit inconsistent in that regard
Changes
New
postgres_synced_tablesresource that syncs a Unity Catalog Delta table into a Postgres table on a Lakebase Autoscaling branch. Supported on both direct and terraform deployment engines.Tests
Acceptance coverage:
basicandrecreateexercise each engine, plus the existingno_driftandmigrateinvariants pick up the new resource. Both engines produce identical human-readable output and identical wire bodies; only the captured request streams diverge by filename (out.requests.{direct,terraform}.json).Verified end to end on a live workspace: the bundle deploys a project, lakebase catalog, pipeline-storage schema, and synced table; the pipeline materializes in under a minute;
SELECTagainst the destination through the UC federated view returns the rows from the source Delta table; andbundle destroycleans up the full chain.This pull request and its description were written by Isaac.