Skip to content

[python] Align SnapshotManager constructor with Java#7758

Open
TheR1sing3un wants to merge 1 commit intoapache:masterfrom
TheR1sing3un:py-snapshot-manager-constructor-align
Open

[python] Align SnapshotManager constructor with Java#7758
TheR1sing3un wants to merge 1 commit intoapache:masterfrom
TheR1sing3un:py-snapshot-manager-constructor-align

Conversation

@TheR1sing3un
Copy link
Copy Markdown
Member

Purpose

Switch the Python SnapshotManager.__init__ from (table) to the Java-aligned (file_io, table_path, branch=None, snapshot_loader=None), so the class no longer depends on FileStoreTable. Mirrors paimon-core/.../utils/SnapshotManager.java's constructor (which takes FileIO, Path tablePath, @Nullable String branchName, @Nullable SnapshotLoader, @Nullable Cache); the Cache parameter is left out of scope (Caffeine has no Python equivalent yet).

The previous Python signature held self.table = table and reached through self.table.file_io / self.table.catalog_environment.snapshot_loader() / self.table.current_branch() to populate fields. That made SnapshotManager a layering-violation -- the snapshot layer reaching back into the table layer -- and forced mock tests to stub a four-attribute mock table just to construct the manager. The new constructor takes the four basics directly.

In scope

  • pypaimon/snapshot/snapshot_manager.py: new constructor; self.table field removed; copy_with_branch rewritten to construct the rebranched manager via the new constructor (no field-swap).
  • pypaimon/table/file_store_table.py: snapshot_manager() remains the canonical factory and now wires (file_io, table_path, current_branch(), catalog_environment.snapshot_loader()) directly.
  • 11 production call sites + 7 raw-construction test sites migrated to table.snapshot_manager().
  • Mock-style tests in streaming_table_scan_test.py / file_store_commit_test.py / partition_predicate_test.py: drop the @patch('...SnapshotManager') decorator and the mock parameter; wire each test's mock_snapshot_manager through table.snapshot_manager.return_value, matching how production code obtains its instance.
  • snapshot_manager_test.py: simplified mock setUp -- no more layered `Mock()` table, just `SnapshotManager(file_io, "/tmp/test_table")`.

Out of scope

  • Caffeine-style `Cache<Path, Snapshot>` parameter (Java-only; Python Cache port is a separate effort).
  • Other Java SnapshotManager API gaps (latest_snapshot_id / earliest_snapshot_id / snapshot_exists / etc.) -- separate follow-up PRs.

Tests

From `paimon-python/`:

```
pytest pypaimon/tests/snapshot_manager_test.py pypaimon/tests/branch_manager_test.py \
pypaimon/tests/streaming_table_scan_test.py pypaimon/tests/file_store_commit_test.py \
pypaimon/tests/partition_predicate_test.py -q # 76 passed
pytest pypaimon/tests/ --ignore=pypaimon/tests/rest --ignore=pypaimon/tests/e2e_rest \
--ignore=pypaimon/tests/sql_context_test.py # 963 passed
flake8 --config=dev/cfg.ini # clean
```

(`sql_context_test.py` requires the optional `datafusion` extra which isn't installed in this environment; failures there are pre-existing on master and unrelated to this diff.)

Anti-divergence checklist

  • Constructor parameter order matches Java `SnapshotManager(FileIO, Path, @nullable String, @nullable SnapshotLoader, ...)`.
  • `branch == "main"` keeps existing paths bit-identical (`BranchManager.branch_path(p, "main") == p`); call sites that passed `current_branch() == "main"` get zero-regression behaviour.
  • `copy_with_branch` mirrors Java's factory: new `(file_io, table_path, branch_name, snapshot_loader.copy_with_branch(...) if loader else None)`.

Generative AI disclosure

Drafted with assistance from a generative AI tool. All code, tests, and Java alignment were reviewed and validated by the contributor.


Note: built on top of #7756. Once #7756 merges, this branch will rebase cleanly onto master.

Switch SnapshotManager.__init__ from (table) to the Java-aligned
(file_io, table_path, branch=None, snapshot_loader=None) so the class
no longer depends on FileStoreTable. The new manager carries its own
file_io / table_path / branch / snapshot_loader fields, mirroring
paimon-core/.../utils/SnapshotManager.java.

FileStoreTable.snapshot_manager() remains the canonical factory and
now wires those four basics. All raw SnapshotManager(table) call sites
across production and tests are migrated to table.snapshot_manager().
copy_with_branch is rewritten to construct the rebranched manager
directly via the new constructor (no field-swap).

Mock-style tests that patched the SnapshotManager class to intercept
its instances now wire the mock through table.snapshot_manager.return_value,
which matches how production code obtains its instance.

Follow-up to apache#7756.
@TheR1sing3un TheR1sing3un force-pushed the py-snapshot-manager-constructor-align branch from d0aa481 to 7e1153a Compare May 3, 2026 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant