Skip to content

fix: tables schema command replace manual _delta_log JSON parsing with DeltaTable.schema() from the deltalake library#229

Open
pkontek wants to merge 4 commits intomicrosoft:mainfrom
SoletPL:tables_schema_fix
Open

fix: tables schema command replace manual _delta_log JSON parsing with DeltaTable.schema() from the deltalake library#229
pkontek wants to merge 4 commits intomicrosoft:mainfrom
SoletPL:tables_schema_fix

Conversation

@pkontek
Copy link
Copy Markdown

@pkontek pkontek commented Apr 30, 2026

📥 Pull Request

✨ Description of new changes

Summary: The fab tables schema command failed with [InvalidDeltaTable] Failed to extract the table schema for Delta tables where pre-checkpoint JSON commit log files had been cleaned up. The previous implementation manually scanned _delta_log/*.json files via the OneLake API looking for a metaData entry — an approach that breaks once Delta Lake compacts logs into .checkpoint.parquet files and removes the preceding JSON entries.

Context: Delta Lake creates a checkpoint every 10 transactions by default and retains log files according to the configured retention policy. Once older JSON commit files are removed, the metaData entry (which carries the schema) is no longer available in the remaining JSON logs, causing the command to fail even for healthy, accessible tables.

Dependencies: Adds deltalake>=0.18.0 as a new dependency. The deltalake library correctly resolves the table schema from both JSON logs and Parquet checkpoints using the standard Delta protocol, making the implementation robust and protocol-compliant.

Changes:

  • Replace manual _delta_log JSON parsing with DeltaTable.schema() from the deltalake library
  • Add deltalake>=0.18.0 to project dependencies in pyproject.toml
  • Simplify fab_tables_schema.py by removing ~50 lines of fragile log-parsing logic
  • Correct typo in error constant for invalid Delta table

Copilot AI review requested due to automatic review settings April 30, 2026 13:17
@pkontek pkontek requested a review from a team as a code owner April 30, 2026 13:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes fab tables schema failing on Delta tables whose pre-checkpoint JSON logs were vacuumed, by switching schema extraction to the Delta protocol via the deltalake library.

Changes:

  • Replaced manual _delta_log/*.json scanning with deltalake.DeltaTable(...).schema() for checkpoint-aware schema extraction.
  • Added deltalake>=0.18.0 to project dependencies.
  • Renamed typo’d error constant ERROR_INVALID_DETLA_TABLE to ERROR_INVALID_DELTA_TABLE and wired it into the tables schema command.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/fabric_cli/core/fab_constant.py Fixes typo in error constant name; minor formatting cleanup.
src/fabric_cli/commands/tables/fab_tables_schema.py Refactors schema retrieval to use deltalake over ABFSS + token, removing fragile log parsing.
pyproject.toml Adds deltalake dependency required for robust schema extraction.
.changes/unreleased/fixed-20260430-130558.yaml Adds release note entry for the schema extraction fix.

Comment thread src/fabric_cli/commands/tables/fab_tables_schema.py Outdated
Comment on lines +35 to +39
table = DeltaTable(
table_uri,
storage_options={
"bearer_token": token,
"use_fabric_endpoint": "true",
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change introduces a new schema extraction path via deltalake.DeltaTable (ABFSS URI + token-based access) but there are no tests covering the success path or the TableNotFoundError -> FabricCLIError(ERROR_INVALID_DELTA_TABLE) mapping. Please add command-level tests that mock DeltaTable/schema() to validate output structure and error handling.

Copilot generated this review using guidance from repository custom instructions.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 30, 2026 13:26
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@pkontek pkontek changed the title Tables schema fix fix: tables schema command replace manual _delta_log JSON parsing with DeltaTable.schema() from the deltalake library Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] tables schema command fails to extract schema from Delta tables with checkpoint files

2 participants