Skip to content

Add more (high-level) details about Pebble compression features#23245

Open
rmloveland wants to merge 1 commit intomainfrom
20260430-DOC-16758-pebble-compression-features
Open

Add more (high-level) details about Pebble compression features#23245
rmloveland wants to merge 1 commit intomainfrom
20260430-DOC-16758-pebble-compression-features

Conversation

@rmloveland
Copy link
Copy Markdown
Contributor

Fixes DOC-16758

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit c04c3b7
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/69f397e7cb24280008ba1b41

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit c04c3b7
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/69f397e7cb24280008ba1b45

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 30, 2026

Netlify Preview

Name Link
🔨 Latest commit c04c3b7
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/69f397e716d6360008868005
😎 Deploy Preview https://deploy-preview-23245--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.


##### SST compression

Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, is optimized for low CPU overhead and is appropriate for most workloads.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that fastest uses MinLZ1 on amd64 and arm64 platforms.


Pebble compresses SSTable and blob value data to reduce physical storage use. The default profile, `fastest`, is optimized for low CPU overhead and is appropriate for most workloads.

For advanced storage tuning, CockroachDB exposes the `storage.sstable.compression_algorithm` [cluster setting]({% link {{ page.version.version }}/cluster-settings.md %}). The profile values are ordered by increasing compression effort: `fastest`, `fast`, `balanced`, and `good`. Higher-effort profiles can improve compression for some workloads, but can also increase CPU usage for writes, compactions, and reads that decompress data. Most users do not need to tune this setting. Work with [Cockroach Labs Support](https://support.cockroachlabs.com/) before changing this setting in production.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These profiles enable selective use of Zstd1 depending on the block type, LSM level, and compression benefit. Higher-effort profiles use Zstd1 more frequently and can improve ..


The output CSV file is periodically rewritten while the command is running. Even if the command is interrupted, you can still use the most recently written output.

### Interpret results
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should encourage customers to interpret these results in their current form. We should say that they should just consult us with the data.


Changing `storage.sstable.compression_algorithm` does not immediately recompress existing SST files. SSTs are immutable, so a new setting applies as Pebble writes new SSTs or rewrites existing SSTs during compaction, ingestion, restore, or other SST-writing work. During a transition, a store can contain SSTs compressed with multiple algorithms.

To evaluate the compression behavior of an existing store or backup, use [`cockroach debug pebble db analyze-data`]({% link {{ page.version.version }}/cockroach-debug-pebble-db-analyze-data.md %}). Compare the compression ratio with the compression and decompression throughput for representative data before changing the cluster setting.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say "To evaluate the CPU usage vs size tradeoff on your particular data, use .."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants