Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
326 changes: 326 additions & 0 deletions content/en/docs/next/applications/backup-and-recovery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,326 @@
---
title: "Application Backup and Recovery"
linkTitle: "Backup and Recovery"
description: "Back up and restore managed databases (Postgres, MariaDB, ClickHouse, FoundationDB) with BackupJob, Plan, and RestoreJob."
weight: 4
---

This guide covers backing up and restoring **Cozystack-managed databases** — Postgres, MariaDB, ClickHouse, and FoundationDB — as a tenant user: running one-off and scheduled backups, checking status, and restoring from a backup either in place or into a separate target instance.

{{% alert color="warning" %}}
**These backups are data-only.** Each strategy snapshots the database contents through the operator's native mechanism (CloudNativePG barman, mariadb-operator dumps, Altinity `clickhouse-backup`, FoundationDB `backup_agent`). They do **not** capture the `apps.cozystack.io/*` CR, its `HelmRelease`, chart values, or operator-managed Secrets.

To restore you must either:
- keep the source application alive and restore in place (each driver re-bootstraps data into the existing operator-managed cluster), **or**
- pre-provision an empty target application of the same Kind, then restore into it.

For backups that include the application's Helm release, CRs, and PVC snapshots (used for VMInstance / VMDisk), see [Backup and Recovery (VMs)]({{% ref "/docs/next/virtualization/backup-and-recovery" %}}).
{{% /alert %}}

## Prerequisites

- A `BackupClass` exists in the cluster for the application Kind you want to back up. Run `kubectl get backupclasses` to confirm; if none is present, ask your administrator to follow the [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) guide.
- S3-compatible storage. Either provision an in-cluster `Bucket` (shown below) or use external S3 coordinates supplied by your administrator.
- `kubectl` and kubeconfig for the management cluster.

## List available BackupClasses

`BackupClass` resources are cluster-scoped and tell you which application Kinds can be backed up and which driver handles each:

```bash
kubectl get backupclasses
```

Example output:

```
NAME AGE
postgres-data-backup 14m
mariadb-data-backup 14m
clickhouse-data-backup 14m
foundationdb-data-backup 14m
velero 1d
```

Use the `BackupClass` name when creating a `BackupJob` or `Plan`. The examples below assume `tenant-user` for the tenant namespace; substitute your own.

## Provision the storage Bucket

If your administrator has not pre-configured external S3, provision an in-cluster `Bucket` in the tenant namespace:

```yaml
apiVersion: apps.cozystack.io/v1alpha1
kind: Bucket
metadata:
name: db-backups
namespace: tenant-user
spec:
users:
backup:
readonly: false
```

```bash
kubectl apply -f bucket.yaml
kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s
```

The `Bucket` controller materialises a `bucket-<name>-backup` Secret in the namespace carrying a `BucketInfo` JSON blob. The S3 endpoint, bucket name, and access keys come from there.

## Create per-application backup credentials

Each driver expects per-application credential Secrets in the application namespace — the strategy templates reference them by name. The snippets below assume a single shell session: first read the bucket credentials, then run only the per-driver block for the application Kind you are setting up.

### Read the bucket credentials

Run this once per shell session. Every per-driver block below reuses `$ACCESS_KEY`, `$SECRET_KEY`, and `/tmp/bucket.json`:

```bash
kubectl -n tenant-user get secret bucket-db-backups-backup \
-o jsonpath='{.data.BucketInfo}' | base64 -d > /tmp/bucket.json
ACCESS_KEY=$(jq -r .spec.secretS3.accessKeyID /tmp/bucket.json)
SECRET_KEY=$(jq -r .spec.secretS3.accessSecretKey /tmp/bucket.json)
```

If you start a new shell, re-run that snippet before continuing.

### Postgres

Project the credentials in the keys CNPG's barman client expects:

```bash
kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-creds \
--from-literal=ACCESS_KEY_ID="$ACCESS_KEY" \
--from-literal=ACCESS_SECRET_KEY="$SECRET_KEY"
```

When the S3 endpoint uses a self-signed certificate (the SeaweedFS default), also create a CA Secret:

```bash
kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-ca \
--from-file=ca.crt=/path/to/ca.crt
```

### MariaDB

```bash
kubectl -n tenant-user create secret generic my-mariadb-mariadb-backup-creds \
--from-literal=AWS_ACCESS_KEY_ID="$ACCESS_KEY" \
--from-literal=AWS_SECRET_ACCESS_KEY="$SECRET_KEY"
```

For self-signed endpoints, add `my-mariadb-mariadb-backup-ca` carrying `ca.crt` the same way.

### ClickHouse

ClickHouse backups read S3 credentials from the chart-emitted `<release>-backup-s3` Secret directly. Set `backup.enabled: true` on the ClickHouse application and fill in `backup.*` with the bucket coordinates — no extra Secret is needed for the BackupClass flow. See the [ClickHouse application reference]({{% ref "/docs/next/applications/clickhouse" %}}) for the `backup.*` values.

### FoundationDB

FoundationDB's `backup_agent` requires a `blob_credentials.json` payload in a specific shape. This block reads the bucket endpoint from `/tmp/bucket.json` (created by [Read the bucket credentials](#read-the-bucket-credentials) above) and reuses `$ACCESS_KEY` / `$SECRET_KEY` from the same step:

```bash
ENDPOINT_FULL=$(jq -r .spec.secretS3.endpoint /tmp/bucket.json)
ENDPOINT_HOSTPORT=${ENDPOINT_FULL#http://}
ENDPOINT_HOSTPORT=${ENDPOINT_HOSTPORT#https://}
ACCOUNT_NAME="${ACCESS_KEY}@${ENDPOINT_HOSTPORT}"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using the dynamic $ACCESS_KEY as part of the ACCOUNT_NAME makes it difficult for administrators to pre-configure a cluster-scoped BackupClass, as they would need to know each tenant's access key in advance.

It is recommended to use a fixed, descriptive account name (e.g., fdb-backup) and ensure it matches the accountName parameter defined by the administrator in the BackupClass. This allows the same BackupClass to be used by multiple tenants with their own credentials.

Suggested change
ACCOUNT_NAME="${ACCESS_KEY}@${ENDPOINT_HOSTPORT}"
ACCOUNT_NAME="fdb-backup@${ENDPOINT_HOSTPORT}"


jq -nc \
--arg account "$ACCOUNT_NAME" \
--arg key "$ACCESS_KEY" \
--arg secret "$SECRET_KEY" \
'{accounts: {($account): {api_key: $key, secret: $secret}}}' \
> /tmp/blob_credentials.json

kubectl -n tenant-user create secret generic my-fdb-fdb-backup-creds \
--from-file=blob_credentials.json=/tmp/blob_credentials.json
```

Your administrator must also patch the FoundationDB `BackupClass` parameters with the resolved `accountName`, `bucket`, `region`, and `secureConnection` values before the first backup runs. Otherwise the first `BackupJob` fails fast with a validation error (`accountName is required`) — this is the intentional fail-loud behaviour for a half-configured tenant.

## Run a backup

### One-off backup

Use a `BackupJob` for an ad-hoc backup (for example, before a risky change):

```yaml
apiVersion: backups.cozystack.io/v1alpha1
kind: BackupJob
metadata:
name: my-postgres-adhoc
namespace: tenant-user
spec:
applicationRef:
apiGroup: apps.cozystack.io
kind: Postgres
name: my-postgres
backupClassName: postgres-data-backup
```

```bash
kubectl apply -f backupjob.yaml
kubectl -n tenant-user get backupjobs
kubectl -n tenant-user describe backupjob my-postgres-adhoc
```

When the `BackupJob` reaches `phase: Succeeded`, the driver creates a `Backup` object with the same name. That name is what you reference when restoring.

Replace `Postgres` / `postgres-data-backup` with `MariaDB` / `mariadb-data-backup`, `ClickHouse` / `clickhouse-data-backup`, or `FoundationDB` / `foundationdb-data-backup` for the other drivers.

### Scheduled backup

Use a `Plan` for cron-driven recurring backups:

```yaml
apiVersion: backups.cozystack.io/v1alpha1
kind: Plan
metadata:
name: my-postgres-daily
namespace: tenant-user
spec:
applicationRef:
apiGroup: apps.cozystack.io
kind: Postgres
name: my-postgres
backupClassName: postgres-data-backup
schedule:
type: cron
cron: "0 */6 * * *" # every 6 hours
```

Each scheduled run creates a `BackupJob` (and, on success, a `Backup`) named after the `Plan` with a timestamp suffix.

```bash
kubectl apply -f plan.yaml
kubectl -n tenant-user get plans
kubectl -n tenant-user get backupjobs -l backups.cozystack.io/plan=my-postgres-daily
```

## Check backup status

List `BackupJob` and `Backup` resources in the namespace:

```bash
kubectl -n tenant-user get backupjobs
kubectl -n tenant-user get backups
```

Inspect a failed run:

```bash
kubectl -n tenant-user get backupjob my-postgres-adhoc -o jsonpath='{.status.message}'
kubectl -n tenant-user describe backupjob my-postgres-adhoc
```

For driver-side detail, inspect the operator-native CR each driver materialises (one of `cnpg.io/Backup`, `k8s.mariadb.com/Backup`, `apps.foundationdb.org/FoundationDBBackup`, or the ClickHouse strategy `Pod`).

## Restore in place

An **in-place restore** replays the backup into the **same** application. Use this to roll back accidental deletion or corruption on a live database you intend to keep using under the same name.

{{% alert color="warning" %}}
In-place restore is **destructive**. Each driver wipes or replaces existing data on the source application; any writes since the backup point are lost. If you cannot afford to lose recent writes, use [Restore to a copy](#restore-to-a-copy) instead.
{{% /alert %}}

```yaml
apiVersion: backups.cozystack.io/v1alpha1
kind: RestoreJob
metadata:
name: my-postgres-restore-inplace
namespace: tenant-user
spec:
backupRef:
name: my-postgres-adhoc
# targetApplicationRef omitted: driver restores into Backup.spec.applicationRef.
# options:
# recoveryTime: "2026-05-01T12:00:00Z" # Postgres only; RFC3339 PITR
```

```bash
kubectl apply -f restorejob.yaml
kubectl -n tenant-user get restorejobs
kubectl -n tenant-user describe restorejob my-postgres-restore-inplace
```

### Per-driver caveats

- **Postgres (CNPG)** — the driver deletes the live `cnpg.io/Cluster` and its PVCs, then re-bootstraps from the Barman archive. Connections drop for the duration. `spec.options.recoveryTime` (RFC3339) is supported for point-in-time recovery; omit it to restore to the latest WAL.
- **MariaDB** — the operator replays the logical dump into the live `MariaDB` via `mariadb-import`. Pre-existing tables will collide; pre-truncate the relevant schemas if your dump does not include `DROP TABLE`.
- **ClickHouse** — the Altinity strategy does **not** pass `clickhouse-backup --rm`. You are responsible for dropping conflicting tables on the source before submitting the `RestoreJob`; otherwise the operation fails with a duplicate-table error.
- **FoundationDB** — the operator pauses the FoundationDB cluster, clears the keyspace, and replays the backup via `fdbrestore`. Any data written after the snapshot is lost. Only one `FoundationDBBackup` directory may exist per cluster at a time — the driver stops any prior backup before starting a new one.

## Restore to a copy

A **to-copy restore** replays the backup into a **different**, freshly-provisioned application of the same Kind. Use this for disaster-recovery drills, side-by-side validation, branch databases, or migrating to a new version of the upstream operator.

First, provision an empty target application with the same Kind. For example, an empty `Postgres`:

```yaml
apiVersion: apps.cozystack.io/v1alpha1
kind: Postgres
metadata:
name: my-postgres-restored
namespace: tenant-user
spec:
# ...same shape as the source, no bootstrap data required...
```

Wait for the target to become Ready, then submit a `RestoreJob` that points at it:

```yaml
apiVersion: backups.cozystack.io/v1alpha1
kind: RestoreJob
metadata:
name: my-postgres-restore-to-copy
namespace: tenant-user
spec:
backupRef:
name: my-postgres-adhoc
targetApplicationRef:
apiGroup: apps.cozystack.io
kind: Postgres
name: my-postgres-restored
```

The source application stays untouched. Cross-namespace restores are **not** supported — `targetApplicationRef` is a local reference; the target must live in the same namespace as the `RestoreJob`.

## Limitations and lifecycle

- **Data-only scope.** Application CRs, HelmReleases, chart values, and operator-managed Secrets (e.g. `cnpg.io` superuser secret, `clickhouse-installation` users) are not captured. Pre-provision the target application before a to-copy restore.
- **Archive retention is driver-owned.** Deleting a Cozystack `Backup` CR removes the artefact reference but leaves the actual S3 object intact. Each driver enforces its own retention:
- CNPG: `retentionPolicy` on the strategy (`30d` default in the admin example).
- MariaDB: configure `cleanupStrategy` on the operator-side `Backup` CR or rotate at the bucket level.
- ClickHouse: governed by the in-pod sidecar's retention configuration. Tenants who need to purge an archive call `DELETE /backup/<name>/remote` on the sidecar.
- FoundationDB: each `BackupJob` owns a discrete blob-store directory; clean up at the bucket level.
- **One running backup per FoundationDB cluster.** The driver enforces this by stopping any prior `FoundationDBBackup` on the same cluster before starting a new one.
- **ClickHouse depends on the in-chart sidecar.** The Altinity strategy is a thin HTTP client; the backup itself runs inside each `chi-*` Pod via `clickhouse-backup`. Disabling `backup.enabled` on the application also disables the BackupClass flow.

## Troubleshooting

If a `BackupJob` or `RestoreJob` ends in `phase: Failed`, check the message field:

```bash
kubectl -n tenant-user get backupjob my-postgres-adhoc -o jsonpath='{.status.message}'
kubectl -n tenant-user get restorejob my-postgres-restore-inplace -o jsonpath='{.status.message}'
```

Then look at the operator-native CR the driver created:

```bash
# Postgres
kubectl -n tenant-user get backups.cnpg.io
# MariaDB
kubectl -n tenant-user get backups.k8s.mariadb.com,restores.k8s.mariadb.com
# ClickHouse
kubectl -n tenant-user logs -l backups.cozystack.io/owned-by.BackupJobName=my-clickhouse-adhoc
# FoundationDB
kubectl -n tenant-user get foundationdbbackups.apps.foundationdb.org,foundationdbrestores.apps.foundationdb.org \
-l backups.cozystack.io/owned-by.BackupJobName=my-fdb-adhoc
```

## See also

- [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) — how administrators define strategies and `BackupClass` resources.
- [Backup and Recovery (VMs)]({{% ref "/docs/next/virtualization/backup-and-recovery" %}}) — the parallel guide for VMInstance / VMDisk backups (HelmRelease + CRs + PVC snapshots).
- [Velero Backup Configuration]({{% ref "/docs/next/operations/services/velero-backup-configuration" %}}) — administrator setup for the Velero-driven VM backups.
4 changes: 4 additions & 0 deletions content/en/docs/next/applications/clickhouse.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@ It is used for online analytical processing (OLAP).

### How to restore backup from S3

{{% alert color="warning" %}}
**Backups: prefer the `BackupClass` flow.** `backup.enabled` and the S3 fields (`s3Region`, `s3Bucket`, `endpoint`, `s3PathOverride`, `s3AccessKey`/`s3SecretKey` or `s3CredentialsSecret`) are still required — they materialise the in-pod `clickhouse-backup` sidecar that the Altinity backup strategy talks to. However, `backup.schedule`, `backup.cleanupStrategy`, and `backup.resticPassword` (which drive the legacy chart-managed CronJob doing dump + restic, and the matching restic restore flow documented below) are **superseded** by the Cozystack backups framework: define a `BackupClass` + `Altinity` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup).
{{% /alert %}}

1. Find the snapshot:

```bash
Expand Down
5 changes: 5 additions & 0 deletions content/en/docs/next/applications/foundationdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ resources:

### Backup (Optional)

{{% alert color="warning" %}}
**The chart-level `backup.*` values documented below are deprecated.** The in-chart `FoundationDBBackup` wiring is superseded by the Cozystack backups framework: define a `BackupClass` + `FoundationDB` strategy once, then drive backups via `BackupJob` / `Plan` and restores via `RestoreJob`. Existing tenants with `backup.enabled=true` continue to render the legacy `FoundationDBBackup` CR unchanged. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup).
{{% /alert %}}


```yaml
backup:
enabled: true
Expand Down
3 changes: 3 additions & 0 deletions content/en/docs/next/applications/mariadb.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@ more details:


### Backup parameters
{{% alert color="warning" %}}
**The chart-level `backup.*` values documented below are deprecated.** The legacy `mariadb-dump` + `restic` flow is superseded by the Cozystack backups framework: define a `BackupClass` + `MariaDB` strategy once, then drive backups via `BackupJob` / `Plan` and restores via `RestoreJob`. Existing tenants with `backup.enabled=true` continue to render the legacy resources unchanged. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup).
{{% /alert %}}

| Name | Description | Type | Value |
| ------------------------ | ----------------------------------------------- | -------- | ------------------------------------------------------ |
Expand Down
4 changes: 4 additions & 0 deletions content/en/docs/next/applications/postgres.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,10 @@ This managed service is controlled by the CloudNativePG operator, ensuring effic

## Operations

{{% alert color="warning" %}}
**Backups: prefer the `BackupClass` flow.** The chart-level `backup.*` values documented below still configure the Barman object store and S3 credentials that backups read from, but the chart-emitted `ScheduledBackup` and the `bootstrap`-based recovery flow have been **superseded** by the Cozystack backups framework: define a `BackupClass` + `CNPG` strategy once, then drive scheduled backups via `Plan` and restores via `RestoreJob`. See [Application Backup and Recovery]({{% ref "/docs/next/applications/backup-and-recovery" %}}) (tenant guide) and [Managed Application Backup Configuration]({{% ref "/docs/next/operations/services/managed-app-backup-configuration" %}}) (admin setup).
{{% /alert %}}

### How to enable backups

To back up a PostgreSQL application, an external S3-compatible storage is required.
Expand Down
Loading