Skip to content

docs(platform): add guides related managed apps backups#536

Merged
androndo merged 1 commit into
mainfrom
feat/apps-backups-guides
May 19, 2026
Merged

docs(platform): add guides related managed apps backups#536
androndo merged 1 commit into
mainfrom
feat/apps-backups-guides

Conversation

@androndo
Copy link
Copy Markdown
Contributor

@androndo androndo commented May 12, 2026

Summary by CodeRabbit

  • Documentation
    • Added tenant guide for application backup and recovery for managed Postgres, MariaDB, ClickHouse, and FoundationDB (one-off & scheduled backups, status, in-place or copy restores).
    • Added administrator guide for configuring the backup framework via cluster BackupClass and driver strategies.
    • Marked legacy chart-level backup values deprecated and added migration guidance and per-driver caveats (e.g., ClickHouse sidecar requirements).
    • Clarified VM vs. application backup scopes.

Review Change Stack

@androndo androndo requested review from kvaps and lllamnyp as code owners May 12, 2026 13:02
@netlify
Copy link
Copy Markdown

netlify Bot commented May 12, 2026

Deploy Preview for cozystack ready!

Name Link
🔨 Latest commit 90c390a
🔍 Latest deploy log https://app.netlify.com/projects/cozystack/deploys/6a0c2a435f938800084668bb
😎 Deploy Preview https://deploy-preview-536--cozystack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

Warning

Rate limit exceeded

@androndo has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 19 minutes and 13 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2f6b9b87-4bd0-4a79-8735-68b1dbe19a18

📥 Commits

Reviewing files that changed from the base of the PR and between a3e538b and 90c390a.

📒 Files selected for processing (7)
  • content/en/docs/next/applications/backup-and-recovery.md
  • content/en/docs/next/applications/clickhouse.md
  • content/en/docs/next/applications/mariadb.md
  • content/en/docs/next/applications/postgres.md
  • content/en/docs/next/operations/services/managed-app-backup-configuration.md
  • content/en/docs/next/operations/services/velero-backup-configuration.md
  • content/en/docs/next/virtualization/backup-and-recovery.md
📝 Walkthrough

Walkthrough

Adds tenant and admin documentation for BackupClass-driven, data-only backups and restores for managed Postgres, MariaDB, ClickHouse, and FoundationDB, plus deprecation callouts and scope clarifications in related application and VM backup docs.

Changes

Backup and Recovery Framework Documentation

Layer / File(s) Summary
Tenant-facing backup and recovery guide
content/en/docs/next/applications/backup-and-recovery.md
New tenant guide documents prerequisites, running one-off and scheduled backups (BackupJob, Plan), inspecting Backup/BackupJob status, in-place and to-copy restores (RestoreJob) with per-driver caveats, lifecycle/retention notes, and tenant escalation diagnostics.
Admin backup configuration guide
content/en/docs/next/operations/services/managed-app-backup-configuration.md
New admin guide defines cluster-scoped BackupClass setup, per-driver strategy examples (Postgres CNPG/barman, MariaDB dumps, ClickHouse Altinity sidecar, FoundationDB backup_agent), required per-application Secrets, apply/verify commands, tenant onboarding, and driver-side diagnostics.
Deprecation notices and scope clarification
content/en/docs/next/applications/postgres.md, content/en/docs/next/applications/mariadb.md, content/en/docs/next/applications/foundationdb.md, content/en/docs/next/applications/clickhouse.md, content/en/docs/next/operations/services/velero-backup-configuration.md, content/en/docs/next/virtualization/backup-and-recovery.md
Adds warning callouts in application docs deprecating chart-level backup.* fields or legacy restore flows in favor of the BackupClass framework, and clarifies Velero/VM backup scope vs. operator-native data-only backups.

Sequence Diagram(s)

sequenceDiagram
  participant Admin
  participant BackupController
  participant Tenant
  participant DriverOperator
  participant S3
  Admin->>BackupController: create BackupClass (strategy + parameters)
  Tenant->>BackupController: submit BackupJob / Plan referencing BackupClass
  BackupController->>DriverOperator: render and apply driver strategy (CR or sidecar call)
  DriverOperator->>S3: write backup artefacts to bucket
  DriverOperator->>BackupController: create Backup CR / report status
  Tenant->>BackupController: submit RestoreJob (optional targetApplicationRef)
  BackupController->>DriverOperator: trigger restore using Backup artefact
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I hop through docs with a whiskered cheer,
BackupClass and Plans now draw near,
From Postgres dumps to ClickHouse sidecars bright,
Tenants and admins sleep snug through the night.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'docs(platform): add guides related managed apps backups' is somewhat vague and lacks specificity—it uses the generic phrase 'guides related' rather than precisely identifying what was added (two new backup guides and cross-link updates across existing docs). Consider a more specific title such as 'docs(platform): add application backup and recovery guides' or 'docs(platform): add managed app backup guides and cross-links' to better convey the scope of changes.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/apps-backups-guides

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new data-only backup and recovery framework for managed databases (Postgres, MariaDB, ClickHouse, and FoundationDB) in Cozystack, transitioning from legacy chart-level configurations to a centralized system using BackupClass and strategy-based resources. It provides detailed documentation for both tenants and administrators. Reviewers suggested several improvements, including standardizing FoundationDB account names for better multi-tenancy, enhancing security by using non-root users in backup pods, and clarifying S3 endpoint terminology to prevent configuration errors.

Comment thread content/en/docs/next/applications/backup-and-recovery.md Outdated
compression: gzip
```

The `endpoint` is **path-style without scheme** (e.g. `seaweedfs-s3.<seaweedfs-namespace>.svc:8333` for the default in-cluster SeaweedFS — substitute the namespace where SeaweedFS is deployed in your environment). Drop the `tls` block entirely when the endpoint serves a publicly-trusted certificate.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The term "path-style" in the context of S3 usually refers to the addressing mode (e.g., s3.amazonaws.com/bucket vs bucket.s3.amazonaws.com). Here, it seems to be used to describe the endpoint format (host and port without scheme). To avoid confusion, it might be clearer to explicitly state that the endpoint should contain only the host and port.

Suggested change
The `endpoint` is **path-style without scheme** (e.g. `seaweedfs-s3.<seaweedfs-namespace>.svc:8333` for the default in-cluster SeaweedFS — substitute the namespace where SeaweedFS is deployed in your environment). Drop the `tls` block entirely when the endpoint serves a publicly-trusted certificate.
The `endpoint` should be provided as **host:port without scheme** (e.g. `seaweedfs-s3.<seaweedfs-namespace>.svc:8333` for the default in-cluster SeaweedFS — substitute the namespace where SeaweedFS is deployed in your environment). Drop the `tls` block entirely when the endpoint serves a publicly-trusted certificate.

Comment thread content/en/docs/next/operations/services/managed-app-backup-configuration.md Outdated
Comment thread content/en/docs/next/operations/services/managed-app-backup-configuration.md Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
content/en/docs/next/applications/backup-and-recovery.md (1)

36-43: ⚡ Quick win

Consider adding a language identifier to the fenced code block.

The output example is missing a language identifier. Adding text or console would satisfy linters and improve rendering:

-```
+```text
 NAME                      AGE
 postgres-data-backup      14m
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@content/en/docs/next/applications/backup-and-recovery.md` around lines 36 -
43, The fenced code block in the backup-and-recovery.md example lacks a language
identifier; update the triple-backtick fence around the output listing (the
block showing NAME / AGE entries like "postgres-data-backup" and "velero") to
include a language such as "text" or "console" (e.g., change ``` to ```text) so
linters/renderers recognize it as plain output.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@content/en/docs/next/applications/backup-and-recovery.md`:
- Around line 36-43: The fenced code block in the backup-and-recovery.md example
lacks a language identifier; update the triple-backtick fence around the output
listing (the block showing NAME / AGE entries like "postgres-data-backup" and
"velero") to include a language such as "text" or "console" (e.g., change ``` to
```text) so linters/renderers recognize it as plain output.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f871a89b-ab72-46ba-b44a-b1b590bb3620

📥 Commits

Reviewing files that changed from the base of the PR and between f466d34 and 4efa0e7.

📒 Files selected for processing (8)
  • content/en/docs/next/applications/backup-and-recovery.md
  • content/en/docs/next/applications/clickhouse.md
  • content/en/docs/next/applications/foundationdb.md
  • content/en/docs/next/applications/mariadb.md
  • content/en/docs/next/applications/postgres.md
  • content/en/docs/next/operations/services/managed-app-backup-configuration.md
  • content/en/docs/next/operations/services/velero-backup-configuration.md
  • content/en/docs/next/virtualization/backup-and-recovery.md

@androndo androndo force-pushed the feat/apps-backups-guides branch 2 times, most recently from 0b9ead0 to 237d627 Compare May 13, 2026 18:46
@lllamnyp
Copy link
Copy Markdown
Member

NOT LGTM

Reviewing against main. Single commit 237d627 adds two new guides plus cross-link callouts in four existing application docs.

Primary blocker — tenant guide is unrunnable as a tenant

content/en/docs/next/applications/backup-and-recovery.md instructs tenant users to do two things their roles do not permit:

  1. Read the BucketInfo Secret (lines 78–83):
    kubectl -n tenant-user get secret bucket-db-backups-backup -o jsonpath='{.data.BucketInfo}'
  2. Create per-driver credential Secrets (lines 92–95, 100–102, 107–110, 135–137):
    kubectl -n tenant-user create secret generic my-postgres-cnpg-backup-creds …
    kubectl -n tenant-user create secret generic my-mariadb-mariadb-backup-creds …
    kubectl -n tenant-user create secret generic my-mariadb-mariadb-backup-ca …
    kubectl -n tenant-user create secret generic my-fdb-fdb-backup-creds …

Neither is permitted by the aggregated tenant roles. Verified against packages/system/cozystack-basics/templates/clusterroles.yaml in cozystack/cozystack:

  • cozy:tenant:base and cozy:tenant:view:base grant nothing on the core API group's secrets resource.
  • cozy:tenant:use:base adds core.cozystack.io/tenantsecrets: get/list/watch — that is TenantSecret, not bare Secret.
  • cozy:tenant:admin:base escalates write verbs only on apps.cozystack.io/* resources; still nothing on secrets.
  • cozy:tenant:super-admin:base is apps.cozystack.io/*: * and kubevirt.io/virtualmachines: * — again, no secrets.

Meanwhile the COSI flow in packages/apps/bucket/templates/bucketclaim.yaml materialises <release>-<user> as a plain v1.Secret via BucketAccess.credentialsSecretName. That Secret carries BucketInfo — and tenants have no read verb on it.

So the tenant guide is broken at step one for every driver except ClickHouse (which routes through chart values, not a hand-rolled Secret). A user following this guide on a real cluster sees 403 on the very first kubectl get secret and never reaches the BackupJob.

This is not a wording fix. The doc as written exposes a missing piece in the implementation: there is no tenant-visible path from a Bucket application to per-driver backup credentials. The Bucket app is itself a tenant-managed resource, so the most natural shape is one where a tenant who has provisioned a Bucket can point a strategy / BackupClass at it (or at a TenantSecret-shaped projection of its credentials) and never has to materialise or even read a bare Secret themselves. The exact mechanism — strategy/controller bridging BucketInfo to the operator-expected key shape, a tenant-visible projection from the Bucket app, or something else entirely — is your call. Whatever shape you land on, the corresponding text in this PR (the `Read the bucket credentials` block and every `kubectl create secret` snippet) needs to be replaced to match.

Verification

End-to-end test should run the tenant guide as a tenant ServiceAccount (one of `cozy:tenant:admin` / `cozy:tenant:use`), impersonated via `--as`, against a real cluster, and complete a Postgres / MariaDB / FoundationDB backup and restore without ever using a cluster-admin kubeconfig. A reproducible CI variant lives in `cozystack/cozystack/examples/backups//run-all.sh`; those scripts currently assume cluster-admin and would need a parallel `run-all-as-tenant.sh` variant under the chosen ServiceAccount.

Secondary issues (correctness)

These are real, but each one sits inside text that will move or vanish when the primary blocker is addressed. Worth folding into the rewrite rather than spot-patching now.

  1. Alert callouts in four autogenerated files will be wiped on next sync. `content/en/docs/next/applications/{postgres,clickhouse,mariadb,foundationdb}.md` each carry the `Autogenerated content. Don't edit this file directly` footer and a matching `_include/.md` stub; bodies are rebuilt from upstream READMEs by `hack/update_apps.sh`. The first `make update-apps` invoked by the upstream `cozystack/cozystack` `tags.yaml` workflow will delete the four warning blocks. The recommended-flow text the callouts add is already present in the upstream READMEs (verified for all four), so the safest move is to drop the in-place edits and rely on the link from the new tenant guide.
  2. FoundationDB strategy template tells admins to run `backup_agent` as root. `managed-app-backup-configuration.md:264` sets `runAsUser: 0`. The canonical example (`cozystack/cozystack/examples/backups/foundationdb/01-create-strategy.sh`) uses `runAsUser: 4059` to match the FoundationDB process UID; `backup_agent` doesn't need root. Change to `4059` (and add `runAsGroup: 4059`).
  3. Bucket-readiness wait is too narrow. `backup-and-recovery.md:65` waits only on the `HelmRelease`; the upstream example (`examples/backups/clickhouse/03-create-bucket.sh`) additionally waits for the `bucketclaim` (`.status.bucketReady=true`) and the `bucketaccess` (`.status.accessGranted=true`). The third is the one that guarantees credentials are populated. Likely moot in any shape where the tenant doesn't hand-read the Secret, but flagging in case the rewrite still surfaces a wait step.
  4. Fabricated validation error string. `backup-and-recovery.md:139` quotes `accountName is required` as the controller's error text; that exact string does not exist anywhere under `internal/backupcontroller/`. Either quote the real message produced by the strategy controller when `parameters.accountName` is empty, or drop the parenthetical.

Out of scope / fine as-is

  • `velero-backup-configuration.md` and `virtualization/backup-and-recovery.md` cross-link callouts — not autogenerated (no `_include` stub, no autogen footer), safe to edit in place. Link targets resolve.
  • Menu weights (4 in `applications/`, 31 in `operations/services/`) slot in cleanly next to the existing entries.
  • CRD kinds (`CNPG`, `MariaDB`, `Altinity`, `FoundationDB`) and plurals (`cnpgs`, `mariadbs`, `altinities`, `foundationdbs`) match the definitions in `packages/system/backupstrategy-controller/definitions/`.
  • The `backups.cozystack.io/owned-by.BackupJobName` label key in the troubleshooting section matches `OwningJobNameLabel` in `api/backups/v1alpha1/backupjob_types.go` and is applied by every strategy controller.
  • `BackupClass` cluster scope is correct per the CRD.
  • The GitHub link to `examples/backups/clickhouse/01-create-strategy.sh` resolves.

@androndo androndo force-pushed the feat/apps-backups-guides branch from 237d627 to a3e538b Compare May 19, 2026 08:34
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@content/en/docs/next/operations/services/managed-app-backup-configuration.md`:
- Around line 271-272: The pod spec currently sets securityContext.runAsUser: 0
which runs the backup_agent as root; change the container/pod security context
to run as a non-root UID (use the FoundationDB canonical UID 4059) by replacing
runAsUser: 0 with runAsUser: 4059 (or another non-root UID) and ensure any
filesystem permissions and capabilities for the backup_agent are adjusted
accordingly; update the securityContext section for the backup_agent
container/pod to reflect the non-root runAsUser and keep other securityContext
settings intact.
- Around line 369-372: Update the readiness checks after applying bucket.yaml to
wait not only for the HelmRelease (hr/bucket-db-backups) ready condition but
also for the BucketClaim and BucketAccess conditions that ensure credentials are
populated; add kubectl -n tenant-user wait commands targeting the BucketClaim
resource (check .status.bucketReady=true) and the BucketAccess resource (check
.status.accessGranted=true) using the same logical names as the HelmRelease so
the Secret is guaranteed to be populated before proceeding.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0d813be1-12ca-41ac-a247-4943f81e047c

📥 Commits

Reviewing files that changed from the base of the PR and between 4efa0e7 and a3e538b.

📒 Files selected for processing (8)
  • content/en/docs/next/applications/backup-and-recovery.md
  • content/en/docs/next/applications/clickhouse.md
  • content/en/docs/next/applications/foundationdb.md
  • content/en/docs/next/applications/mariadb.md
  • content/en/docs/next/applications/postgres.md
  • content/en/docs/next/operations/services/managed-app-backup-configuration.md
  • content/en/docs/next/operations/services/velero-backup-configuration.md
  • content/en/docs/next/virtualization/backup-and-recovery.md
✅ Files skipped from review due to trivial changes (7)
  • content/en/docs/next/virtualization/backup-and-recovery.md
  • content/en/docs/next/applications/postgres.md
  • content/en/docs/next/applications/mariadb.md
  • content/en/docs/next/applications/foundationdb.md
  • content/en/docs/next/applications/clickhouse.md
  • content/en/docs/next/operations/services/velero-backup-configuration.md
  • content/en/docs/next/applications/backup-and-recovery.md

Comment thread content/en/docs/next/operations/services/managed-app-backup-configuration.md Outdated
Comment on lines +369 to +372
```bash
kubectl apply -f bucket.yaml
kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add bucketclaim and bucketaccess readiness checks.

The current wait only checks HelmRelease ready status. Upstream examples also wait for the bucketclaim .status.bucketReady=true and bucketaccess .status.accessGranted=true conditions to guarantee credential population before proceeding to read the Secret.

🔧 Proposed fix to add complete readiness checks
 kubectl apply -f bucket.yaml
 kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s
+kubectl -n tenant-user wait bucketclaim/bucket-db-backups --for=jsonpath='{.status.bucketReady}'=true --timeout=60s
+kubectl -n tenant-user wait bucketaccess/bucket-db-backups-backup --for=jsonpath='{.status.accessGranted}'=true --timeout=60s
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```bash
kubectl apply -f bucket.yaml
kubectl -n tenant-user wait hr/bucket-db-backups --for=condition=ready --timeout=300s
```
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@content/en/docs/next/operations/services/managed-app-backup-configuration.md`
around lines 369 - 372, Update the readiness checks after applying bucket.yaml
to wait not only for the HelmRelease (hr/bucket-db-backups) ready condition but
also for the BucketClaim and BucketAccess conditions that ensure credentials are
populated; add kubectl -n tenant-user wait commands targeting the BucketClaim
resource (check .status.bucketReady=true) and the BucketAccess resource (check
.status.accessGranted=true) using the same logical names as the HelmRelease so
the Secret is guaranteed to be populated before proceeding.

@androndo androndo force-pushed the feat/apps-backups-guides branch from a3e538b to 5eef4d3 Compare May 19, 2026 09:13
Signed-off-by: Andrey Kolkov <androndo@gmail.com>
@androndo androndo force-pushed the feat/apps-backups-guides branch from 5eef4d3 to 90c390a Compare May 19, 2026 09:15
@androndo androndo enabled auto-merge May 19, 2026 09:15
@androndo androndo merged commit ec7b45a into main May 19, 2026
5 of 6 checks passed
@androndo androndo deleted the feat/apps-backups-guides branch May 19, 2026 09:16
myasnikovdaniil added a commit to cozystack/cozystack that referenced this pull request May 19, 2026
Adds the missing entry for #2649 (OpenSearch wired into PaaS bundle, plus
StorageClass dropdown override for the legacy 1.3 openapi-ui dashboard),
corrects six website-PR author attributions against gh pr view, picks up
two website docs PRs merged just before the v1.3.4 cut (cozystack/website#536
and #538), updates the contributors list, and refreshes the date to today.

Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants