[docs] Add GitOps configuration surfaces design proposal#16
[docs] Add GitOps configuration surfaces design proposal#16myasnikovdaniil wants to merge 2 commits into
Conversation
Proposes a single GitOps primitive (GitRepository + Kustomization + a scoped ServiceAccount) instantiated at two trust tiers: platform GitOps for admins (cluster scope) and tenant GitOps for tenants (namespace scope), plus a packageOverrides admin override layer. Generalizes cozystack/cozystack#2731 as the first increment. Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a design proposal for GitOps Configuration Surfaces in Cozystack, defining a unified primitive to enable platform GitOps for admins and tenant GitOps for users. The review feedback highlights several critical areas for improvement: first, Helm's default list-merging behavior could inadvertently overwrite default lists (like VM images) when applying packageOverrides; second, a potential security bypass exists where tenants could omit or modify the sharding label on Kustomization resources to run them on the unrestricted global controller; and third, restricting the customizer's write access to cozy-system to protect the supply chain would conflict with the need for admins to manage system-level configurations (like the cozystack-values Secret) in that namespace.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| packageOverrides: | ||
| linstor: | ||
| autoDiskful: | ||
| minutes: 10 | ||
| vm-default-images: | ||
| images: | ||
| - name: ubuntu-24.04 | ||
| url: https://example/custom.img | ||
| ``` |
There was a problem hiding this comment.
When implementing the packageOverrides merge logic in the platform chart, keep in mind that standard Helm merge functions (like merge or mustMerge) overwrite entire lists/arrays (such as the images list under vm-default-images) rather than merging them by key or appending to them. If an admin only wants to override or add a single image, they might inadvertently wipe out all other default images. It would be beneficial to specify in the design how list-type overrides are handled or if custom template merging logic will be used.
|
|
||
| A tenant must not be able to (a) reconcile into another namespace, or (b) borrow a more privileged SA. Two options, recommendation first: | ||
|
|
||
| 1. **Dedicated tenant kustomize-controller shard** (recommended) — mirror the existing `flux-tenants` helm-controller shard. Run a kustomize-controller selecting `sharding.fluxcd.io/key: tenants`, started with `--no-cross-namespace-refs` and a default-service-account lockdown, so the restrictions apply *only* to tenant reconciliation and platform Kustomizations are unaffected. This matches a pattern Cozystack already uses. |
There was a problem hiding this comment.
Using a dedicated tenant kustomize-controller shard that filters by a sharding label (e.g., sharding.fluxcd.io/key: tenants) introduces a potential security bypass. If a tenant can create or modify Kustomization resources directly, they could omit or change this label to have their Kustomization reconciled by the default, unrestricted global kustomize-controller instead. To make this isolation robust, the design should include an admission policy (such as a ValidatingAdmissionPolicy or webhook) that enforces the sharding label on all tenant-created Kustomization resources, or configure the global controller to explicitly ignore resources in tenant namespaces.
| - **Tool 1 (admin):** repo write == cluster-admin, by design and by documentation. The curated ClusterRole in the prototype (patch-not-delete on Packages, no CRDs, etc.) is *defense against accident, not against intent* — anyone with repo write can escalate, so the docs must not over-claim a boundary. New trust surface introduced: an admin git repo + its credentials (protect like a kubeconfig), and a standing, self-healing reconciling credential whose blast radius equals an admin's. | ||
| - **Tool 1 footguns to narrow** (they cost nothing functionally): source access should not let the customizer rewrite the platform's *own* supply chain (the `cozystack-packages` OCIRepository in `cozy-system`), so scope write away from `cozy-system`; and field ownership on downstream Packages should eventually be enforced by a `ValidatingAdmissionPolicy` scoped to the customizer SA that rejects changes to fields currently managed by helm-controller (detected via `managedFields`), leaving the hand-owned bootstrap Package free. Until then, the `packageOverrides` layer keeps admins off chart-owned objects entirely, so the contract is mostly avoided rather than merely documented. | ||
| - **Tool 2 (tenant):** the surface runs as the tenant's own SA, so it can do nothing the tenant cannot already do via the API. The new risk is **cross-namespace / SA-borrowing**, closed by the tenant shard lockdown (`--no-cross-namespace-refs`, forced `serviceAccountName`). Tenant-supplied input = the contents of a tenant-owned repo, bounded by `cozy:tenant:*`. |
There was a problem hiding this comment.
There is a potential conflict between scoping the customizer's write access away from cozy-system (to protect the platform's supply chain like cozystack-packages) and the need for admins to manage system-level configurations. As noted in the Context section, critical admin configurations (such as cert-manager issuers or other platform values) are driven by resources residing in cozy-system (e.g., the cozystack-values Secret). If write access to cozy-system is completely blocked, admins won't be able to manage these via GitOps. Consider refining this to use fine-grained RBAC or a ValidatingAdmissionPolicy that specifically protects the supply chain resources (like OCIRepository or specific HelmReleases) while still allowing the customizer to manage other configurations in cozy-system.
Revising Tool 1: override layer as a
|
PackageValues (proposed) |
packageOverrides (alternative) |
|
|---|---|---|
| API change | new optional CRD + reconciler merge | none (chart-only) |
| Field-ownership race | eliminated (separate object) | sidestepped (funnel through chart) |
| RBAC delegation | per-package (write PackageValues, read Package) |
all-or-nothing (sub-key of one object) |
List/map merge (e.g. vm-default-images.images is a 16-entry list) |
one tested merge in the operator | per-package Go-template logic, untested |
| Override blob | one small reviewable resource per package | one god-object for all packages |
| Schema validation | validatable per package later | free-form JSON |
packageOverrides is a reasonable lighter-weight alternative — no Go, ships purely as chart templating — and is essentially PackageValues crammed into one chart-rendered blob. The tradeoff is that it sidesteps the field-ownership race instead of removing it, can't do per-package RBAC delegation, and pushes list-merge correctness into per-package templates. Since PackageValues is additive, shipping packageOverrides first wouldn't block it — but it would make the packageOverrides key throwaway.
Feedback welcome on the precedence chain (chart defaults < cozystack-values < Package < PackageValues) and the RBAC split before I flesh out the CRD.
…layer Recast Tool 1's override layer from a `packageOverrides` map on the bootstrap Package (merged into downstream Packages by the platform chart) to a dedicated `PackageValues` API kind that the operator merges into each HelmRelease. `packageOverrides` is kept as a lighter-weight chart-only alternative/interim. PackageValues removes the helm-controller force-ownership race structurally (the chart never writes PackageValues; the admin never writes the chart-owned Package), makes ownership enforceable by object-level RBAC, and is fully additive — absent a PackageValues, the operator builds HelmReleases as today. Updates the admin override layer, user-facing / upgrade / security / failure / testing / rollout sections, open question 3 (now PackageValues precedence and binding), and alternatives considered. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Myasnikov Daniil <myasnikovdaniil2001@gmail.com>
📝 WalkthroughWalkthroughAdds ChangesGitOps Configuration Surfaces Design Proposal
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@design-proposals/gitops-surfaces/README.md`:
- Around line 130-131: The proposal currently presents RBAC enforcement as a
mechanism that prevents field-level edits, but this is actually a
human-discipline contract with no API-level enforcement. Revise lines 130-131 to
clarify that the split design structurally avoids the contract violation (by
separating concerns into PackageValues and Package) rather than enforcing it via
RBAC alone. Additionally, update the text to explicitly state that the admission
policy is a prerequisite for production use, not an optional enhancement. Make
clear that without field-ownership enforcement at the API level, admins must
manually follow the documented protocol, and any deviation recreates the
helm-controller race conditions the PackageValues design was meant to prevent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d7a0a413-b6cb-49ef-ba12-c1bd9336da65
📒 Files selected for processing (1)
design-proposals/gitops-surfaces/README.md
| The split is **additive**: with no `PackageValues` present, the operator path is byte-identical to today, so existing clusters are unaffected (see Upgrade compatibility). It also makes ownership enforceable by ordinary object-level RBAC — the admin GitOps SA gets write on `PackageValues` and read-only on `Package` — which is the discipline that keeps admins off chart-owned objects entirely, the foundation for whole-cluster reproducibility (UC5). | ||
|
|
There was a problem hiding this comment.
RBAC enforcement strategy relies on discipline until admission policy lands.
Lines 130–131 assert that object-level RBAC (write PackageValues, read-only Package) can enforce the field-level contract and prevent admins from patching chart-owned Package fields. However, this is a human-discipline contract — nothing in the CR syntax or API prevents an admin from editing a chart-owned Package directly.
The security section (line 171) correctly acknowledges this as a footgun: "until the admission policy lands, the PackageValues resource keeps admins off chart-owned objects entirely" — they write PackageValues, never the chart-owned Package — so the contract is structurally avoided rather than merely documented.
Recommendation: Ensure the admission policy (or equivalent validation) is treated as a prerequisite for production use of this surface, not a "nice-to-have." The proposal should explicitly state that without field-ownership enforcement, admins must follow the documented protocol manually, and any deviation creates the same helm-controller race the PackageValues design was meant to avoid.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@design-proposals/gitops-surfaces/README.md` around lines 130 - 131, The
proposal currently presents RBAC enforcement as a mechanism that prevents
field-level edits, but this is actually a human-discipline contract with no
API-level enforcement. Revise lines 130-131 to clarify that the split design
structurally avoids the contract violation (by separating concerns into
PackageValues and Package) rather than enforcing it via RBAC alone.
Additionally, update the text to explicitly state that the admission policy is a
prerequisite for production use, not an optional enhancement. Make clear that
without field-ownership enforcement at the API level, admins must manually
follow the documented protocol, and any deviation recreates the helm-controller
race conditions the PackageValues design was meant to prevent.
Summary
Adds a design proposal — GitOps Configuration Surfaces — under
design-proposals/gitops-surfaces/.It defines a single GitOps primitive (
GitRepository+Kustomization+ a scopedServiceAccount) and instantiates it at two trust tiers:packageOverridesadmin override layer so component-parameter changes don't race helm-controller for field ownership.gitopsapp bound to the tenantServiceAccount, isolated via a dedicated tenant kustomize-controller shard.Covers admin package/parameter/golden-image configuration, tenant self-service GitOps, and arbitrary cluster resources outside the core API, plus cross-cutting secrets-as-code, DR/reproducibility, and policy-as-code.
Status
Draft — opening for community feedback per the design-proposal process. The first increment (admin surface) is already prototyped in cozystack/cozystack#2731.
Summary by CodeRabbit