Sync Waves: Cluster-Complete Bootstrap#
What This Covers#
How to bootstrap a fully self-contained cluster environment - cert-manager, Traefik,
ArgoCD ingress, and services - using ArgoCD sync waves, with a single kubectl apply
as the only manual step after ArgoCD itself is installed.
What Are Sync Waves?#
ArgoCD processes resources in a sync operation in wave order. Each wave must reach
Healthy before the next wave starts.
You set a wave with an annotation:
| |
Waves are integers. Lower = earlier. Default = wave 0. Negative values are valid (useful for CRD-installing resources that must precede wave 0).
Key rule: ArgoCD waits for all resources in wave N to be Synced + Healthy before
processing wave N+1. For Application CRDs (which are themselves ArgoCD resources), an
Application is Healthy when its child resources are all healthy.
This means sync waves on Application manifests give you cluster-level dependency
ordering - not just resource ordering within a single app.
Folder Structure Philosophy#
| |
platform/ = orchestration only. It contains Application and ApplicationSet CRDs -
nothing else. Each Application CRD points to a subfolder (config/cert-manager,
environments/eu-dev-rancher/argocd, etc.) that holds the actual manifests.
This separation means:
- You can read
platform/to understand the full cluster topology at a glance. - Adding a new component = one new
platform/otel.yaml(the CRD) + one new subfolder (the manifests). The two concerns never mix.
Extensibility example: Adding OTel later means adding platform/otel.yaml (wave 5,
points to environments/eu-dev-rancher/observability/) and the actual OTel manifests
in observability/. The bootstrap.yaml never changes.
Why AppSet Lives in platform/#
The ApplicationSet is platform team policy, not a dev team concern:
Ownership: The AppSet defines what Helm chart all services use, what namespace they land in, what sync policy they get, what labels they carry. Dev teams only add a values file to
services/- they never touch the AppSet.Policy enforcement: The AppSet is the contract between platform and dev teams. “Bring us a values file and we deploy it according to this template.” Keeping it in
platform/makes it as governed as cert-manager or Traefik.Wave ordering: The AppSet (wave 4) must run after cert-manager (waves 0–1) and Traefik (wave 2) are healthy. Services that boot before their ingress controller or CA issuer exists will fail TLS issuance and endpoint health checks.
Wave Ordering#
| |
Namespace Decisions#
| Component | Namespace | Rationale |
|---|---|---|
| ArgoCD | argocd | Standard |
| cert-manager | cert-manager | Standard |
| Traefik | ingress | Groups all ingress infra; descriptive |
| Services (svc1/2) | alpha-dev | Team/env scoped |
Why bootstrap.yaml sets destination.namespace: argocd: The resources it deploys
are Application and ApplicationSet CRDs, which must live in the argocd namespace by
ArgoCD convention. The child Applications each deploy their workloads into their own
namespaces (cert-manager, ingress, alpha-dev, etc.).
One-Time Manual Steps#
These run once per cluster. After step 5, ArgoCD takes over - no further kubectl apply
is needed for platform components or services.
| |
Why the PAT can’t be GitOps-managed#
ArgoCD needs the repo credential to read the repo - but the credential would be inside that same repo. ArgoCD cannot sync a file it needs to read before it can sync. This is a fundamental chicken-and-egg: the PAT must exist before bootstrap.
bootstrap/ is gitignored for exactly this reason. You fill in the PAT locally and
apply it once. It never hits the git history.
Why the admin password hash can be committed#
A bcrypt hash is one-way - knowing the hash doesn’t reveal the password. Committing
$2a$10$... is safe. Generate it once locally, commit it to
environments/eu-dev-rancher/argocd/argocd-admin-password.yaml, and ArgoCD applies it
on wave 3. No manual patching ever needed after that.
Wave-by-Wave Verification#
| |
Adding a New Service#
Drop a values file into services/:
| |
Commit and push. The ApplicationSet detects the new file and creates
alpha-svc3-eu-dev-rancher automatically. No changes to platform/appset.yaml, no new
Application manifest, no ArgoCD UI interaction.
Extending to eu-staging#
To add a staging cluster (e.g. Minikube), create a parallel folder:
| |
The config/cert-manager/ manifests are reused as-is - ClusterIssuer manifests are
cluster-agnostic (they reference no cluster-specific hostname or secret name).
Future: Adding Observability#
| |
The bootstrap.yaml never changes. ArgoCD picks up the new platform/otel.yaml on the
next sync and adds OTel to the wave sequence.
Gotchas#
ArgoCD UI returns 500 Internal Server Error through Traefik ingress
This took two rounds of debugging. Both root causes stem from argocd-server’s default TLS behaviour.
Round 1 - protocol mismatch
argocd-server runs TLS on port 443 by default. Traefik terminates TLS at the ingress, then forwards plain HTTP to the backend - but argocd-server:443 expects TLS. Traefik logs show a generic 500 with no further detail.
Diagnosed by checking kubectl describe ingress argocd-server -n argocd (backend port
443) and kubectl get configmap argocd-cmd-params-cm -n argocd (no server.insecure
set, so TLS mode is active).
Round 2 - x509 IP SAN validation failure
Fixing the protocol mismatch by adding serversscheme: https (so Traefik re-encrypts
to the backend) exposed the next problem: Traefik validates the backend TLS cert, and
argocd-server’s self-signed cert has no IP SAN for the pod IP. Traefik rejects it.
Diagnosed via kubectl logs -n ingress -l app.kubernetes.io/name=traefik:
| |
Root fix - run argocd-server in insecure mode
The clean solution for both: set server.insecure: true so argocd-server serves plain
HTTP on port 80. Traefik connects with HTTP - no cert validation, no protocol mismatch.
External traffic (browser → Traefik) is still TLS-encrypted via the cert-manager cert.
| |
| |
Final traffic flow:
| |
Regular services (nginx) are not affected - they already serve plain HTTP on port 80. No config changes needed for svc1, svc2, or any standard HTTP backend.