szymiserver/RUNBOOK.md

6.8 KiB

Szymiserver Infrastructure Runbook

Overview

This server runs a Kubernetes cluster (kind) hosting:

Service URL Notes
Forgejo https://git.szymi.ddns.net Git + CI/CD
ArgoCD https://argocd.szymi.ddns.net GitOps deployment
Registry https://registry.szymi.ddns.net Private Docker registry
Nextcloud https://nextcloud.szymi.ddns.net File storage
Draw.io https://drawio.szymi.ddns.net Diagrams

All TLS certificates are automatically issued and renewed via Let's Encrypt (cert-manager).

Apps are deployed via ArgoCD GitOps — push to Forgejo → ArgoCD syncs automatically.


Prerequisites

  • Linux machine with Docker installed
  • kind installed (go install sigs.k8s.io/kind@latest)
  • kubectl installed
  • Ports 80, 443, 2222 open on the firewall/router
  • DNS A records pointing to server's public IP for all subdomains
  • /media/ssd mounted (SSD for persistent data)

Fresh Install From Scratch

Step 1: Clone the repo

git clone https://git.szymi.ddns.net/szymi/szymiserver.git
cd szymiserver

Step 2: Create the kind cluster

kind create cluster --config cluster-config.yml

This creates a cluster named szymicluster with:

  • Port 80, 443 exposed (for ingress HTTP/HTTPS)
  • Port 2222 exposed (for Forgejo SSH)
  • /media/ssd mounted into the kind container (persistent volumes)
  • /var/run/docker.sock mounted (for Forgejo runner)

Step 3: Deploy infrastructure

chmod +x k8s/deploy-infrastructure.sh
./k8s/deploy-infrastructure.sh

What the script does, in order:

  1. Fixes kind container DNS (forces IPv4 — prevents ImagePullBackOff on servers with broken IPv6)
  2. Creates required host directories on /media/ssd
  3. Applies all infrastructure via kustomize (ingress-nginx, cert-manager, CoreDNS, registry, ArgoCD, Forgejo)
  4. Waits for cert-manager to be ready, then applies the ClusterIssuer
  5. Waits for ingress-nginx to be ready
  6. Restarts CoreDNS to pick up internal DNS config
  7. Applies ArgoCD Application definitions (deploys apps via GitOps)

Step 4: Wait for pods and certificates

# Watch pods come up (takes 2-5 min)
kubectl get pods -A -w

# Check TLS certificates (takes 1-3 min after pods are up)
kubectl get certificates -A

All certificates should show READY = True.

Step 5: Get ArgoCD admin password

kubectl -n argocd get secret argocd-initial-admin-secret \
  -o jsonpath="{.data.password}" | base64 -d && echo

Login at https://argocd.szymi.ddns.net with admin / above password.


Manual Steps After Fresh Install

These cannot be automated and must be done once:

Forgejo: Create admin account

  1. Open https://git.szymi.ddns.net
  2. Complete the setup wizard (first user becomes admin)
  3. Create the repositories: szymiserver, nextcloud, drawio

Forgejo Runner: Register

The runner pod expects /media/ssd/forgejo/runner-data/ to contain a registered config.

# Get a runner registration token from Forgejo:
# Site Administration → Actions → Runners → Create new Runner

# Register the runner (exec into the pod):
kubectl exec -it -n forgejo deployment/forgejo-runner -- \
  forgejo-runner register \
  --instance https://git.szymi.ddns.net \
  --token <TOKEN_FROM_FORGEJO> \
  --name szymiserver \
  --no-interactive

Day-to-Day Operations

Deploying a new app via ArgoCD

  1. Create Kubernetes manifests in your app repo (see drawio or nextcloud repos as examples)
  2. Add nextcloud.szymi.ddns.net to CoreDNS internal hosts:
    # k8s/infrastructure/coredns/coredns-custom.yaml
    10.96.0.100 newapp.szymi.ddns.net
    
  3. Create an ArgoCD Application in k8s/argocd-apps/newapp.yaml
  4. Push to Forgejo and apply:
    kubectl apply -f k8s/infrastructure/coredns/coredns-custom.yaml
    kubectl rollout restart deployment coredns -n kube-system
    kubectl apply -f k8s/argocd-apps/newapp.yaml
    

After server restart (if pods have ImagePullBackOff)

Kind container loses its DNS config on restart:

docker exec szymicluster-control-plane bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
kubectl delete pods -n argocd --all

Checking overall health

# All pods status
kubectl get pods -A | grep -v Running | grep -v Completed

# Certificate status
kubectl get certificates -A

# ArgoCD app sync status
kubectl get applications -n argocd

Forcing ArgoCD to re-sync

kubectl patch application <app-name> -n argocd \
  --type merge -p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}'

Architecture

Internet
   │
   ▼
Router (ports 80, 443, 2222 → server)
   │
   ▼
Linux Host (/media/ssd for persistent data)
   │
   ▼
kind container (szymicluster-control-plane)
   │
   ├── ingress-nginx  ← routes all HTTP/HTTPS traffic
   ├── cert-manager   ← issues Let's Encrypt TLS certs
   ├── CoreDNS        ← internal DNS (routes *.szymi.ddns.net to ingress)
   ├── ArgoCD         ← watches Forgejo, deploys apps automatically
   ├── Forgejo        ← git server + CI/CD runner
   ├── Registry       ← private Docker image registry
   ├── Nextcloud      ← deployed via ArgoCD
   └── Draw.io        ← deployed via ArgoCD

Why CoreDNS has internal hosts

cert-manager needs to verify domain ownership via HTTP-01 challenge. Inside the cluster, *.szymi.ddns.net would resolve to the external IP, which doesn't route back in. CoreDNS overrides these to point directly to ingress-nginx's fixed ClusterIP (10.96.0.100).


File Structure

szymiserver/
├── cluster-config.yml          # kind cluster definition (run once)
├── RUNBOOK.md                  # this file
├── k8s/
│   ├── deploy-infrastructure.sh  # main deployment script
│   ├── infrastructure/
│   │   ├── kustomization.yaml    # applies all infrastructure
│   │   ├── coredns/              # internal DNS overrides
│   │   ├── ingress-nginx/        # ingress controller
│   │   ├── cert-manager/         # TLS certificates
│   │   ├── argocd/               # GitOps controller
│   │   ├── forgejo/              # git server + runner
│   │   └── registry/             # private Docker registry
│   └── argocd-apps/
│       ├── drawio.yaml           # ArgoCD app for draw.io
│       └── nextcloud.yaml        # ArgoCD app for Nextcloud

Persistent Data (on host at /media/ssd)

Path Used by
/media/ssd/forgejo/forgejo-data Forgejo repos, config, DB
/media/ssd/forgejo/runner-data Runner registration + config
/media/ssd/registry Docker image layers
/media/ssd/nextcloud Nextcloud files
/media/ssd/mariadb Nextcloud database