7.1 KiB
Szymiserver Infrastructure Runbook
Overview
This server runs a Kubernetes cluster (kind) hosting:
| Service | URL | Notes |
|---|---|---|
| Forgejo | https://git.szymi.ddns.net | Git + CI/CD |
| ArgoCD | https://argocd.szymi.ddns.net | GitOps deployment |
| Registry | https://registry.szymi.ddns.net | Private Docker registry |
| Nextcloud | https://nextcloud.szymi.ddns.net | File storage |
| Draw.io | https://drawio.szymi.ddns.net | Diagrams |
All TLS certificates are automatically issued and renewed via Let's Encrypt (cert-manager).
Apps are deployed via ArgoCD GitOps — push to Forgejo → ArgoCD syncs automatically.
Prerequisites
- Linux machine with Docker installed
kindinstalled (go install sigs.k8s.io/kind@latest)kubectlinstalled- Ports 80, 443, 2222 open on the firewall/router
- DNS A records pointing to server's public IP for all subdomains
/media/ssdmounted (SSD for persistent data)
Fresh Install From Scratch
Step 1: Clone the repo
git clone https://git.szymi.ddns.net/szymi/szymiserver.git
cd szymiserver
Step 2: Create the kind cluster
kind create cluster --config cluster-config.yml
This creates a cluster named szymicluster with:
- Port 80, 443 exposed (for ingress HTTP/HTTPS)
- Port 2222 exposed (for Forgejo SSH)
/media/ssdmounted into the kind container (persistent volumes)/var/run/docker.sockmounted (for Forgejo runner)
Step 3: Deploy infrastructure
chmod +x k8s/deploy-infrastructure.sh
./k8s/deploy-infrastructure.sh
What the script does, in order:
- Fixes kind container DNS (forces IPv4 — prevents ImagePullBackOff on servers with broken IPv6)
- Creates required host directories on
/media/ssd - Applies all infrastructure via kustomize (ingress-nginx, cert-manager, CoreDNS, registry, ArgoCD, Forgejo)
- Waits for cert-manager to be ready, then applies the ClusterIssuer
- Waits for ingress-nginx to be ready
- Restarts CoreDNS to pick up internal DNS config
- Applies ArgoCD Application definitions (deploys apps via GitOps)
Step 4: Wait for pods and certificates
# Watch pods come up (takes 2-5 min)
kubectl get pods -A -w
# Check TLS certificates (takes 1-3 min after pods are up)
kubectl get certificates -A
All certificates should show READY = True.
Step 5: Get ArgoCD admin password
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath="{.data.password}" | base64 -d && echo
Login at https://argocd.szymi.ddns.net with admin / above password.
Manual Steps After Fresh Install
These cannot be automated and must be done once:
Forgejo: Create admin account
- Open https://git.szymi.ddns.net
- Complete the setup wizard (first user becomes admin)
- Create the repositories:
szymiserver,nextcloud,drawio
Forgejo Runner: Register
The runner pod expects /media/ssd/forgejo/runner-data/ to contain a registered config.
# Get a runner registration token from Forgejo:
# Site Administration → Actions → Runners → Create new Runner
# Register the runner (exec into the pod):
kubectl exec -it -n forgejo deployment/forgejo-runner -- \
forgejo-runner register \
--instance https://git.szymi.ddns.net \
--token <TOKEN_FROM_FORGEJO> \
--name szymiserver \
--no-interactive
Day-to-Day Operations
Deploying a new app via ArgoCD
- Create Kubernetes manifests in your app repo (see
drawioornextcloudrepos as examples) - Add
nextcloud.szymi.ddns.netto CoreDNS internal hosts:# k8s/infrastructure/coredns/coredns-custom.yaml 10.96.0.100 newapp.szymi.ddns.net - Create an ArgoCD Application in
k8s/argocd-apps/newapp.yaml - Push to Forgejo and apply:
kubectl apply -f k8s/infrastructure/coredns/coredns-custom.yaml kubectl rollout restart deployment coredns -n kube-system kubectl apply -f k8s/argocd-apps/newapp.yaml
After server restart (if pods have ImagePullBackOff)
This should no longer happen — /etc/docker/daemon.json is configured with "dns": ["8.8.8.8"] to prevent the router's broken IPv6 DNS from being injected into containers. The kind container also has restart=always so it auto-starts with Docker.
If it does happen anyway (e.g. daemon.json was reset), quick fix:
docker exec szymicluster-control-plane bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
kubectl delete pods -A --field-selector=status.phase!=Running 2>/dev/null
Checking overall health
# All pods status
kubectl get pods -A | grep -v Running | grep -v Completed
# Certificate status
kubectl get certificates -A
# ArgoCD app sync status
kubectl get applications -n argocd
Forcing ArgoCD to re-sync
kubectl patch application <app-name> -n argocd \
--type merge -p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}'
Architecture
Internet
│
▼
Router (ports 80, 443, 2222 → server)
│
▼
Linux Host (/media/ssd for persistent data)
│
▼
kind container (szymicluster-control-plane)
│
├── ingress-nginx ← routes all HTTP/HTTPS traffic
├── cert-manager ← issues Let's Encrypt TLS certs
├── CoreDNS ← internal DNS (routes *.szymi.ddns.net to ingress)
├── ArgoCD ← watches Forgejo, deploys apps automatically
├── Forgejo ← git server + CI/CD runner
├── Registry ← private Docker image registry
├── Nextcloud ← deployed via ArgoCD
└── Draw.io ← deployed via ArgoCD
Why CoreDNS has internal hosts
cert-manager needs to verify domain ownership via HTTP-01 challenge. Inside the cluster, *.szymi.ddns.net would resolve to the external IP, which doesn't route back in. CoreDNS overrides these to point directly to ingress-nginx's fixed ClusterIP (10.96.0.100).
File Structure
szymiserver/
├── cluster-config.yml # kind cluster definition (run once)
├── RUNBOOK.md # this file
├── k8s/
│ ├── deploy-infrastructure.sh # main deployment script
│ ├── infrastructure/
│ │ ├── kustomization.yaml # applies all infrastructure
│ │ ├── coredns/ # internal DNS overrides
│ │ ├── ingress-nginx/ # ingress controller
│ │ ├── cert-manager/ # TLS certificates
│ │ ├── argocd/ # GitOps controller
│ │ ├── forgejo/ # git server + runner
│ │ └── registry/ # private Docker registry
│ └── argocd-apps/
│ ├── drawio.yaml # ArgoCD app for draw.io
│ └── nextcloud.yaml # ArgoCD app for Nextcloud
Persistent Data (on host at /media/ssd)
| Path | Used by |
|---|---|
/media/ssd/forgejo/forgejo-data |
Forgejo repos, config, DB |
/media/ssd/forgejo/runner-data |
Runner registration + config |
/media/ssd/registry |
Docker image layers |
/media/ssd/nextcloud |
Nextcloud files |
/media/ssd/mariadb |
Nextcloud database |