cicd: initial kubernetes infrastructure

This commit is contained in:
Szymon Wałachowski 2026-03-31 23:37:01 +02:00
commit 601a0328b8
28 changed files with 1385 additions and 0 deletions

228
RUNBOOK.md Normal file
View file

@ -0,0 +1,228 @@
# Szymiserver Infrastructure Runbook
## Overview
This server runs a Kubernetes cluster (kind) hosting:
| Service | URL | Notes |
|---------|-----|-------|
| Forgejo | https://git.szymi.ddns.net | Git + CI/CD |
| ArgoCD | https://argocd.szymi.ddns.net | GitOps deployment |
| Registry | https://registry.szymi.ddns.net | Private Docker registry |
| Nextcloud | https://nextcloud.szymi.ddns.net | File storage |
| Draw.io | https://drawio.szymi.ddns.net | Diagrams |
All TLS certificates are automatically issued and renewed via Let's Encrypt (cert-manager).
Apps are deployed via **ArgoCD GitOps** — push to Forgejo → ArgoCD syncs automatically.
---
## Prerequisites
- Linux machine with Docker installed
- `kind` installed (`go install sigs.k8s.io/kind@latest`)
- `kubectl` installed
- Ports 80, 443, 2222 open on the firewall/router
- DNS A records pointing to server's public IP for all subdomains
- `/media/ssd` mounted (SSD for persistent data)
---
## Fresh Install From Scratch
### Step 1: Clone the repo
```bash
git clone https://git.szymi.ddns.net/szymi/szymiserver.git
cd szymiserver
```
### Step 2: Create the kind cluster
```bash
kind create cluster --config cluster-config.yml
```
This creates a cluster named `szymicluster` with:
- Port 80, 443 exposed (for ingress HTTP/HTTPS)
- Port 2222 exposed (for Forgejo SSH)
- `/media/ssd` mounted into the kind container (persistent volumes)
- `/var/run/docker.sock` mounted (for Forgejo runner)
### Step 3: Deploy infrastructure
```bash
chmod +x k8s/deploy-infrastructure.sh
./k8s/deploy-infrastructure.sh
```
**What the script does, in order:**
1. Fixes kind container DNS (forces IPv4 — prevents ImagePullBackOff on servers with broken IPv6)
2. Creates required host directories on `/media/ssd`
3. Applies all infrastructure via kustomize (ingress-nginx, cert-manager, CoreDNS, registry, ArgoCD, Forgejo)
4. Waits for cert-manager to be ready, then applies the ClusterIssuer
5. Waits for ingress-nginx to be ready
6. Restarts CoreDNS to pick up internal DNS config
7. Applies ArgoCD Application definitions (deploys apps via GitOps)
### Step 4: Wait for pods and certificates
```bash
# Watch pods come up (takes 2-5 min)
kubectl get pods -A -w
# Check TLS certificates (takes 1-3 min after pods are up)
kubectl get certificates -A
```
All certificates should show `READY = True`.
### Step 5: Get ArgoCD admin password
```bash
kubectl -n argocd get secret argocd-initial-admin-secret \
-o jsonpath="{.data.password}" | base64 -d && echo
```
Login at https://argocd.szymi.ddns.net with `admin` / above password.
---
## Manual Steps After Fresh Install
These cannot be automated and must be done once:
### Forgejo: Create admin account
1. Open https://git.szymi.ddns.net
2. Complete the setup wizard (first user becomes admin)
3. Create the repositories: `szymiserver`, `nextcloud`, `drawio`
### Forgejo Runner: Register
The runner pod expects `/media/ssd/forgejo/runner-data/` to contain a registered config.
```bash
# Get a runner registration token from Forgejo:
# Site Administration → Actions → Runners → Create new Runner
# Register the runner (exec into the pod):
kubectl exec -it -n forgejo deployment/forgejo-runner -- \
forgejo-runner register \
--instance https://git.szymi.ddns.net \
--token <TOKEN_FROM_FORGEJO> \
--name szymiserver \
--no-interactive
```
---
## Day-to-Day Operations
### Deploying a new app via ArgoCD
1. Create Kubernetes manifests in your app repo (see `drawio` or `nextcloud` repos as examples)
2. Add `nextcloud.szymi.ddns.net` to CoreDNS internal hosts:
```yaml
# k8s/infrastructure/coredns/coredns-custom.yaml
10.96.0.100 newapp.szymi.ddns.net
```
3. Create an ArgoCD Application in `k8s/argocd-apps/newapp.yaml`
4. Push to Forgejo and apply:
```bash
kubectl apply -f k8s/infrastructure/coredns/coredns-custom.yaml
kubectl rollout restart deployment coredns -n kube-system
kubectl apply -f k8s/argocd-apps/newapp.yaml
```
### After server restart (if pods have ImagePullBackOff)
Kind container loses its DNS config on restart:
```bash
docker exec szymicluster-control-plane bash -c 'echo "nameserver 8.8.8.8" > /etc/resolv.conf'
kubectl delete pods -n argocd --all
```
### Checking overall health
```bash
# All pods status
kubectl get pods -A | grep -v Running | grep -v Completed
# Certificate status
kubectl get certificates -A
# ArgoCD app sync status
kubectl get applications -n argocd
```
### Forcing ArgoCD to re-sync
```bash
kubectl patch application <app-name> -n argocd \
--type merge -p '{"metadata":{"annotations":{"argocd.argoproj.io/refresh":"hard"}}}'
```
---
## Architecture
```
Internet
Router (ports 80, 443, 2222 → server)
Linux Host (/media/ssd for persistent data)
kind container (szymicluster-control-plane)
├── ingress-nginx ← routes all HTTP/HTTPS traffic
├── cert-manager ← issues Let's Encrypt TLS certs
├── CoreDNS ← internal DNS (routes *.szymi.ddns.net to ingress)
├── ArgoCD ← watches Forgejo, deploys apps automatically
├── Forgejo ← git server + CI/CD runner
├── Registry ← private Docker image registry
├── Nextcloud ← deployed via ArgoCD
└── Draw.io ← deployed via ArgoCD
```
### Why CoreDNS has internal hosts
cert-manager needs to verify domain ownership via HTTP-01 challenge. Inside the cluster, `*.szymi.ddns.net` would resolve to the external IP, which doesn't route back in. CoreDNS overrides these to point directly to ingress-nginx's fixed ClusterIP (`10.96.0.100`).
---
## File Structure
```
szymiserver/
├── cluster-config.yml # kind cluster definition (run once)
├── RUNBOOK.md # this file
├── k8s/
│ ├── deploy-infrastructure.sh # main deployment script
│ ├── infrastructure/
│ │ ├── kustomization.yaml # applies all infrastructure
│ │ ├── coredns/ # internal DNS overrides
│ │ ├── ingress-nginx/ # ingress controller
│ │ ├── cert-manager/ # TLS certificates
│ │ ├── argocd/ # GitOps controller
│ │ ├── forgejo/ # git server + runner
│ │ └── registry/ # private Docker registry
│ └── argocd-apps/
│ ├── drawio.yaml # ArgoCD app for draw.io
│ └── nextcloud.yaml # ArgoCD app for Nextcloud
```
---
## Persistent Data (on host at /media/ssd)
| Path | Used by |
|------|---------|
| `/media/ssd/forgejo/forgejo-data` | Forgejo repos, config, DB |
| `/media/ssd/forgejo/runner-data` | Runner registration + config |
| `/media/ssd/registry` | Docker image layers |
| `/media/ssd/nextcloud` | Nextcloud files |
| `/media/ssd/mariadb` | Nextcloud database |