This is what the cluster looks like:
What it's made of:
- 3 raspberry pi 4 (8Go)
- 1 gigabit ethernet 5 ports switch
- 1 1To lexar ES3 usb SSD
- 1 80mm fan
- 3 very short cat6 ethernet cables
- a 3d printed rack
- some m3 threaded inserts and screws
The rack is a remix of this one. I've included the stls that I remixed/designed, aka the vented sleds for the PI4 and the SSD, and the side fan mount.
Here is a top view diagram of the main components:
This is the repo that governs almost all the cluster. The bootstrapping is done using ansible, from 3 ssh-available machines (pi4 in this case).
From here, Flux will create everything that is declared in k8s/
, decrypt what's secret using a private key, and keep the stack in sync.
In k8s/
there are 2 main folders:
-
infra
that represents what's needed for the cluster to function:- a NFS provisionner as a storageclass,
- an IngressController with Traefik, one private (listens on local lan), one "public" (routes specific subdomains from cloudflare),
- cert-manager for certificates management of my domain,
- cloudflare tunnel for exposing part of my services to the outside world,
- tailscale-operator for accessing my private services from wherever (using a subnet route) and for my cluster services to access my offsite backup server
- system-upgrade-controller for managing k8s upgrades directly in the cluster using CRDs.
- renovate cronjob to create PR for components updates (w/ auto merging when it's a patch level update)
- restic cronjob that create the local backup (if an app fails and borks its files) and remote backup (if the server catches fire)
-
apps
, the actual services running on the cluster:- adguard for DNS/DHCP
- gitea for local git and CI/CD
- paperless-ngx for my important files
- immich for photos backups and sync
- vaultwarden as my passwords manager
- filebrowser for file sharing
- glance as my internet homepage
- kromgo for exposing prom stats publicly
- octoprint for controlling my 3D printer
- pocketid as an OIDC provider
- atuin for my centralized shell history
- grafana + prometheus + loki for monitoring
- and some other stuff like a blog , static sites, etc..
-
there is also an
appchart
folder. It's a Helm chart that ease the deployment of simple services.
I try to adhere to gitops/automation principles. Some things aren't automated but it's mainly toil (one-time-things during setup). 95% of the infrastructure should be deployable by following these instructions (assuming data and encryption keys are known).
Requirements and basic stack:
- ansible: infrastructure automation
- flux: cluster state mgmt
- sops + age: encryption
- git: change management
brew install git ansible fluxcd/tap/flux sops age
This assume you have the decryption key age.agekey
, and the env var configured:
SOPS_AGE_KEY_FILE=age.agekey
If you want to encrypt an already created file (eg a k8s Secret spec):
sops encrypt -i <file.yaml>
If you want to edit inline a encrypted file (eg modify a value in a encrypted Secret/Configmap) using $EDITOR:
sops <file.yaml>
It is assumed that a ssh key auth is configured on the nodes (ssh-copy-id ),
with passwordless sudo (<user> ALL=(ALL) NOPASSWD: ALL
in visudo).
cd ansible
ansible-playbook -i inventory.yaml -l lampone cluster-install.yaml
- Get a github token and set an env var:
export GITHUB_TOKEN=xxx
- Enter some commands
# pre create the decryption key
kubectl create ns flux-system
kubectl create secret generic sops-age --namespace=flux-system --from-file=age.agekey
# bootstrap flux
flux bootstrap github \
--owner=k0rventen \
--repository=lampone \
--branch=main \
--path=./k8s/flux
- Things should start to deploy ! :magic:
I try to follow a 3-2-1 backup rule. The 'live' data is on the nfs ssd. It's backed up daily onto the same ssd (mainly for rollbacks and potential local re-deployments). For disaster-recovery situations, it's also backed up daily onto a HDD offsite, which can be accessed through my tailnet.
The backup tool is restic . It's deployed as a cronjob in the cluster. The image used runs a custom script that runs both the local restic backup as well as the remote one (which requires commands before and after to mount the external disk.). Here are the commands used to create the restic repos before deploying the cronjob:
- local repo
cd /nfs
restic init nfs-backups
- remote repo
Create a mnt-backup.mount
systemd service on the remote server to mount/umount the backup disk
coco@remote_server:~ $ cat /etc/systemd/system/mnt-backup.mount
[Unit]
Description=Restic Backup External Disk mount
[Mount]
What=/dev/disk/by-label/backup
Where=/mnt/backup
Type=ext4
Options=defaults
[Install]
WantedBy=multi-user.target
Init the repo from the nfs server (this assumes passwordless ssh auth):
restic init -r sftp:<remote_server_ip>:/mnt/backup/nfs-backups
A staging environment can be deployed using vagrant:
brew tap hashicorp/tap
brew install hashicorp/tap/vagrant
sudo apt install virtualbox vagrant --no-install-recommends
Then create the staging env:
# launch
vagrant up
# add the nodes ssh config
vagrant ssh-config >> $HOME/.ssh/config
# deploy the cluster
cd ansible
ansible-playbook -i inventory.yaml -l staging cluster-install.yaml
# get the kubectl config
cd ..
vagrant ssh -c "kubectl config view --raw" staging-master > $HOME/.kube/configs/staging
# test
kubectl get nodes
Then bootstrap the cluster using flux from this section, ideally using a develop branch.