Keycloak deployment¶
Runbook for deploying Keycloak on the Hetzner cloud instance designated as auth.golemtrust.am. Written after the sticky note incident. Adora Belle requires this document be kept current. It is not optional.
Prerequisites¶
Hetzner cloud account with the Finland region available (Helsinki,
hel1)A provisioned CX31 instance running Debian 12 (Bookworm)
Root or sudo access via SSH key (no password login; Ponder disabled it after a conversation with Vimes)
A DNS A record pointing
auth.golemtrust.amat the instance’s public IPCertbot installed and configured with the Cloudflare DNS plugin for wildcard certificates
System preparation¶
Update the base system before touching anything else:
apt update && apt upgrade -y
apt install -y curl wget gnupg2 unzip nginx certbot python3-certbot-dns-cloudflare
Create a dedicated system user. Keycloak should not run as root. It has been explained to Mr. Pump that this applies to golems as well, though the conversation was one-sided:
useradd --system --home /opt/keycloak --shell /bin/false keycloak
Java runtime¶
Keycloak 26.x requires Java 21. Install the Temurin distribution from Adoptium:
wget -O - https://packages.adoptium.net/artifactory/api/gpg/key/public | apt-key add -
echo "deb https://packages.adoptium.net/artifactory/deb $(lsb_release -cs) main" \
> /etc/apt/sources.list.d/adoptium.list
apt update && apt install -y temurin-21-jdk
java -version
Keycloak installation¶
Download the latest release. Check the Keycloak releases page for the current version and update the URL accordingly:
cd /opt
wget https://github.com/keycloak/keycloak/releases/download/26.1.4/keycloak-26.1.4.tar.gz
tar xzf keycloak-26.1.4.tar.gz
mv keycloak-26.1.4 keycloak
chown -R keycloak:keycloak /opt/keycloak
rm keycloak-26.1.4.tar.gz
Database configuration¶
Keycloak connects to the PostgreSQL instance on db.golemtrust.am. The database and user should already exist (see the PostgreSQL backend runbook). Set the connection details in the Keycloak configuration file:
cat > /opt/keycloak/conf/keycloak.conf << 'EOF'
db=postgres
db-url=jdbc:postgresql://db.golemtrust.am:5432/keycloak
db-username=keycloak
db-password=<retrieve from Vaultwarden, collection: Infrastructure>
hostname=auth.golemtrust.am
http-enabled=false
https-certificate-file=/etc/letsencrypt/live/auth.golemtrust.am/fullchain.pem
https-certificate-key-file=/etc/letsencrypt/live/auth.golemtrust.am/privkey.pem
proxy=edge
EOF
The password is not to be written anywhere except Vaultwarden. This runbook exists precisely because of what happens when passwords are written somewhere else.
TLS certificate¶
Obtain a certificate using the Cloudflare DNS challenge. Place the Cloudflare API token in /etc/cloudflare.ini with permissions 600:
certbot certonly \
--dns-cloudflare \
--dns-cloudflare-credentials /etc/cloudflare.ini \
-d auth.golemtrust.am \
--email ponder@golemtrust.am \
--agree-tos \
--non-interactive
Add a cron job for renewal:
0 3 * * * certbot renew --quiet && systemctl reload keycloak
Nginx reverse proxy¶
Keycloak handles TLS directly in production mode, but Nginx sits in front to handle rate limiting and to expose port 443. Create /etc/nginx/sites-available/keycloak:
server {
listen 80;
server_name auth.golemtrust.am;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl;
server_name auth.golemtrust.am;
ssl_certificate /etc/letsencrypt/live/auth.golemtrust.am/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/auth.golemtrust.am/privkey.pem;
location / {
proxy_pass https://localhost:8443;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
}
}
ln -s /etc/nginx/sites-available/keycloak /etc/nginx/sites-enabled/
nginx -t && systemctl reload nginx
Build and start¶
Run the build step once to optimise the Keycloak configuration for production:
sudo -u keycloak /opt/keycloak/bin/kc.sh build
Create the systemd unit at /etc/systemd/system/keycloak.service:
[Unit]
Description=Keycloak identity provider
After=network.target postgresql.service
[Service]
User=keycloak
Group=keycloak
ExecStart=/opt/keycloak/bin/kc.sh start
Restart=on-failure
RestartSec=10
LimitNOFILE=65536
Environment=KEYCLOAK_ADMIN=admin
Environment=KEYCLOAK_ADMIN_PASSWORD=<retrieve from Vaultwarden, collection: Infrastructure>
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
systemctl enable keycloak
systemctl start keycloak
Initial realm configuration¶
Wait approximately two minutes for Keycloak to complete first startup, then log in at https://auth.golemtrust.am with the admin credentials.
Create two realms:
golemtrust-internal for employees and golems. Configure:
Password policy: minimum 16 characters, at least one symbol, no dictionary words, expiry 90 days
OTP required for all human accounts
Session idle timeout: 8 hours
Session max: 24 hours
golemtrust-customer for the customer portal. Configure:
Password policy: minimum 12 characters, no expiry (customers find mandatory rotation confusing and simply increment a number)
OTP available but not required by default; Adora Belle is considering making it mandatory after the Dolly Sisters incident
Registration flow enabled, with email verification
Session idle timeout: 2 hours
Hardware authentication devices from Überwald are enrolled by Ponder manually per employee. There is no self-service enrolment until the second device policy is in place.
Verification¶
curl -s https://auth.golemtrust.am/realms/golemtrust-internal/.well-known/openid-configuration \
| python3 -m json.tool | head -20
The response should include issuer, authorization endpoint, and token endpoint URIs. If it does not, check journalctl -u keycloak -n 100 and proceed to the troubleshooting section.
Troubleshooting¶
If Keycloak fails to start, the most common causes in order of frequency are:
The database is unreachable. Check
db.golemtrust.amis up and the credentials are correct.The TLS certificate path is wrong.
ls -la /etc/letsencrypt/live/auth.golemtrust.am/and compare withkeycloak.conf.Java version mismatch.
java -versionshould report 21.Port 8443 is already in use.
ss -tlnp | grep 8443.
Mr. Pump does not experience authentication failures. If the golem authentication provider is failing, see the golem authentication runbook.