Key management
TL;DR — Three signing-key stores:
keyfile(default, PEM on disk — single-instance),postgres_key(encrypted-at-rest in DB, multi-instance viaLISTEN/NOTIFY),vault_transit(private key never leaves Vault — HSM-grade). Rotation is zero-downtime: new key becomescurrent, old stays in JWKS until it expires. SDK JWKS cache picks up new keys within 5 min by default. Losing the private key = every outstanding JWT unverifiable — back up keyfile deployments religiously.
Lifecycle
Generate → Active (signing + verification) → Rotated (verification only) → Removed from JWKS
- Generate — first boot creates the key automatically. Nothing to configure.
- Active — the current key signs all new JWTs. Public key in JWKS.
- Rotated — after
authserver admin key rotate, a new key takes over signing. Old key stays in JWKS so existing tokens still verify. - Removed — after every token signed with the old key has expired, the old key leaves the JWKS.
Default retention of the “previous” key = dcr.default_token_expiry (15 min) + some grace period — the old key stays around long enough for in-flight tokens to complete but not indefinitely.
Storage backends
keyfile (default) — single-instance
PEM files on disk. Simplest option; works well for single-instance deployments.
signing:
algorithm: ES256
key_store: keyfile
key_path: /var/lib/authserver/keys
Directory contains:
current.pem— active signing key (private + public)rotated-<timestamp>.pem— previous keys kept for verification
Watch out for:
- File permissions — only the authserver process user should read them.
chmod 600+ owned by theauthserveruser. If anyone else can read, they can forge tokens. - Backups — back up the
keys/directory. Losing the private key = every outstanding JWT unverifiable (MCP servers reject them until users re-authenticate). - Docker — use a named volume (
authserver-data) so keys persist across container restarts. Without a volume, key rotation state resets on restart.
postgres_key — multi-instance (HA)
Multiple AuthPlane instances behind a load balancer need to share the signing key. postgres_key stores encrypted keys in the DB — all instances read from the same place, LISTEN/NOTIFY on the signing_key_change channel propagates rotations across instances in milliseconds.
signing:
algorithm: ES256
key_store: postgres_key
Requires storage.driver: postgres AND data_encryption configured (AES-256-GCM or Vault Transit). The signing key is encrypted with the data-encryption driver before being written to signing_keys.
vault_transit — maximum security
Keys are managed by HashiCorp Vault’s Transit engine. Private key never leaves Vault — AuthPlane sends the JWT payload to Vault for signing, gets back the signature.
signing:
algorithm: ES256
key_store: vault_transit
vault_transit:
address: https://vault:8200
mount: transit
key_name: authserver-signing
approle:
role_id: "..."
secret_id: "..."
Full setup + Vault CLI walkthrough in Operate: Vault Transit signing.
Cost: ~2 ms latency per signing operation (Vault round trip). Fine for /oauth/token throughput on any reasonable Vault deployment.
Benefits: keys never touch disk, Vault audit log records every sign, FIPS compliance possible with Vault-backed HSM.
Rotation
When to rotate
- Calendar cadence — every 90 days as part of key-hygiene policy.
- After suspected exposure — backup loss, stolen disk, operator departure with past
keys/access. - During algorithm migration — ES256 ↔ RS256. Rotate to switch algorithms gradually; old tokens keep verifying until they expire.
How to rotate
Zero-downtime hot operation. Previous key stays in the JWKS document so outstanding tokens remain verifiable; new tokens are signed with the new key from the first /oauth/token call after rotation.
# Via Admin API
curl -X POST http://localhost:9001/admin/keys/rotate \
-H "Authorization: Bearer $AUTHPLANE_ADMIN_API_KEY"
# Or via CLI
authserver admin key rotate
Response:
{
"current_kid": "kid_new_abc",
"previous_kid": "kid_old_xyz",
"rotated_at": "2026-07-01T00:00:00Z"
}
What happens under the hood
- New key pair generated + persisted (keyfile writes to
key_path; Vault Transit +postgres_keywrite to their backends). - New key becomes
current; the previously current key demotes toprevious. Older keys are removed from JWKS so key-compromise blast radius is bounded. /.well-known/jwks.jsonnow includes BOTHcurrentandpreviouskeys.- Verifiers that cache the JWKS pick up the new key on their next refresh (SDKs default to 5 min cache TTL — configurable via
jwks_refresh_seconds). - All tokens signed after rotation carry the new
kid. In-flight tokens signed by the previous key continue to verify until they expire.
Multi-instance propagation
postgres_key—LISTEN/NOTIFYonsigning_key_changechannel; all instances reload within milliseconds.vault_transit— Vault is the source of truth; instances all sign via Vault, no local key to propagate.keyfileon shared PVC — instances see the new file when their filesystem watchers fire (may take up tojwks_refresh_seconds). UseSIGHUPfor immediate reload:kill -HUP $(pidof authserver)ordocker kill -s HUP ....keyfileon separate hosts — you’re on your own for propagation. Not recommended for multi-instance.
JWKS caching considerations
Every SDK caches the JWKS with a default TTL of 5 min (jwks_refresh_seconds). Right after rotation:
- Tokens signed with the new key verify only once the SDK’s cache has picked up the new
kid. - Tokens signed with the old key continue to verify because the old key is still in the JWKS.
- Worst case: 5 min of the SDK rejecting new tokens with a “kid not found” error.
For coordinated rotations, lower jwks_refresh_seconds on the SDK side temporarily. Or accept the 5-min window — nothing breaks, just a rejection window for new tokens during that period.
Algorithm selection
AuthPlane rejects HS256 and other symmetric algorithms for JWT signing — the shared secret model doesn’t fit multi-party token verification.
Switching algorithms: rotate with the new algorithm. The old-algorithm key stays in JWKS for its retention window; old tokens verify against it. New tokens use the new algorithm. Zero-downtime migration.
JWKS endpoint
Served at /.well-known/jwks.json, unauthenticated, cacheable.
Shape:
{
"keys": [
{
"kty": "EC",
"crv": "P-256",
"x": "...",
"y": "...",
"use": "sig",
"kid": "current-kid",
"alg": "ES256"
},
{
"kty": "EC",
...
"kid": "previous-kid",
"alg": "ES256"
}
]
}
Recommendations for MCP servers
- Cache JWKS — SDKs do this by default. Don’t fetch per request.
- Verify
kid— tokens carrykidin the header; use the matching key from JWKS. Reject ifkidisn’t in the current JWKS (either the AS hasn’t propagated a new key yet, or the token was signed with a key that’s been rotated out). - Handle refresh windows — during rotation, expect a small window where new tokens might be rejected while your JWKS cache is stale. Bounded by
jwks_refresh_seconds.
Backup and disaster recovery
keyfile— back upkey_path/directory. Rotation state = which key iscurrentvspreviousis only in the file naming; ordering by timestamp is safe.postgres_key— back up the DB. Data-encryption key must also be preserved (via env var backup).vault_transit— follow your Vault backup procedure. Vault is the source of truth; AuthPlane doesn’t need a separate backup.
Recovery from lost private key: you can’t. Every outstanding JWT that was signed with the lost key becomes unverifiable. Clients hit 401 invalid_token until they re-authenticate. Rotate to a new key immediately; users complete a fresh OAuth flow.
Security invariants (T9)
From threat model → T9:
keyfilefiles: chmod 600, owned by AuthPlane user.postgres_key: encrypted at rest viadata_encryption.vault_transit: private key never on AuthPlane host.- Rotation on suspicion or every 90 days.
Related
- Operate: Vault Transit — HSM-grade signing walkthrough
- Operate: Backup, upgrade, purge — backup procedures per storage driver
- Reference: Configuration → signing — every knob
- Security: Threat model → T9
- Full source in authserver repo