Key management

TL;DR — Three signing-key stores: keyfile (default, PEM on disk — single-instance), postgres_key (encrypted-at-rest in DB, multi-instance via LISTEN/NOTIFY), vault_transit (private key never leaves Vault — HSM-grade). Rotation is zero-downtime: new key becomes current, old stays in JWKS until it expires. SDK JWKS cache picks up new keys within 5 min by default. Losing the private key = every outstanding JWT unverifiable — back up keyfile deployments religiously.

Lifecycle

Generate → Active (signing + verification) → Rotated (verification only) → Removed from JWKS
  1. Generate — first boot creates the key automatically. Nothing to configure.
  2. Active — the current key signs all new JWTs. Public key in JWKS.
  3. Rotated — after authserver admin key rotate, a new key takes over signing. Old key stays in JWKS so existing tokens still verify.
  4. Removed — after every token signed with the old key has expired, the old key leaves the JWKS.

Default retention of the “previous” key = dcr.default_token_expiry (15 min) + some grace period — the old key stays around long enough for in-flight tokens to complete but not indefinitely.

Storage backends

keyfile (default) — single-instance

PEM files on disk. Simplest option; works well for single-instance deployments.

signing:
  algorithm: ES256
  key_store: keyfile
  key_path: /var/lib/authserver/keys

Directory contains:

  • current.pem — active signing key (private + public)
  • rotated-<timestamp>.pem — previous keys kept for verification

Watch out for:

  • File permissions — only the authserver process user should read them. chmod 600 + owned by the authserver user. If anyone else can read, they can forge tokens.
  • Backups — back up the keys/ directory. Losing the private key = every outstanding JWT unverifiable (MCP servers reject them until users re-authenticate).
  • Docker — use a named volume (authserver-data) so keys persist across container restarts. Without a volume, key rotation state resets on restart.

postgres_key — multi-instance (HA)

Multiple AuthPlane instances behind a load balancer need to share the signing key. postgres_key stores encrypted keys in the DB — all instances read from the same place, LISTEN/NOTIFY on the signing_key_change channel propagates rotations across instances in milliseconds.

signing:
  algorithm: ES256
  key_store: postgres_key

Requires storage.driver: postgres AND data_encryption configured (AES-256-GCM or Vault Transit). The signing key is encrypted with the data-encryption driver before being written to signing_keys.

vault_transit — maximum security

Keys are managed by HashiCorp Vault’s Transit engine. Private key never leaves Vault — AuthPlane sends the JWT payload to Vault for signing, gets back the signature.

signing:
  algorithm: ES256
  key_store: vault_transit
  vault_transit:
    address: https://vault:8200
    mount: transit
    key_name: authserver-signing
    approle:
      role_id: "..."
      secret_id: "..."

Full setup + Vault CLI walkthrough in Operate: Vault Transit signing.

Cost: ~2 ms latency per signing operation (Vault round trip). Fine for /oauth/token throughput on any reasonable Vault deployment.

Benefits: keys never touch disk, Vault audit log records every sign, FIPS compliance possible with Vault-backed HSM.

Rotation

When to rotate

  • Calendar cadence — every 90 days as part of key-hygiene policy.
  • After suspected exposure — backup loss, stolen disk, operator departure with past keys/ access.
  • During algorithm migration — ES256 ↔ RS256. Rotate to switch algorithms gradually; old tokens keep verifying until they expire.

How to rotate

Zero-downtime hot operation. Previous key stays in the JWKS document so outstanding tokens remain verifiable; new tokens are signed with the new key from the first /oauth/token call after rotation.

# Via Admin API
curl -X POST http://localhost:9001/admin/keys/rotate \
    -H "Authorization: Bearer $AUTHPLANE_ADMIN_API_KEY"

# Or via CLI
authserver admin key rotate

Response:

{
  "current_kid": "kid_new_abc",
  "previous_kid": "kid_old_xyz",
  "rotated_at": "2026-07-01T00:00:00Z"
}

What happens under the hood

  1. New key pair generated + persisted (keyfile writes to key_path; Vault Transit + postgres_key write to their backends).
  2. New key becomes current; the previously current key demotes to previous. Older keys are removed from JWKS so key-compromise blast radius is bounded.
  3. /.well-known/jwks.json now includes BOTH current and previous keys.
  4. Verifiers that cache the JWKS pick up the new key on their next refresh (SDKs default to 5 min cache TTL — configurable via jwks_refresh_seconds).
  5. All tokens signed after rotation carry the new kid. In-flight tokens signed by the previous key continue to verify until they expire.

Multi-instance propagation

  • postgres_keyLISTEN/NOTIFY on signing_key_change channel; all instances reload within milliseconds.
  • vault_transit — Vault is the source of truth; instances all sign via Vault, no local key to propagate.
  • keyfile on shared PVC — instances see the new file when their filesystem watchers fire (may take up to jwks_refresh_seconds). Use SIGHUP for immediate reload: kill -HUP $(pidof authserver) or docker kill -s HUP ....
  • keyfile on separate hosts — you’re on your own for propagation. Not recommended for multi-instance.

JWKS caching considerations

Every SDK caches the JWKS with a default TTL of 5 min (jwks_refresh_seconds). Right after rotation:

  • Tokens signed with the new key verify only once the SDK’s cache has picked up the new kid.
  • Tokens signed with the old key continue to verify because the old key is still in the JWKS.
  • Worst case: 5 min of the SDK rejecting new tokens with a “kid not found” error.

For coordinated rotations, lower jwks_refresh_seconds on the SDK side temporarily. Or accept the 5-min window — nothing breaks, just a rejection window for new tokens during that period.

Algorithm selection

AlgorithmKey typeRecommendation
ES256ECDSA P-256Default recommended — compact tokens (~500 bytes), fast verification, wide support
RS256RSA 2048+Use only when environment requires RSA (legacy IdPs, HSM constraints)

AuthPlane rejects HS256 and other symmetric algorithms for JWT signing — the shared secret model doesn’t fit multi-party token verification.

Switching algorithms: rotate with the new algorithm. The old-algorithm key stays in JWKS for its retention window; old tokens verify against it. New tokens use the new algorithm. Zero-downtime migration.

JWKS endpoint

Served at /.well-known/jwks.json, unauthenticated, cacheable.

Shape:

{
  "keys": [
    {
      "kty": "EC",
      "crv": "P-256",
      "x": "...",
      "y": "...",
      "use": "sig",
      "kid": "current-kid",
      "alg": "ES256"
    },
    {
      "kty": "EC",
      ...
      "kid": "previous-kid",
      "alg": "ES256"
    }
  ]
}

Recommendations for MCP servers

  1. Cache JWKS — SDKs do this by default. Don’t fetch per request.
  2. Verify kid — tokens carry kid in the header; use the matching key from JWKS. Reject if kid isn’t in the current JWKS (either the AS hasn’t propagated a new key yet, or the token was signed with a key that’s been rotated out).
  3. Handle refresh windows — during rotation, expect a small window where new tokens might be rejected while your JWKS cache is stale. Bounded by jwks_refresh_seconds.

Backup and disaster recovery

  • keyfile — back up key_path/ directory. Rotation state = which key is current vs previous is only in the file naming; ordering by timestamp is safe.
  • postgres_key — back up the DB. Data-encryption key must also be preserved (via env var backup).
  • vault_transit — follow your Vault backup procedure. Vault is the source of truth; AuthPlane doesn’t need a separate backup.

Recovery from lost private key: you can’t. Every outstanding JWT that was signed with the lost key becomes unverifiable. Clients hit 401 invalid_token until they re-authenticate. Rotate to a new key immediately; users complete a fresh OAuth flow.

Security invariants (T9)

From threat model → T9:

  • keyfile files: chmod 600, owned by AuthPlane user.
  • postgres_key: encrypted at rest via data_encryption.
  • vault_transit: private key never on AuthPlane host.
  • Rotation on suspicion or every 90 days.