03 — Authentication Service (lattice-auth)¶
Part of the Lattice networking suite design set. Read 00-overview.md and 01-high-level-design.md first for vocabulary and the data-plane/control-plane split. This document specifies
lattice-auth, the control-plane service that proves who a player is and mints the tokens every other component trusts. The on-the-wire token validation at connection time is the bridge to 02-netcode-lld.md; identity is consumed by 04-social-library.md; delivery sequencing lives in 06-implementation-roadmap.md.
1. Goals & Responsibilities¶
lattice-auth is the single logical source of identity for the suite. It is implemented as N ≥ 2 stateless, load-balanced nodes that behave as one auth source (see §5 — this is the core requirement driving the whole design).
In scope¶
- Authentication — verify a caller's claim to an identity (guest, email+password, or an external platform ticket such as Steam/Epic/Google/Apple).
- Identity & account management — one durable
Accountper human; multiple linkedIdentity/Credentialrecords (account linking and upgrade, e.g. guest → email → Steam-linked). - Session & token issuance — mint short-lived, stateless, Ed25519-signed access tokens (verifiable offline by game servers) and long-lived, rotating, server-tracked refresh tokens.
- Public-key distribution — publish the current signing public keys as a JWKS-style key set at
/.well-known/jwks.jsonso any verifier (other auth nodes,lattice-director,lattice-gameserver) can validate tokens without contacting auth. - Security & anti-abuse — rate limiting, brute-force / credential-stuffing defence, account lockout, audit logging, key rotation.
- Data-subject operations — account deletion / export (GDPR / CCPA).
Explicitly NOT in scope¶
| Concern | Owner | Note |
|---|---|---|
| Matchmaking, session directory, fleet orchestration, server allocation | lattice-director (8444) |
Director consumes an access token to authorize a player and reads sub/roles/region claims. |
| Friends, presence, parties, invites, messaging | lattice-social (9443) |
Social consumes auth identity (sub) as the stable user key; it never mints tokens. |
| Gameplay authority, replication, prediction | lattice-gameserver / lattice-core |
Game server only validates the access token at handshake (§7). |
| Payments / entitlements / inventory | Out of suite | A title's commerce backend can read sub; not provided here. |
Design principle: auth is on the control plane and is off the hot path. A player authenticates rarely (login, refresh) and then carries a self-describing token. The data plane (game servers) must never make a synchronous call to auth to admit a player; it validates the token's signature locally. This keeps connection setup fast and keeps auth availability decoupled from match availability.
2. Identity Model¶
Lattice separates the durable Account (the person / save anchor) from the Identities that can authenticate into it. This is what makes account linking and the guest→full upgrade clean.
Identity kinds¶
| Kind | provider |
How it authenticates | Notes |
|---|---|---|---|
| Guest / anonymous | guest |
Device-bound secret issued at first launch; no PII | Upgradable; rate-limited & abuse-scored (§8). |
| Email + password | email |
Email + Argon2id-hashed password | Email verification, password reset, lockout. |
| Steam | steam |
Steam encrypted app ticket | Verified via Steamworks Web API (AuthenticateUserTicket). |
| Epic / EOS | epic |
EOS ID token (OIDC) | Verified via Epic JWKS. |
google |
Google OIDC ID token | Verified via Google JWKS, aud check. |
|
| Apple | apple |
Sign in with Apple identity token | Verified via Apple JWKS, nonce check. |
| Console (future) | psn / xbl / nintendo |
Platform attestation token | Same exchange pattern as §6.3. |
A single Account may own many Identity rows (one per provider), enabling cross-platform play under one save. Account linking attaches a new verified identity to an existing account; the inverse unlink is supported with a guard that an account must always retain at least one usable credential.
2.1 Data Model¶
classDiagram
class Account {
+uuid id "PK — the stable sub claim"
+string display_name
+enum status "active | locked | banned | deleted"
+string[] roles "player, moderator, admin, server"
+string home_region "eu, na, ap..."
+bool is_guest "true until upgraded"
+timestamptz created_at
+timestamptz updated_at
+timestamptz deleted_at "soft delete for GDPR grace"
}
class Identity {
+uuid id "PK"
+uuid account_id "FK -> Account"
+enum provider "guest|email|steam|epic|google|apple"
+string provider_user_id "external subject id"
+bool verified
+timestamptz linked_at
+timestamptz last_login_at
}
class Credential {
+uuid id "PK"
+uuid identity_id "FK -> Identity (email/guest only)"
+enum type "password | guest_secret"
+string secret_hash "Argon2id (never plaintext)"
+int failed_attempts
+timestamptz locked_until
+timestamptz rotated_at
}
class Session {
+uuid id "PK — the sid claim"
+uuid account_id "FK -> Account"
+uuid device_id "FK -> Device"
+enum auth_method "guest|email|steam|epic|google|apple"
+string ip_cidr
+string region
+timestamptz created_at
+timestamptz expires_at
+timestamptz revoked_at
}
class RefreshToken {
+uuid id "PK — opaque token id (jti)"
+uuid session_id "FK -> Session"
+string token_hash "SHA-256 of the opaque secret"
+uuid prev_token_id "rotation chain (reuse detection)"
+bool used
+timestamptz issued_at
+timestamptz expires_at
}
class Device {
+uuid id "PK"
+uuid account_id "FK -> Account (nullable for guest pre-link)"
+string platform "ios|android|win|mac|linux|console|web"
+string fingerprint_hash
+string push_handle "nullable"
+timestamptz first_seen_at
+timestamptz last_seen_at
}
class AuditLog {
+uuid id "PK"
+uuid account_id "FK (nullable)"
+enum event "login|logout|refresh|link|unlink|lockout|delete|key_rotation"
+string ip
+string node_id "which auth node served it"
+jsonb detail
+timestamptz at
}
Account "1" o-- "many" Identity : owns
Identity "1" o-- "0..1" Credential : secured by
Account "1" o-- "many" Session : has
Account "1" o-- "many" Device : registers
Session "1" o-- "many" RefreshToken : rotates
Session "1" --> "1" Device : bound to
Account "1" o-- "many" AuditLog : records
Storage placement. Account, Identity, Credential, Device, AuditLog live in PostgreSQL (durable, transactional, the source of truth). Session and RefreshToken live in Redis (fast, TTL-driven) with an optional async write-through to Postgres for long-term audit/forensics. Rate-limit counters and lockout state are Redis-only. See §9 for HA.
3. Token Design¶
Two token types with deliberately different lifetimes and trust models.
3.1 Access token — short-lived, stateless, offline-verifiable¶
- Format: PASETO v4.public (default) — an Ed25519-signed, versioned, hard-to-misuse token. JWT with EdDSA (Ed25519) is the interoperable alternative when third-party tooling demands JWT; both carry the same claim set below. (Crypto baseline from the shared brief: Ed25519 for signing; X25519 + ChaCha20-Poly1305 is the separate transport crypto in
lattice-core.) - Lifetime: 5–15 minutes (default 10 min). Short on purpose: revocation is mostly handled by expiry, so verifiers can stay fully offline.
- Verification: signature checked against the public key for the token's
kid, fetched once from/.well-known/jwks.jsonand cached. No round-trip to auth is required to validate — this is what lets game servers admit players at line rate (§7). - Stateless: the access token is never stored server-side. Any node can issue it; any verifier can validate it.
Access-token claims¶
| Claim | Type | Example | Meaning / who reads it |
|---|---|---|---|
sub |
uuid string | "7c9e...". Account.id |
Stable user identity. Read by game server, director, social. |
sid |
uuid string | Session.id |
Session this token belongs to; key for optional revocation check & refresh linkage. |
platform |
string | "steam" |
Auth method / platform the player came in on. |
roles |
string[] | ["player"] |
Authorization roles (player, moderator, admin, server). |
region |
string | "eu" |
Home/affinity region; director uses it for placement, game server for logging. |
iss |
string | "lattice-auth" |
Issuer; logical issuer (not per-node) — all nodes share it. |
aud |
string | "lattice" |
Intended audience; verifiers reject mismatches. |
iat |
unix ts | 1718900000 |
Issued-at. |
exp |
unix ts | 1718900600 |
Expiry (iat + ~10 min). Primary revocation mechanism. |
jti |
uuid string | per-token | Unique token id; enables targeted deny-listing if ever needed. |
kid |
string (header) | "2026-06-key-a" |
Signing key id; selects the public key from JWKS (§5.3). |
Cross-doc contract (load-bearing).
lattice-gameserverand the 02-netcode-lld.md handshake rely on exactly these claims: they verify the Ed25519 signature forkid, then checkexp/iat,iss/aud, and readsub,sid,roles,region. The handshake auth field carries the raw PASETO/JWT string. Adding or renaming a claim is a coordinated change across auth, director, social, and the netcode handshake.
3.2 Refresh token — long-lived, rotating, server-tracked¶
- Format: an opaque high-entropy secret (≥ 256 bits, e.g.
base64url(32 bytes)). It is not a JWT — it carries no claims and is meaningless without the server record. - Lifetime: days to weeks (default 30 days, sliding), bounded by the parent
Session.expires_at. - Storage: server-side in Redis (hash of the secret, never the secret itself), keyed by
RefreshToken.id, with theSession/Devicelinkage. This is the stateful half of the system and is the lever for true logout/revocation. - Rotation: single-use, rotating (RFC 6819 §5.2.2.3). Every successful refresh issues a new refresh token and marks the old one
used. Presenting an already-used token is treated as token theft: the entireSession(and its token chain) is revoked and anAuditLogrefresh/lockoutevent is written. This makes refresh-token replay detectable.
4. Why two token types?¶
| Property | Access token | Refresh token |
|---|---|---|
| Verifiable offline by game servers | ✅ (signature) | ❌ (server lookup) |
| Stored server-side | ❌ stateless | ✅ Redis |
| Lifetime | minutes | days/weeks |
| Carries claims | ✅ | ❌ opaque |
| Revocable instantly | only via short expiry / optional deny-list | ✅ delete the record |
| Used at game-server handshake | ✅ | ❌ never leaves the client↔auth path |
The access token optimizes the hot path (cheap, local, fast). The refresh token optimizes control (revocable, theft-detecting). Together: players stay logged in for weeks while game servers never call auth.
5. The Dual-Node "Act As One" Design (core requirement)¶
User requirement: "an auth server of sorts before the main game servers, all with balanced capabilities, so 2 auth servers but both act as one auth source."
Lattice realizes this as N ≥ 2 stateless auth nodes behind a load balancer, all sharing the same backing stores and the same signing key. To a client, the director, or a game server there is one auth source; internally, requests fan out across interchangeable nodes.
5.1 The three properties that make N nodes "one source"¶
- Stateless nodes. A node holds no per-request session in local memory. Everything durable goes to the shared stores. Any node can serve any request for any user; nodes are cattle, not pets. Adding/removing a node changes throughput, never correctness.
- Shared state stores. All nodes read/write the same PostgreSQL (accounts/identities — the source of truth) and the same Redis (sessions, refresh tokens, rate-limit counters). So a refresh issued on Node A is immediately visible to Node B; a lockout counter incremented on Node B is enforced by Node A.
- Shared signing key. All nodes sign access tokens with the same Ed25519 private key (selected by
kid). Therefore a token minted by Node A is byte-for-byte valid to a verifier using the shared public key — including the other auth nodes and every game server. There is no "Node A's tokens" vs "Node B's tokens"; there is one issuer (iss: lattice-auth) with one key set.
Statelessness lets any node serve; the shared store lets any node see the same data; the shared signing key lets any node's tokens be trusted everywhere. That triad is precisely "balanced capabilities, both act as one auth source."
5.2 Deployment¶
flowchart TB
subgraph clients["Clients / Callers"]
C["Game client"]
D["lattice-director (8444)"]
S["lattice-social (9443)"]
end
LB["Load Balancer (L4/L7)\nTLS terminate, health checks\nVIP :8443"]
subgraph authtier["lattice-auth tier (stateless, N>=2)"]
A1["auth-node-1"]
A2["auth-node-2"]
A3["auth-node-N (scale out)"]
end
subgraph signing["Signing / Secrets"]
KMS["Secret Manager / KMS\n(private signing key, by kid)"]
SS["(optional) dedicated signing service\nholds private key, returns signatures"]
end
subgraph stores["Shared state (HA)"]
PG[("PostgreSQL primary\n+ replicas — accounts, identities")]
RDS[("Redis cluster\nsessions, refresh, rate-limit")]
NATS[("optional NATS\nkey-rotation / revocation pub-sub")]
end
GS["lattice-gameserver (UDP 27015+)\nverifies tokens OFFLINE via JWKS"]
C -->|"HTTPS :8443"| LB
D -->|"HTTPS :8443"| LB
S -->|"HTTPS :8443"| LB
LB --> A1
LB --> A2
LB --> A3
A1 -. "fetch/cache private key" .-> KMS
A2 -. "fetch/cache private key" .-> KMS
A3 -. "fetch/cache private key" .-> KMS
A1 -. "or sign via" .-> SS
A2 -. "or sign via" .-> SS
A1 --> PG
A2 --> PG
A3 --> PG
A1 --> RDS
A2 --> RDS
A3 --> RDS
A1 -. "rotation events" .- NATS
A2 -. "rotation events" .- NATS
GS -->|"GET /.well-known/jwks.json (cached)"| LB
5.3 Signing key distribution & rotation (JWKS + key IDs)¶
The private signing key is never baked into an image or committed. Two equivalent strategies (the brief lists both):
- Distributed private key (default). The Ed25519 private key is stored in a secret manager / KMS (e.g. cloud KMS, Vault). Each node fetches it at boot (and on rotation), caches it in memory, and signs locally — fast, no extra hop.
- Dedicated signing service. A single small service holds the private key and exposes a "sign these bytes" RPC; auth nodes never see the private key. Stronger blast-radius control, one extra network hop. Choose this for high-compliance titles.
Public keys are openly published as a JWKS-style key set at GET /.well-known/jwks.json, each entry tagged with a kid. Verifiers (other auth nodes, director, every game server) fetch and cache the set; the token header's kid selects the right public key.
Rotation is overlap-based and zero-downtime:
flowchart LR
A["Generate key B in KMS\n(kid = 2026-09-key-b)"] --> B["Publish B's PUBLIC key in JWKS\nalongside A (both keys served)"]
B --> C["Wait propagate window\n>= verifier JWKS cache TTL\n(e.g. 10-15 min)"]
C --> D["Flip nodes to SIGN with B\n(A still verifiable)"]
D --> E["Wait > max access-token lifetime\n(all A-signed tokens expired)"]
E --> F["Remove A's public key from JWKS\nretire kid A"]
Because both public keys are in JWKS during the overlap, tokens signed by the old key keep validating until they expire, and tokens signed by the new key validate as soon as the JWKS cache refreshes. Rotation never invalidates a live session. An emergency rotation (suspected key compromise) shortens the windows and additionally pushes a key_rotation event over NATS (or relies on a short JWKS TTL) so verifiers refresh immediately.
6. Login Flows¶
All flows hit the LB VIP on :8443; the LB picks any healthy node. The "auth node" lane below is whichever node was chosen — it does not matter which.
6.1 Guest / anonymous login¶
sequenceDiagram
autonumber
participant Cli as Client
participant LB as Load Balancer (:8443)
participant N as auth-node (any)
participant PG as PostgreSQL
participant R as Redis
Cli->>LB: POST /guest { device_fingerprint }
LB->>N: forward (any healthy node)
N->>R: check guest-create rate limit (per IP/device)
alt limit exceeded
N-->>Cli: 429 Too Many Requests
else allowed
N->>PG: upsert Device; create Account(is_guest=true) + Identity(provider=guest) + guest_secret Credential
N->>R: create Session + RefreshToken (hashed)
N->>N: sign access token (Ed25519, current kid)
N-->>Cli: 200 { access_token, refresh_token, expires_in, account_id }
end
The returned guest_secret (delivered once) lets the same device re-authenticate later; the guest can be upgraded by linking an email/platform identity to the same Account (§6.3 linking variant).
6.2 Email + password login¶
sequenceDiagram
autonumber
participant Cli as Client
participant LB as Load Balancer (:8443)
participant N as auth-node (any)
participant PG as PostgreSQL
participant R as Redis
Cli->>LB: POST /login { email, password, device }
LB->>N: forward
N->>R: check login rate limit + lockout (email + IP)
alt locked or rate-limited
N-->>Cli: 429 / 423 Locked
else allowed
N->>PG: load Identity(email) + Credential
N->>N: Argon2id verify(password, secret_hash)
alt password wrong
N->>R: increment failed_attempts; maybe set locked_until
N->>PG: write AuditLog(login, failure)
N-->>Cli: 401 Unauthorized
else password ok
N->>R: reset failed_attempts; create Session + RefreshToken
N->>PG: write AuditLog(login, success); update last_login_at
N->>N: sign access token
N-->>Cli: 200 { access_token, refresh_token, expires_in }
end
end
6.3 Platform-token exchange (e.g. Steam ticket → Lattice token)¶
sequenceDiagram
autonumber
participant Cli as Client
participant LB as Load Balancer (:8443)
participant N as auth-node (any)
participant P as Platform API (Steam/Epic/Google/Apple)
participant PG as PostgreSQL
participant R as Redis
Cli->>LB: POST /platform { provider:"steam", ticket, device }
LB->>N: forward
N->>R: rate-limit check (per IP/provider)
N->>P: verify ticket (AuthenticateUserTicket / OIDC + JWKS)
alt ticket invalid
N-->>Cli: 401 Unauthorized
else valid -> provider_user_id
N->>PG: find Identity(provider, provider_user_id)
alt identity exists
PG-->>N: existing Account
else first time
N->>PG: create Account + Identity (verified=true)
end
opt account linking (attach to existing logged-in account)
N->>PG: link new Identity to current Account (guard >=1 credential)
N->>PG: AuditLog(link)
end
N->>R: create Session + RefreshToken
N->>N: sign access token (platform claim = "steam")
N-->>Cli: 200 { access_token, refresh_token, expires_in, account_id }
end
This is the canonical "auth server before the game servers" step: the client trades a platform-native credential for a Lattice access token that the rest of the suite understands.
6.4 Token refresh / rotation¶
sequenceDiagram
autonumber
participant Cli as Client
participant LB as Load Balancer (:8443)
participant N as auth-node (any)
participant R as Redis
Note over Cli: access token near expiry (or expired)
Cli->>LB: POST /refresh { refresh_token }
LB->>N: forward (possibly a DIFFERENT node than issued it)
N->>R: look up RefreshToken by hash
alt not found / expired / session revoked
N-->>Cli: 401 — must re-login
else used == true (REPLAY!)
N->>R: revoke entire Session + token chain
N-->>Cli: 401 — session terminated (theft suspected)
else valid & unused
N->>R: mark old token used; create new RefreshToken (prev_token_id = old)
N->>N: sign fresh access token (current kid)
N-->>Cli: 200 { access_token, refresh_token, expires_in }
end
Because Redis is shared, the node serving the refresh need not be the node that issued the original token — proof of the "act as one" property end to end.
7. Game-Server Token Validation (handshake bridge to 02-netcode-lld)¶
When a client connects to a lattice-gameserver (UDP 27015+), it presents its access token inside the 02-netcode-lld.md connection handshake (the encrypted handshake establishes X25519 + ChaCha20-Poly1305 transport keys; the access token rides in the authenticated handshake payload). The game server validates it offline:
sequenceDiagram
autonumber
participant Cli as Client
participant GS as lattice-gameserver (UDP 27015+)
participant JWKS as Auth JWKS (cached, :8443)
participant R as Redis (optional)
Note over GS,JWKS: at boot/periodically GS caches JWKS public keys (by kid)
Cli->>GS: connection handshake (X25519) + access_token (PASETO/JWT)
GS->>GS: select public key by token.kid (from cache)
GS->>GS: verify Ed25519 signature
GS->>GS: check exp/iat, iss="lattice-auth", aud="lattice"
alt signature/claims invalid or expired
GS-->>Cli: reject handshake (auth failed)
else valid
opt high-security titles only
GS->>R: SISMEMBER revoked sid/jti ?
alt revoked
GS-->>Cli: reject handshake (revoked)
end
end
GS->>GS: bind connection to sub, roles, region
GS-->>Cli: accept — proceed to sim join (see 02-netcode-lld)
end
Key points:
- The default path is a pure local cryptographic check — no network call to auth. This is essential: thousands of players can connect without ever loading the auth tier, and a full auth outage does not stop matches that are already placed.
- Optional revocation check. High-security or competitive titles can add a single Redis lookup against a small deny-set of revoked
sid/jtivalues (auth writes to this set on logout/ban). This trades a touch of latency and a Redis dependency for near-instant revocation, instead of waiting out the ≤10-minute token expiry. - The game server trusts the token because it was signed by the shared key it already has the public half of — it does not care which auth node minted it.
8. Node Failover¶
Because nodes are stateless and the access/refresh model is store-backed, losing a node is a throughput event, not a correctness event.
sequenceDiagram
autonumber
participant Cli as Client
participant LB as Load Balancer
participant A1 as auth-node-1
participant A2 as auth-node-2
participant R as Redis
LB->>A1: GET /healthz (every few seconds)
A1-->>LB: 200 OK
Note over A1: node-1 crashes / fails health check
LB->>A1: GET /healthz
A1--xLB: timeout / 5xx
LB->>LB: mark node-1 UNHEALTHY, drain from pool
Cli->>LB: POST /refresh { refresh_token }
LB->>A2: route to healthy node-2
A2->>R: read same session/refresh state (shared store)
A2-->>Cli: 200 { new tokens }
Note over Cli,A2: no re-login, no session loss
flowchart LR
F["auth-node-1 fails"] --> H["LB health check fails"]
H --> P["LB removes node from pool (connection draining)"]
P --> Rt["In-flight idempotent requests retried on another node"]
Rt --> OK["Survivors serve all traffic\n(stateless + shared store + shared key)"]
OK --> SC["Auto-scaler / orchestrator replaces node\n-> rejoins pool, no special bootstrap"]
- Health checks: LB probes
GET /healthz(liveness) andGET /readyz(DB + Redis + key-cache reachable). An unready node is drained before it serves traffic. - In-flight requests: all write endpoints are designed to be idempotent or safe to retry (refresh rotation tolerates retry via the chain; account creation upserts on provider id). The client SDK retries on 502/503/timeouts.
- No session loss: sessions/refresh tokens are in shared Redis and access tokens are self-contained, so a surviving node continues seamlessly. Tokens already issued by the dead node keep validating everywhere — they were signed with the shared key.
- Replacement: a new node needs only DB/Redis credentials and KMS access; it fetches the signing key and JWKS at boot and joins the pool. No state migration.
9. Scaling & High Availability¶
| Layer | Strategy |
|---|---|
| Auth nodes | Horizontal: add stateless replicas behind the LB. CPU-bound work is Argon2id hashing and Ed25519 signing; scale on CPU. N is independent per region. |
| Load balancer | Redundant L4/L7 LB (cloud LB or HAProxy/Envoy pair) with health checks; terminates TLS, or passes through to nodes for mTLS-internal setups. |
| PostgreSQL | Primary + streaming read replicas; reads (identity lookups) can hit replicas, writes go to primary. Automated failover (Patroni / managed Postgres). Account data is the source of truth — protected by PITR backups. |
| Redis | Redis Cluster (or primary/replica + Sentinel) for sessions/refresh/rate-limit. Tolerates node loss; AOF persistence for durability of refresh tokens. |
| NATS (optional) | Clustered; carries key-rotation and revocation fan-out. Non-critical — JWKS TTL is the fallback. |
| Regions | Deploy an auth tier per region (eu/na/ap). Postgres can be globally replicated (writes home-region or a global primary); Redis is regional (sessions are region-local). The signing key set is global so a token minted in EU validates on an NA game server — important for cross-region/social. region claim records affinity for director placement. |
Capacity intuition: a single node handles thousands of logins/sec and tens of thousands of refreshes/sec because refresh is a cheap Redis op + one signature; logins are dominated by the deliberately-expensive Argon2id. Sizing follows login spike, not concurrent players (those are on the data plane).
10. Security¶
| Area | Control |
|---|---|
| Transport | TLS 1.3 only on :8443; HSTS; modern cipher suites. Optional mTLS for director/social→auth internal calls. |
| Password storage | Argon2id (tuned memory/time cost), unique salt per credential; never plaintext, never reversible. Pepper held in KMS optional. |
| Rate limiting | Per-IP, per-account, per-endpoint sliding-window counters in shared Redis (so limits hold across all nodes — a key benefit of "act as one"). Stricter limits on /login, /guest, /platform. |
| Brute-force / credential stuffing | Failed-attempt counters with exponential backoff; account lockout (Credential.locked_until) after a threshold; IP reputation / velocity checks; optional CAPTCHA / proof-of-work challenge on suspicious bursts; breach-password screening (k-anonymity HIBP-style) at set/reset. |
| Audit logging | Append-only AuditLog (login, logout, refresh, link/unlink, lockout, delete, key_rotation) with node_id, IP, and detail — for forensics and compliance. |
| Token theft | Rotating single-use refresh tokens with reuse detection → whole-session revocation (§3.2, §6.4). Short access-token TTL caps stolen-access-token value. |
| Key management | Private signing key only in KMS / signing service; overlap-based rotation with JWKS + kid (§5.3); emergency rotation path; keys never in source/images/logs. |
| Guest abuse | Guests are device-bound and abuse-scored: rate-limited creation per IP/device, lower default trust, restricted from sensitive features until upgraded, and prunable (idle guest accounts garbage-collected). Prevents farming throwaway identities. |
| Authorization | roles claim drives RBAC across the suite; the server role authorizes server-to-server tokens (e.g. director/gameserver service identities). |
| GDPR / data deletion | DELETE /account triggers soft-delete (status=deleted, grace window) then hard purge of PII (Identity, Credential, Device, PII in AuditLog); data export endpoint returns the subject's data. sub (UUID) is pseudonymous and can be retained where lawful for integrity/ban enforcement after PII removal. Deletion fans out a notice to director/social to drop derived identity data. |
11. API Surface¶
All endpoints are served on the LB VIP https://auth.<env>.lattice:8443. JSON over HTTPS. Access token passed as Authorization: Bearer <token> where required.
| Method & Path | Auth | Purpose | Returns |
|---|---|---|---|
POST /guest |
none | Create/restore a guest identity (device-bound). | access_token, refresh_token, expires_in, account_id |
POST /login |
none | Email + password authentication. | tokens + account_id |
POST /platform |
none (or Bearer to link) | Exchange a platform ticket (Steam/Epic/Google/Apple) for Lattice tokens; with Bearer, links to the current account. | tokens + account_id |
POST /refresh |
refresh token | Rotate refresh token, mint new access token. | new tokens |
POST /logout |
Bearer / refresh | Revoke the current session (and add sid/jti to deny-set). |
204 |
GET /.well-known/jwks.json |
none (public) | Current signing public keys by kid for offline verification. |
JWKS document |
GET /account |
Bearer | Fetch the caller's account + linked identities + devices. | account profile |
PATCH /account |
Bearer | Update profile (display name, etc.). | updated profile |
POST /account/link |
Bearer | Link a new verified identity to the account. | updated identities |
POST /account/unlink |
Bearer | Unlink an identity (guard: ≥1 credential remains). | updated identities |
POST /account/export |
Bearer | GDPR data export. | export job / payload |
DELETE /account |
Bearer | GDPR deletion (soft-delete → purge). | 202 |
GET /healthz |
none (internal) | Liveness for LB. | 200 |
GET /readyz |
none (internal) | Readiness (DB/Redis/key-cache). | 200 / 503 |
12. Cross-Doc Assumptions & Contracts¶
- Access-token claim set is a shared contract (§3.1).
lattice-gameserver(the 02-netcode-lld.md handshake),lattice-director, andlattice-socialall depend onsub,sid,platform,roles,region,iss="lattice-auth",aud="lattice",exp,iat,jti, and the headerkid. Changing these is a coordinated suite-wide change. - Offline verification is the default (§7): game servers validate via cached JWKS, never a synchronous auth call. Only high-security titles add the optional Redis revocation lookup.
subis the universal user key consumed bylattice-social(04-social-library.md) andlattice-director. It is a stable, pseudonymous UUID that survives platform linking.- Crypto split: Ed25519 here is for token signing; X25519 + ChaCha20-Poly1305 in 02-netcode-lld.md is the transport layer — independent key material.
- Delivery: node count N, KMS choice (distributed key vs signing service), and the optional NATS revocation bus are sequenced in 06-implementation-roadmap.md; the minimum shippable config is N=2 nodes + Postgres + Redis + distributed KMS key.