Skip to content

Authentication & keys

Every request to the gateway is authenticated with a gateway API key (gw_live_…).

SurfaceHeader
Provider proxy (/openai/…, /anthropic/…, …)The provider’s native auth header — Authorization: Bearer gw_live_… for OpenAI-style APIs, x-api-key: gw_live_… for Anthropic
Self-service API (/gw/…)Authorization: Bearer gw_live_…

The gateway replaces your key with the real provider credential server-side; your key never reaches the provider.

Each key carries a set of scopes:

ScopeAllows
inference:useCalling providers through the proxy
stats:readGET /gw/stats, GET /gw/usage, GET /gw/async
keys:manageGET /gw/ceiling, GET/POST /gw/keys, DELETE /gw/keys/{id}

A request without the required scope gets 403.

Besides scopes, each key has entitlements(provider, model_pattern) glob rules that control which models it may call. Policy is default-deny and deny-wins: a model is allowed only if some allow rule matches and no deny rule matches. Calling a non-entitled model returns 403 before the request ever reaches the provider.

Check what your key can do:

Terminal window
curl https://api.sociaro.com/gw/me -H "Authorization: Bearer gw_live_..."
# {"scopes":["inference:use"],"entitlements":[{"provider":"openai","model_pattern":"gpt-4o*","effect":"allow"}]}

With a keys:manage key you can issue child keys scoped down for each service or teammate — see POST /gw/keys. Rules:

  • A child key’s scopes and entitlements must fit inside your organization’s ceiling (GET /gw/ceiling); anything beyond it is rejected with 403 and no key is created.
  • The plaintext key is returned exactly once — store it immediately.
  • Keys are immutable: to change permissions, revoke and issue a new one.
  • Revocation (DELETE /gw/keys/{id}) is a soft delete — usage history is preserved. A revoked key may keep working for up to ~30 seconds on a given gateway instance (auth cache TTL).
  • Treat gw_live_… like any production secret: environment variables or a secret manager, never source control.
  • Issue one key per service/environment so revocation is surgical.
  • Use the least scopes possible — a server that only does inference needs nothing beyond inference:use.