Authentication & keys
Every request to the gateway is authenticated with a gateway API key
(gw_live_…).
Where the key goes
Section titled “Where the key goes”| Surface | Header |
|---|---|
Provider proxy (/openai/…, /anthropic/…, …) | The provider’s native auth header — Authorization: Bearer gw_live_… for OpenAI-style APIs, x-api-key: gw_live_… for Anthropic |
Self-service API (/gw/…) | Authorization: Bearer gw_live_… |
The gateway replaces your key with the real provider credential server-side; your key never reaches the provider.
Scopes
Section titled “Scopes”Each key carries a set of scopes:
| Scope | Allows |
|---|---|
inference:use | Calling providers through the proxy |
stats:read | GET /gw/stats, GET /gw/usage, GET /gw/async |
keys:manage | GET /gw/ceiling, GET/POST /gw/keys, DELETE /gw/keys/{id} |
A request without the required scope gets 403.
Model entitlements
Section titled “Model entitlements”Besides scopes, each key has entitlements — (provider, model_pattern)
glob rules that control which models it may call. Policy is default-deny and
deny-wins: a model is allowed only if some allow rule matches and no deny
rule matches. Calling a non-entitled model returns 403 before the request
ever reaches the provider.
Check what your key can do:
curl https://api.sociaro.com/gw/me -H "Authorization: Bearer gw_live_..."# {"scopes":["inference:use"],"entitlements":[{"provider":"openai","model_pattern":"gpt-4o*","effect":"allow"}]}Issuing keys for your team
Section titled “Issuing keys for your team”With a keys:manage key you can issue child keys scoped down for each
service or teammate — see POST /gw/keys.
Rules:
- A child key’s scopes and entitlements must fit inside your organization’s
ceiling (
GET /gw/ceiling); anything beyond it is rejected with403and no key is created. - The plaintext key is returned exactly once — store it immediately.
- Keys are immutable: to change permissions, revoke and issue a new one.
- Revocation (
DELETE /gw/keys/{id}) is a soft delete — usage history is preserved. A revoked key may keep working for up to ~30 seconds on a given gateway instance (auth cache TTL).
Key safety
Section titled “Key safety”- Treat
gw_live_…like any production secret: environment variables or a secret manager, never source control. - Issue one key per service/environment so revocation is surgical.
- Use the least scopes possible — a server that only does inference needs
nothing beyond
inference:use.