Error contract

Two kinds of errors

Provider errors — forwarded to you as-is (status code and body unchanged). Your existing provider-specific error handling keeps working.
Gateway errors — returned before the request reaches the provider, with a JSON envelope:

{"error": {"type": "<machine-readable>", "message": "<human-readable>"}}

The JSON-envelope guarantee covers only errors the gateway generates itself. Provider errors are forwarded as-is, and some are routed through internal aggregators that may return non-JSON error bodies (plain text, HTML, or an upstream-specific shape). Do not assume every error response is JSON — only gateway-originated errors are guaranteed to carry the envelope above. Always check the status code first and parse the body defensively.

Gateway status codes

Every gateway error carries a stable type literal — match on it rather than on the human-readable message. The exact message strings for the auth, scope, entitlement and region cases are fixed and listed below so you can pattern-match them too.

Code	Type	When	What to do
`401`	`unauthorized`	Key missing, unknown, revoked, or belongs to a suspended sub-account. Message is always `invalid gateway key` (the gateway never reveals why a key failed)	Check the key and the auth header for this provider
`403`	`forbidden`	Missing scope — message `missing required scope: <scope>` (e.g. `missing required scope: keys:manage`); or model not entitled — message `model not permitted for this client`; or active key limit reached on `POST /gw/keys` — message `active key limit reached for this client: revoke unused keys or ask the operator to raise max_keys`	Check `GET /gw/me`; ask your admin for access. For the key limit: revoke keys you no longer use, or ask your operator to raise the cap
`403`	`region_policy`	Your organization has an `eu_only` routing policy and this provider does not run in the EU. Message: `provider region "<region>" is blocked by the client's eu_only routing policy`	Use an EU provider route — see eu-compliance
`400`	`bad_request`	A request the gateway must reject before forwarding: a model door called with a bare model name or a model of the wrong kind (see Door errors), async CREATE without `model`, an out-of-range `limit`, malformed JSON, an unknown entitlement axis	Fix the request
`404`	`not_found`	A catalog slug is absent from the catalog (model doors); an async task does not exist or belongs to another organization; a key, sub-account, budget target, branding row or Stripe mapping was not found on a `/gw/…` call	Recreate the missing object or fix the slug — see Door errors
`409`	`conflict`	A `/gw/…` create collides with an existing unique value: `sub_account slug already exists`, or `portal_slug already exists`	Choose a different slug
`413`	`request_too_large`	The request body exceeds `server.max_request_bytes` (10 MiB in production). Message: `request body too large`. The model doors buffer the whole body to rewrite the `model` field, so the cap applies there too	Split the request or reduce the payload
`429`	`rate_limited`	Per-key RPS or concurrency limit exceeded. Message `rate limit exceeded`; `Retry-After: 1` is set	Back off and retry
`402`	`budget_exceeded`	A budget at any applicable level (organization, sub-account or key) is exhausted (when enforcement is on). Message: `budget limit reached`	Contact your admin
`402`	`insufficient_balance`	The client’s cumulative prepay/postpay balance is exhausted (manual-billing clients, when balance enforcement is on) — distinct from `budget_exceeded`, which is a periodic per-period limit. Message: `account balance exhausted; top up to continue`	Top the balance up (your operator credits it) — see billing
`410`	`gone`	An async task can no longer be finalized because the upstream credential rotated after the task was created. Message: `credential rotated after task creation; re-issue the task`	Re-create the job
`503`	`unavailable`	Transient failure finalizing an async task, or an upstream billing call (Stripe) was unreachable	Retry the poll
`502`	`bad_gateway`	Gateway-originated, with envelope. A catalog slug routes to a provider that is not in the running gateway config (a catalog↔provider desync on the operator side). Message: `inference route for model "<slug>" is not available`	Not a client error — quote the `X-Gateway-Request-Id` and contact support
`500`	`internal` / `internal_error`	An unexpected gateway-side fault (details stay server-side). The model doors and provider proxy emit type `internal` (e.g. message `internal error` or `credential resolution failed`); the `/gw/*` self-service API emits `internal_error`	Retry; if it persists, quote the request id
`502` / `504`	— (no envelope)	Transport-level failure between the gateway and the provider: the upstream was unreachable, the connection or TLS handshake failed, request signing failed, or the provider timed out. The reverse proxy emits a bare `502`/`504` with no JSON body	Retry with backoff; consider SDK fallback chains

The two 502 rows are distinct. A gateway-originated 502 carries the standard envelope {"error":{"type":"bad_gateway","message":…}} and means the catalog and the running provider config disagree (an operator-side desync), not a network problem. A transport-level 502/504 comes from the reverse proxy when the request never completed against the upstream; it has no JSON body, so do not try to parse one.

Door errors

The OpenAI- and Anthropic-shaped endpoints — POST /v1/chat/completions, POST /v1/messages, POST /v1/images/generations, POST /v1/videos (+ GET /v1/videos/{id} to poll), POST /v1/embeddings, POST /v1/reranks, POST /v1/audio/speech — are the model doors. They resolve the model field as a catalog slug (author/model) before dispatching. The model must be a slug from GET /v1/models, not a provider-native name. These are the validation errors a door returns, with their exact messages:

Code	Type	Message (with `model` interpolated)	Cause
`400`	`bad_request`	`model "gpt-4o" is not a catalog slug; use 'author/model' (see GET /v1/models)`	You sent a bare model name. Use the catalog slug, e.g. `openai/gpt-4o`
`404`	`not_found`	`model "openai/nope" not found in catalog (see GET /v1/models)`	The slug is well-formed but not in the catalog
`400`	`bad_request`	`model "alibaba/qwen-image-2.0" has kind "image"; this endpoint serves only llm models`	The slug is real but its kind does not match this door (e.g. an image model sent to `/v1/chat/completions`). Each door serves exactly one kind: `llm`, `image`, `video`, `embedding`, `rerank`, or `audio`
`400`	`bad_request`	`model openai/gpt-4o is not served by this endpoint; use /v1/messages or the sociaro SDK`	The model’s provider has no compatible path for this door (e.g. an Anthropic-only model in the OpenAI-shaped `/v1/chat/completions`). The hint names the right door; cross-format translation lives in the Python SDK
`413`	`request_too_large`	`request body too large`	Body over `server.max_request_bytes` (10 MiB) — the door buffers the whole body to rewrite `model`
`502`	`bad_gateway`	`inference route for model "openai/gpt-4o" is not available`	The slug resolves to a provider that is not in the running gateway config — an operator-side desync, not your error

The recovery action for all door 400/404 cases is the same: list valid slugs with GET /v1/models and send one of those.

# Wrong: bare provider model name → 400 bad_request
curl https://api.sociaro.com/v1/chat/completions \
  -H "Authorization: Bearer $GW_KEY" -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"hi"}]}'
# {"error":{"type":"bad_request","message":"model \"gpt-4o\" is not a catalog slug; use 'author/model' (see GET /v1/models)"}}

# Right: catalog slug
curl https://api.sociaro.com/v1/chat/completions \
  -H "Authorization: Bearer $GW_KEY" -H "Content-Type: application/json" \
  -d '{"model":"openai/gpt-4o","messages":[{"role":"user","content":"hi"}]}'

The native provider-prefixed routes (/openai/…, /anthropic/…) do not go through the catalog; they pass model through unchanged, so the door errors above do not apply there.

Debugging

Every response carries X-Gateway-Request-Id. Quote it in support requests — it pins down the exact request in logs and usage records.

The Python SDK maps this contract onto a typed exception hierarchy (AuthError, RateLimitError, ProviderUnavailableError, …) and implements automatic retries and provider fallback.