Skip to content

Gateway API

The Gateway API is the public surface of ai_gateway projects. Two routes:

RouteUse
POST /api/{uuid}/{slug}Direct call to a single endpoint by slug.
POST /api/{uuid}/chat/completionsGeneric dispatcher — request body picks the endpoint by slug.

Both require chat scope.

The two routes are functionally equivalent; the second is more familiar to OpenAI users (one URL, multiple endpoints) and is what you’d use if a downstream tool expects a single OpenAI-style URL but you want endpoint routing.

POST /api/{uuid}/{slug}
Content-Type: application/json
Authorization: Bearer pg_live_...

Body — single message:

{ "message": "Explain quantum computing." }

Or full conversation:

{
"messages": [
{ "role": "system", "content": "You are a teacher." },
{ "role": "user", "content": "What's a qubit?" }
]
}

Optional fields:

FieldTypeDefaultNotes
streamboolfalseHonoured if endpoint has streaming enabled.
session_idstring (uuid)absentResume an existing session.
{
"ok": true,
"id": "chatcmpl-abcd1234",
"model": "gpt-4o-mini",
"content": "A qubit is the quantum analogue of a classical bit...",
"finish_reason": "stop",
"usage": {
"prompt_tokens": 38,
"completion_tokens": 87,
"total_tokens": 125
},
"meta": {
"request_id": "0e2f...c4",
"session_id": "..." // only when session is created/resumed
}
}

OpenAI-style SSE chunks. See Streaming.

Terminal window
curl -X POST $PG_URL/api/$PG_UUID/$PG_SLUG \
-H "Authorization: Bearer $PG_TOKEN" \
-H "Content-Type: application/json" \
-d '{"message": "Explain quantum computing."}'
import os, requests
resp = requests.post(
f"{os.environ['PG_URL']}/api/{os.environ['PG_UUID']}/{os.environ['PG_SLUG']}",
headers={"Authorization": f"Bearer {os.environ['PG_TOKEN']}"},
json={"message": "Explain quantum computing."},
timeout=120,
)
resp.raise_for_status()
print(resp.json()["content"])
const r = await fetch(`${process.env.PG_URL}/api/${process.env.PG_UUID}/${process.env.PG_SLUG}`, {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.PG_TOKEN}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({ message: 'Explain quantum computing.' }),
});
console.log((await r.json()).content);

Same as Route 1, but the endpoint slug goes in the body:

{
"endpoint": "summarize",
"messages": [
{ "role": "user", "content": "Long text..." }
]
}

Useful when you have a tool that talks to a single URL (POST .../chat/completions) but you want to route across multiple endpoints in the project.

Terminal window
curl -X POST $PG_URL/api/$PG_UUID/chat/completions \
-H "Authorization: Bearer $PG_TOKEN" \
-d '{
"endpoint": "summarize",
"messages": [{"role":"user","content":"Long text..."}]
}'

The response shape is identical to Route 1.

SituationStatusBody
Success200{ok: true, ...}
No bearer token401{ok: false, error: "Missing Bearer token."}
Token from different project403{ok: false, error: "Token does not belong to this project."}
Wrong scope403scope mismatch message
Endpoint slug not found / inactive404endpoint-not-found message
Per-minute rate limit hit429+ Retry-After header
Per-request token cap exceeded422budget message
Monthly budget exhausted422budget message
Guardrail blocked422which guardrail
Input schema invalid422validator detail
Provider error after failover502provider error
Output schema invalid502validator detail
Wrong project type for this surface400type mismatch

Request → auth → rate limit → budget → guardrails → input schema → prompt apply → provider call (with failover) → output schema → log → response.

So a request that fails an early step never pays the cost of later steps.

Just pass message. PromptGate wraps it as [{role: "user", content: ...}].

Pass the full messages history. The client manages the transcript.

Pass session_id + the new user message. The gateway stores the history and prepends it next time.

Pass "stream": true. Endpoint has to have streaming_enabled=true. Response is SSE.


Next: Wrapper API.


© Akyros Labs LLC. All rights reserved.