Wrapper API
The Wrapper API is the public surface of ai_wrapper projects. It mimics OpenAI’s REST shape so any OpenAI-compatible SDK works without modification.
| Route | Use |
|---|---|
POST /api/{uuid}/v1/chat/completions | Chat completion — model identifier picks the upstream. |
GET /api/{uuid}/v1/models | Discover aliases + provider:* placeholders. |
Both require chat scope.
POST /v1/chat/completions
Section titled “POST /v1/chat/completions”Request
Section titled “Request”POST /api/{uuid}/v1/chat/completionsContent-Type: application/jsonAuthorization: Bearer pg_live_...Body — same shape as OpenAI:
{ "model": "fast", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Hello!"} ], "temperature": 0.7, "max_tokens": 500, "stream": false}Where model is one of:
- A namespaced alias —
promptgate:<name>(preferred), e.g.promptgate:fast,promptgate:summarizer. - A bare alias —
<name>(backward compatible), e.g.fast,summarizer. - A preset — any alias that also bakes in
system_prompt+ sampling defaults. Samepromptgate:<name>form. - A
provider:modelpair (e.g.openai:gpt-4o-mini,anthropic:claude-3-5-sonnet-20241022).
Aliases resolve via wrapper_aliases. Provider+model uses the project’s wrapper_provider_settings.credential_id for that provider.
When the alias has preset config (system_prompt / temperature / top_p / max_tokens), the wrapper merges those defaults into the request. Client-supplied values win — see AI Wrapper → Aliases as presets.
Response
Section titled “Response”OpenAI-compatible:
{ "id": "chatcmpl-abcd", "object": "chat.completion", "model": "gpt-4o-mini", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "Hello! How can I help today?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 20, "completion_tokens": 9, "total_tokens": 29 }}Note: model in the response is the resolved identifier, not the alias. So requests with "model": "fast" come back with "model": "gpt-4o-mini".
Streaming
Section titled “Streaming”Pass "stream": true. Returns OpenAI-style SSE. See Streaming.
GET /v1/models
Section titled “GET /v1/models”Request
Section titled “Request”GET /api/{uuid}/v1/modelsAuthorization: Bearer pg_live_...Response
Section titled “Response”{ "object": "list", "data": [ { "id": "fast", "object": "model", "owned_by": "promptgate", "is_alias": true, "resolved": { "provider": "openai", "model": "gpt-4o-mini" } }, { "id": "smart", "object": "model", "owned_by": "promptgate", "is_alias": true, "resolved": { "provider": "anthropic", "model": "claude-3-5-sonnet-20241022" } }, { "id": "openai:*", "object": "model", "owned_by": "promptgate", "is_alias": false }, { "id": "anthropic:*", "object": "promptgate", "owned_by": "promptgate", "is_alias": false } ]}is_alias=true rows are aliases the project has defined. is_alias=false rows are placeholders for enabled providers — clients can use any of that provider’s models with the provider:model syntax.
The custom resolved field on alias rows tells you what the alias points to — useful for client UIs that want to show “fast = gpt-4o-mini”.
OpenAI SDK compatibility
Section titled “OpenAI SDK compatibility”The wrapper’s request and response shapes are byte-compatible with OpenAI’s, so you can use the OpenAI SDKs unchanged:
Python
Section titled “Python”import osfrom openai import OpenAI
client = OpenAI( base_url=f"{os.environ['PG_URL']}/api/{os.environ['PG_UUID']}/v1", api_key=os.environ['PG_TOKEN'],)
# Non-streamingresp = client.chat.completions.create( model="fast", messages=[{"role": "user", "content": "Hello!"}],)print(resp.choices[0].message.content)
# Streamingstream = client.chat.completions.create( model="smart", messages=[{"role": "user", "content": "Write a haiku about Mondays."}], stream=True,)for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)Node.js
Section titled “Node.js”import OpenAI from 'openai';
const client = new OpenAI({ baseURL: `${process.env.PG_URL}/api/${process.env.PG_UUID}/v1`, apiKey: process.env.PG_TOKEN,});
const resp = await client.chat.completions.create({ model: 'fast', messages: [{ role: 'user', content: 'Hello!' }],});console.log(resp.choices[0].message.content);LangChain (Python)
Section titled “LangChain (Python)”from langchain_openai import ChatOpenAI
llm = ChatOpenAI( base_url=f"{PG_URL}/api/{PG_UUID}/v1", api_key=PG_TOKEN, model="fast",)
print(llm.invoke("Hello!").content)Vercel AI SDK (Node)
Section titled “Vercel AI SDK (Node)”import { createOpenAI } from '@ai-sdk/openai';import { generateText } from 'ai';
const promptgate = createOpenAI({ baseURL: `${PG_URL}/api/${PG_UUID}/v1`, apiKey: PG_TOKEN,});
const { text } = await generateText({ model: promptgate('fast'), prompt: 'Hello!',});Behaviour reference
Section titled “Behaviour reference”| Situation | Status |
|---|---|
| Success | 200 |
| Unknown alias | 404 |
provider:model with provider not enabled in this wrapper | 400 |
| Wrong project type | 400 |
| No bearer token | 401 |
| Wrong scope | 403 |
| Token from different project | 403 |
| Provider call fails after failover (if any) | 502 |
Differences from real OpenAI
Section titled “Differences from real OpenAI”- No
n > 1— only one choice is returned per request. - No
function_call/tools— function calling is roadmap. - No
logprobs— not exposed. modelin response is the resolved identifier, not the alias the client sent.- Token counts come from the provider’s response and may differ slightly from OpenAI’s tokenizer (e.g. Anthropic returns its own counts).
For most production usage these don’t matter; for advanced features that aren’t yet supported, check back or use the AI Gateway directly with provider-specific endpoints.
Next: Proxy API.
© Akyros Labs LLC. All rights reserved.