Agent Proxy
The Agent Proxy project type is PromptGate framed for one job: every AI request leaving your laptop / your team / your company flows through a single point. From there you get cost dashboards, audit logs, secret scanners, reversible PII redaction, and per-user tokens — without changing a line of code in the agents themselves.
When you want this
Section titled “When you want this”- Your team uses Claude Code + Cursor + Aider + an internal Python script using OpenAI’s SDK, and you want one cost dashboard for all of them.
- Compliance / legal asks “what data has left our network for an LLM provider in the last 30 days?” and “we don’t know” is no longer an acceptable answer.
- You want each developer to have their own API token with their own budget so a runaway script doesn’t burn the team budget.
- You want secrets (AWS keys, GitHub PATs, JWTs) to be caught before a copy-paste accident ships them to OpenAI.
- You want PII tokenization so the LLM never sees your customer emails, but your developers do.
The four API shapes
Section titled “The four API shapes”The Agent Proxy speaks every shape today’s coding agents speak. Pick the shape your tool uses and point it at PromptGate via that tool’s environment variable:
| Shape | Endpoint | Used by |
|---|---|---|
| OpenAI Chat Completions | POST /api/{uuid}/v1/chat/completions | Cursor, Aider, Continue, Cline, OpenAI SDK |
| OpenAI Responses | POST /api/{uuid}/v1/responses | Codex CLI, agentic OpenAI clients |
| OpenAI Embeddings | POST /api/{uuid}/v1/embeddings | RAG pipelines (LangChain, LlamaIndex, etc.) |
| Anthropic Messages | POST /api/{uuid}/v1/messages | Claude Code, Anthropic SDK, claudette |
Internally everything maps to PromptGate’s canonical OpenAI-Chat shape; the controllers translate to and from each public shape, so the wrapper resolver, providers, guardrails, response cache, and gateway logs stay unified.
Cross-provider routing
Section titled “Cross-provider routing”The model field in any request is resolved through the same wrapper resolver used by ai_wrapper projects. That means you can:
- Send an Anthropic-shaped
POST /v1/messageswithmodel: "openai:gpt-4o-mini"→ request gets translated to OpenAI shape, response gets translated back to Anthropic shape, Claude Code never knows. - Define a wrapper alias
cheap-fastthat points atopenai:gpt-4o-minitoday andmistral:mistral-smalltomorrow. The agents keep callingcheap-fast; you swap the backing model without redeploying anything.
Default-secure guardrails
Section titled “Default-secure guardrails”Two guardrails are intended for Agent Proxy projects in particular:
- Reversible Redaction — emails / phones / IBANs / SSNs / IPs / custom-regex matches are tokenized before the LLM call (the LLM sees
[[EMAIL_001]], neverjohn@acme.com) and the substitution is reversed in the response so the user gets their real data back. - Secret Scanner — AWS keys, GitHub PATs, Slack tokens, OpenAI keys, JWTs, private keys, npm/PyPI tokens. Block mode rejects with 422; redact mode tokenizes via Reversible Redaction.
Both layer on top of the existing PII Filter / Prompt Injection / Keyword Blocklist / Content Length guardrails — they’re additive, not replacements.
- Create a project with type Agent Proxy under
/projects. - Click Setup in the sidebar — that page lists every supported tool with the exact env var to set.
- Issue an API token with the
chatscope under API Tokens. - Drop the env var on each developer’s machine; their next request flows through PromptGate.
The Setup page also shows a “Connected Agents · last 7 days” panel inferred from gateway_logs — once traffic is flowing you’ll see “12 Cursor / 8 Claude Code / 4 Aider” without manual instrumentation.
© Akyros Labs LLC. All rights reserved.