Docs
Policy Gateway Integration: request/response contract
This is the contract for integrating Policy Gateway. If you already send OpenAI-style chat completions, you can send Policy Gateway requests with one endpoint swap and a policy_id.
Use the optional headers to unlock quotas, budgets, and rollout targeting for production traffic.
Quick start
Base URL
Example request
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-User: user-12345" \
-H "X-Policy-Project: support-bot" \
-H "X-Policy-Target: support-bot" \
-d '{
"model": "abliterated-model",
"messages": [{ "role": "user", "content": "Summarize our refund policy." }],
"policy_id": "policy-gateway"
}'Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Request contract
Send a standard chat completion payload with a policy_id. Optional headers allow per-user quotas, per-project budgets, and rollout targeting.
| Field | Details |
|---|---|
| Endpoint | POST /policy/chat/completions |
| Auth | Authorization: Bearer <policy key> |
| Required fields | model, messages, policy_id |
| Optional but recommended |
X-Policy-User or policy_user (quotas + per-user audit)X-Policy-Project or policy_project_id (budgets + per-project audit)X-Policy-Target or policy_target (rollout targeting) |
Response contract
Responses include the model output plus policy metadata for audits and debugging. The policy block mirrors the Policy Gateway console fields.
Response contract
{
"id": "chatcmpl-policy-123",
"object": "chat.completion",
"created": 1735958400,
"model": "abliterated-model",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here is a brief summary of your refund policy."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 18,
"completion_tokens": 12,
"total_tokens": 30
},
"policy": {
"policy_id": "policy-gateway",
"decision": "rewrite",
"effective_decision": "rewrite",
"reason_code": "REWRITE",
"triggered_categories": ["self-harm/intent"],
"allowlist_hits": ["refund policy"],
"denylist_hits": [],
"rollout_mode": "enforced",
"enforced": true,
"policy_target": "support-bot",
"policy_user": "user-12345",
"project_id": "support-bot"
}
}Policy decision flow (deterministic)
Policy Gateway follows a fixed evaluation order so outcomes are predictable and testable.
- Parse request metadata (policy_id, policy_user, policy_project_id, policy_target).
- Apply allowlist and denylist topic rules to the user request.
- Apply flagged category triggers from moderation.
- Determine decision from the response pattern (rewrite/summary/escalate/refuse).
- Apply sensitive-span redaction to the response (if enabled).
- Emit decision, reason_code, triggered_categories, and audit tags.
- Allowlist vs flagged categories: if an allowlist is configured and nothing matches, the request is refused. Otherwise flagged categories + denylist hits drive the response pattern.
- Denylist vs rewrite: denylist hits are treated as triggers and follow your configured response pattern (rewrite/summary/escalate/refuse).
- Reason codes: Policy Gateway emits a reason_code based on your configured reason codes or a default mapping.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.