Policy Gateway backend guide
This guide covers the backend APIs used to configure Policy Gateway and enforce policy on live traffic.
Management endpoints live under /api/policy-gateway and are intended for a user bearer token (JWT from the dashboard).
Enforcement is typically done through /policy/chat/completions with a policy API key (ak_...) plus optional policy_user, policy_project_id, and policy_target metadata.
Quick start
{
"policy_id": "policy-gateway",
"name": "Support Policy",
"owner": "Platform team",
"description": "Ensure support replies follow approved topics.",
"rules": {
"allowlist": ["refund policy", "account support"],
"denylist": ["illegal instructions"],
"redact": true,
"rewrite_instead_of_refuse": true,
"response_pattern": "rewrite",
"reason_codes": ["ALLOW", "REWRITE", "SUMMARY", "ESCALATE", "REFUSE"],
"flagged_categories": ["self-harm", "self-harm/intent", "sexual/minors"]
},
"org_controls": {
"project_keys": true,
"user_quotas": true,
"audit_logs": true,
"data_classification": "confidential",
"user_quota": { "requests": 5000, "tokens": 2000000, "window": "daily" },
"project_quota": { "requests": 20000, "tokens": 10000000, "window": "monthly" }
},
"rollout": {
"shadow": { "enabled": false, "sample_percent": 20, "targets": [] },
"canary": { "enabled": false, "sample_percent": 5, "targets": [] },
"rollback_on_spike": true,
"rollback_threshold": 0.25,
"rollback_min_requests": 20,
"rollback_window_minutes": 15,
"rollback_cooldown_minutes": 30,
"rollback_decisions": ["refuse", "escalate"]
},
"refusal_replacement": {
"mode": "rewrite",
"escalation_path": "policy-review@example.com"
}
}Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
API endpoints and auth
Policy Gateway splits management APIs (config, projects, history) from the enforcement endpoint.
All endpoints require Authorization; use a user JWT for management and a policy API key (ak_...) for enforcement.
GET/POST /api/policy-gateway/config- read/write the saved policy config.POST /api/policy-gateway/simulate- compute decisions for category scenarios.GET /api/policy-gateway/history- list revision/simulation/enforcement entries.GET/POST /api/policy-gateway/projects- manage projects and monthly limits.GET/POST /api/policy-gateway/projects/{project_id}/keys- create scoped policy keys.DELETE /api/policy-gateway/keys/{key_id}- revoke a key.POST /policy/chat/completions- enforce policy on live traffic.
Save policy configuration
POST a config object to store the policy and create a revision entry in history.
- You can send the config directly or wrap it under a top-level
configkey. - Missing fields are filled with defaults; list values are trimmed and capped.
- Set
org_controls.audit_logsto false to disable enforcement history writes.
curl https://api.abliteration.ai/api/policy-gateway/config \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{
"config": {
"policy_id": "policy-gateway",
"name": "Support Policy",
"rules": {
"allowlist": ["refund policy", "account support"],
"denylist": ["illegal instructions"],
"response_pattern": "rewrite"
}
}
}'Decision logic (what is evaluated)
Policy Gateway applies a consistent evaluation order so outcomes are predictable and testable.
- Parse metadata (policy_id, policy_user, policy_project_id, policy_target).
- Run moderation on the last user message if configured; some categories are hard-blocked.
- Match allowlist/denylist terms against the last user message (substring match).
- Compute triggered categories from moderation labels in
rules.flagged_categories. - Choose a decision from
response_pattern(rewrite/summary/escalate/refuse). - Resolve rollout mode (shadow/canary/enforced) and apply redaction if enforced.
- Emit policy metadata and write history if audit logs are enabled.
- Allowlist is strict: if an allowlist is configured and nothing matches, the decision becomes
refuse. - Denylist + flagged categories: either can trigger a non-allow decision.
- Rewrite flag: if
response_patternisrewriteandrewrite_instead_of_refuseis false, the decision becomesrefuse. - Reason codes: the first code containing the decision string is used, otherwise defaults apply.
- Moderation gate: when enabled, sexual/minors and self-harm categories are rejected before policy evaluation.
Enforce policy on live traffic
Send OpenAI-compatible chat completions through the policy endpoint.
policy_idis optional; if provided it must match the saved (or inline) policy id.- Provide
policy_userandpolicy_project_id(or headers) for quotas and audits. - Inline policy overrides are supported via
configorpolicyin the request body. - Responses include a
policyobject with decision metadata and rollout state.
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-User: user-12345" \
-H "X-Policy-Project: support-bot" \
-H "X-Policy-Target: support-bot" \
-d '{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Summarize our refund policy." }
],
"policy_id": "policy-gateway"
}'Rollouts and targeting
Rollouts are configured in the policy config and targeted with X-Policy-Target or policy_target.
- Shadow rollouts mark sampled requests as
rollout_mode=shadowand do not enforce. - Canary rollouts enforce only the sampled percentage; unsampled requests are allowed.
- Shadow takes precedence over canary when both are enabled.
- If automatic rollback is enabled and enforcement spikes, Policy Gateway temporarily disables enforcement.
Projects, keys, and quotas
Create a project per app or agent, issue a scoped key, and attach policy_user for per-user quotas.
- Project ids are normalized to lowercase slugs; the response contains the canonical id.
- If
org_controls.project_keysis true, API-key requests must include a project id (or use a scoped key). - Project monthly limits override config
project_quotaand always use a monthly window. - User quotas use
policy_user(or API key id when omitted) and honor the configured window.
curl https://api.abliteration.ai/api/policy-gateway/projects \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "name": "Support bot", "monthly_token_limit": 10000000, "monthly_request_limit": 20000 }'
curl https://api.abliteration.ai/api/policy-gateway/projects/support-bot/keys \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "label": "Support bot prod" }'Simulate and audit
Use simulation for dry runs and history for audits.
- Simulation compares
categoriesagainstrules.flagged_categoriesonly. - Set
persist: falseto skip writing a simulation entry. - History results are newest-first and capped by
POLICY_HISTORY_LIMIT(default 50).
curl https://api.abliteration.ai/api/policy-gateway/simulate \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "categories": ["self-harm/intent"] }'
curl "https://api.abliteration.ai/api/policy-gateway/history?type=enforcement&limit=20" \
-H "Authorization: Bearer $ABLIT_JWT"Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.