Guides
Policy gateway backend guide (bring your own policy)
Stop selling "uncensored." Sell controllable behavior with policy rules your team owns.
The buyer is product teams building assistants, internal tools, or agent workflows who need predictable outcomes across apps, models, and agents.
End users get fewer surprise refusals, clearer policy outcomes, and auditable enforcement that still aligns with lawful use.
Quick start
{
"policy_id": "byop-gateway",
"rules": {
"allowlist": ["product docs", "account support"],
"denylist": ["illegal instructions", "high-risk requests"],
"redact": true,
"rewrite_instead_of_refuse": true,
"reason_codes": ["ALLOW", "REWRITE", "ESCALATE"],
"flagged_categories": ["hate/threatening", "self-harm/intent"]
},
"org_controls": {
"project_keys": true,
"user_quotas": true,
"audit_logs": true,
"data_classification": "confidential",
"user_quota": { "requests": 5000, "tokens": 2000000, "window": "daily" },
"project_quota": { "requests": 20000, "tokens": 10000000, "window": "monthly" }
},
"refusal_replacement": {
"mode": "rewrite",
"escalation_path": "trust@company.com"
}
}Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
What to build (the wedge)
A control plane and change-management layer that makes LLM behavior predictable and auditable for enterprise teams.
- Policy-as-code rules: allow/deny lists, redaction, rewrite instead of refuse, and structured reason codes.
- Change management: simulations, shadow/canary rollouts, and rollback-ready history.
- Org controls: per-project keys, per-user quotas, audit logs, and data classification tags.
- Refusal replacement patterns: safe alternatives, escalation paths, and compliant summaries instead of hard blocks.
Policy-as-code layer
Treat policy like a versioned config: easy to diff, test, and roll back.
- Define rule precedence and defaults in a single JSON or YAML file.
- Return machine-readable reason codes for every decision.
- Attach the policy id to each request across apps, models, and agents.
Enforce policy on live traffic
Use the policy-enforced endpoint to apply your saved policy to production requests.
- Call /policy/chat/completions with your API key to enforce the Policy Gateway.
- Send X-Policy-Target or policy_target to target rollouts per app or agent.
- Include policy_user and policy_project_id (or X-Policy-User/X-Policy-Project) for quotas and audit trails.
- Policy metadata is returned with each response for audit trails.
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $ABLIT_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-Target: support-bot" \
-d '{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Summarize our refund policy." }
],
"policy_id": "policy-gateway",
"policy_user": "user-12345",
"policy_project_id": "support-bot"
}'Getting started (enterprise rollout)
Create a project per app or agent, issue scoped keys, then start sending policy_user and project IDs.
- Create projects with monthly budgets and generate scoped keys.
- Attach policy_user and policy_project_id to every request for quotas and audit visibility.
- Review enforcement history to detect spikes and validate rollbacks.
curl https://api.abliteration.ai/api/policy-gateway/projects \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "name": "Support bot", "monthly_token_limit": 10000000, "monthly_request_limit": 20000 }'
curl https://api.abliteration.ai/api/policy-gateway/projects/support-bot/keys \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "label": "Support bot prod" }'Org controls & auditability
Enterprise buyers want deterministic controls, not a black box.
- Per-project keys with budgets to isolate workloads.
- Per-user quotas using policy_user or X-Policy-User.
- Audit logs with policy id, decision, project, and reason code per request.
- Data classification tags for compliance reporting.
Refusal replacement patterns
When requests cross a boundary, return something useful instead of a hard refusal.
- Rewrite with safe alternatives that preserve user intent.
- Provide compliant summaries that avoid actionable steps.
- Escalate to a human reviewer for high-risk workflows.
Packaging & pricing
Charge for the control plane, not just tokens.
- Offer premium tiers priced in credits (e.g., 6x, 20x, 60x Scale packs) plus usage.
- Sell annual contracts for compliance features and audit retention.
- Position it as developer-controlled and enterprise-ready.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.