Guides
Policy Gateway onboarding checklist
Use this checklist to launch Policy Gateway in production with scoped keys, quota enforcement, and auditable logs.
Each step includes a UI reference and a minimal API call so teams can ship quickly.
Requires a Policy Gateway subscription and a signed-in user. Replace https://api.abliteration.ai with your own API base URL for staging or test environments.
Quick start
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Checklist at a glance
Follow these steps in order to go live safely.
- Define the policy rules, reason codes, and refusal replacement strategy.
- Set user and project quotas to avoid runaway agents.
- Create a project per app/agent and issue scoped API keys.
- Route production traffic through /policy/chat/completions with policy_user + project ID.
- Review enforcement history and rollback spikes if needed.
Step 1 — Policy rules & quotas
Start by defining allow/deny lists, refusal replacement, and quota windows. These values are enforced by the Policy Gateway.
- Set user quotas (requests/tokens) for the policy_user you attach to each request.
- Set project quotas to cap app-level usage and budgets.
- Enable shadow or canary rollout before enforcing globally.
Step 2 — Projects & scoped keys
Create a project per app or agent, then generate a scoped key for each environment.
curl https://api.abliteration.ai/api/policy-gateway/projects \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "name": "Support bot", "monthly_token_limit": 10000000, "monthly_request_limit": 20000 }'
curl https://api.abliteration.ai/api/policy-gateway/projects/support-bot/keys \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "label": "Support bot prod" }'Step 3 — Enforce policy + audit
Send production requests through the policy gateway endpoint with policy_user and project IDs attached.
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-User: user-12345" \
-H "X-Policy-Project: support-bot" \
-d '{
"model": "abliterated-model",
"messages": [{ "role": "user", "content": "Summarize our refund policy." }],
"policy_id": "policy-gateway"
}'Go-live guardrails
These operational checks keep launches stable and auditable.
- Use shadow mode to observe policy outcomes without enforcement.
- Switch to canary with small sample sizes before full rollout.
- Monitor enforcement spikes and rely on auto-rollback if thresholds are exceeded.
- Export policy JSON to version control after each change.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.