Policy Gateway onboarding checklist
Step-by-step rollout checklist for enterprise Policy Gateway deployment, including projects, keys, quotas, and audits.
Use this checklist to launch Policy Gateway in production with scoped keys, quota enforcement, and auditable logs.
Each step includes a UI reference and a minimal API call so teams can ship quickly.
Requires a Policy Gateway subscription and a signed-in user. Replace https://api.abliteration.ai with your own API base URL for staging or test environments.
Quick start
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Checklist at a glance
#Follow these steps in order to go live safely.
- Define the policy rules, reason codes, and refusal replacement strategy.
- Set user and project quotas to avoid runaway agents.
- Create a project per app/agent and issue scoped API keys.
- Route production traffic through /policy/chat/completions with policy_user + project ID.
- Review enforcement history and rollback spikes if needed.
Step 1 — Policy rules & quotas
#Start by defining allow/deny lists, refusal replacement, and quota windows. These values are enforced by the Policy Gateway.
- Set user quotas (requests/tokens) for the policy_user you attach to each request.
- Set project quotas to cap app-level usage and budgets.
- Enable shadow or canary rollout before enforcing globally.
Step 2 — Projects & scoped keys
#Create a project per app or agent, then generate a scoped key for each environment.
curl https://api.abliteration.ai/api/policy-gateway/projects \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "name": "Support bot", "monthly_token_limit": 10000000, "monthly_request_limit": 20000 }'
curl https://api.abliteration.ai/api/policy-gateway/projects/support-bot/keys \
-H "Authorization: Bearer $ABLIT_JWT" \
-H "Content-Type: application/json" \
-d '{ "label": "Support bot prod" }'Step 3 — Enforce policy + audit
#Send production requests through the policy gateway endpoint with policy_user and project IDs attached.
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-User: user-12345" \
-H "X-Policy-Project: support-bot" \
-d '{
"model": "abliterated-model",
"messages": [{ "role": "user", "content": "Summarize our refund policy." }],
"policy_id": "policy-gateway"
}'Go-live guardrails
#These operational checks keep launches stable and auditable.
- Use shadow mode to observe policy outcomes without enforcement.
- Switch to canary with small sample sizes before full rollout.
- Monitor enforcement spikes and rely on auto-rollback if thresholds are exceeded.
- Export policy JSON to version control after each change.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.