abliteration.ai - Uncensored LLM API Platform
Abliteration
PolicyGatewaySecurity TestingDocsMigrationGlossaryPricing
Home/Docs/Policy Gateway backend guide
LLM governance / policy control planeGuides

Policy Gateway backend guide

This guide covers the backend APIs used to configure Policy Gateway and enforce policy on live traffic.

Management endpoints live under /api/policy-gateway and are intended for a user bearer token (JWT from the dashboard).

Enforcement is typically done through /policy/chat/completions with a policy API key (ak_...) plus optional policy_user, policy_project_id, and policy_target metadata.

Quick start

Example request
{
  "policy_id": "policy-gateway",
  "name": "Support Policy",
  "owner": "Platform team",
  "description": "Ensure support replies follow approved topics.",
  "rules": {
    "allowlist": ["refund policy", "account support"],
    "denylist": ["illegal instructions"],
    "redact": true,
    "rewrite_instead_of_refuse": true,
    "response_pattern": "rewrite",
    "reason_codes": ["ALLOW", "REWRITE", "SUMMARY", "ESCALATE", "REFUSE"],
    "flagged_categories": ["self-harm", "self-harm/intent", "sexual/minors"]
  },
  "org_controls": {
    "project_keys": true,
    "user_quotas": true,
    "audit_logs": true,
    "data_classification": "confidential",
    "user_quota": { "requests": 5000, "tokens": 2000000, "window": "daily" },
    "project_quota": { "requests": 20000, "tokens": 10000000, "window": "monthly" }
  },
  "rollout": {
    "shadow": { "enabled": false, "sample_percent": 20, "targets": [] },
    "canary": { "enabled": false, "sample_percent": 5, "targets": [] },
    "rollback_on_spike": true,
    "rollback_threshold": 0.25,
    "rollback_min_requests": 20,
    "rollback_window_minutes": 15,
    "rollback_cooldown_minutes": 30,
    "rollback_decisions": ["refuse", "escalate"]
  },
  "refusal_replacement": {
    "mode": "rewrite",
    "escalation_path": "policy-review@example.com"
  }
}

Service notes

  • Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
  • Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
  • Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
  • Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
  • Throughput: Team plans include priority throughput. Actual throughput varies with demand.
  • Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.

On this page

  • API endpoints and auth
  • Save policy configuration
  • Decision logic (what is evaluated)
  • Enforce policy on live traffic
  • Rollouts and targeting
  • Projects, keys, and quotas
  • Simulate and audit

API endpoints and auth

Policy Gateway splits management APIs (config, projects, history) from the enforcement endpoint.

All endpoints require Authorization; use a user JWT for management and a policy API key (ak_...) for enforcement.

  • GET/POST /api/policy-gateway/config - read/write the saved policy config.
  • POST /api/policy-gateway/simulate - compute decisions for category scenarios.
  • GET /api/policy-gateway/history - list revision/simulation/enforcement entries.
  • GET/POST /api/policy-gateway/projects - manage projects and monthly limits.
  • GET/POST /api/policy-gateway/projects/{project_id}/keys - create scoped policy keys.
  • DELETE /api/policy-gateway/keys/{key_id} - revoke a key.
  • POST /policy/chat/completions - enforce policy on live traffic.

Save policy configuration

POST a config object to store the policy and create a revision entry in history.

  • You can send the config directly or wrap it under a top-level config key.
  • Missing fields are filled with defaults; list values are trimmed and capped.
  • Set org_controls.audit_logs to false to disable enforcement history writes.
Save policy configuration
curl https://api.abliteration.ai/api/policy-gateway/config \
  -H "Authorization: Bearer $ABLIT_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "config": {
      "policy_id": "policy-gateway",
      "name": "Support Policy",
      "rules": {
        "allowlist": ["refund policy", "account support"],
        "denylist": ["illegal instructions"],
        "response_pattern": "rewrite"
      }
    }
  }'

Decision logic (what is evaluated)

Policy Gateway applies a consistent evaluation order so outcomes are predictable and testable.

  1. Parse metadata (policy_id, policy_user, policy_project_id, policy_target).
  2. Run moderation on the last user message if configured; some categories are hard-blocked.
  3. Match allowlist/denylist terms against the last user message (substring match).
  4. Compute triggered categories from moderation labels in rules.flagged_categories.
  5. Choose a decision from response_pattern (rewrite/summary/escalate/refuse).
  6. Resolve rollout mode (shadow/canary/enforced) and apply redaction if enforced.
  7. Emit policy metadata and write history if audit logs are enabled.
  • Allowlist is strict: if an allowlist is configured and nothing matches, the decision becomes refuse.
  • Denylist + flagged categories: either can trigger a non-allow decision.
  • Rewrite flag: if response_pattern is rewrite and rewrite_instead_of_refuse is false, the decision becomes refuse.
  • Reason codes: the first code containing the decision string is used, otherwise defaults apply.
  • Moderation gate: when enabled, sexual/minors and self-harm categories are rejected before policy evaluation.

Enforce policy on live traffic

Send OpenAI-compatible chat completions through the policy endpoint.

  • policy_id is optional; if provided it must match the saved (or inline) policy id.
  • Provide policy_user and policy_project_id (or headers) for quotas and audits.
  • Inline policy overrides are supported via config or policy in the request body.
  • Responses include a policy object with decision metadata and rollout state.
Enforce policy on live traffic
curl https://api.abliteration.ai/policy/chat/completions \
  -H "Authorization: Bearer $POLICY_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Policy-User: user-12345" \
  -H "X-Policy-Project: support-bot" \
  -H "X-Policy-Target: support-bot" \
  -d '{
    "model": "abliterated-model",
    "messages": [
      { "role": "user", "content": "Summarize our refund policy." }
    ],
    "policy_id": "policy-gateway"
  }'

Rollouts and targeting

Rollouts are configured in the policy config and targeted with X-Policy-Target or policy_target.

  • Shadow rollouts mark sampled requests as rollout_mode=shadow and do not enforce.
  • Canary rollouts enforce only the sampled percentage; unsampled requests are allowed.
  • Shadow takes precedence over canary when both are enabled.
  • If automatic rollback is enabled and enforcement spikes, Policy Gateway temporarily disables enforcement.

Projects, keys, and quotas

Create a project per app or agent, issue a scoped key, and attach policy_user for per-user quotas.

  • Project ids are normalized to lowercase slugs; the response contains the canonical id.
  • If org_controls.project_keys is true, API-key requests must include a project id (or use a scoped key).
  • Project monthly limits override config project_quota and always use a monthly window.
  • User quotas use policy_user (or API key id when omitted) and honor the configured window.
Projects, keys, and quotas
curl https://api.abliteration.ai/api/policy-gateway/projects \
  -H "Authorization: Bearer $ABLIT_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "name": "Support bot", "monthly_token_limit": 10000000, "monthly_request_limit": 20000 }'

curl https://api.abliteration.ai/api/policy-gateway/projects/support-bot/keys \
  -H "Authorization: Bearer $ABLIT_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "label": "Support bot prod" }'

Simulate and audit

Use simulation for dry runs and history for audits.

  • Simulation compares categories against rules.flagged_categories only.
  • Set persist: false to skip writing a simulation entry.
  • History results are newest-first and capped by POLICY_HISTORY_LIMIT (default 50).
Simulate and audit
curl https://api.abliteration.ai/api/policy-gateway/simulate \
  -H "Authorization: Bearer $ABLIT_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "categories": ["self-harm/intent"] }'

curl "https://api.abliteration.ai/api/policy-gateway/history?type=enforcement&limit=20" \
  -H "Authorization: Bearer $ABLIT_JWT"

Common errors & fixes

  • 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
  • 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
  • 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
  • 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.

Related links

  • Policy Gateway integration contract
  • Policy Gateway connectors
  • Policy Gateway security & privacy
  • Policy Gateway onboarding checklist
  • Policy gateway feature page
  • OpenAI compatibility guide
  • Rate limits and retries
  • Anthropic Pentagon case explainer
  • API pricing
  • Privacy policy
ProductDocumentationRun in PostmanGlossary
Trust & LegalData Handling FAQTrust CenterPrivacy PolicyTerms of Service
ConnectHugging Facehelp@abliteration.ai
FacebookX (Twitter)LinkedIn

© 2025 Social Keyboard, Inc. All rights reserved.