abliteration.ai - Uncensored LLM API Platform
Abliteration
PolicyDocsMigrationDefinitionsPricing
Home/Docs/Policy Gateway for Azure OpenAI (vendor-agnostic)
LLM governance / policy control planePolicy Gateway

Policy Gateway for Azure OpenAI (vendor-agnostic)

Policy Gateway gives Azure OpenAI teams a vendor-agnostic control plane for quotas, audit trails, and rollout safety.

It acts as a gateway for monitoring, sitting between your apps and any model provider so the same policy applies to Azure OpenAI, OpenAI, Anthropic, or local models.

Quick start

Base URL
Example request
curl https://api.abliteration.ai/policy/chat/completions \
  -H "Authorization: Bearer $POLICY_KEY" \
  -H "Content-Type: application/json" \
  -H "X-Policy-User: user-12345" \
  -H "X-Policy-Project: finance-assistant" \
  -d '{
    "model": "abliterated-model",
    "messages": [{"role": "user", "content": "Summarize Q4 usage trends."}],
    "policy_id": "azure-openai-governance"
  }'

Service notes

  • Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
  • Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
  • Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
  • Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
  • Throughput: Team plans include priority throughput. Actual throughput varies with demand.
  • Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.

Architecture: gateway for monitoring and governance

Route all LLM traffic through Policy Gateway to standardize quotas, audit logs, and policy enforcement.

  • Apps send requests to /policy/chat/completions.
  • Policy Gateway enforces policy-as-code and attaches audit metadata.
  • Upstream routing stays vendor-agnostic (Azure OpenAI, OpenAI, local, or multi-provider).

Enforce per-user and per-project quotas

Quotas are driven by policy_user and policy_project_id tags.

  • Tag every request with policy_user for chargeback and usage tracking.
  • Use per-project keys for app-level budgets and isolation.
  • Quota violations return a policy decision with reason codes.
Enforce per-user and per-project quotas
{
  "policy_id": "azure-openai-governance",
  "org_controls": {
    "user_quotas": true,
    "project_keys": true,
    "user_quota": { "requests": 500, "tokens": 100000, "window": "daily" },
    "project_quota": { "requests": 20000, "tokens": 5000000, "window": "monthly" }
  }
}

Tag and export audit events

Audit logs include decision metadata plus the tags you send with each request.

Export logs to the SIEM or log platform your security team already uses, and add logging inputs/outputs for auditing in your own systems if required.

  • policy_user, policy_project_id, and policy_target appear in every audit event.
  • Export destinations: Splunk HEC, Datadog Logs, Elastic, Amazon S3, Azure Monitor / Log Analytics.
  • Policy Gateway security & privacy explains what is stored.

Shadow mode and canary rollouts

Shadow and canary let you validate policy changes before enforcing them on all traffic.

  • Shadow mode logs decisions without enforcement.
  • Canary mode enforces on a sample before full rollout.
  • Auto-rollback protects against decision spikes.
Shadow mode and canary rollouts
{
  "rollout": {
    "shadow": { "enabled": true, "sample_percent": 20, "targets": ["finance-assistant"] },
    "canary": { "enabled": true, "sample_percent": 5, "targets": ["finance-assistant"] },
    "rollback_on_spike": true
  }
}

Deployment checklist

  • Define policy rules, quotas, and reason codes.
  • Create projects and scoped keys per app or tenant.
  • Export audit logs to your SIEM for compliance reviews.
  • Start in shadow mode, then canary, then enforce.

Common errors & fixes

  • 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
  • 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
  • 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
  • 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.

Related links

  • Azure APIM AI Gateway vs Policy Gateway
  • Policy Gateway onboarding checklist
  • Policy Gateway security & privacy
  • Splunk HEC export
  • Datadog Logs export
  • Elastic audit log export
  • Amazon S3 export
  • Azure Monitor / Log Analytics export
  • Rate limits and retries
  • API pricing
  • Privacy policy
DefinitionsDocumentationRun in PostmanPrivacy PolicyTerms of ServiceHugging Facehelp@abliteration.ai
FacebookX (Twitter)LinkedIn

© 2025 Social Keyboard, Inc. All rights reserved.