Can I apply quotas without changing prompts?
Yes. Quotas use policy_user and policy_project_id metadata, not prompt content.
AI Gateway
AI gateways are expected to enforce quotas. Policy Gateway adds per-user and per-project token limits to LLM traffic.
Attach policy_user and policy_project_id to enforce budgets and keep audit trails clean.
Token quotas for LLM APIs are AI gateway controls that cap usage per user, per project, or per tenant to protect spend and prevent abuse.
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-H "X-Policy-User: user-8932" \
-H "X-Policy-Project: pro-plan" \
-d '{
"model": "abliterated-model",
"messages": [{"role":"user","content":"Summarize the latest invoice."}],
"policy_id": "quota-control"
}'{
"policy_id": "quota-control",
"name": "Quota control",
"owner": "Platform team",
"description": "Per-user and per-project token caps.",
"rules": {
"allowlist": ["billing", "support", "account"],
"denylist": ["credential theft"],
"flagged_categories": ["self-harm/intent", "sexual/minors"],
"response_pattern": "refuse",
"rewrite_instead_of_refuse": false,
"redact": true,
"reason_codes": ["ALLOW", "REFUSE", "REDACT"]
},
"org_controls": {
"project_keys": true,
"user_quotas": true,
"audit_logs": true,
"data_classification": "restricted",
"user_quota": { "requests": 60, "tokens": 5000, "window": "daily" },
"project_quota": { "requests": 30000, "tokens": 3000000, "window": "monthly" }
},
"rollout": {
"shadow": { "enabled": false, "sample_percent": 0, "targets": [] },
"canary": { "enabled": false, "sample_percent": 0, "targets": [] },
"rollback_on_spike": true
},
"refusal_replacement": { "mode": "refuse", "escalation_path": "policy-review@company.com" }
}User 8932 consumes 50k tokens in one hour with no limits.
decision: refuse reason_code: USER_QUOTA_EXCEEDED policy_user: user-8932 policy_project_id: pro-plan
Verify quota behavior and audit tags before enforcing limits.
FAQ
Yes. Quotas use policy_user and policy_project_id metadata, not prompt content.
Project-scoped keys map requests to projects, which makes quota enforcement automatic.