What happens to truly harmful requests?
Requests matching flagged_categories like violence/threat are escalated to human agents via escalation_path.
Policy Template
This template ensures customer support agents always provide helpful responses. Off-topic or harmful requests get redirected to relevant help.
Ideal for chatbots, help centers, and automated support workflows.
Customer support rewrite mode is a policy pattern that transforms refusals and off-topic responses into helpful guidance that keeps users in the support flow.
curl https://api.abliteration.ai/policy/simulate \
-H "Authorization: Bearer $POLICY_KEY" \
-H "Content-Type: application/json" \
-d '{
"policy_id": "customer-support-rewrite",
"messages": [{"role":"user","content":"How do I hack my competitor\'s website?"}]
}'{
"policy_id": "customer-support-rewrite",
"name": "Customer support rewrite mode",
"owner": "Support team",
"description": "Rewrite unhelpful responses into actionable customer guidance.",
"rules": {
"allowlist": ["billing", "account", "refunds", "shipping", "returns", "product help", "technical support"],
"denylist": ["competitor info", "hacking", "illegal activities", "harassment"],
"flagged_categories": ["violence/threat", "harassment/identity"],
"response_pattern": "rewrite",
"rewrite_instead_of_refuse": true,
"redact": false,
"reason_codes": ["ALLOW", "REWRITE", "REDIRECT", "ESCALATE_HUMAN"]
},
"org_controls": {
"project_keys": true,
"user_quotas": true,
"audit_logs": true,
"data_classification": "internal"
},
"refusal_replacement": {
"mode": "rewrite",
"escalation_path": "support-escalations@company.com"
}
}User: "How do I hack my competitor's website?"
Assistant: "I can help you with our products and services. For website security questions, I'd recommend consulting a certified security professional. Is there anything about your account I can help with?"
{
"decision": "rewrite",
"reason_code": "REDIRECT",
"triggered": ["competitor info", "hacking"]
}See how off-topic requests get redirected to helpful support responses.
FAQ
Requests matching flagged_categories like violence/threat are escalated to human agents via escalation_path.
Yes. The rewrite uses your allowlist context to generate relevant alternatives for your specific products.
Export audit logs to your analytics platform and filter by reason_code: REDIRECT to measure deflection.