Authorized penetration testing with governed AI: full workflow and controls
End-to-end workflow for using governed AI in authorized penetration testing. Covers scoped keys, audit logs, policy enforcement, and client-engagement isolation.
Security firms need AI that answers technical questions without triggering generic hacking filters — but they also need governance controls to satisfy enterprise clients. This page documents the full workflow.
abliteration.ai combines an abliterated model (reduced refusals for authorized technical work) with Policy Gateway (audit logs, scoped keys, per-engagement isolation) to solve both problems.
Authorized penetration testing with governed AI: full workflow and controls
Authorized penetration testing with governed AI is a workflow where security teams use abliterated models for technical research while enforcing audit trails, scoped access, and policy controls through Policy Gateway.
- Mainstream LLMs refuse legitimate security prompts — writing exploit code, analyzing malware, generating test payloads — even when the user is an authorized pentester.
- Enterprise clients require audit logs and governance proof before approving AI use in engagements.
- Generic "uncensored" models lack the governance layer. You need reduced refusals AND controls.
- 01Create a Policy Gateway policy scoped to security testing with appropriate allow/deny rules.
- 02Create a project per client engagement and issue scoped API keys.
- 03Configure per-user and per-project quotas to enforce budget and prevent misuse.
- 04Route all pentest AI requests through the policy endpoint to get audit-logged decisions.
- 05Export audit logs to your SIEM (Splunk, Datadog, Elastic, S3) for client reporting.
# 1. Create a project for the engagement
curl -X POST https://api.abliteration.ai/v1/policy/projects \
-H "Authorization: Bearer $ADMIN_KEY" \
-d '{"name": "acme-pentest-q2", "policy_id": "security-testing"}'
# 2. Issue a scoped key for the engagement
curl -X POST https://api.abliteration.ai/v1/policy/keys \
-H "Authorization: Bearer $ADMIN_KEY" \
-d '{"project_id": "acme-pentest-q2", "quota": {"tokens": 5000000, "window": "monthly"}}'
# 3. Use the scoped key for pentest queries
curl https://api.abliteration.ai/policy/chat/completions \
-H "Authorization: Bearer $SCOPED_KEY" \
-H "X-Policy-User: analyst-jane" \
-d '{
"model": "abliterated-model",
"messages": [{"role":"user","content":"Generate a test SQL injection payload for the OWASP Juice Shop login form."}],
"policy_id": "security-testing"
}'Workflow overview
- Step 1 — Policy setup: Define what your pentesters can and cannot ask. Allow exploit generation, payload crafting, and vulnerability analysis. Deny social engineering against real targets.
- Step 2 — Engagement isolation: Create a project per client engagement with its own scoped key, budget, and audit trail.
- Step 3 — Governed inference: All requests route through Policy Gateway. Every decision is logged with policy_id, user, project, and reason code.
- Step 4 — Client reporting: Export audit logs to your SIEM. Show clients exactly what AI was used for, what policies were enforced, and what was refused.
- Step 5 — Engagement teardown: Revoke the scoped key. Audit logs persist for compliance. No residual access.
Frequently asked questions.
Why do legitimate pentest prompts get blocked on other APIs?
Mainstream LLMs use generic hacking filters that cannot distinguish authorized security research from malicious intent. They block based on keywords and patterns, not authorization context.
How do I prove AI governance to enterprise clients?
Policy Gateway logs every decision with structured metadata: policy ID, user, project, reason code, and triggered categories. Export to your SIEM and include in engagement reports.
Can I restrict what individual analysts can ask?
Yes. Use policy_user tags and per-user quotas to enforce different access levels within the same engagement.
Is the AI output stored?
No. Prompts and completions are processed ephemerally. Only policy decision metadata is logged for audits.