Model
Uncensored Gemma 2 API - Developer Controlled
Looking for Uncensored Gemma 2 API? abliteration.ai offers a developer-controlled, less-censored API you can call today.
Use model abliterated-model on the OpenAI-compatible endpoint at https://api.abliteration.ai/v1.
Quick start
Base URL
Chat completions payload
{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Generate a short FAQ response for a billing question." }
],
"temperature": 0.2
}Free preview for 5 messages. Sign up to continue.
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Why developers choose abliteration.ai for Uncensored Gemma 2 API
- OpenAI-compatible /v1/chat/completions endpoint (swap the base URL).
- Works well for lightweight app integrations workflows with streaming support.
- Single production model id:
abliterated-model. - No prompt/output retention by default.
- Usage-based pricing (~$5 per 1M tokens).
- Optional Policy Gateway for policy-as-code, quotas, and audit logging.
- Need the exact Gemma 2 weights? Email help@abliteration.ai for roadmap updates.
Model specs (abliterated-model)
Latest published benchmark scores and refusal rate for abliterated-model.
| Metric | Score |
|---|---|
mmlu_pro |
82.1 |
gpqa |
73.1 |
aime_2025 |
83.7 |
mmmu_pro |
68.1 |
refusals (mlabonne/harmful_behaviors) |
3/100 |
Implementation checklist
- Set base_url to https://api.abliteration.ai/v1.
- Provide your abliteration.ai API key as a Bearer token (ABLIT_KEY).
- Use model "abliterated-model" in the request payload.
- Keep the OpenAI-compatible messages schema unchanged.
- Enable stream: true if you want incremental tokens.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.