Docs
OpenAI compatibility (base URL switch)
abliteration.ai exposes an OpenAI-compatible /v1/chat/completions endpoint, so most OpenAI SDKs work without code rewrites.
To migrate, change the base URL, supply your abliteration.ai API key, and set the model id (for example, abliterated-model).
Your message schema, parameters, and streaming flags stay the same. The main differences are the base URL and model naming.
Quick start
from openai import OpenAI
client = OpenAI(
base_url="https://api.abliteration.ai/v1",
api_key="YOUR_ABLIT_KEY",
)
resp = client.chat.completions.create(
model="abliterated-model",
messages=[{"role": "user", "content": "Say hello in one sentence."}],
)
print(resp.choices[0].message.content)Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Compatibility checklist
Use this checklist to switch providers in minutes.
- Set base_url to https://api.abliteration.ai/v1.
- Use your abliteration.ai API key as a Bearer token.
- Keep the same role/content message format.
- Swap in an abliteration.ai model id.
- Enable stream: true if you want partial tokens.
Request shape and common fields
The request and response mirror OpenAI Chat Completions, so you can reuse the same SDK helpers and typed schemas.
{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Summarize this in one sentence." }
],
"temperature": 0.7,
"max_tokens": 256,
"stream": false
}Structured JSON output
Structured output is available through response_format with type: "json_object".
The model will return a JSON object you can parse directly.
{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Return a JSON object with title and summary." }
],
"response_format": { "type": "json_object" }
}Function calling
Function calling uses the OpenAI-compatible functions and function_call fields.
Provide a JSON Schema in parameters and optionally force a call by name.
{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Schedule a meeting on 2026-02-03 at 15:00." }
],
"functions": [
{
"name": "create_calendar_event",
"description": "Create a calendar event from a natural language request",
"parameters": {
"type": "object",
"properties": {
"title": { "type": "string" },
"date": { "type": "string" },
"time": { "type": "string" }
},
"required": ["title", "date", "time"]
}
}
],
"function_call": { "name": "create_calendar_event" }
}Migration validation
Start with a small prompt, then compare latency and output quality before sending production traffic.
- Confirm 200 responses and a populated choices[0].message.
- Log response headers to capture request ids and timing.
- Watch for 401/404 errors that indicate the wrong key or base path.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.