abliteration.ai - Uncensored LLM API Platform
Abliteration
DocsRoleplayMigrationDefinitionsPricing
Home/OpenAI-compatible rate limit matrix

Compatibility

OpenAI-compatible rate limit matrix

Rate limits vary by plan and traffic conditions.

Use response headers and Retry-After to pace requests.

Quick start

Base URL
Backoff-ready payload
{
  "model": "abliterated-model",
  "messages": [
    { "role": "user", "content": "Reply with: rate limit check." }
  ],
  "temperature": 0.2
}

Free preview for 5 messages. Sign up to continue.

Service notes

  • Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
  • Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
  • Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
  • Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
  • Throughput: Team plans include priority throughput. Actual throughput varies with demand.
  • Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.

What to monitor

  • 429 responses signal you hit a limit window.
  • Retry-After headers provide recommended wait times.
  • Track token usage per request to avoid spikes.

Retry strategy

  • Use exponential backoff with jitter.
  • Cap maximum retries to avoid request storms.
  • Log request ids for debugging with support.

Mini playground

Response will appear here.
Expected output should include rate limit check.

Test vector

Expected output should include rate limit check.

Request payload
{
  "model": "abliterated-model",
  "messages": [
    {
      "role": "user",
      "content": "Reply with: rate limit check."
    }
  ],
  "temperature": 0.2,
  "stream": false
}

Common errors & fixes

  • 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
  • 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
  • 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
  • 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.

Related links

  • OpenAI compatibility guide
  • Instant migration tool
  • Compatibility matrix
  • Streaming chat completions
  • See API Pricing
  • View Uncensored Models
  • Rate limits
  • Privacy policy
DefinitionsDocumentationRun in PostmanPrivacy PolicyTerms of ServiceHugging Facehelp@abliteration.ai
FacebookX (Twitter)

© 2025 Social Keyboard, Inc. All rights reserved.