Integration guide
How to use AWS Lambda with an OpenAI-compatible endpoint (Python)
AWS Lambda works with OpenAI-compatible APIs by switching the base URL and API key.
This guide shows a Python example plus a test vector you can run to validate responses.
Quick start
Base URL
Python request example
import os
import requests
payload = {
"model": "abliterated-model",
"messages": [{"role": "user", "content": "Respond with: AWS Lambda Python ready."}],
"temperature": 0.2,
}
resp = requests.post(
"https://api.abliteration.ai/v1/chat/completions",
headers={
"Authorization": f"Bearer {os.getenv('ABLIT_KEY')}",
"Content-Type": "application/json",
},
json=payload,
)
resp.raise_for_status()
print(resp.json()["choices"][0]["message"]["content"])Free preview for 5 messages. Sign up to continue.
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Configure AWS Lambda
Follow this checklist to point your integration at the OpenAI-compatible endpoint.
- Add ABLIT_KEY to your function environment and keep requests on the server.
- Set the base URL to https://api.abliteration.ai/v1.
- Provide your abliteration.ai API key as a Bearer token (ABLIT_KEY).
- Use model "abliterated-model" to match the provider naming.
- Keep the messages schema unchanged (role/content).
OpenAI-compatible payload
Use this request body as a known-good payload before customizing parameters.
Chat completions payload
{
"model": "abliterated-model",
"messages": [
{ "role": "user", "content": "Respond with: AWS Lambda Python ready." }
],
"temperature": 0.2
}Streaming and tool calling readiness
If you stream responses or send tool definitions, keep the OpenAI-compatible schema and validate against the OpenAPI spec.
- Set stream: true to receive chunks as they arrive.
- Parse SSE data lines and ignore keep-alives.
- Validate tool schemas before sending production traffic.
Mini playground
Response will appear here.
Expected output should include AWS Lambda Python ready.
Test vector
Expected output should include AWS Lambda Python ready.
Request payload
{
"model": "abliterated-model",
"messages": [
{
"role": "user",
"content": "Respond with: AWS Lambda Python ready."
}
],
"temperature": 0.2,
"stream": false
}Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.