Integrations
Cloud Run integration
Cloud Run accepts HTTP requests and forwards them to your container.
Use a lightweight Express handler to relay prompts to abliteration.ai.
Quick start
Base URL
Example request
import express from "express";
const app = express();
app.use(express.json());
app.post("/chat", async (req, res) => {
const prompt = req.body?.prompt || "";
const apiRes = await fetch("https://api.abliteration.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": `Bearer ${process.env.ABLIT_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "abliterated-model",
messages: [{ role: "user", content: prompt }],
}),
});
res.status(apiRes.status).json(await apiRes.json());
});
const port = process.env.PORT || 8080;
app.listen(port);Free preview for 5 messages. Sign up to continue.
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.