Integrations
Spring Boot integration
Use Spring WebClient to post a minimal chat completion payload.
Store your ABLIT_KEY in an environment variable or secret manager.
Quick start
Base URL
Example request
import org.springframework.http.MediaType;
import org.springframework.web.bind.annotation.*;
import org.springframework.web.reactive.function.client.WebClient;
import reactor.core.publisher.Mono;
import java.util.List;
import java.util.Map;
@RestController
public class ChatController {
private final WebClient client = WebClient.builder()
.baseUrl("https://api.abliteration.ai/v1")
.defaultHeader("Authorization", "Bearer " + System.getenv("ABLIT_KEY"))
.build();
@PostMapping("/chat")
public Mono<String> chat(@RequestBody Map<String, String> body) {
return client.post()
.uri("/chat/completions")
.contentType(MediaType.APPLICATION_JSON)
.bodyValue(Map.of(
"model", "abliterated-model",
"messages", List.of(Map.of("role", "user", "content", body.get("prompt")))
))
.retrieve()
.bodyToMono(String.class);
}
}Free preview for 5 messages. Sign up to continue.
Service notes
- Pricing model: Usage-based pricing (~$5 per 1M tokens) billed on total tokens (input + output). See the API pricing page for current plans.
- Data retention: No prompt/output retention by default. Operational telemetry (token counts, timestamps, error codes) is retained for billing and reliability.
- Compatibility: OpenAI-style /v1/chat/completions request and response format with a base URL switch.
- Latency: Depends on model size, prompt length, and load. Streaming reduces time-to-first-token.
- Throughput: Team plans include priority throughput. Actual throughput varies with demand.
- Rate limits: Limits vary by plan and load. Handle 429s with backoff and respect any Retry-After header.
Common errors & fixes
- 401 Unauthorized: Check that your API key is set and sent as a Bearer token.
- 404 Not Found: Make sure the base URL ends with /v1 and you call /chat/completions.
- 400 Bad Request: Verify the model id and that messages are an array of { role, content } objects.
- 429 Rate limit: Back off and retry. Use the Retry-After header for pacing.