ReferenceUpdated 2026-05-01

Video-capable LLM API

Send short videos to OpenAI-compatible chat completions. MP4/WebM/MOV up to 25 MB / 30 seconds.

A video-capable LLM API accepts video bytes in the same request as text and returns natural-language descriptions or structured answers.

On abliteration.ai, video is supported on /v1/chat/completions only — Anthropic Messages and OpenAI Responses do not accept video natively.

Definition

Video-capable LLM API

A video-capable LLM API lets you include short videos as inputs and receive descriptions, structured answers, or grounded reasoning from the model.

Why it matters

Describe scenes, actions, or interactions captured on video.
Summarize short screen recordings or product demos.
Extract structured data — counts, labels, timestamps — from short clips.
Combine video with text instructions for grounded reasoning.

How it works

01Call /v1/chat/completions with a model that supports video inputs.
02Send message.content as an array mixing text parts and video_url parts.
03Use a public HTTPS URL or an inline data:video/mp4;base64,... URL.
04Authenticate with a JWT or API key — anon free-tier callers are blocked from video.
05Stream responses by setting stream: true and consuming delta chunks.

Example request

curl https://api.abliteration.ai/v1/chat/completions \
  -H "Authorization: Bearer $ABLIT_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "abliterated-model",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "Describe this clip." },
          { "type": "video_url", "video_url": { "url": "https://example.com/clip.mp4" } }
        ]
      }
    ]
  }'

FAQ

Frequently asked questions.

Which endpoints accept video?

Only /v1/chat/completions and /policy/chat/completions. /v1/messages and /v1/responses return 400 with a neutral pointer to https://docs.abliteration.ai.

What video formats are supported?

MP4 (video/mp4), WebM (video/webm), and QuickTime/MOV (video/quicktime). Convert other containers before sending.

How long can a video be?

Up to 30 seconds and up to 25 MB raw. Longer or larger videos return HTTP 413 with the body-cap message.

Can I send video as a public URL?

Yes. The backend fetches the URL server-side. The same SSRF guard that blocks private IPs for image URLs applies to video URLs — rejection code is unsafe_video_url.

Can anonymous (free-tier) users send video?

No. Anonymous X-Free-Tier callers are blocked from video specifically. Text and image stay free. The rejection code is video_anon_blocked.

How many tokens does a video use?

Roughly proportional to the number of frames sampled times their resolution. A 2-second 128x256 clip is about 90 tokens; a 5-second 480p clip is around 2,000 tokens; a 10-second 360p clip is around 3,700 tokens. Downscale to keep latency and cost predictable.

Is video moderated?

The accompanying prompt text is moderated. Per-frame video moderation is planned before public launch — tracked as TODO(VIDEO-MODERATION) in the moderation pipeline. Avoid uploading content that violates the abliteration.ai usage policy.

Does streaming work with video?

Yes. Set stream: true and consume delta chunks the same way as text-only completions. Time to first token is higher for video because vLLM samples frames first.

Why isn't video supported on /v1/messages or /v1/responses?

The Anthropic Messages spec has 17 content types and none are video. The OpenAI Responses spec accepts input_text/input_image/input_file but not video. We match the canonical specs rather than adding a non-standard translation layer. When upstream specs add video, we will too.

Next steps.

Video inputs guide Vision-capable LLM API Multimodal LLM API overview Streaming chat completions OpenAI compatibility guide See API Pricing View Unrestricted Models Rate limits Privacy policy