Generate the labels other models refuse to write.
Generate labelled datasets at scale: preference pairs, eval rows, classifier examples, and the edge cases other LLMs refuse to write. Reasoning traces and live web-search grounding on every row; one-click export to Hugging Face, Kaggle, S3, GCS, or Azure. Schema-validated outputs, ready to drop into any training pipeline.
Generate 10,000 realistic customer support conversations about refunds and billing edge cases.
billing-support-training.jsonl
10,000 rows · 4.2 MB · ready
The categories that block your training pipeline.
Coded harassment
Multi-lingual dog-whistles and ingroup-coded slurs for moderation classifier training.
Phishing emails
Synthetic phishing across urgency, authority, and pretexting tactics for detection eval.
Prompt injections
Direct, indirect, ASCII-smuggling, and tool-confusion vectors for AppSec red-team training.
Adversarial edge cases
Refusal-eval prompts, policy edge cases, and known-jailbreak corpora for safety regression sets.
Reasoning included.
Most frontier APIs hide chain-of-thought out of distillation paranoia. We don't. Toggle Thinking in the console and the model's reasoning lands on every row as a schema-aware sidecar, ready to feed your next reasoning model.
Grounded in the live web.
Generate against today's web, not the model's frozen pre-training cutoff. Toggle Web search in the console and the sources field on every row carries the URLs the model actually fetched, with titles and fetched-at timestamps. Citations stay with the row through export.
Preview before you commit.
Every job starts with a paid 3-row preview generated against your spec, schema, and toggles. We charge real preview credits, then estimate the full run as the highest-cost preview row × your row target plus a reserve buffer. Top up if short. Promote to full job if it looks right.
| # | sample prompt | tokens in/out | cost |
|---|---|---|---|
| 1 | Multi-lingual coded harassment, tier-2 severity. | 1,402 / 812 | $0.0052 |
| 2 | RAG-grounded answer about Article 53 timeline. | 1,538 / 921 | $0.0064max |
| 3 | Deepfake-script detector seed, voice-clone variant. | 1,318 / 784 | $0.0049 |
In the format your pipeline already speaks.
Push to where your training pipeline lives.
Stream a run straight to the bucket, dataset hub, or notebook you already use. No middleman, no manual upload step.
Hugging Face Hub
One-click push to a private or public dataset card.
Kaggle
Push to a private Kaggle dataset for competitions and notebooks.
Amazon S3
Stream directly to your bucket. SSE-KMS supported.
Google Cloud Storage
Stream to GCS with service-account auth and CMEK.
Azure Blob
Write straight to a container with managed identity.
Direct download
Grab a signed URL and pull the run from your training box.
Typed once. Validated every row.
Define the dataset shape once. Every generated row is validated before it leaves the workspace, with provenance, tier, and policy label attached. No bespoke clean-up step downstream.
{ "schema": "preference-pair-v3", "fields": [ { "name": "prompt", "type": "string" }, { "name": "chosen", "type": "string" }, { "name": "rejected", "type": "string" }, { "name": "policy", "type": "string" }, { "name": "tier", "type": "int" } ] }
{ "prompt": "Rewrite this support ticket in a civil tone.", "chosen": "I understand your frustration. Let's solve the refund...", "rejected": "Stop being so rude. Read the FAQ.", "policy": "civil-tone-v2", "tier": 1 }
One line in your training script.
Once a run is exported, load it the same way you load every other dataset: Hugging Face datasets, the OpenAI fine-tune API, or your own RLHF stack.
from datasets import load_dataset ds = load_dataset( "abliteration/runs/billing-support-training", split="train", ) trainer.train(dataset=ds)
Free tier. Pay-as-you-go. Enterprise.
Generate up to 1M records per job. Tiered by records-per-month, with usage-based scaling for enterprise.
Try the model that doesn’t say no.
Free tier. OpenAI-compatible. Policy Gateway when you scale.