Transparent per-unit pricing
Every model has a published unit price (per image, per second of video, per million tokens). No "contact sales", no surprise per-second compute bills. The full pricing table is on /pricing and on every model page.
Browse 307 curated SOTA models across video, image, music and chat. Test any of them in the playground without writing code. Pay per unit (per image, per second of video, per million tokens) from one credit pool — no per-provider account, no negotiated contract, no wasted setup time.
No card required · Test any model in the playground before you build
Hugging Face lists 45,000 model cards. Most are forks, dead branches or research artefacts that never reach production. AI Generate lists 307 — every one is a current SOTA model from a named provider, priced per unit, latency-tested, and shipped behind a Bearer-token API. The catalog is small on purpose; everything in it is meant to be deployed.
GPT-5.5, Claude Opus 4.6, Gemini 3 Pro, DeepSeek v3, Mistral Large, GPT-5-Codex
Flux 2 Pro, Nano Banana Pro, GPT-Image, Ideogram v3, Recraft, Qwen Image
Veo 3.1, Runway Aleph, Seedance 2, Wan 2.7, Kling 3.0, Sora 2 Pro
Suno v4.5 Plus, MusicGen, Mureka, ElevenLabs voices
Two per modality, refreshed automatically when a new SOTA ships. Click through for the model page, the per-unit price and a one-click playground.
Every model has a published unit price (per image, per second of video, per million tokens). No "contact sales", no surprise per-second compute bills. The full pricing table is on /pricing and on every model page.
Buy $10, $50 or $200 of credits and spend them on whatever you need this week. Generate a Suno song in the morning, a Veo clip in the afternoon, a Claude analysis in the evening — all from the same balance.
Every chat, image, video and music model has a UI runner at /playground. Tweak prompts, switch models, copy the resulting curl command into your code. Onboarding stops at "log in".
Markup over upstream cost drops from 40% to 10% as your rolling 30-day spend climbs. Recomputed every six hours. No negotiation, no contract, no minimum commit.
Three categories, each with a different trade-off. The choice depends on whether you want curation, breadth or vertical depth.
| AI Generate | Hugging Face | Replicate | Direct providers | |
|---|---|---|---|---|
| Catalog size | 307 curated SOTA | 45,000+ (mostly forks) | ~thousands community | 1 provider per account |
| Pricing surface | Per-unit, published | Per-token via Inference API | Per-second of GPU compute | Per-token, per-provider invoice |
| Modalities under one key | Chat + image + video + music | Chat + image (limited video) | Image + video + ML, limited chat | Single modality per provider |
| Try without code | Yes — playground for every model | Yes — Spaces (per model UI varies) | Yes — model page demos | No (or per-provider Playground) |
| SDK story | OpenAI-compatible (drop-in) | huggingface_hub (proprietary) | replicate-python (proprietary) | OpenAI / Anthropic / Google direct |
| Volume discount | Auto, 5 tiers, no negotiation | No public tier ladder | No public tier ladder | Negotiated separately |
| Spend caps | Daily + monthly, returns HTTP 402 | Per-key budget | Per-key spend limits | Per-account billing alerts |
The word marketplace is overloaded. Hugging Face is technically the largest, with 45,000+ model cards uploaded by anyone — research labs, forks, fine-tunes, community experiments. The signal-to-noise ratio is great if you are a researcher and bad if you are shipping a SaaS. Replicate is a model marketplace in a different sense: it gives you a per-second GPU runtime to host any model someone has packaged in Cog. AI Generate is a marketplace in the third sense: a curated catalog of 307 production-grade models, priced per unit, behind a single Bearer token, with a playground UI for testing and a webhook surface for async generation. Every model in our catalog has a named upstream provider (OpenAI, Anthropic, Google, Black Forest Labs, ByteDance, Suno, ElevenLabs, fal, Replicate-hosted) and a per-unit price you can quote your own customers from.
When your stack reaches the third or fourth provider, the operational tax of running them in parallel becomes the dominant cost. You have separate invoices to reconcile, separate spend caps to monitor, separate dashboards to log into, separate Stripe webhooks to verify if you bill customers downstream. AI Generate folds all of that into one credit pool with one API key. Your finance team gets one line item; your engineers stop branching error-handling on which provider died this morning; your operators see the same dashboard whether the call routed to Claude, Gemini or Veo. The 10–40% markup pays for the consolidation, and the auto-tier discount makes the unit-price math competitive once your spend crosses ~$200 a month.
Every model in the marketplace has a dedicated page at /model/<slug>. The page shows the per-unit price (in credits and USD), the average latency, the supported parameters, a sample request, and a one-click "open in playground" button. The playground at /playground runs the same call you will run from your backend — same model, same prompt format, same output. You can copy the resulting curl request straight into your code, or switch the SDK tab between Python and JavaScript. Onboarding becomes: log in, claim $10, open a model page, hit playground, copy curl, ship. The interactive surface is why developers pick a marketplace over wiring three to five provider SDKs into their own backend before they have shipped anything.
When a new flagship model ships — Veo 3.2, Claude Opus 4.7, GPT-5.6, the next Suno — we benchmark it against the current category leader on latency, output quality and unit price, and add it to the catalog within days. Old or replaced models stay listed for backward compatibility (so existing integrations do not break) but are marked "superseded" on the model page. The curation policy is explicit: every active model in the catalog is one we would recommend to a customer building today. There is no SEO-padding with abandoned forks or weak rebrands. Coverage breadth is a side effect of provider quality, not a vanity metric.
Sign up, verify your email, and $10 in credits land in your account — enough to test 80 Flux Kontext Pro renders or 600 Nano Banana images.