PRICING
Pay only for what you actually run
No plans, no subscriptions, no minimum commitments. Two billing modes cover every workload — rates track the upstream provider in real time, and TokenByte takes no cut.
Per-token billing
Chat completions, text completions, embeddings — input and output tokens metered separately at live upstream rates.
How it works
Every API request is metered by token count. Input and output tokens are tracked independently at their own rate. TokenByte charges no platform fee — the rate on your invoice is the upstream provider's rate.
Endpoints
- Input and output metered separately
- Provider-equivalent live rates
- Second-level usage tracking
- Per-key spending caps
Per-request billing
Image generation, speech-to-text, text-to-speech, audio processing — a flat fee per task, independent of input length.
How it works
Each call is billed at a fixed rate for its task type. The price depends on the task and the model, not the size of the prompt. One request, one charge — easy to reconcile.
Endpoints
- Fixed cost per request
- Tiered by task type
- Covers images, audio, video
- No hidden platform fees
Billing FAQ
Start building in minutes
Create an account, generate an API key, and send your first request — pay only for what your wallet actually spends.
Get your API key