PRICING

Pay only for what you actually run

No plans, no subscriptions, no minimum commitments. Two billing modes cover every workload — rates track the upstream provider in real time, and TokenByte takes no cut.

01TEXT WORKLOADS

Per-token billing

Chat completions, text completions, embeddings — input and output tokens metered separately at live upstream rates.

How it works

Every API request is metered by token count. Input and output tokens are tracked independently at their own rate. TokenByte charges no platform fee — the rate on your invoice is the upstream provider's rate.

Endpoints

Chat completionsText completionsEmbeddings
  • Input and output metered separately
  • Provider-equivalent live rates
  • Second-level usage tracking
  • Per-key spending caps
02MEDIA WORKLOADS

Per-request billing

Image generation, speech-to-text, text-to-speech, audio processing — a flat fee per task, independent of input length.

How it works

Each call is billed at a fixed rate for its task type. The price depends on the task and the model, not the size of the prompt. One request, one charge — easy to reconcile.

Endpoints

Image generationSpeech-to-textText-to-speechAudio processing
  • Fixed cost per request
  • Tiered by task type
  • Covers images, audio, video
  • No hidden platform fees
FAQ

Billing FAQ

Start building in minutes

Create an account, generate an API key, and send your first request — pay only for what your wallet actually spends.

Get your API key