Limits & quotas

Account-level ceilings that are separate from per-request rate limits — how many keys, how many parallel deploys, how big a request body.

Account limits

LimitFree / PAYGScale
API keys10100
Webhook endpoints550
Audit log retention90 days1 year
Spend capSelf-setSelf-set

Per-request limits

LimitValue
Request body10 MB
Embedding batch size2048 inputs
Chat messages array256 messages
File upload25 MB (audio), 100 MB (batch)
Streaming idle timeout60 s

These are hard limits, enforced at the edge. Hitting them returns 422 with a precise error message — not 400.

Raising a limit

Contact us with:

  • The limit you want lifted.
  • The workload that's hitting it.
  • The projected scale (peak, sustained, retention period).

Most reasonable lifts go through the day they're asked. The exceptions are limits tied to abuse vectors (key count, webhook fan-out) where we may want to talk first.

For per-request rate limits (RPM / TPM), see Rate limits.

Last updated 13 May 2026Edit this page on GitHub