Rate Limits & Quotas
Relay protects the platform with two independent controls: a per-API-key rate limit that smooths out bursts of requests, and a monthly usage quota that caps participant-minutes by plan tier.
Rate limits
Every API key is metered with a token bucket. Each key gets a bucket that holds a burst of up to 120 requests and refills at roughly 2 requests per second. So you can spend a short burst quickly, but sustained throughput settles at about 2 requests/second per key. (Exact numbers may change as we tune the platform — treat them as the current defaults, not a contract, and rely on the response headers below.)
When a key runs out of tokens, the request is rejected with HTTP 429 and a rate_limited error envelope. Every limited response carries headers describing the limit and when to retry:
RateLimit-Limit— the bucket capacity (max burst).RateLimit-Remaining— tokens left right now (0on a 429).RateLimit-Reset— seconds until a token replenishes.Retry-After— seconds to wait before retrying (mirrorsRateLimit-Reset).
HTTP/1.1 429 Too Many Requests
RateLimit-Limit: 120
RateLimit-Remaining: 0
RateLimit-Reset: 3
Retry-After: 3
Content-Type: application/json
{
"error": {
"type": "rate_limited",
"code": "rate_limited",
"message": "Too many requests"
}
}Handle 429 by waiting the number of seconds in Retry-After before retrying, and back off exponentially if you hit it repeatedly. Keep your sustained request rate under the refill rate and reserve bursts for genuine spikes.
async function relayFetch(url, init, attempt = 0) {
const res = await fetch(url, init);
if (res.status !== 429 || attempt >= 5) return res;
// Respect Retry-After (seconds until tokens replenish); back off if absent.
const retryAfter = Number(res.headers.get("Retry-After")) || 2 ** attempt;
await new Promise((r) => setTimeout(r, retryAfter * 1000));
return relayFetch(url, init, attempt + 1);
}Quotas
Each organization has a monthly cap on participant-minutes, set by its plan tier:
- Free — 10,000 participant-minutes / month.
- Pro — 200,000 participant-minutes / month.
- Business — 1,000,000 participant-minutes / month.
Usage accumulates across all projects in your organization and resets at the start of each UTC month. Once you reach the cap, the actions that drive new usage — POST /v1/tokens and POST /v1/rooms — return HTTP 402 with a quota_exceeded error envelope:
HTTP/1.1 402 Payment Required
Content-Type: application/json
{
"error": {
"type": "quota_exceeded",
"code": "quota_exceeded",
"message": "Monthly usage quota exceeded for this plan"
}
}The quota is a soft gate: it only blocks the calls that create new usage. Read endpoints (such as GET /v1/usage and the room GETendpoints) and room deletes are never quota-blocked, so a session already in progress is not cut off. You can check your current month's usage and remaining allowance on the Settings page in the dashboard.