API Reference
Rate Limits

Rate Limits

Routify applies three layers of rate limits:

  1. Global — protects gateway from abuse (10w QPS aggregate)
  2. Per-user — by account tier
  3. Per-model — protects upstream provider quotas

User tier limits

TierRPMRPHTokens/dayConcurrent
Free60600100k2
Pro30010kunlimited8
VIP1500-3000100kunlimited32-64
EnterpriseCustomCustomunlimited256

Model-specific limits (free + pro)

ModelRPM capConcurrent
deepseek-v3.2100050
kimi-k2.580030
claude-opus-4-7505
claude-sonnet-4-620015
gpt-4o20015
o1505
kling-1.6303 (1k/day platform-wide)

Hitting a limit

You'll get a 429 with Retry-After header (seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 30
 
{"error": {"type": "rate_limit", "message": "RPM exceeded: 300 / 60s"}}

Standard OpenAI SDKs retry automatically.

Increasing limits

  • Upgrade tier (auto-promote based on monthly spend)
  • Contact us for enterprise quotas

Endpoint-specific limits

EndpointLimit
POST /v1/chat/completionsper-tier above
POST /v1/embeddings1000 RPM, 100 concurrent
POST /api/auth/login10 / IP / min
POST /api/auth/register3 / IP / min
POST /api/payment/recharge5 / user / min