Rate Limits

Routify applies three layers of rate limits:

Global — protects gateway from abuse (10w QPS aggregate)
Per-user — by account tier
Per-model — protects upstream provider quotas

User tier limits

Tier	RPM	RPH	Tokens/day	Concurrent
Free	60	600	100k	2
Pro	300	10k	unlimited	8
VIP	1500-3000	100k	unlimited	32-64
Enterprise	Custom	Custom	unlimited	256

Model-specific limits (free + pro)

Model	RPM cap	Concurrent
`deepseek-v3.2`	1000	50
`kimi-k2.5`	800	30
`claude-opus-4-7`	50	5
`claude-sonnet-4-6`	200	15
`gpt-4o`	200	15
`o1`	50	5
`kling-1.6`	30	3 (1k/day platform-wide)

Hitting a limit

You'll get a 429 with Retry-After header (seconds):

HTTP/1.1 429 Too Many Requests
Retry-After: 30
 
{"error": {"type": "rate_limit", "message": "RPM exceeded: 300 / 60s"}}

Standard OpenAI SDKs retry automatically.

Increasing limits

Upgrade tier (auto-promote based on monthly spend)
Contact us for enterprise quotas

Endpoint-specific limits

Endpoint	Limit
`POST /v1/chat/completions`	per-tier above
`POST /v1/embeddings`	1000 RPM, 100 concurrent
`POST /api/auth/login`	10 / IP / min
`POST /api/auth/register`	3 / IP / min
`POST /api/payment/recharge`	5 / user / min

Errors Authentication