Rate limiting is partner-scoped: limits are shared across all tenants under the same API key and environment. Each request is classified into 1 of 6 buckets.
Buckets
| Bucket | Limit | Endpoints |
|---|
criteria_ai | 2 req/s + 4 concurrent in-flight | POST /v1/jobs/{jobId}/question-sets, POST /v1/jobs/{jobId}/criteria/generate |
scoring_intake_batch | 1 req/s | POST /v1/jobs/{jobId}/scoring-batches |
scoring_intake_single | 10 req/s | POST /v1/jobs/{jobId}/applications/{applicationId}/scoring-jobs |
read_and_ops | 20 req/s | All other /v1/* endpoints (except analytics and rate-limit status) |
analytics | 20 req/s | GET /v1/analytics/* |
rate_limit_status | 2 req/s | GET /v1/rate-limit-status |
Buckets are isolated: exhausting one doesn’t affect others.
The criteria_ai bucket has an extra concurrency cap. At most 4 requests can process simultaneously, returning 429 even if RPS tokens are available.
If you need higher limits, contact us through the Embed Portal.
All /v1/* responses include:
X-RateLimit-Bucket: scoring_intake_single
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 8
X-RateLimit-Reset: 1734187201
| Header | Description |
|---|
X-RateLimit-Bucket | Which bucket the request was classified into |
X-RateLimit-Limit | Max requests per second for this bucket |
X-RateLimit-Remaining | Requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-RateLimit-Degraded | "true" when rate limiting is in degraded mode |
Checking status
GET /v1/rate-limit-status returns all buckets at once without consuming tokens from other buckets. Only needs Authorization, not X-Tenant-Id.
Handling 429
When rate-limited, the response includes a Retry-After header (seconds to wait).
{
"type": "https://docs.nova.dweet.com/embed-api/errors#rate-limited",
"code": "RATE_LIMITED",
"status": 429,
"message": "Rate limit exceeded. Retry after 2 seconds.",
"retryable": true,
"traceId": "5c2f4f5b2c0a4ce0b6a31a1a18f8e9a1"
}
For high-volume backfills, use POST /v1/jobs/{jobId}/scoring-batches with up to 25 applications per request.
Degraded mode
If Redis is temporarily unavailable, the system enters degraded mode: requests are allowed through (fail-open), X-RateLimit-Degraded: true is set, and header values are best-effort estimates. Normal limiting resumes automatically when Redis recovers.