Rate Limits
Per-endpoint request throttling with distributed sliding-window rate limiting.
Overview
GaaS enforces rate limits to prevent abuse, ensure fair resource allocation, and maintain system stability. Rate limits are applied per API key (organization) and vary by endpoint.
When you exceed a rate limit, the API returns an HTTP 429 Too Many Requests response with a Retry-After header indicating when you can retry.
GAAS_DATABASE_URL is set), GaaS uses Postgres-backed distributed rate limiting with atomic sliding-window counters. This ensures accurate rate limiting across multiple API server instances.
Per-Endpoint Rate Limits
Different endpoints have different rate limits based on their resource intensity:
| Endpoint Category | Limit | Window | Examples |
|---|---|---|---|
| Intent Submission | 20 requests | Per minute | POST /v1/intentsPOST /v1/intents/batch |
| Backtest | 5 requests | Per minute | POST /v1/learning/backtest |
| Onboarding | 10 requests | Per minute | POST /v1/onboarding/quickstartPOST /v1/onboarding/intake |
| Authentication (IP-based) | 5 requests | Per minute | Dashboard login, signup, MFA verification |
| General Endpoints | 120 requests | Per minute | All other API endpoints (decisions, escalations, audit, webhooks, etc.) |
429 Response Format
When you exceed a rate limit, the API returns a 429 Too Many Requests response:
POST https://api.gaas.is/v1/intents
X-API-Key: your_api_key
# Response (when rate limit exceeded):
HTTP/1.1 429 Too Many Requests
Retry-After: 42
X-RateLimit-Limit: 20
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1707824460
Content-Type: application/json
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded for POST /v1/intents. Limit: 20 requests per minute. Retry after 42 seconds.",
"retry_after_seconds": 42
}
}
Rate Limit Headers
All successful responses (2xx) include rate limit headers:
X-RateLimit-Limit— Maximum requests allowed in the windowX-RateLimit-Remaining— Requests remaining in current windowX-RateLimit-Reset— Unix timestamp when the rate limit resets
On 429 responses, the Retry-After header indicates the number of seconds to wait before retrying.
Rate Limit Scoping
Per API Key (org_id)
Most rate limits are scoped to your API key (organization). This means all requests from your organization count toward the same limit, regardless of which agent or service is making the request.
Per IP Address
Authentication endpoints (login, signup, MFA verification) are rate-limited per IP address (not per API key) to prevent brute-force attacks. This applies to the dashboard authentication routes only.
Handling Rate Limits
Python (gaas_sdk)
import time
from gaas_sdk import GaaSClient
from gaas_sdk.exceptions import RateLimitError
async with GaaSClient("https://api.gaas.is", headers={"X-API-Key": api_key}) as client:
try:
response = await client.submit_intent(intent)
except RateLimitError as e:
retry_after = e.retry_after_seconds
print(f"Rate limited. Retrying after {retry_after} seconds...")
time.sleep(retry_after)
response = await client.submit_intent(intent) # Retry
TypeScript (@gaas/sdk)
import { GaaSClient, RateLimitError } from '@gaas/sdk';
const client = new GaaSClient({ baseUrl: 'https://api.gaas.is', headers: { ... } });
try {
const response = await client.submitIntent(intent);
} catch (error) {
if (error instanceof RateLimitError) {
const retryAfter = error.retryAfterSeconds;
console.log(`Rate limited. Retrying after ${retryAfter}s...`);
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
const response = await client.submitIntent(intent); // Retry
}
}
Java (com.gaas.sdk)
import com.gaas.sdk.GaaSClient;
import com.gaas.sdk.exceptions.RateLimitException;
try {
GaaSResponse response = client.submitIntent(intent);
} catch (RateLimitException e) {
int retryAfter = e.getRetryAfterSeconds();
System.out.println("Rate limited. Retrying after " + retryAfter + " seconds...");
Thread.sleep(retryAfter * 1000);
GaaSResponse response = client.submitIntent(intent); // Retry
}
Best Practices
1. Implement Exponential Backoff
When a request is rate-limited, wait for the duration specified in Retry-After, then retry. If you hit the limit again, double the wait time with each subsequent retry:
async def submit_with_backoff(client, intent, max_retries=3):
retry_delay = 1
for attempt in range(max_retries):
try:
return await client.submit_intent(intent)
except RateLimitError as e:
if attempt == max_retries - 1:
raise # Final attempt failed
retry_delay = e.retry_after_seconds * (2 ** attempt)
await asyncio.sleep(retry_delay)
2. Batch Requests Where Possible
Use the bulk intent submission endpoint (POST /v1/intents/batch) to submit up to 50 intents in a single request. This counts as 1 request toward the rate limit instead of 50.
3. Monitor Rate Limit Headers
Check X-RateLimit-Remaining on every response. If it's low (e.g., <10), slow down your request rate proactively to avoid hitting the limit.
remaining = int(response.headers['X-RateLimit-Remaining'])
if remaining < 10:
print("Approaching rate limit. Slowing down requests...")
time.sleep(5) # Pause before next request
4. Use Shadow Mode for Load Testing
Shadow mode intent submissions (?mode=shadow) are still subject to rate limits. When load testing, ensure your test traffic doesn't exceed the 20 req/min intent submission limit. Consider using multiple API keys to distribute load.
5. Cache Read-Only Data
Responses for read-only endpoints (e.g., GET /v1/intents/{intent_id}/decision) include ETag headers. Cache these responses locally and use If-None-Match headers to get 304 Not Modified responses, which don't count toward rate limits.
Upgrading Rate Limits
Rate limits are not tied to pricing tiers—all organizations (including Developer tier) have the same rate limits. This is intentional to ensure fair access.
If you need higher rate limits for high-volume production use cases, contact sales@gaas.is to discuss custom Enterprise rate limits. Enterprise plans can negotiate:
- Higher intent submission limits (e.g., 100 req/min instead of 20)
- Dedicated API endpoints with isolated rate limits
- Burst allowances (temporary rate limit increases)
Troubleshooting
Getting 429 Errors Despite Low Traffic
Cause: Multiple services or agents sharing the same API key are collectively exceeding the limit.
Solution: Monitor X-RateLimit-Remaining headers. Consider using separate API keys for different services (requires separate organizations).
Rate Limit Not Resetting
Cause: Sliding window means the limit resets gradually as old requests age out of the 1-minute window, not all at once.
Solution: Wait at least 60 seconds from your last successful request before retrying at full speed.
Auth Endpoints Blocked by IP Rate Limit
Cause: Dashboard authentication endpoints (login, signup, MFA) are limited to 5 req/min per IP address.
Solution: If multiple users share a single outbound IP (e.g., corporate NAT), requests may be rate-limited collectively. Contact support for IP allowlisting (Enterprise only).
Related Pages
- Billing & Quotas — Monthly action limits
- Advanced Features — Bulk submission to reduce rate limit impact
- Authentication — API key setup
- API Reference — Complete endpoint documentation