Rate Limits
Learn about rate limits, how they work, and best practices for handling rate limit responses in your application.
Overview
Rate limits help ensure fair usage of the API and protect against abuse. The Scoped Memory API implements rate limiting to maintain service quality for all users.
Rate Limit Information
Rate limits are applied per API key. Different plans may have different rate limits. Check your plan details for specific limits.
Rate Limit Headers
Every API response includes rate limit information in the response headers:
Rate Limit Headers
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1642680000X-RateLimit-Limit - The maximum number of requests allowed in the current time windowX-RateLimit-Remaining - The number of requests remaining in the current time windowX-RateLimit-Reset - Unix timestamp when the rate limit window resetsRate Limit Exceeded
When you exceed the rate limit, the API returns a 429 Too Many Requests response:
Rate Limit Error Response
{
"detail": "Rate limit exceeded. Please try again later."
}The response will also include the rate limit headers showing when the limit will reset.
Handling Rate Limits
Implement exponential backoff when you receive a 429 response. Here's how to handle rate limits gracefully:
Rate Limit Handling with Retry
import time
from memory_scope import MemoryScopeClient
from memory_scope.exceptions import RateLimitError
client = MemoryScopeClient(api_key="your-api-key")
def make_request_with_retry(max_retries=3):
for attempt in range(max_retries):
try:
result = client.read_memory(
user_id="user123",
scope="preferences",
domain="food",
purpose="generate food recommendations"
)
return result
except RateLimitError as e:
if attempt < max_retries - 1:
# Exponential backoff: wait 2^attempt seconds
wait_time = 2 ** attempt
print(f"Rate limit exceeded. Waiting {wait_time} seconds...")
time.sleep(wait_time)
else:
# Max retries reached
raise Exception("Rate limit exceeded. Max retries reached.")
except Exception as e:
# Handle other errors
raise eMonitoring Rate Limits
Monitor rate limit headers to track your usage and avoid hitting limits:
Monitoring Rate Limits
import requests
response = requests.post(
"https://api.memoryscope.dev/memory/read",
headers={
"X-API-Key": "your-api-key",
"Content-Type": "application/json"
},
json={...}
)
# Check rate limit headers
rate_limit = response.headers.get("X-RateLimit-Limit")
remaining = response.headers.get("X-RateLimit-Remaining")
reset_time = response.headers.get("X-RateLimit-Reset")
print(f"Rate limit: {rate_limit}")
print(f"Remaining: {remaining}")
print(f"Resets at: {reset_time}")
# Alert when approaching limit
if int(remaining) < 100:
print("Warning: Approaching rate limit!")Best Practices
- Implement Exponential Backoff: When you receive a 429 response, wait before retrying. Use exponential backoff to avoid overwhelming the API.
- Monitor Rate Limit Headers: Check rate limit headers in responses to track your usage and avoid hitting limits.
- Cache Responses: Cache API responses when appropriate to reduce the number of requests you make.
- Batch Operations: When possible, batch multiple operations into fewer requests.
- Use Webhooks: If available, use webhooks instead of polling to reduce API calls.
- Plan for Limits: Design your application to handle rate limits gracefully. Don't assume unlimited requests.
- Upgrade Plan if Needed: If you consistently hit rate limits, consider upgrading your plan for higher limits.
- Handle Gracefully: Always handle 429 responses gracefully. Don't retry immediately without waiting.
Rate Limit Windows
Rate limits are typically applied on a per-minute or per-hour basis, depending on your plan. The reset time in the headers indicates when the limit window resets. Check your plan details for specific rate limit windows.
Related Documentation