Performance Optimization
Learn how to optimize your API usage for better performance, lower latency, and reduced costs.
Performance Tips
These optimization techniques will help you build faster, more efficient applications while reducing API costs.
Use max_age_days to Filter Stale Data
Filtering out stale memories reduces response size and improves relevance. Only request data that's actually useful for your use case.
Using max_age_days
# Good: Only get recent preferences (last 30 days)
result = client.read_memory(
user_id="user123",
scope="preferences",
domain="food",
purpose="generate food recommendations",
max_age_days=30 # Filter stale data
)
# Bad: Getting all preferences (including very old ones)
result = client.read_memory(
user_id="user123",
scope="preferences",
domain="food",
purpose="generate food recommendations"
# No max_age_days - gets everything
)Cache Responses Appropriately
Cache API responses to reduce the number of requests, but be mindful of revocation tokens and data freshness.
Response Caching
import time
from functools import lru_cache
# Cache with TTL
cache = {}
CACHE_TTL = 300 # 5 minutes
def get_cached_memory(user_id, scope, domain, purpose):
cache_key = f"{user_id}:{scope}:{domain}:{purpose}"
# Check cache
if cache_key in cache:
cached_data, timestamp = cache[cache_key]
if time.time() - timestamp < CACHE_TTL:
return cached_data
# Cache miss - fetch from API
result = client.read_memory(
user_id=user_id,
scope=scope,
domain=domain,
purpose=purpose
)
# Store in cache
cache[cache_key] = (result, time.time())
return resultImportant: If a user revokes access, invalidate the cache for that user's data. Don't serve stale cached data after revocation.
Batch Operations When Possible
When you need to read multiple scopes or domains, consider batching operations or using continue operations to reduce API calls.
Batching Operations
# Good: Read once, use continue for related reads
result = client.read_memory(
user_id="user123",
scope="preferences",
domain="food",
purpose="generate recommendations"
)
# Later, continue reading with same token (no new grant)
food_prefs = client.read_memory_continue(
revocation_token=result.revocation_token
)
# Bad: Making separate read requests
food_prefs = client.read_memory(user_id="user123", scope="preferences", domain="food", ...)
music_prefs = client.read_memory(user_id="user123", scope="preferences", domain="music", ...)
# Each creates a new grant - less efficientUse Appropriate Scopes and Domains
Using the right scope and domain helps the API optimize merging and reduces unnecessary data processing.
Scope and Domain Selection
# Good: Specific scope and domain
client.create_memory(
user_id="user123",
scope="preferences", # Correct scope
domain="food", # Specific domain
value_json={"likes": ["pizza"]}
)
# Bad: Too generic
client.create_memory(
user_id="user123",
scope="preferences",
domain=None, # No domain - less organized
value_json={"food_likes": ["pizza"], "music_likes": ["jazz"]} # Mixed data
)Optimize Memory Creation
Create memories efficiently by using appropriate TTL values and avoiding unnecessary duplicates.
- Use TTL for temporary data: Set ttl_days for data that should expire automatically
- Don't worry about duplicates: The API handles deduplication automatically
- Batch creates when possible: Create multiple memories in parallel if your use case allows
Memory Creation Optimization
# Good: Set TTL for temporary preferences
client.create_memory(
user_id="user123",
scope="preferences",
domain="food",
source="explicit_user_input",
ttl_days=7, # Expires in 7 days
value_json={"likes": ["seasonal_special"]}
)
# The API handles deduplication, so you can create multiple memories
# They'll be merged automatically when readMonitor and Optimize
Monitor your API usage to identify optimization opportunities:
- Track API call frequency and patterns
- Monitor response times
- Identify unnecessary API calls
- Review cache hit rates
- Check rate limit usage
Monitoring API Calls
import time
import logging
logger = logging.getLogger(__name__)
def read_with_monitoring(user_id, scope, domain, purpose):
start_time = time.time()
try:
result = client.read_memory(
user_id=user_id,
scope=scope,
domain=domain,
purpose=purpose
)
elapsed = time.time() - start_time
logger.info(f"API call took {elapsed:.2f}s")
return result
except Exception as e:
elapsed = time.time() - start_time
logger.error(f"API call failed after {elapsed:.2f}s: {e}")
raiseBest Practices Summary
- Use max_age_days: Filter stale data to improve performance and relevance
- Cache responses: Reduce API calls by caching with appropriate TTL
- Use continue operations: Re-read data without creating new grants
- Choose right scopes/domains: Better organization improves API efficiency
- Set TTL appropriately: Let temporary data expire automatically
- Monitor usage: Track API calls to identify optimization opportunities
- Avoid unnecessary calls: Don't read data you don't need
- Handle rate limits: Implement exponential backoff to avoid hitting limits
Related Documentation