Performance Optimization

Learn how to optimize your API usage for better performance, lower latency, and reduced costs.

Use max_age_days to Filter Stale Data

Filtering out stale memories reduces response size and improves relevance. Only request data that's actually useful for your use case.

Using max_age_days
# Good: Only get recent preferences (last 30 days) result = client.read_memory( user_id="user123", scope="preferences", domain="food", purpose="generate food recommendations", max_age_days=30 # Filter stale data ) # Bad: Getting all preferences (including very old ones) result = client.read_memory( user_id="user123", scope="preferences", domain="food", purpose="generate food recommendations" # No max_age_days - gets everything )
Cache Responses Appropriately

Cache API responses to reduce the number of requests, but be mindful of revocation tokens and data freshness.

Response Caching
import time from functools import lru_cache # Cache with TTL cache = {} CACHE_TTL = 300 # 5 minutes def get_cached_memory(user_id, scope, domain, purpose): cache_key = f"{user_id}:{scope}:{domain}:{purpose}" # Check cache if cache_key in cache: cached_data, timestamp = cache[cache_key] if time.time() - timestamp < CACHE_TTL: return cached_data # Cache miss - fetch from API result = client.read_memory( user_id=user_id, scope=scope, domain=domain, purpose=purpose ) # Store in cache cache[cache_key] = (result, time.time()) return result
Batch Operations When Possible

When you need to read multiple scopes or domains, consider batching operations or using continue operations to reduce API calls.

Batching Operations
# Good: Read once, use continue for related reads result = client.read_memory( user_id="user123", scope="preferences", domain="food", purpose="generate recommendations" ) # Later, continue reading with same token (no new grant) food_prefs = client.read_memory_continue( revocation_token=result.revocation_token ) # Bad: Making separate read requests food_prefs = client.read_memory(user_id="user123", scope="preferences", domain="food", ...) music_prefs = client.read_memory(user_id="user123", scope="preferences", domain="music", ...) # Each creates a new grant - less efficient
Use Appropriate Scopes and Domains

Using the right scope and domain helps the API optimize merging and reduces unnecessary data processing.

Scope and Domain Selection
# Good: Specific scope and domain client.create_memory( user_id="user123", scope="preferences", # Correct scope domain="food", # Specific domain value_json={"likes": ["pizza"]} ) # Bad: Too generic client.create_memory( user_id="user123", scope="preferences", domain=None, # No domain - less organized value_json={"food_likes": ["pizza"], "music_likes": ["jazz"]} # Mixed data )
Optimize Memory Creation

Create memories efficiently by using appropriate TTL values and avoiding unnecessary duplicates.

  • Use TTL for temporary data: Set ttl_days for data that should expire automatically
  • Don't worry about duplicates: The API handles deduplication automatically
  • Batch creates when possible: Create multiple memories in parallel if your use case allows
Memory Creation Optimization
# Good: Set TTL for temporary preferences client.create_memory( user_id="user123", scope="preferences", domain="food", source="explicit_user_input", ttl_days=7, # Expires in 7 days value_json={"likes": ["seasonal_special"]} ) # The API handles deduplication, so you can create multiple memories # They'll be merged automatically when read
Monitor and Optimize

Monitor your API usage to identify optimization opportunities:

  • Track API call frequency and patterns
  • Monitor response times
  • Identify unnecessary API calls
  • Review cache hit rates
  • Check rate limit usage
Monitoring API Calls
import time import logging logger = logging.getLogger(__name__) def read_with_monitoring(user_id, scope, domain, purpose): start_time = time.time() try: result = client.read_memory( user_id=user_id, scope=scope, domain=domain, purpose=purpose ) elapsed = time.time() - start_time logger.info(f"API call took {elapsed:.2f}s") return result except Exception as e: elapsed = time.time() - start_time logger.error(f"API call failed after {elapsed:.2f}s: {e}") raise
Best Practices Summary
  • Use max_age_days: Filter stale data to improve performance and relevance
  • Cache responses: Reduce API calls by caching with appropriate TTL
  • Use continue operations: Re-read data without creating new grants
  • Choose right scopes/domains: Better organization improves API efficiency
  • Set TTL appropriately: Let temporary data expire automatically
  • Monitor usage: Track API calls to identify optimization opportunities
  • Avoid unnecessary calls: Don't read data you don't need
  • Handle rate limits: Implement exponential backoff to avoid hitting limits