Core Concepts
- Memory Items: Discrete pieces of information stored in the memory system
- Projects: Namespaces for organizing memory items
- Contexts: Scopes for memory items (global, project, conversation, document, session)
- Embeddings: Vector representations of memory items for semantic search
- Permissions: Access controls for memory items
Memory Access Patterns
The Unified Memory System supports multiple access patterns:Direct API Client (Recommended)
React Hook Pattern
Legacy Window Pattern (Deprecated)
Endpoints
Store Memory Item
Store a new memory item in the system. Endpoint:POST /api/memory/store
Authentication Required: Yes
Request:
Retrieve Memory Item
Retrieve a specific memory item by ID. Endpoint:GET /api/memory/retrieve?id=memory-id
Authentication Required: Yes
Response:
Vector Search
Search for memory items using semantic vector search. Endpoint:POST /api/memory/search
Authentication Required: Yes
Request:
Memory Batch Operations
Perform multiple memory operations in a single request. Endpoint:POST /api/memory/batch
Authentication Required: Yes
Request:
Memory System Data Model
Memory Access Controls
The Memory System API enforces strict access controls:- Project Scope: Memory items are isolated by project
- Namespace Isolation: Namespaces provide further isolation
- Permission Checks: Access requires appropriate permissions
- Multi-tenancy: Tenant isolation at the database level
- Audit Logging: Memory access is logged for compliance
Performance Considerations
The Memory System API is optimized for performance:- HNSW Indexes: 3-6x faster vector search with pgvector
- Connection Pooling: Efficient database connections
- Caching: Frequently accessed memories are cached
- Batch Processing: Efficient bulk operations
- Async Processing: Non-blocking embedding generation
- p95 latency for retrieval: < 50ms
- p95 latency for search: < 100ms
- Maximum throughput: 1000+ requests/second
Best Practices
- Use Batch Operations: When working with multiple memory items, use batch operations for better performance.
- Include Proper TTL: Always include appropriate TTL values to ensure memory items are automatically cleaned up.
- Use Contexts: Properly scope memory items with contexts to improve organization and retrieval.
- Tag Everything: Add meaningful tags to improve filtering and discovery.
- Handle Embedding Status: Check embedding_status before performing vector searches, as some items may still be pending embedding generation.
- Optimize Query Size: Keep query texts concise for better vector search performance.
- Use SDK Patterns: The SDK provides optimized patterns for memory access.
Rate Limits
- Store Operation: 10 requests per second per user
- Retrieve Operation: 20 requests per second per user
- Search Operation: 5 requests per second per user
- Batch Operation: 2 requests per second per user (max 50 operations per batch)
Error Handling
Common error scenarios:- 401 Unauthorized: Missing or invalid authentication
- 403 Forbidden: Insufficient permissions
- 404 Not Found: Memory item not found
- 422 Unprocessable Entity: Invalid request data
- 429 Too Many Requests: Rate limit exceeded
- 500 Internal Server Error: Server-side error
Next Steps
- SDK Integration - Learn how to use the SDK for memory access
- Memory Permissions - Learn about memory-specific permissions
- Event System - Subscribe to memory events