letta research claude
file-management.md
219 lines 8.8 kB view raw view rendered
1Perfect! Now I have a comprehensive understanding. Let me compile my research findings: 2 3 Comprehensive Research Report: Letta's RAG/Document Capabilities 4 5 Based on my analysis of the Letta SDK integration in the roci project, here's everything I discovered about Letta's RAG and document storage/retrieval capabilities: 6 7 1. What Letta SDK Supports for Document Storage and Retrieval 8 9 Letta has a two-tier memory system for documents: 10 11 A. Archival Memory (Text-based, no files) 12 - Simple text passage storage with semantic search 13 - No file upload/management 14 - Uses embedding-based similarity search 15 - Methods: searchArchival(), addArchival(), listArchival() 16 17 B. Data Sources + Files (Enterprise RAG system) 18 - File upload capability (PDF, text, markdown, etc.) 19 - Chunking and embedding of file content 20 - Semantic search across file passages 21 - Processing status tracking 22 - File metadata management 23 24 2. How to Upload Documents to Letta 25 26 There are two distinct pathways: 27 28 Path A: Direct Text to Archival (Simple) 29 // Via roci-memory's LettaClient wrapper 30 await client.agents.passages.create(agentId, { 31 text: "Content to store" // Just text, no files 32 }) 33 Used by: archive_conversation tool (stores summarized conversations) 34 35 Path B: File Upload via Data Sources (Full RAG) 36 // Via sources API 37 await client.sources.files.upload(fileStream, sourceId, { 38 // Letta handles chunking and embedding 39 }) 40 - Creates source first: client.sources.create({ name, description, embedding }) 41 - Upload files: client.sources.files.upload(file, sourceId) 42 - Files are automatically chunked, embedded, and searchable 43 44 3. How to Search/Query Documents 45 46 Archival Search (for agent's own passages): 47 await client.agents.passages.search(agentId, { 48 query: "search terms", 49 limit: 5 50 }) 51 // Returns: Array<{text, score}> 52 53 Source-based Search (enterprise RAG): 54 // List passages in a source 55 await client.sources.passages.list(sourceId, { limit: 10 }) 56 57 // Get file metadata 58 await client.sources.getFileMetadata(sourceId, fileId) 59 60 Used in roci: 61 - RecallConversationTool uses searchArchival() to find past conversations 62 - ArchiveConversationTool uses addArchival() to store summaries 63 64 4. Supported File Types 65 66 Based on FileMetadata structure, Letta supports: 67 - Format: Any file with MIME type tracking (fileType: string) 68 - Common types: PDF, Markdown, Text, Word docs, etc. 69 - Processing: Files go through states: pending → parsing → embedding → completed (or error) 70 - Chunking: Automatic with embeddingChunkSize configuration per source 71 - Tracking: Total chunks and chunks embedded tracked separately 72 73 5. Embedding Handling 74 75 Letta manages embeddings completely: 76 77 EmbeddingConfig Structure: 78 { 79 embeddingEndpointType: "openai" | "anthropic" | "bedrock" | 80 "google_ai" | "google_vertex" | "azure" | 81 "ollama" | "vllm" | etc., 82 embeddingModel: string, // e.g., "text-embedding-3-small" 83 embeddingDim: number, // vector dimension (e.g., 1536) 84 embeddingChunkSize: number, // tokens per chunk (default varies) 85 embeddingEndpoint: string, // API endpoint 86 batchSize: number // processing batch size 87 } 88 89 Embedding Model Support: 90 - OpenAI (text-embedding-3-small, -large) 91 - Anthropic 92 - Google AI / Vertex 93 - Bedrock 94 - Azure 95 - Local (Ollama, vLLM, LM Studio, etc.) 96 - Groq, Mistral, HuggingFace, Together, Pinecone 97 98 Per-Source Configuration: 99 Each source has its own embedding config, allowing different models for different data sources. 100 101 6. Data Model Overview 102 103 Passage (stored unit): 104 { 105 id: string, 106 archiveId?: string, // Agent's archival ID 107 sourceId?: string, // Source ID (if from file) 108 fileId?: string, // File ID (if from upload) 109 fileName?: string, // Original filename 110 text: string, // Actual content 111 embedding?: number[], // Vector (if populated) 112 embeddingConfig?: EmbeddingConfig, 113 metadata?: Record<string, unknown>, // Custom metadata 114 tags?: string[], // Search tags 115 createdAt: Date, 116 updatedAt: Date 117 } 118 119 Source (document collection): 120 { 121 id: string, 122 name: string, 123 description?: string, 124 instructions?: string, // How to use this source 125 metadata?: Record<string, unknown>, 126 embeddingConfig: EmbeddingConfig, // Required 127 createdAt: Date, 128 updatedAt: Date 129 } 130 131 FileMetadata (upload tracking): 132 { 133 id: string, 134 sourceId: string, 135 fileName: string, 136 originalFileName: string, 137 fileType: string, // MIME type 138 fileSize: number, // bytes 139 processingStatus: "pending" | "parsing" | "embedding" | "completed" | "error", 140 errorMessage?: string, 141 totalChunks: number, // After chunking 142 chunksEmbedded: number, // Embedding progress 143 content?: string, // Full text (on demand) 144 createdAt: Date, 145 updatedAt: Date 146 } 147 148 7. Current roci Integration Pattern 149 150 In roci-memory (LettaClient wrapper): 151 - searchArchival(query, limit) - semantic search 152 - addArchival(text) - store passage 153 - listArchival(limit) - list recent entries 154 - createBlock(label, value, description) - memory blocks (separate from passages) 155 156 In roci-agent tools: 157 - archive_conversation - summarizes recent messages with Haiku, stores in archival 158 - recall_conversation - searches archival with semantic search 159 - Both use agent's archival memory (not file-based sources) 160 161 8. What roci DOESN'T Use Yet 162 163 Not yet implemented: 164 - File upload via client.sources.files.upload() 165 - Data sources creation and management 166 - File-based RAG for external documents 167 - Metadata tagging and filtering 168 - Batch file processing 169 - Source-specific search vs. archival search 170 171 Potential use cases not exploited: 172 1. Upload user documents (PDFs, notes, etc.) to sources 173 2. Cross-reference external documents in conversations 174 3. File-aware RAG ("This is from your Q3_Report.pdf") 175 4. Organizational knowledge bases 176 5. Multi-source search with filtering 177 178 9. Key Architectural Insights 179 180 Separation of Concerns: 181 - Archival: For the agent's own memories (conversations, summaries) 182 - Sources: For external/user documents (enterprise RAG) 183 184 Two API Levels: 185 - client.agents.passages - agent-specific archival 186 - client.sources.passages - source-level passages 187 - client.sources.files - file upload/management 188 189 Embedding is Centralized: 190 - No embedding management in client code 191 - Letta handles: chunking, embedding, storage, indexing 192 - Search uses embedded vectors automatically 193 194 Processing is Async: 195 - File upload returns immediately 196 - Processing happens server-side 197 - Status tracked via processingStatus field 198 199 10. Integration Opportunities for roci 200 201 1. Document Storage Tool: Let agent upload user documents to sources 202 2. Document Search Tool: Enable RAG queries across uploaded files 203 3. Memory Augmentation: Archive conversation to both archival AND tag external docs 204 4. Source Management: Create/manage sources for different topics 205 5. Knowledge Base: Build searchable reference library from user docs 206 207 --- 208 Summary 209 210 Letta provides a full-featured RAG system with: 211 - Automatic chunking and embedding (many models supported) 212 - Semantic search with embedding vectors 213 - File upload and processing tracking 214 - Metadata and tagging support 215 - Scalable source organization 216 - Clean API separation between agent memory (archival) and document stores (sources) 217 218 Currently, roci uses only the archival memory for storing conversation summaries. The file-based source system is completely unutilized, representing significant untapped RAG 219 potential for incorporating external documents into agent conversations.