file-management.md · by tijs.org

letta research claude
file-management.md 2mo ago
219 lines 8.8 kB view raw view rendered
  1Perfect! Now I have a comprehensive understanding. Let me compile my research findings:
  2
  3       Comprehensive Research Report: Letta's RAG/Document Capabilities
  4
  5       Based on my analysis of the Letta SDK integration in the roci project, here's everything I discovered about Letta's RAG and document storage/retrieval capabilities:
  6
  7       1. What Letta SDK Supports for Document Storage and Retrieval
  8
  9       Letta has a two-tier memory system for documents:
 10
 11       A. Archival Memory (Text-based, no files)
 12       - Simple text passage storage with semantic search
 13       - No file upload/management
 14       - Uses embedding-based similarity search
 15       - Methods: searchArchival(), addArchival(), listArchival()
 16
 17       B. Data Sources + Files (Enterprise RAG system)
 18       - File upload capability (PDF, text, markdown, etc.)
 19       - Chunking and embedding of file content
 20       - Semantic search across file passages
 21       - Processing status tracking
 22       - File metadata management
 23
 24       2. How to Upload Documents to Letta
 25
 26       There are two distinct pathways:
 27
 28       Path A: Direct Text to Archival (Simple)
 29       // Via roci-memory's LettaClient wrapper
 30       await client.agents.passages.create(agentId, {
 31         text: "Content to store"  // Just text, no files
 32       })
 33       Used by: archive_conversation tool (stores summarized conversations)
 34
 35       Path B: File Upload via Data Sources (Full RAG)
 36       // Via sources API
 37       await client.sources.files.upload(fileStream, sourceId, {
 38         // Letta handles chunking and embedding
 39       })
 40       - Creates source first: client.sources.create({ name, description, embedding })
 41       - Upload files: client.sources.files.upload(file, sourceId)
 42       - Files are automatically chunked, embedded, and searchable
 43
 44       3. How to Search/Query Documents
 45
 46       Archival Search (for agent's own passages):
 47       await client.agents.passages.search(agentId, {
 48         query: "search terms",
 49         limit: 5
 50       })
 51       // Returns: Array<{text, score}>
 52
 53       Source-based Search (enterprise RAG):
 54       // List passages in a source
 55       await client.sources.passages.list(sourceId, { limit: 10 })
 56
 57       // Get file metadata
 58       await client.sources.getFileMetadata(sourceId, fileId)
 59
 60       Used in roci:
 61       - RecallConversationTool uses searchArchival() to find past conversations
 62       - ArchiveConversationTool uses addArchival() to store summaries
 63
 64       4. Supported File Types
 65
 66       Based on FileMetadata structure, Letta supports:
 67       - Format: Any file with MIME type tracking (fileType: string)
 68       - Common types: PDF, Markdown, Text, Word docs, etc.
 69       - Processing: Files go through states: pending → parsing → embedding → completed (or error)
 70       - Chunking: Automatic with embeddingChunkSize configuration per source
 71       - Tracking: Total chunks and chunks embedded tracked separately
 72
 73       5. Embedding Handling
 74
 75       Letta manages embeddings completely:
 76
 77       EmbeddingConfig Structure:
 78       {
 79         embeddingEndpointType: "openai" | "anthropic" | "bedrock" |
 80                                 "google_ai" | "google_vertex" | "azure" |
 81                                 "ollama" | "vllm" | etc.,
 82         embeddingModel: string,        // e.g., "text-embedding-3-small"
 83         embeddingDim: number,          // vector dimension (e.g., 1536)
 84         embeddingChunkSize: number,    // tokens per chunk (default varies)
 85         embeddingEndpoint: string,     // API endpoint
 86         batchSize: number              // processing batch size
 87       }
 88
 89       Embedding Model Support:
 90       - OpenAI (text-embedding-3-small, -large)
 91       - Anthropic
 92       - Google AI / Vertex
 93       - Bedrock
 94       - Azure
 95       - Local (Ollama, vLLM, LM Studio, etc.)
 96       - Groq, Mistral, HuggingFace, Together, Pinecone
 97
 98       Per-Source Configuration:
 99       Each source has its own embedding config, allowing different models for different data sources.
100
101       6. Data Model Overview
102
103       Passage (stored unit):
104       {
105         id: string,
106         archiveId?: string,          // Agent's archival ID
107         sourceId?: string,           // Source ID (if from file)
108         fileId?: string,             // File ID (if from upload)
109         fileName?: string,           // Original filename
110         text: string,                // Actual content
111         embedding?: number[],        // Vector (if populated)
112         embeddingConfig?: EmbeddingConfig,
113         metadata?: Record<string, unknown>,  // Custom metadata
114         tags?: string[],             // Search tags
115         createdAt: Date,
116         updatedAt: Date
117       }
118
119       Source (document collection):
120       {
121         id: string,
122         name: string,
123         description?: string,
124         instructions?: string,       // How to use this source
125         metadata?: Record<string, unknown>,
126         embeddingConfig: EmbeddingConfig,  // Required
127         createdAt: Date,
128         updatedAt: Date
129       }
130
131       FileMetadata (upload tracking):
132       {
133         id: string,
134         sourceId: string,
135         fileName: string,
136         originalFileName: string,
137         fileType: string,            // MIME type
138         fileSize: number,            // bytes
139         processingStatus: "pending" | "parsing" | "embedding" | "completed" | "error",
140         errorMessage?: string,
141         totalChunks: number,         // After chunking
142         chunksEmbedded: number,      // Embedding progress
143         content?: string,            // Full text (on demand)
144         createdAt: Date,
145         updatedAt: Date
146       }
147
148       7. Current roci Integration Pattern
149
150       In roci-memory (LettaClient wrapper):
151       - searchArchival(query, limit) - semantic search
152       - addArchival(text) - store passage
153       - listArchival(limit) - list recent entries
154       - createBlock(label, value, description) - memory blocks (separate from passages)
155
156       In roci-agent tools:
157       - archive_conversation - summarizes recent messages with Haiku, stores in archival
158       - recall_conversation - searches archival with semantic search
159       - Both use agent's archival memory (not file-based sources)
160
161       8. What roci DOESN'T Use Yet
162
163       Not yet implemented:
164       - File upload via client.sources.files.upload()
165       - Data sources creation and management
166       - File-based RAG for external documents
167       - Metadata tagging and filtering
168       - Batch file processing
169       - Source-specific search vs. archival search
170
171       Potential use cases not exploited:
172       1. Upload user documents (PDFs, notes, etc.) to sources
173       2. Cross-reference external documents in conversations
174       3. File-aware RAG ("This is from your Q3_Report.pdf")
175       4. Organizational knowledge bases
176       5. Multi-source search with filtering
177
178       9. Key Architectural Insights
179
180       Separation of Concerns:
181       - Archival: For the agent's own memories (conversations, summaries)
182       - Sources: For external/user documents (enterprise RAG)
183
184       Two API Levels:
185       - client.agents.passages - agent-specific archival
186       - client.sources.passages - source-level passages
187       - client.sources.files - file upload/management
188
189       Embedding is Centralized:
190       - No embedding management in client code
191       - Letta handles: chunking, embedding, storage, indexing
192       - Search uses embedded vectors automatically
193
194       Processing is Async:
195       - File upload returns immediately
196       - Processing happens server-side
197       - Status tracked via processingStatus field
198
199       10. Integration Opportunities for roci
200
201       1. Document Storage Tool: Let agent upload user documents to sources
202       2. Document Search Tool: Enable RAG queries across uploaded files
203       3. Memory Augmentation: Archive conversation to both archival AND tag external docs
204       4. Source Management: Create/manage sources for different topics
205       5. Knowledge Base: Build searchable reference library from user docs
206
207       ---
208       Summary
209
210       Letta provides a full-featured RAG system with:
211       - Automatic chunking and embedding (many models supported)
212       - Semantic search with embedding vectors
213       - File upload and processing tracking
214       - Metadata and tagging support
215       - Scalable source organization
216       - Clean API separation between agent memory (archival) and document stores (sources)
217
218       Currently, roci uses only the archival memory for storing conversation summaries. The file-based source system is completely unutilized, representing significant untapped RAG
219       potential for incorporating external documents into agent conversations.