Read-Only PDS Implementation Plan Overview Implement a read-only PDS that loads repositories from CAR files on startup and serves AT Protocol read endpoints while rejecting all write operations with authentication errors. Architecture Approach Create a new @pds/readonly package with a CLI tool that: 1. Parses CAR files to extract repository data 2. Populates SQLite storage for efficient querying 3. Serves read-only XRPC endpoints 4. Returns AuthenticationRequired for all write endpoints Files to Create 1. CAR Parser - /packages/core/src/car.js The codebase has buildCarFile() but no parser. Implement: export function readVarint(bytes, offset) // Decode varint, return [value, newOffset] export function parseCarFile(carBytes) // Returns { roots: string[], blocks: Map } export async function* iterateCarBlocks(bytes) // Memory-efficient streaming Uses existing: cborDecode, cidToString from repo.js 2. Repository Loader - /packages/core/src/loader.js Extract and index repository data from parsed CAR: export async function loadRepositoryFromCar(carBytes, actorStorage) // 1. Parse CAR, get root CID (commit) // 2. Decode commit: { did, version, rev, prev, data (MST root), sig } // 3. Walk MST using walkMst() from mst.js // 4. Populate: blocks, records, commits, metadata tables 3. Read-Only Package - /packages/readonly/ packages/readonly/ package.json src/ index.js # createReadOnlyServer(options) cli.js # CLI entry point CLI Usage: pds-readonly \ --car ./repos/did:plc:abc123.car \ --car ./repos/did:plc:xyz789.car \ --blobs ./blobs \ --port 3000 4. Multi-Repository Support Use per-DID SQLite databases with a routing layer: class MultiRepoManager { repos = new Map() // did -> { db, actorStorage } async loadCar(carPath) { // Create DB, load CAR, register DID } getStorage(did) { return this.repos.get(did)?.actorStorage } } Files to Modify /packages/core/src/pds.js Add read-only mode with guards on write handlers: // In constructor: this.readOnly = options.readOnly ?? false; // Guard for write endpoints: if (this.readOnly) { return Response.json( { error: 'AuthenticationRequired', message: 'This PDS is read-only' }, { status: 401 } ); } Write endpoints to guard: - handleInit - handleCreateSession, handleRefreshSession - handleCreateRecord, handlePutRecord, handleDeleteRecord, handleApplyWrites - handleUploadBlob - handlePutPreferences - OAuth: handleOAuthPar, handleOAuthToken, handleOAuthRevoke /packages/core/src/index.js Export new modules: export * from './car.js'; export * from './loader.js'; /package.json (root) Add to workspaces: "workspaces": ["packages/*", "examples/*"] // readonly package auto-included via packages/* Read-Only Endpoint Behavior ┌────────────────────┬───────────┬─────────────────────────────────┐ │ Endpoint │ Status │ Notes │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ describeServer │ Supported │ Indicates read-only in response │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ describeRepo │ Supported │ Returns repo metadata │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ getRecord │ Supported │ Retrieve single record │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ listRecords │ Supported │ Paginated record listing │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ listRepos │ Supported │ Lists all loaded DIDs │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ getRepoStatus │ Supported │ Status for specific DID │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ getRepo │ Supported │ Full CAR export │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ sync.getRecord │ Supported │ Record with MST proof │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ getLatestCommit │ Supported │ Latest commit info │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ getBlob │ Supported │ If blobs directory provided │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ listBlobs │ Supported │ List blob CIDs │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ subscribeRepos │ Partial │ Historical events only │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ resolveHandle │ Supported │ Handle-to-DID resolution │ ├────────────────────┼───────────┼─────────────────────────────────┤ │ All POST endpoints │ 401 │ AuthenticationRequired error │ └────────────────────┴───────────┴─────────────────────────────────┘ Blob Handling CAR files don't contain blobs (blobs are stored separately). Options: 1. Filesystem directory (recommended): Point to existing blob storage --blobs ./blobs/ # Structure: blobs/{did}/{shard}/{cid} 2. No blobs: getBlob returns 404, listBlobs returns empty Deployment Docker FROM node:20-slim WORKDIR /app COPY . . RUN npm install && npm run build ENTRYPOINT ["node", "packages/readonly/src/cli.js"] CMD ["--port", "3000"] Docker Compose services: pds-readonly: build: . ports: - "3000:3000" volumes: - ./data/repos:/repos:ro - ./data/blobs:/blobs:ro command: - --car - /repos/*.car - --blobs - /blobs - --port - "3000" Environment Variables PDS_PORT=3000 PDS_CAR_DIR=/repos # Directory containing .car files PDS_BLOBS_DIR=/blobs # Optional blob storage PDS_HOSTNAME=pds.example.com # Public hostname Verification Steps 1. Unit tests: CAR parser roundtrip with buildCarFile 2. Loader tests: Load test CAR, verify storage populated 3. Integration tests: - Start server, call listRepos - returns loaded DIDs - Call getRecord - returns record data - Call createRecord - returns 401 4. E2E test: Export from real PDS via getRepo, import to read-only, verify data matches Implementation Sequence 1. Phase 1: CAR parser (car.js) + tests 2. Phase 2: Repository loader (loader.js) + tests 3. Phase 3: Read-only guards in pds.js 4. Phase 4: @pds/readonly package with CLI 5. Phase 5: Multi-repo routing layer 6. Phase 6: Documentation and deployment configs Design Decisions - Multi-repo: Support multiple CAR files from the start, each representing a different DID - Blobs: Filesystem directory support (--blobs ./blobs/ with structure blobs/{did}/{shard}/{cid}) - WebSocket: subscribeRepos serves historical events only (from cursor), no live updates since data is static Obtaining CAR Files CAR files can be obtained via: 1. Export from existing PDS: GET /xrpc/com.atproto.sync.getRepo?did= 2. Relay/BGS: Many relays provide repo export endpoints 3. Direct backup: If you have database access to a PDS Example: Export a repo curl "https://pds.example.com/xrpc/com.atproto.sync.getRepo?did=did:plc:abc123" \ -o did_plc_abc123.car Blob Export Blobs must be copied separately. From a PDS with filesystem blobs: cp -r /path/to/pds/blobs// ./blobs// Complete Deployment Example # 1. Create data directories mkdir -p data/repos data/blobs # 2. Export repositories curl "https://bsky.social/xrpc/com.atproto.sync.getRepo?did=did:plc:abc123" \ -o data/repos/abc123.car # 3. Copy blobs (if available) # Blobs would need to come from the original PDS's blob storage # 4. Start read-only PDS docker compose up -d # 5. Verify curl http://localhost:3000/xrpc/com.atproto.sync.listRepos # Returns: { "repos": [{ "did": "did:plc:abc123", ... }] } curl http://localhost:3000/xrpc/com.atproto.repo.listRecords?repo=did:plc:abc123&collection=app.bsky.feed.post # Returns posts from the loaded repository