Monorepo for wisp.place. A static site hosting service built on top of the AT Protocol. wisp.place

docs

+351 -50
+1
docs/astro.config.mjs
··· 25 25 { 26 26 label: 'Guides', 27 27 items: [ 28 + { label: 'Architecture', slug: 'architecture' }, 28 29 { label: 'Self-Hosting', slug: 'deployment' }, 29 30 { label: 'Monitoring & Metrics', slug: 'monitoring' }, 30 31 { label: 'Redirects & Rewrites', slug: 'redirects' },
+211
docs/src/content/docs/architecture.md
··· 1 + --- 2 + title: Architecture Guide 3 + description: How the hosting service, firehose service, and tiered storage work together 4 + --- 5 + 6 + Wisp.place's serving infrastructure is split into two microservices: the **firehose service** (write path) and the **hosting service** (read path). They communicate through S3-compatible storage and Redis pub/sub. 7 + 8 + ## Service Overview 9 + 10 + ### Firehose Service 11 + 12 + The firehose service watches the AT Protocol firehose (Jetstream WebSocket) for `place.wisp.fs` and `place.wisp.settings` record changes. When a site is created, updated, or deleted, it: 13 + 14 + 1. Downloads all blobs from the user's PDS 15 + 2. Decompresses gzipped content 16 + 3. Rewrites HTML for subdirectory serving (absolute paths become relative) 17 + 4. Writes the processed files to S3 (or disk) 18 + 5. Publishes a cache invalidation event to Redis 19 + 20 + The firehose service is **write-only** — it never serves requests to end users. 21 + 22 + **Key configuration:** 23 + 24 + ```bash 25 + # Firehose connection 26 + FIREHOSE_URL="wss://jetstream2.us-east.bsky.network/subscribe" 27 + 28 + # S3 storage (recommended for production) 29 + S3_BUCKET="wisp-sites" 30 + S3_REGION="auto" 31 + S3_ENDPOINT="https://your-account.r2.cloudflarestorage.com" 32 + S3_ACCESS_KEY_ID="..." 33 + S3_SECRET_ACCESS_KEY="..." 34 + 35 + # Redis for cache invalidation 36 + REDIS_URL="redis://localhost:6379" 37 + 38 + # Concurrency control 39 + FIREHOSE_CONCURRENCY=5 # Max parallel event processing 40 + ``` 41 + 42 + **Backfill mode:** Start with `--backfill` to do a one-time bulk sync of all existing sites from the database into the cache. 43 + 44 + ### Hosting Service 45 + 46 + The hosting service is a **read-only** CDN built with Node.js and Hono. It serves static files from a three-tier cache and handles routing for custom domains, wisp subdomains, and direct URLs. 47 + 48 + On each request, the hosting service: 49 + 50 + 1. Resolves the site from the request hostname/path 51 + 2. Looks up the file in tiered storage (hot → warm → cold) 52 + 3. On a cache miss, fetches from the PDS on-demand and populates the cache 53 + 4. Applies HTML path rewriting if serving from a subdirectory 54 + 5. Processes `_redirects` rules 55 + 6. Serves the file with appropriate headers 56 + 57 + The hosting service subscribes to Redis pub/sub for cache invalidation messages from the firehose service. When it receives an invalidation, it evicts the affected entries from its hot and warm tiers so the next request fetches fresh content. 58 + 59 + ## Tiered Storage 60 + 61 + The `@wispplace/tiered-storage` package implements a three-tier cascading cache. Data flows **down** on writes and is looked up **upward** on reads. 62 + 63 + ``` 64 + Read path: Hot (memory) → Warm (disk) → Cold (S3/disk) 65 + Write path: Hot ← Warm ← Cold (writes cascade down through all tiers) 66 + ``` 67 + 68 + ### Hot Tier (Memory) 69 + 70 + - **Implementation:** In-memory LRU cache 71 + - **Eviction:** Size-based (bytes) and count-based (max items) 72 + - **Use case:** Frequently accessed files (index.html, CSS, JS) 73 + - **Lost on restart** — repopulated from warm/cold tiers on access 74 + 75 + ```bash 76 + HOT_CACHE_SIZE=104857600 # 100 MB (default) 77 + HOT_CACHE_COUNT=500 # Max items 78 + ``` 79 + 80 + ### Warm Tier (Disk) 81 + 82 + - **Implementation:** Filesystem with human-readable paths 83 + - **Eviction:** Configurable — `lru` (default), `fifo`, or `size` 84 + - **Structure:** `cache/sites/{did}/{sitename}/path/to/file` 85 + - **Survives restarts** — provides fast local reads without network calls 86 + 87 + ```bash 88 + WARM_CACHE_SIZE=10737418240 # 10 GB (default) 89 + WARM_EVICTION_POLICY=lru # lru, fifo, or size 90 + CACHE_DIR=./cache/sites 91 + ``` 92 + 93 + The warm tier is optional when S3 is configured. Without S3, disk acts as the cold (source of truth) tier. 94 + 95 + ### Cold Tier (S3 or Disk) 96 + 97 + - **With S3:** The firehose service writes here; the hosting service reads (read-only wrapper) 98 + - **Without S3:** A disk-based tier serves as both warm and cold 99 + - **Compatible with:** Cloudflare R2, MinIO, AWS S3, or any S3-compatible endpoint 100 + 101 + ```bash 102 + S3_BUCKET="wisp-sites" 103 + S3_REGION="auto" 104 + S3_ENDPOINT="https://your-account.r2.cloudflarestorage.com" 105 + S3_ACCESS_KEY_ID="..." 106 + S3_SECRET_ACCESS_KEY="..." 107 + S3_METADATA_BUCKET="wisp-metadata" # Optional, recommended for production 108 + ``` 109 + 110 + ### Tier Placement Rules 111 + 112 + Not all files are placed on every tier. The hosting service uses placement rules to keep the hot tier efficient: 113 + 114 + | File Pattern | Tiers | Rationale | 115 + |---|---|---| 116 + | `index.html`, `*.css`, `*.js` | Hot, Warm, Cold | Critical for page loads | 117 + | Rewritten HTML (`.rewritten/`) | Hot, Warm, Cold | Pre-processed for fast serving | 118 + | Images, fonts, media (`*.jpg`, `*.woff2`, etc.) | Warm, Cold | Already compressed, large — skip memory | 119 + | Everything else | Warm, Cold | Default placement | 120 + 121 + ### Promotion and Bootstrap 122 + 123 + When a file is found in a lower tier but not a higher one, it's **eagerly promoted** upward. For example, a cache miss on hot that hits warm will copy the file into hot for future requests. 124 + 125 + On startup, the hosting service can **bootstrap** tiers: 126 + - Hot bootstraps from warm by loading the most-accessed items 127 + - Warm bootstraps from cold by loading recently written items 128 + 129 + ## Cache Invalidation 130 + 131 + The firehose service and hosting service communicate through Redis pub/sub: 132 + 133 + ``` 134 + Firehose Service Hosting Service 135 + │ │ 136 + │ ── Redis pub/sub ──────────────→ │ 137 + │ (wisp:revalidate) │ 138 + │ │ 139 + │ Site updated/deleted: │ Receives invalidation: 140 + │ 1. Write new files to S3 │ 1. Evict from hot tier 141 + │ 2. Publish invalidation │ 2. Evict from warm tier 142 + │ │ 3. Next request fetches fresh 143 + ``` 144 + 145 + If Redis is not configured, the hosting service still works — it just won't receive real-time invalidation and will rely on TTL-based expiry (default 14 days) and on-demand fetching. 146 + 147 + ## On-Demand Cache Population 148 + 149 + When the hosting service receives a request for a site that isn't in any cache tier, it fetches directly from the user's PDS: 150 + 151 + 1. Resolves the user's DID to their PDS endpoint 152 + 2. Downloads the `place.wisp.fs` record 153 + 3. Fetches the requested blob 154 + 4. Decompresses and processes the file 155 + 5. Stores it in the appropriate tiers based on placement rules 156 + 6. Serves the response 157 + 158 + This means the hosting service works even without the firehose service running — it just won't have pre-populated caches. 159 + 160 + ## Deployment Scenarios 161 + 162 + ### Minimal (Disk Only) 163 + 164 + No S3 or Redis required. The hosting service uses disk as both warm and cold tier. Best for small deployments or development. 165 + 166 + ```bash 167 + # Hosting service only 168 + CACHE_DIR=./cache/sites 169 + HOT_CACHE_SIZE=104857600 170 + ``` 171 + 172 + ### Production (S3 + Redis) 173 + 174 + The firehose service pre-populates S3 and notifies the hosting service of changes via Redis. Multiple hosting service instances can share the same S3 backend. 175 + 176 + ```bash 177 + # Both services 178 + S3_BUCKET=wisp-sites 179 + S3_ENDPOINT=https://account.r2.cloudflarestorage.com 180 + REDIS_URL=redis://localhost:6379 181 + 182 + # Hosting service 183 + HOT_CACHE_SIZE=104857600 184 + WARM_CACHE_SIZE=10737418240 185 + ``` 186 + 187 + ### Scaled (Multiple Hosting Instances) 188 + 189 + Run multiple hosting service instances behind a load balancer. Each has its own hot and warm tiers, but they share the S3 cold tier and receive the same Redis invalidation events. 190 + 191 + ``` 192 + Load Balancer 193 + / | \ 194 + Hosting-1 Hosting-2 Hosting-3 195 + (hot+warm) (hot+warm) (hot+warm) 196 + \ | / 197 + S3 (cold tier) 198 + | 199 + Firehose Service 200 + ``` 201 + 202 + ## Observability 203 + 204 + Both services expose internal observability endpoints: 205 + 206 + - `/__internal__/observability/logs` — Recent log entries 207 + - `/__internal__/observability/errors` — Error log entries 208 + - `/__internal__/observability/metrics` — Prometheus-format metrics 209 + - `/__internal__/observability/cache` — Cache tier statistics (hosting service only) 210 + 211 + See [Monitoring & Metrics](/monitoring) for Grafana integration details.
+110 -38
docs/src/content/docs/deployment.md
··· 3 3 description: Deploy your own Wisp.place instance 4 4 --- 5 5 6 - This guide covers deploying your own Wisp.place instance. Wisp.place consists of two services: the main backend (handles OAuth, uploads, domains) and the hosting service (serves cached sites). 6 + This guide covers deploying your own Wisp.place instance. Wisp.place consists of three services: the main backend (handles OAuth, uploads, domains), the firehose service (watches the AT Protocol firehose and populates the cache), and the hosting service (serves cached sites). See the [Architecture Guide](/architecture) for a detailed breakdown of how these services work together. 7 7 8 8 ## Prerequisites 9 9 10 10 - **PostgreSQL** database (14 or newer) 11 - - **Bun** runtime for the main backend 11 + - **Bun** runtime for the main backend and firehose service 12 12 - **Node.js** (18+) for the hosting service 13 13 - **Caddy** (optional, for custom domain TLS) 14 14 - **Domain name** for your instance 15 + - **S3-compatible storage** (optional, recommended for production — Cloudflare R2, MinIO, etc.) 16 + - **Redis** (optional, for real-time cache invalidation between services) 15 17 16 18 ## Architecture Overview 17 19 18 20 ``` 19 - ┌─────────────────────────────────────────┐ ┌─────────────────────────────────────────┐ 20 - │ Main Backend (port 8000) │ │ Hosting Service (port 3001) │ 21 - │ - OAuth authentication │ │ - Firehose listener │ 22 - │ - Site upload/management │ │ - Site caching │ 23 - │ - Domain registration │ │ - Content serving │ 24 - │ - Admin panel │ │ - Redirect handling │ 25 - └─────────────────────────────────────────┘ └─────────────────────────────────────────┘ 26 - │ │ 27 - └─────────────────┬───────────────────────────┘ 28 - 29 - ┌─────────────────────────────────────────┐ 30 - │ PostgreSQL Database │ 31 - │ - User sessions │ 32 - │ - Domain mappings │ 33 - │ - Site metadata │ 34 - └─────────────────────────────────────────┘ 21 + ┌──────────────────────────┐ ┌──────────────────────────┐ ┌──────────────────────────┐ 22 + │ Main Backend (:8000) │ │ Firehose Service │ │ Hosting Service (:3001) │ 23 + │ - OAuth authentication │ │ - Watches AT firehose │ │ - Tiered cache (mem/ │ 24 + │ - Site upload/manage │ │ - Downloads blobs │ │ disk/S3) │ 25 + │ - Domain registration │ │ - Writes to S3/disk │ │ - Content serving │ 26 + │ - Admin panel │ │ - Publishes invalidation │ │ - Redirect handling │ 27 + └──────────────────────────┘ └──────────────────────────┘ └──────────────────────────┘ 28 + │ │ │ │ 29 + │ │ S3/Disk │ Redis pub/sub │ 30 + └────────┬───────────────┘ └─────────────────────┘ 31 + 32 + ┌─────────────────────────────────────────┐ 33 + │ PostgreSQL Database │ 34 + │ - User sessions │ 35 + │ - Domain mappings │ 36 + │ - Site metadata │ 37 + └─────────────────────────────────────────┘ 35 38 ``` 36 39 37 40 ## Database Setup ··· 106 109 107 110 Admin panel is available at `https://yourdomain.com/admin` 108 111 112 + ## Firehose Service Setup 113 + 114 + The firehose service watches the AT Protocol firehose for site changes and pre-populates the cache. It is **write-only** — it never serves requests to users. 115 + 116 + ### Environment Variables 117 + 118 + ```bash 119 + # Required 120 + DATABASE_URL="postgres://user:password@localhost:5432/wisp" 121 + 122 + # S3 storage (recommended for production) 123 + S3_BUCKET="wisp-sites" 124 + S3_REGION="auto" 125 + S3_ENDPOINT="https://your-account.r2.cloudflarestorage.com" 126 + S3_ACCESS_KEY_ID="..." 127 + S3_SECRET_ACCESS_KEY="..." 128 + S3_METADATA_BUCKET="wisp-metadata" # Optional, recommended 129 + 130 + # Redis (for notifying hosting service of changes) 131 + REDIS_URL="redis://localhost:6379" 132 + 133 + # Firehose 134 + FIREHOSE_URL="wss://jetstream2.us-east.bsky.network/subscribe" 135 + FIREHOSE_CONCURRENCY=5 # Max parallel event processing 136 + 137 + # Optional 138 + CACHE_DIR="./cache/sites" # Fallback if S3 not configured 139 + ``` 140 + 141 + ### Installation 142 + 143 + ```bash 144 + cd firehose-service 145 + 146 + # Install dependencies 147 + bun install 148 + 149 + # Production mode 150 + bun run start 151 + 152 + # With backfill (one-time bulk sync of all existing sites) 153 + bun run start -- --backfill 154 + ``` 155 + 156 + The firehose service will: 157 + 1. Connect to the AT Protocol firehose (Jetstream) 158 + 2. Filter for `place.wisp.fs` and `place.wisp.settings` events 159 + 3. Download blobs, decompress, and rewrite HTML paths 160 + 4. Write files to S3 (or disk) 161 + 5. Publish cache invalidation events to Redis 162 + 109 163 ## Hosting Service Setup 110 164 111 - The hosting service is a separate microservice that serves cached sites. 165 + The hosting service is a **read-only** CDN that serves cached sites through a three-tier storage system (memory, disk, S3). 112 166 113 167 ### Environment Variables 114 168 115 169 ```bash 116 170 # Required 117 171 DATABASE_URL="postgres://user:password@localhost:5432/wisp" 118 - BASE_HOST="wisp.place" # Same as main backend 172 + BASE_HOST="wisp.place" # Same as main backend 173 + 174 + # Tiered storage 175 + HOT_CACHE_SIZE=104857600 # Hot tier: 100 MB (memory, LRU) 176 + HOT_CACHE_COUNT=500 # Max items in hot tier 177 + 178 + WARM_CACHE_SIZE=10737418240 # Warm tier: 10 GB (disk, LRU) 179 + WARM_EVICTION_POLICY="lru" # lru, fifo, or size 180 + CACHE_DIR="./cache/sites" # Warm tier directory 181 + 182 + # S3 cold tier (same bucket as firehose service, read-only) 183 + S3_BUCKET="wisp-sites" 184 + S3_REGION="auto" 185 + S3_ENDPOINT="https://your-account.r2.cloudflarestorage.com" 186 + S3_ACCESS_KEY_ID="..." 187 + S3_SECRET_ACCESS_KEY="..." 188 + S3_METADATA_BUCKET="wisp-metadata" 189 + 190 + # Redis (receive cache invalidation from firehose service) 191 + REDIS_URL="redis://localhost:6379" 119 192 120 193 # Optional 121 - PORT="3001" # Default: 3001 122 - CACHE_DIR="./cache/sites" # Site cache directory 123 - CACHE_ONLY_MODE="false" # Set true to disable DB writes 194 + PORT="3001" # Default: 3001 124 195 ``` 125 196 126 197 ### Installation ··· 136 207 137 208 # Production mode 138 209 npm run start 139 - 140 - # With backfill (downloads all sites from DB on startup) 141 - npm run start -- --backfill 142 210 ``` 143 211 144 212 The hosting service will: 145 - 1. Connect to PostgreSQL 146 - 2. Start firehose listener (watches for new sites) 147 - 3. Create cache directory 148 - 4. Serve sites on port 3001 213 + 1. Initialize tiered storage (hot → warm → cold) 214 + 2. Subscribe to Redis for cache invalidation events 215 + 3. Serve sites on port 3001 149 216 150 - ### Cache Management 217 + ### Cache Behavior 218 + 219 + Files are cached across three tiers with automatic promotion: 220 + 221 + - **Hot (memory):** Fastest, limited by `HOT_CACHE_SIZE`. Evicted on restart. 222 + - **Warm (disk):** Fast local reads at `CACHE_DIR`. Survives restarts. 223 + - **Cold (S3):** Shared source of truth, populated by firehose service. 151 224 152 - Sites are cached to disk at `./cache/sites/{did}/{sitename}/`. The cache is automatically populated: 153 - - **On first request**: Downloads from PDS and caches 154 - - **Via firehose**: Updates when sites are deployed 155 - - **Backfill mode**: Downloads all sites from database on startup 225 + On a cache miss at all tiers, the hosting service fetches directly from the user's PDS and promotes the file into the appropriate tiers. 226 + 227 + **Without S3:** Disk acts as both warm and cold tier. The hosting service still works — it just relies on on-demand fetching instead of pre-populated S3 cache. 156 228 157 229 ## Reverse Proxy Setup 158 230 ··· 318 390 319 391 ## Scaling Considerations 320 392 321 - - **Multiple hosting instances**: Run multiple hosting services behind a load balancer 393 + - **Multiple hosting instances**: Run multiple hosting services behind a load balancer — each has its own hot/warm tiers but shares the S3 cold tier and Redis invalidation 322 394 - **Separate databases**: Split read/write with replicas 323 395 - **CDN**: Put Cloudflare or Bunny in front for global caching 324 - - **Cache storage**: Use NFS/S3 for shared cache across instances 325 - - **Redis**: Add Redis for session storage at scale 396 + - **S3 cold tier**: Shared storage across all hosting instances (Cloudflare R2, MinIO, AWS S3) 397 + - **Redis**: Required for real-time cache invalidation between firehose and hosting services at scale 326 398 327 399 ## Security Notes 328 400
+29 -12
docs/src/content/docs/index.mdx
··· 57 57 58 58 The deployment process starts when you upload your files. Each file is compressed with gzip, base64-encoded, and uploaded as a blob to your PDS. A `place.wisp.fs` record then stores the complete site structure with references to these blobs, creating a verifiable manifest of your site. 59 59 60 - Hosting services continuously watch the AT Protocol firehose for new and updated sites. When your site is first accessed or updated, the hosting service downloads the manifest and blobs, caching them locally for optimized delivery. Custom domains work through DNS verification, allowing your site to be served from your own domain while maintaining the cryptographic guarantees of the AT Protocol. 60 + The **firehose service** continuously watches the AT Protocol firehose for new and updated sites. When a site is created or updated, it downloads the manifest and blobs from the PDS, writes them to S3 (or disk), and publishes a cache invalidation event via Redis. The **hosting service** is a read-only CDN that serves files from a three-tier cache (memory, disk, S3). When a file isn't in cache, the hosting service fetches it on-demand from the PDS and promotes it through the tiers. 61 + 62 + Custom domains work through DNS verification, allowing your site to be served from your own domain while maintaining the cryptographic guarantees of the AT Protocol. 61 63 62 64 ## Architecture Overview 63 65 ··· 67 69 │ (Rust Binary)│ │ Website │ 68 70 │ │ │ (React UI) │ 69 71 └──────────────┘ └──────────────┘ 70 - │ │ 71 72 │ │ 72 73 ▼ ▼ 73 74 ┌─────────────────────────────────────────────────────────┐ ··· 83 84 │ │ 84 85 │ ┌──────────────────────────────────────────────┐ │ 85 86 │ │ Blobs (gzipped + base64 encoded) │ │ 86 - │ │ - index.html │ │ 87 - │ │ - styles.css │ │ 88 - │ │ - assets/* │ │ 87 + │ │ - index.html, styles.css, assets/* │ │ 89 88 │ └──────────────────────────────────────────────┘ │ 90 89 └─────────────────────────────────────────────────────────┘ 91 90 ··· 97 96 98 97 99 98 ┌─────────────────────────────────────────────────────────┐ 100 - │ Wisp Hosting Service │ 99 + │ Firehose Service (Write Path) │ 100 + │ │ 101 + │ - Watches firehose for place.wisp.fs changes │ 102 + │ - Downloads blobs from PDS │ 103 + │ - Writes cached files to S3 / disk │ 104 + │ - Publishes cache invalidation via Redis │ 105 + └─────────────────────────────────────────────────────────┘ 106 + │ │ 107 + │ (S3 / Disk) │ (Redis pub/sub) 108 + ▼ ▼ 109 + ┌─────────────────────────────────────────────────────────┐ 110 + │ Hosting Service (Read Path) │ 101 111 │ │ 102 112 │ ┌──────────────────────────────────────────────┐ │ 103 - │ │ Cache (Disk + In-Memory) │ │ 104 - │ │ - Downloads sites on first access │ │ 105 - │ │ - Auto-updates on firehose events │ │ 106 - │ │ - LRU eviction for memory limits │ │ 113 + │ │ Tiered Storage │ │ 114 + │ │ ┌──────┐ ┌──────┐ ┌──────────────┐ │ │ 115 + │ │ │ Hot │ → │ Warm │ → │ Cold │ │ │ 116 + │ │ │(Mem) │ │(Disk)│ │(S3/Disk) │ │ │ 117 + │ │ └──────┘ └──────┘ └──────────────┘ │ │ 118 + │ │ On miss: fetch from PDS and promote up │ │ 107 119 │ └──────────────────────────────────────────────┘ │ 108 120 │ │ 109 121 │ ┌──────────────────────────────────────────────┐ │ ··· 120 132 └─────────────┘ 121 133 ``` 122 134 135 + For a detailed breakdown of the services and storage system, see the [Architecture Guide](/architecture). 136 + 123 137 ## Tech Stack 124 138 125 139 - **Backend**: Bun + Elysia + PostgreSQL 126 140 - **Frontend**: React 19 + Tailwind 4 + Radix UI 127 - - **Hosting**: Node.js + Hono 141 + - **Hosting Service**: Node.js + Hono 142 + - **Firehose Service**: Bun 128 143 - **CLI**: Rust + Jacquard (AT Protocol library) 129 144 - **Protocol**: AT Protocol OAuth + custom lexicons 145 + - **Storage**: S3-compatible (Cloudflare R2, MinIO, etc.) + Redis for cache invalidation 130 146 131 147 ## Limits 132 148 ··· 139 155 ## Getting Started 140 156 141 157 - [CLI Documentation](/cli) - Deploy sites from the command line 142 - - [Deployment Guide](/deployment) - Configure domains, redirects, and hosting 158 + - [Architecture Guide](/architecture) - How hosting, firehose, and tiered storage work 159 + - [Self-Hosting Guide](/deployment) - Deploy your own instance 143 160 - [Lexicons](/lexicons) - AT Protocol record schemas and data structures 144 161 145 162 ## Links