WIP! A BB-style forum, on the ATmosphere! We're still working... we'll be back soon when we have something to show off!
node typescript hono htmx atproto

feat(appview): backfill & repo sync — ATB-13 (#54)

* docs: add backfill and repo sync design (ATB-13)

Approved design for gap detection, collection-based repo sync via
existing Indexer handlers, DB-backed progress tracking with resume,
and async admin API for manual backfill triggers.

* docs: add backfill implementation plan (ATB-13)

12-task TDD plan covering DB schema, gap detection, repo sync,
orchestration with progress tracking, firehose integration,
admin API endpoints, and AppContext wiring.

* feat(db): add backfill_progress and backfill_errors tables (ATB-13)

Add two tables to support crash-resilient backfill:
- backfill_progress: tracks job state, DID counts, and resume cursor
- backfill_errors: per-DID error log with FK to backfill_progress

* feat(appview): add backfill configuration fields (ATB-13)

Add three new optional config fields with sensible defaults:
- backfillRateLimit (default 10): max XRPC requests/sec per PDS
- backfillConcurrency (default 10): max DIDs processed concurrently
- backfillCursorMaxAgeHours (default 48): cursor age threshold for CatchUp

Declare env vars in turbo.json so Turbo passes them through to tests.
Update test helpers (app-context.test.ts, test-context.ts) for new fields.

* feat(appview): add getCursorAgeHours to CursorManager (ATB-13)

Add method to calculate cursor age in hours from microsecond Jetstream
timestamps. Used by BackfillManager gap detection to determine if
backfill is needed when cursor is too old.

* feat(appview): add BackfillManager with gap detection (ATB-13)

- Adds BackfillManager class with checkIfNeeded() and getIsRunning()
- BackfillStatus enum: NotNeeded, CatchUp, FullSync
- Gap detection logic: null cursor → FullSync, empty DB → FullSync,
stale cursor (>backfillCursorMaxAgeHours) → CatchUp, fresh → NotNeeded
- Structured JSON logging for all decision paths
- 4 unit tests covering all decision branches

* fix(appview): add DB error handling and fix null guard in BackfillManager (ATB-13)

- Wrap forums DB query in try-catch; return FullSync on error (fail safe)
- Replace destructuring with results[0] so forum is in scope after try block
- Use non-null assertion on getCursorAgeHours since cursor is proven non-null at that point
- Remove redundant null ternary in NotNeeded log payload (ageHours is always a number)
- Add test: returns FullSync when DB query fails (fail safe)

* feat(appview): add syncRepoRecords with event adapter (ATB-13)

* fix(appview): correct event adapter shape and add guard logging in BackfillManager (ATB-13)

* feat(appview): add performBackfill orchestration with progress tracking (ATB-13)

* fix(appview): mark backfill as failed on error, fix type and concurrent mutation (ATB-13)

* fix(appview): resolve TypeScript closure narrowing with const capture (ATB-13)

TypeScript cannot narrow let variables through async closure boundaries.
Replace backfillId! non-null assertions inside batch.map closures with
a const resolvedBackfillId captured immediately after the insert.

* test(appview): add CatchUp path coverage for performBackfill (ATB-13)

Add two tests exercising the Phase 2 (CatchUp) branch:
- Aggregates counts correctly across 2 users × 2 collections × 1 record
- Rejected batch callbacks (backfillErrors insert failure) increment
totalErrors via allSettled rejected branch rather than silently swallowing

Phase 1 mocks now explicitly return empty pages for all 5 forum-owned
collections so counts are isolated to Phase 2 user records.

* feat(appview): add interrupted backfill resume (ATB-13)

- Add checkForInterruptedBackfill() to query backfill_progress for any in_progress row
- Add resumeBackfill() to continue a CatchUp from lastProcessedDid without re-running Phase 1
- Add gt to drizzle-orm imports for the WHERE did > lastProcessedDid predicate
- Cover both methods with 6 new tests (null result, found row, resume counts, no-op complete, isRunning cleanup, concurrency guard)

* feat(appview): integrate backfill check into FirehoseService.start() (ATB-13)

- Add BackfillManager setter/getter to FirehoseService for DI wiring
- Run checkForInterruptedBackfill and resumeBackfill before Jetstream starts
- Fall back to gap detection (checkIfNeeded/performBackfill) when no interrupted backfill
- Expose getIndexer() for BackfillManager wiring in Task 10
- Add 5 Backfill Integration tests covering CatchUp, NotNeeded, resume, no-manager, and getIndexer()
- Add missing handleBoard/handleRole handlers to Indexer mock

* feat(appview): add admin backfill endpoints (ATB-13)

- POST /api/admin/backfill: trigger backfill (202), check if needed (200), or force with ?force=catch_up|full_sync
- GET /api/admin/backfill/:id: fetch progress row with error count
- GET /api/admin/backfill/:id/errors: list per-DID errors for a backfill
- Add backfillManager field to AppContext (null until Task 10 wires it up)
- Add backfillProgress/backfillErrors cleanup to test-context for isolation
- Fix health.test.ts to include backfillManager: null in mock AppContext
- 16 tests covering auth, permissions, 409 conflict, 503 unavailable, 200/202 success cases, 404/400 errors

* feat(appview): wire BackfillManager into AppContext and startup (ATB-13)

* docs: add backfill Bruno collection and update plan (ATB-13)

- Add bruno/AppView API/Admin/ with three .bru files:
- Trigger Backfill (POST /api/admin/backfill, ?force param docs)
- Get Backfill Status (GET /api/admin/backfill/:id)
- Get Backfill Errors (GET /api/admin/backfill/:id/errors)
- Mark ATB-13 complete in docs/atproto-forum-plan.md (Phase 3 entry)
- Resolve "Backfill" item in Key Risks & Open Questions

* fix(appview): address PR review feedback for ATB-13 backfill

Critical fixes:
- Wrap firehose startup backfill block in try-catch so a transient DB error
doesn't crash the entire process; stale firehose data is better than no data
- Bind error in handleReconnect bare catch{} so root cause is never silently lost
- Add isProgrammingError re-throw to per-record catch in syncRepoRecords so
code bugs (TypeError, ReferenceError) surface instead of being counted as data errors
- Add try-catch to checkForInterruptedBackfill; returns null on runtime errors
- Mark interrupted FullSync backfills as failed instead of silently no-oping;
FullSync has no checkpoint to resume from and must be re-triggered

Important fixes:
- Remove yourPriority/targetRolePriority from 403 response (CLAUDE.md: no internal details)
- Add isProgrammingError re-throw to GET /roles and GET /members catch blocks
- Wrap cursor load + checkIfNeeded in try-catch in POST /api/admin/backfill
- Replace parseInt with BigInt regex validation to prevent silent precision loss
- Wrap batch checkpoint updates in separate try-catch so a failed checkpoint
logs a warning but does not abort the entire backfill run
- Add DID to batch failure logs for debuggability

API improvement:
- Surface backfill ID in 202 response via prepareBackfillRow; the progress row
is created synchronously so the ID can be used immediately for status polling
- performBackfill now accepts optional existingRowId to skip duplicate row creation

Tests added:
- resumeBackfill with full_sync type marks row as failed (not completed)
- checkForInterruptedBackfill returns null on DB failure
- syncRepoRecords returns error stats when indexer is not set
- 403 tests for GET /backfill/:id and GET /backfill/:id/errors
- 500 error tests for both GET endpoints
- in_progress status response test for GET /backfill/:id
- Decimal backfill ID rejected (5.9 → 400)
- Invalid ?force falls through to gap detection
- 202 response now asserts id field and correct performBackfill call signature

* fix(backfill): address follow-up review feedback on ATB-13

HIGH priority:
- firehose.ts: add isInitialStart guard to prevent backfill re-running
on Jetstream reconnects; flag cleared before try block so reconnects
are skipped even when the initial backfill throws
- firehose.test.ts: replace stub expect(true).toBe(true) with real
graceful-degradation test; add reconnect guard test
- admin.ts: switch GET /backfill/:id and GET /backfill/:id/errors catch
blocks to handleReadError for consistent error classification

Medium priority:
- route-errors.ts: tighten safeParseJsonBody catch to re-throw anything
that is not a SyntaxError (malformed user JSON), preventing silent
swallowing of programming bugs
- packages/atproto/src/errors.ts: replace broad "query" substring with
"failed query" — the exact prefix DrizzleQueryError uses when wrapping
failed DB queries, avoiding false positives on unrelated messages
- backfill-manager.ts: persist per-collection errors to backfillErrors
table during Phase 1 (forum-owned collections) to match Phase 2 behaviour
- admin.ts GET /members: add isTruncated field to response when result
set is truncated at 100 rows

authored by

Malpercio and committed by
GitHub
841e248e e8d6e692

+6177 -21
+24
apps/appview/drizzle/0008_flat_sauron.sql
··· 1 + CREATE TABLE "backfill_errors" ( 2 + "id" bigserial PRIMARY KEY NOT NULL, 3 + "backfill_id" bigint NOT NULL, 4 + "did" text NOT NULL, 5 + "collection" text NOT NULL, 6 + "error_message" text NOT NULL, 7 + "created_at" timestamp with time zone NOT NULL 8 + ); 9 + --> statement-breakpoint 10 + CREATE TABLE "backfill_progress" ( 11 + "id" bigserial PRIMARY KEY NOT NULL, 12 + "status" text NOT NULL, 13 + "backfill_type" text NOT NULL, 14 + "last_processed_did" text, 15 + "dids_total" integer DEFAULT 0 NOT NULL, 16 + "dids_processed" integer DEFAULT 0 NOT NULL, 17 + "records_indexed" integer DEFAULT 0 NOT NULL, 18 + "started_at" timestamp with time zone NOT NULL, 19 + "completed_at" timestamp with time zone, 20 + "error_message" text 21 + ); 22 + --> statement-breakpoint 23 + ALTER TABLE "backfill_errors" ADD CONSTRAINT "backfill_errors_backfill_id_backfill_progress_id_fk" FOREIGN KEY ("backfill_id") REFERENCES "public"."backfill_progress"("id") ON DELETE no action ON UPDATE no action;--> statement-breakpoint 24 + CREATE INDEX "backfill_errors_backfill_id_idx" ON "backfill_errors" USING btree ("backfill_id");
+1242
apps/appview/drizzle/meta/0008_snapshot.json
··· 1 + { 2 + "id": "53a1f2f4-ad21-481d-ad60-8769c68353d2", 3 + "prevId": "d1681012-5578-4505-9467-f4e6096facc5", 4 + "version": "7", 5 + "dialect": "postgresql", 6 + "tables": { 7 + "public.backfill_errors": { 8 + "name": "backfill_errors", 9 + "schema": "", 10 + "columns": { 11 + "id": { 12 + "name": "id", 13 + "type": "bigserial", 14 + "primaryKey": true, 15 + "notNull": true 16 + }, 17 + "backfill_id": { 18 + "name": "backfill_id", 19 + "type": "bigint", 20 + "primaryKey": false, 21 + "notNull": true 22 + }, 23 + "did": { 24 + "name": "did", 25 + "type": "text", 26 + "primaryKey": false, 27 + "notNull": true 28 + }, 29 + "collection": { 30 + "name": "collection", 31 + "type": "text", 32 + "primaryKey": false, 33 + "notNull": true 34 + }, 35 + "error_message": { 36 + "name": "error_message", 37 + "type": "text", 38 + "primaryKey": false, 39 + "notNull": true 40 + }, 41 + "created_at": { 42 + "name": "created_at", 43 + "type": "timestamp with time zone", 44 + "primaryKey": false, 45 + "notNull": true 46 + } 47 + }, 48 + "indexes": { 49 + "backfill_errors_backfill_id_idx": { 50 + "name": "backfill_errors_backfill_id_idx", 51 + "columns": [ 52 + { 53 + "expression": "backfill_id", 54 + "isExpression": false, 55 + "asc": true, 56 + "nulls": "last" 57 + } 58 + ], 59 + "isUnique": false, 60 + "concurrently": false, 61 + "method": "btree", 62 + "with": {} 63 + } 64 + }, 65 + "foreignKeys": { 66 + "backfill_errors_backfill_id_backfill_progress_id_fk": { 67 + "name": "backfill_errors_backfill_id_backfill_progress_id_fk", 68 + "tableFrom": "backfill_errors", 69 + "tableTo": "backfill_progress", 70 + "columnsFrom": [ 71 + "backfill_id" 72 + ], 73 + "columnsTo": [ 74 + "id" 75 + ], 76 + "onDelete": "no action", 77 + "onUpdate": "no action" 78 + } 79 + }, 80 + "compositePrimaryKeys": {}, 81 + "uniqueConstraints": {}, 82 + "policies": {}, 83 + "checkConstraints": {}, 84 + "isRLSEnabled": false 85 + }, 86 + "public.backfill_progress": { 87 + "name": "backfill_progress", 88 + "schema": "", 89 + "columns": { 90 + "id": { 91 + "name": "id", 92 + "type": "bigserial", 93 + "primaryKey": true, 94 + "notNull": true 95 + }, 96 + "status": { 97 + "name": "status", 98 + "type": "text", 99 + "primaryKey": false, 100 + "notNull": true 101 + }, 102 + "backfill_type": { 103 + "name": "backfill_type", 104 + "type": "text", 105 + "primaryKey": false, 106 + "notNull": true 107 + }, 108 + "last_processed_did": { 109 + "name": "last_processed_did", 110 + "type": "text", 111 + "primaryKey": false, 112 + "notNull": false 113 + }, 114 + "dids_total": { 115 + "name": "dids_total", 116 + "type": "integer", 117 + "primaryKey": false, 118 + "notNull": true, 119 + "default": 0 120 + }, 121 + "dids_processed": { 122 + "name": "dids_processed", 123 + "type": "integer", 124 + "primaryKey": false, 125 + "notNull": true, 126 + "default": 0 127 + }, 128 + "records_indexed": { 129 + "name": "records_indexed", 130 + "type": "integer", 131 + "primaryKey": false, 132 + "notNull": true, 133 + "default": 0 134 + }, 135 + "started_at": { 136 + "name": "started_at", 137 + "type": "timestamp with time zone", 138 + "primaryKey": false, 139 + "notNull": true 140 + }, 141 + "completed_at": { 142 + "name": "completed_at", 143 + "type": "timestamp with time zone", 144 + "primaryKey": false, 145 + "notNull": false 146 + }, 147 + "error_message": { 148 + "name": "error_message", 149 + "type": "text", 150 + "primaryKey": false, 151 + "notNull": false 152 + } 153 + }, 154 + "indexes": {}, 155 + "foreignKeys": {}, 156 + "compositePrimaryKeys": {}, 157 + "uniqueConstraints": {}, 158 + "policies": {}, 159 + "checkConstraints": {}, 160 + "isRLSEnabled": false 161 + }, 162 + "public.boards": { 163 + "name": "boards", 164 + "schema": "", 165 + "columns": { 166 + "id": { 167 + "name": "id", 168 + "type": "bigserial", 169 + "primaryKey": true, 170 + "notNull": true 171 + }, 172 + "did": { 173 + "name": "did", 174 + "type": "text", 175 + "primaryKey": false, 176 + "notNull": true 177 + }, 178 + "rkey": { 179 + "name": "rkey", 180 + "type": "text", 181 + "primaryKey": false, 182 + "notNull": true 183 + }, 184 + "cid": { 185 + "name": "cid", 186 + "type": "text", 187 + "primaryKey": false, 188 + "notNull": true 189 + }, 190 + "name": { 191 + "name": "name", 192 + "type": "text", 193 + "primaryKey": false, 194 + "notNull": true 195 + }, 196 + "description": { 197 + "name": "description", 198 + "type": "text", 199 + "primaryKey": false, 200 + "notNull": false 201 + }, 202 + "slug": { 203 + "name": "slug", 204 + "type": "text", 205 + "primaryKey": false, 206 + "notNull": false 207 + }, 208 + "sort_order": { 209 + "name": "sort_order", 210 + "type": "integer", 211 + "primaryKey": false, 212 + "notNull": false 213 + }, 214 + "category_id": { 215 + "name": "category_id", 216 + "type": "bigint", 217 + "primaryKey": false, 218 + "notNull": false 219 + }, 220 + "category_uri": { 221 + "name": "category_uri", 222 + "type": "text", 223 + "primaryKey": false, 224 + "notNull": true 225 + }, 226 + "created_at": { 227 + "name": "created_at", 228 + "type": "timestamp with time zone", 229 + "primaryKey": false, 230 + "notNull": true 231 + }, 232 + "indexed_at": { 233 + "name": "indexed_at", 234 + "type": "timestamp with time zone", 235 + "primaryKey": false, 236 + "notNull": true 237 + } 238 + }, 239 + "indexes": { 240 + "boards_did_rkey_idx": { 241 + "name": "boards_did_rkey_idx", 242 + "columns": [ 243 + { 244 + "expression": "did", 245 + "isExpression": false, 246 + "asc": true, 247 + "nulls": "last" 248 + }, 249 + { 250 + "expression": "rkey", 251 + "isExpression": false, 252 + "asc": true, 253 + "nulls": "last" 254 + } 255 + ], 256 + "isUnique": true, 257 + "concurrently": false, 258 + "method": "btree", 259 + "with": {} 260 + }, 261 + "boards_category_id_idx": { 262 + "name": "boards_category_id_idx", 263 + "columns": [ 264 + { 265 + "expression": "category_id", 266 + "isExpression": false, 267 + "asc": true, 268 + "nulls": "last" 269 + } 270 + ], 271 + "isUnique": false, 272 + "concurrently": false, 273 + "method": "btree", 274 + "with": {} 275 + } 276 + }, 277 + "foreignKeys": { 278 + "boards_category_id_categories_id_fk": { 279 + "name": "boards_category_id_categories_id_fk", 280 + "tableFrom": "boards", 281 + "tableTo": "categories", 282 + "columnsFrom": [ 283 + "category_id" 284 + ], 285 + "columnsTo": [ 286 + "id" 287 + ], 288 + "onDelete": "no action", 289 + "onUpdate": "no action" 290 + } 291 + }, 292 + "compositePrimaryKeys": {}, 293 + "uniqueConstraints": {}, 294 + "policies": {}, 295 + "checkConstraints": {}, 296 + "isRLSEnabled": false 297 + }, 298 + "public.categories": { 299 + "name": "categories", 300 + "schema": "", 301 + "columns": { 302 + "id": { 303 + "name": "id", 304 + "type": "bigserial", 305 + "primaryKey": true, 306 + "notNull": true 307 + }, 308 + "did": { 309 + "name": "did", 310 + "type": "text", 311 + "primaryKey": false, 312 + "notNull": true 313 + }, 314 + "rkey": { 315 + "name": "rkey", 316 + "type": "text", 317 + "primaryKey": false, 318 + "notNull": true 319 + }, 320 + "cid": { 321 + "name": "cid", 322 + "type": "text", 323 + "primaryKey": false, 324 + "notNull": true 325 + }, 326 + "name": { 327 + "name": "name", 328 + "type": "text", 329 + "primaryKey": false, 330 + "notNull": true 331 + }, 332 + "description": { 333 + "name": "description", 334 + "type": "text", 335 + "primaryKey": false, 336 + "notNull": false 337 + }, 338 + "slug": { 339 + "name": "slug", 340 + "type": "text", 341 + "primaryKey": false, 342 + "notNull": false 343 + }, 344 + "sort_order": { 345 + "name": "sort_order", 346 + "type": "integer", 347 + "primaryKey": false, 348 + "notNull": false 349 + }, 350 + "forum_id": { 351 + "name": "forum_id", 352 + "type": "bigint", 353 + "primaryKey": false, 354 + "notNull": false 355 + }, 356 + "created_at": { 357 + "name": "created_at", 358 + "type": "timestamp with time zone", 359 + "primaryKey": false, 360 + "notNull": true 361 + }, 362 + "indexed_at": { 363 + "name": "indexed_at", 364 + "type": "timestamp with time zone", 365 + "primaryKey": false, 366 + "notNull": true 367 + } 368 + }, 369 + "indexes": { 370 + "categories_did_rkey_idx": { 371 + "name": "categories_did_rkey_idx", 372 + "columns": [ 373 + { 374 + "expression": "did", 375 + "isExpression": false, 376 + "asc": true, 377 + "nulls": "last" 378 + }, 379 + { 380 + "expression": "rkey", 381 + "isExpression": false, 382 + "asc": true, 383 + "nulls": "last" 384 + } 385 + ], 386 + "isUnique": true, 387 + "concurrently": false, 388 + "method": "btree", 389 + "with": {} 390 + } 391 + }, 392 + "foreignKeys": { 393 + "categories_forum_id_forums_id_fk": { 394 + "name": "categories_forum_id_forums_id_fk", 395 + "tableFrom": "categories", 396 + "tableTo": "forums", 397 + "columnsFrom": [ 398 + "forum_id" 399 + ], 400 + "columnsTo": [ 401 + "id" 402 + ], 403 + "onDelete": "no action", 404 + "onUpdate": "no action" 405 + } 406 + }, 407 + "compositePrimaryKeys": {}, 408 + "uniqueConstraints": {}, 409 + "policies": {}, 410 + "checkConstraints": {}, 411 + "isRLSEnabled": false 412 + }, 413 + "public.firehose_cursor": { 414 + "name": "firehose_cursor", 415 + "schema": "", 416 + "columns": { 417 + "service": { 418 + "name": "service", 419 + "type": "text", 420 + "primaryKey": true, 421 + "notNull": true, 422 + "default": "'jetstream'" 423 + }, 424 + "cursor": { 425 + "name": "cursor", 426 + "type": "bigint", 427 + "primaryKey": false, 428 + "notNull": true 429 + }, 430 + "updated_at": { 431 + "name": "updated_at", 432 + "type": "timestamp with time zone", 433 + "primaryKey": false, 434 + "notNull": true 435 + } 436 + }, 437 + "indexes": {}, 438 + "foreignKeys": {}, 439 + "compositePrimaryKeys": {}, 440 + "uniqueConstraints": {}, 441 + "policies": {}, 442 + "checkConstraints": {}, 443 + "isRLSEnabled": false 444 + }, 445 + "public.forums": { 446 + "name": "forums", 447 + "schema": "", 448 + "columns": { 449 + "id": { 450 + "name": "id", 451 + "type": "bigserial", 452 + "primaryKey": true, 453 + "notNull": true 454 + }, 455 + "did": { 456 + "name": "did", 457 + "type": "text", 458 + "primaryKey": false, 459 + "notNull": true 460 + }, 461 + "rkey": { 462 + "name": "rkey", 463 + "type": "text", 464 + "primaryKey": false, 465 + "notNull": true 466 + }, 467 + "cid": { 468 + "name": "cid", 469 + "type": "text", 470 + "primaryKey": false, 471 + "notNull": true 472 + }, 473 + "name": { 474 + "name": "name", 475 + "type": "text", 476 + "primaryKey": false, 477 + "notNull": true 478 + }, 479 + "description": { 480 + "name": "description", 481 + "type": "text", 482 + "primaryKey": false, 483 + "notNull": false 484 + }, 485 + "indexed_at": { 486 + "name": "indexed_at", 487 + "type": "timestamp with time zone", 488 + "primaryKey": false, 489 + "notNull": true 490 + } 491 + }, 492 + "indexes": { 493 + "forums_did_rkey_idx": { 494 + "name": "forums_did_rkey_idx", 495 + "columns": [ 496 + { 497 + "expression": "did", 498 + "isExpression": false, 499 + "asc": true, 500 + "nulls": "last" 501 + }, 502 + { 503 + "expression": "rkey", 504 + "isExpression": false, 505 + "asc": true, 506 + "nulls": "last" 507 + } 508 + ], 509 + "isUnique": true, 510 + "concurrently": false, 511 + "method": "btree", 512 + "with": {} 513 + } 514 + }, 515 + "foreignKeys": {}, 516 + "compositePrimaryKeys": {}, 517 + "uniqueConstraints": {}, 518 + "policies": {}, 519 + "checkConstraints": {}, 520 + "isRLSEnabled": false 521 + }, 522 + "public.memberships": { 523 + "name": "memberships", 524 + "schema": "", 525 + "columns": { 526 + "id": { 527 + "name": "id", 528 + "type": "bigserial", 529 + "primaryKey": true, 530 + "notNull": true 531 + }, 532 + "did": { 533 + "name": "did", 534 + "type": "text", 535 + "primaryKey": false, 536 + "notNull": true 537 + }, 538 + "rkey": { 539 + "name": "rkey", 540 + "type": "text", 541 + "primaryKey": false, 542 + "notNull": true 543 + }, 544 + "cid": { 545 + "name": "cid", 546 + "type": "text", 547 + "primaryKey": false, 548 + "notNull": true 549 + }, 550 + "forum_id": { 551 + "name": "forum_id", 552 + "type": "bigint", 553 + "primaryKey": false, 554 + "notNull": false 555 + }, 556 + "forum_uri": { 557 + "name": "forum_uri", 558 + "type": "text", 559 + "primaryKey": false, 560 + "notNull": true 561 + }, 562 + "role": { 563 + "name": "role", 564 + "type": "text", 565 + "primaryKey": false, 566 + "notNull": false 567 + }, 568 + "role_uri": { 569 + "name": "role_uri", 570 + "type": "text", 571 + "primaryKey": false, 572 + "notNull": false 573 + }, 574 + "joined_at": { 575 + "name": "joined_at", 576 + "type": "timestamp with time zone", 577 + "primaryKey": false, 578 + "notNull": false 579 + }, 580 + "created_at": { 581 + "name": "created_at", 582 + "type": "timestamp with time zone", 583 + "primaryKey": false, 584 + "notNull": true 585 + }, 586 + "indexed_at": { 587 + "name": "indexed_at", 588 + "type": "timestamp with time zone", 589 + "primaryKey": false, 590 + "notNull": true 591 + } 592 + }, 593 + "indexes": { 594 + "memberships_did_rkey_idx": { 595 + "name": "memberships_did_rkey_idx", 596 + "columns": [ 597 + { 598 + "expression": "did", 599 + "isExpression": false, 600 + "asc": true, 601 + "nulls": "last" 602 + }, 603 + { 604 + "expression": "rkey", 605 + "isExpression": false, 606 + "asc": true, 607 + "nulls": "last" 608 + } 609 + ], 610 + "isUnique": true, 611 + "concurrently": false, 612 + "method": "btree", 613 + "with": {} 614 + }, 615 + "memberships_did_idx": { 616 + "name": "memberships_did_idx", 617 + "columns": [ 618 + { 619 + "expression": "did", 620 + "isExpression": false, 621 + "asc": true, 622 + "nulls": "last" 623 + } 624 + ], 625 + "isUnique": false, 626 + "concurrently": false, 627 + "method": "btree", 628 + "with": {} 629 + } 630 + }, 631 + "foreignKeys": { 632 + "memberships_did_users_did_fk": { 633 + "name": "memberships_did_users_did_fk", 634 + "tableFrom": "memberships", 635 + "tableTo": "users", 636 + "columnsFrom": [ 637 + "did" 638 + ], 639 + "columnsTo": [ 640 + "did" 641 + ], 642 + "onDelete": "no action", 643 + "onUpdate": "no action" 644 + }, 645 + "memberships_forum_id_forums_id_fk": { 646 + "name": "memberships_forum_id_forums_id_fk", 647 + "tableFrom": "memberships", 648 + "tableTo": "forums", 649 + "columnsFrom": [ 650 + "forum_id" 651 + ], 652 + "columnsTo": [ 653 + "id" 654 + ], 655 + "onDelete": "no action", 656 + "onUpdate": "no action" 657 + } 658 + }, 659 + "compositePrimaryKeys": {}, 660 + "uniqueConstraints": {}, 661 + "policies": {}, 662 + "checkConstraints": {}, 663 + "isRLSEnabled": false 664 + }, 665 + "public.mod_actions": { 666 + "name": "mod_actions", 667 + "schema": "", 668 + "columns": { 669 + "id": { 670 + "name": "id", 671 + "type": "bigserial", 672 + "primaryKey": true, 673 + "notNull": true 674 + }, 675 + "did": { 676 + "name": "did", 677 + "type": "text", 678 + "primaryKey": false, 679 + "notNull": true 680 + }, 681 + "rkey": { 682 + "name": "rkey", 683 + "type": "text", 684 + "primaryKey": false, 685 + "notNull": true 686 + }, 687 + "cid": { 688 + "name": "cid", 689 + "type": "text", 690 + "primaryKey": false, 691 + "notNull": true 692 + }, 693 + "action": { 694 + "name": "action", 695 + "type": "text", 696 + "primaryKey": false, 697 + "notNull": true 698 + }, 699 + "subject_did": { 700 + "name": "subject_did", 701 + "type": "text", 702 + "primaryKey": false, 703 + "notNull": false 704 + }, 705 + "subject_post_uri": { 706 + "name": "subject_post_uri", 707 + "type": "text", 708 + "primaryKey": false, 709 + "notNull": false 710 + }, 711 + "forum_id": { 712 + "name": "forum_id", 713 + "type": "bigint", 714 + "primaryKey": false, 715 + "notNull": false 716 + }, 717 + "reason": { 718 + "name": "reason", 719 + "type": "text", 720 + "primaryKey": false, 721 + "notNull": false 722 + }, 723 + "created_by": { 724 + "name": "created_by", 725 + "type": "text", 726 + "primaryKey": false, 727 + "notNull": true 728 + }, 729 + "expires_at": { 730 + "name": "expires_at", 731 + "type": "timestamp with time zone", 732 + "primaryKey": false, 733 + "notNull": false 734 + }, 735 + "created_at": { 736 + "name": "created_at", 737 + "type": "timestamp with time zone", 738 + "primaryKey": false, 739 + "notNull": true 740 + }, 741 + "indexed_at": { 742 + "name": "indexed_at", 743 + "type": "timestamp with time zone", 744 + "primaryKey": false, 745 + "notNull": true 746 + } 747 + }, 748 + "indexes": { 749 + "mod_actions_did_rkey_idx": { 750 + "name": "mod_actions_did_rkey_idx", 751 + "columns": [ 752 + { 753 + "expression": "did", 754 + "isExpression": false, 755 + "asc": true, 756 + "nulls": "last" 757 + }, 758 + { 759 + "expression": "rkey", 760 + "isExpression": false, 761 + "asc": true, 762 + "nulls": "last" 763 + } 764 + ], 765 + "isUnique": true, 766 + "concurrently": false, 767 + "method": "btree", 768 + "with": {} 769 + }, 770 + "mod_actions_subject_did_idx": { 771 + "name": "mod_actions_subject_did_idx", 772 + "columns": [ 773 + { 774 + "expression": "subject_did", 775 + "isExpression": false, 776 + "asc": true, 777 + "nulls": "last" 778 + } 779 + ], 780 + "isUnique": false, 781 + "concurrently": false, 782 + "method": "btree", 783 + "with": {} 784 + }, 785 + "mod_actions_subject_post_uri_idx": { 786 + "name": "mod_actions_subject_post_uri_idx", 787 + "columns": [ 788 + { 789 + "expression": "subject_post_uri", 790 + "isExpression": false, 791 + "asc": true, 792 + "nulls": "last" 793 + } 794 + ], 795 + "isUnique": false, 796 + "concurrently": false, 797 + "method": "btree", 798 + "with": {} 799 + } 800 + }, 801 + "foreignKeys": { 802 + "mod_actions_forum_id_forums_id_fk": { 803 + "name": "mod_actions_forum_id_forums_id_fk", 804 + "tableFrom": "mod_actions", 805 + "tableTo": "forums", 806 + "columnsFrom": [ 807 + "forum_id" 808 + ], 809 + "columnsTo": [ 810 + "id" 811 + ], 812 + "onDelete": "no action", 813 + "onUpdate": "no action" 814 + } 815 + }, 816 + "compositePrimaryKeys": {}, 817 + "uniqueConstraints": {}, 818 + "policies": {}, 819 + "checkConstraints": {}, 820 + "isRLSEnabled": false 821 + }, 822 + "public.posts": { 823 + "name": "posts", 824 + "schema": "", 825 + "columns": { 826 + "id": { 827 + "name": "id", 828 + "type": "bigserial", 829 + "primaryKey": true, 830 + "notNull": true 831 + }, 832 + "did": { 833 + "name": "did", 834 + "type": "text", 835 + "primaryKey": false, 836 + "notNull": true 837 + }, 838 + "rkey": { 839 + "name": "rkey", 840 + "type": "text", 841 + "primaryKey": false, 842 + "notNull": true 843 + }, 844 + "cid": { 845 + "name": "cid", 846 + "type": "text", 847 + "primaryKey": false, 848 + "notNull": true 849 + }, 850 + "title": { 851 + "name": "title", 852 + "type": "text", 853 + "primaryKey": false, 854 + "notNull": false 855 + }, 856 + "text": { 857 + "name": "text", 858 + "type": "text", 859 + "primaryKey": false, 860 + "notNull": true 861 + }, 862 + "forum_uri": { 863 + "name": "forum_uri", 864 + "type": "text", 865 + "primaryKey": false, 866 + "notNull": false 867 + }, 868 + "board_uri": { 869 + "name": "board_uri", 870 + "type": "text", 871 + "primaryKey": false, 872 + "notNull": false 873 + }, 874 + "board_id": { 875 + "name": "board_id", 876 + "type": "bigint", 877 + "primaryKey": false, 878 + "notNull": false 879 + }, 880 + "root_post_id": { 881 + "name": "root_post_id", 882 + "type": "bigint", 883 + "primaryKey": false, 884 + "notNull": false 885 + }, 886 + "parent_post_id": { 887 + "name": "parent_post_id", 888 + "type": "bigint", 889 + "primaryKey": false, 890 + "notNull": false 891 + }, 892 + "root_uri": { 893 + "name": "root_uri", 894 + "type": "text", 895 + "primaryKey": false, 896 + "notNull": false 897 + }, 898 + "parent_uri": { 899 + "name": "parent_uri", 900 + "type": "text", 901 + "primaryKey": false, 902 + "notNull": false 903 + }, 904 + "created_at": { 905 + "name": "created_at", 906 + "type": "timestamp with time zone", 907 + "primaryKey": false, 908 + "notNull": true 909 + }, 910 + "indexed_at": { 911 + "name": "indexed_at", 912 + "type": "timestamp with time zone", 913 + "primaryKey": false, 914 + "notNull": true 915 + }, 916 + "deleted": { 917 + "name": "deleted", 918 + "type": "boolean", 919 + "primaryKey": false, 920 + "notNull": true, 921 + "default": false 922 + } 923 + }, 924 + "indexes": { 925 + "posts_did_rkey_idx": { 926 + "name": "posts_did_rkey_idx", 927 + "columns": [ 928 + { 929 + "expression": "did", 930 + "isExpression": false, 931 + "asc": true, 932 + "nulls": "last" 933 + }, 934 + { 935 + "expression": "rkey", 936 + "isExpression": false, 937 + "asc": true, 938 + "nulls": "last" 939 + } 940 + ], 941 + "isUnique": true, 942 + "concurrently": false, 943 + "method": "btree", 944 + "with": {} 945 + }, 946 + "posts_forum_uri_idx": { 947 + "name": "posts_forum_uri_idx", 948 + "columns": [ 949 + { 950 + "expression": "forum_uri", 951 + "isExpression": false, 952 + "asc": true, 953 + "nulls": "last" 954 + } 955 + ], 956 + "isUnique": false, 957 + "concurrently": false, 958 + "method": "btree", 959 + "with": {} 960 + }, 961 + "posts_board_id_idx": { 962 + "name": "posts_board_id_idx", 963 + "columns": [ 964 + { 965 + "expression": "board_id", 966 + "isExpression": false, 967 + "asc": true, 968 + "nulls": "last" 969 + } 970 + ], 971 + "isUnique": false, 972 + "concurrently": false, 973 + "method": "btree", 974 + "with": {} 975 + }, 976 + "posts_board_uri_idx": { 977 + "name": "posts_board_uri_idx", 978 + "columns": [ 979 + { 980 + "expression": "board_uri", 981 + "isExpression": false, 982 + "asc": true, 983 + "nulls": "last" 984 + } 985 + ], 986 + "isUnique": false, 987 + "concurrently": false, 988 + "method": "btree", 989 + "with": {} 990 + }, 991 + "posts_root_post_id_idx": { 992 + "name": "posts_root_post_id_idx", 993 + "columns": [ 994 + { 995 + "expression": "root_post_id", 996 + "isExpression": false, 997 + "asc": true, 998 + "nulls": "last" 999 + } 1000 + ], 1001 + "isUnique": false, 1002 + "concurrently": false, 1003 + "method": "btree", 1004 + "with": {} 1005 + } 1006 + }, 1007 + "foreignKeys": { 1008 + "posts_did_users_did_fk": { 1009 + "name": "posts_did_users_did_fk", 1010 + "tableFrom": "posts", 1011 + "tableTo": "users", 1012 + "columnsFrom": [ 1013 + "did" 1014 + ], 1015 + "columnsTo": [ 1016 + "did" 1017 + ], 1018 + "onDelete": "no action", 1019 + "onUpdate": "no action" 1020 + }, 1021 + "posts_board_id_boards_id_fk": { 1022 + "name": "posts_board_id_boards_id_fk", 1023 + "tableFrom": "posts", 1024 + "tableTo": "boards", 1025 + "columnsFrom": [ 1026 + "board_id" 1027 + ], 1028 + "columnsTo": [ 1029 + "id" 1030 + ], 1031 + "onDelete": "no action", 1032 + "onUpdate": "no action" 1033 + }, 1034 + "posts_root_post_id_posts_id_fk": { 1035 + "name": "posts_root_post_id_posts_id_fk", 1036 + "tableFrom": "posts", 1037 + "tableTo": "posts", 1038 + "columnsFrom": [ 1039 + "root_post_id" 1040 + ], 1041 + "columnsTo": [ 1042 + "id" 1043 + ], 1044 + "onDelete": "no action", 1045 + "onUpdate": "no action" 1046 + }, 1047 + "posts_parent_post_id_posts_id_fk": { 1048 + "name": "posts_parent_post_id_posts_id_fk", 1049 + "tableFrom": "posts", 1050 + "tableTo": "posts", 1051 + "columnsFrom": [ 1052 + "parent_post_id" 1053 + ], 1054 + "columnsTo": [ 1055 + "id" 1056 + ], 1057 + "onDelete": "no action", 1058 + "onUpdate": "no action" 1059 + } 1060 + }, 1061 + "compositePrimaryKeys": {}, 1062 + "uniqueConstraints": {}, 1063 + "policies": {}, 1064 + "checkConstraints": {}, 1065 + "isRLSEnabled": false 1066 + }, 1067 + "public.roles": { 1068 + "name": "roles", 1069 + "schema": "", 1070 + "columns": { 1071 + "id": { 1072 + "name": "id", 1073 + "type": "bigserial", 1074 + "primaryKey": true, 1075 + "notNull": true 1076 + }, 1077 + "did": { 1078 + "name": "did", 1079 + "type": "text", 1080 + "primaryKey": false, 1081 + "notNull": true 1082 + }, 1083 + "rkey": { 1084 + "name": "rkey", 1085 + "type": "text", 1086 + "primaryKey": false, 1087 + "notNull": true 1088 + }, 1089 + "cid": { 1090 + "name": "cid", 1091 + "type": "text", 1092 + "primaryKey": false, 1093 + "notNull": true 1094 + }, 1095 + "name": { 1096 + "name": "name", 1097 + "type": "text", 1098 + "primaryKey": false, 1099 + "notNull": true 1100 + }, 1101 + "description": { 1102 + "name": "description", 1103 + "type": "text", 1104 + "primaryKey": false, 1105 + "notNull": false 1106 + }, 1107 + "permissions": { 1108 + "name": "permissions", 1109 + "type": "text[]", 1110 + "primaryKey": false, 1111 + "notNull": true, 1112 + "default": "'{}'::text[]" 1113 + }, 1114 + "priority": { 1115 + "name": "priority", 1116 + "type": "integer", 1117 + "primaryKey": false, 1118 + "notNull": true 1119 + }, 1120 + "created_at": { 1121 + "name": "created_at", 1122 + "type": "timestamp with time zone", 1123 + "primaryKey": false, 1124 + "notNull": true 1125 + }, 1126 + "indexed_at": { 1127 + "name": "indexed_at", 1128 + "type": "timestamp with time zone", 1129 + "primaryKey": false, 1130 + "notNull": true 1131 + } 1132 + }, 1133 + "indexes": { 1134 + "roles_did_rkey_idx": { 1135 + "name": "roles_did_rkey_idx", 1136 + "columns": [ 1137 + { 1138 + "expression": "did", 1139 + "isExpression": false, 1140 + "asc": true, 1141 + "nulls": "last" 1142 + }, 1143 + { 1144 + "expression": "rkey", 1145 + "isExpression": false, 1146 + "asc": true, 1147 + "nulls": "last" 1148 + } 1149 + ], 1150 + "isUnique": true, 1151 + "concurrently": false, 1152 + "method": "btree", 1153 + "with": {} 1154 + }, 1155 + "roles_did_idx": { 1156 + "name": "roles_did_idx", 1157 + "columns": [ 1158 + { 1159 + "expression": "did", 1160 + "isExpression": false, 1161 + "asc": true, 1162 + "nulls": "last" 1163 + } 1164 + ], 1165 + "isUnique": false, 1166 + "concurrently": false, 1167 + "method": "btree", 1168 + "with": {} 1169 + }, 1170 + "roles_did_name_idx": { 1171 + "name": "roles_did_name_idx", 1172 + "columns": [ 1173 + { 1174 + "expression": "did", 1175 + "isExpression": false, 1176 + "asc": true, 1177 + "nulls": "last" 1178 + }, 1179 + { 1180 + "expression": "name", 1181 + "isExpression": false, 1182 + "asc": true, 1183 + "nulls": "last" 1184 + } 1185 + ], 1186 + "isUnique": false, 1187 + "concurrently": false, 1188 + "method": "btree", 1189 + "with": {} 1190 + } 1191 + }, 1192 + "foreignKeys": {}, 1193 + "compositePrimaryKeys": {}, 1194 + "uniqueConstraints": {}, 1195 + "policies": {}, 1196 + "checkConstraints": {}, 1197 + "isRLSEnabled": false 1198 + }, 1199 + "public.users": { 1200 + "name": "users", 1201 + "schema": "", 1202 + "columns": { 1203 + "did": { 1204 + "name": "did", 1205 + "type": "text", 1206 + "primaryKey": true, 1207 + "notNull": true 1208 + }, 1209 + "handle": { 1210 + "name": "handle", 1211 + "type": "text", 1212 + "primaryKey": false, 1213 + "notNull": false 1214 + }, 1215 + "indexed_at": { 1216 + "name": "indexed_at", 1217 + "type": "timestamp with time zone", 1218 + "primaryKey": false, 1219 + "notNull": true 1220 + } 1221 + }, 1222 + "indexes": {}, 1223 + "foreignKeys": {}, 1224 + "compositePrimaryKeys": {}, 1225 + "uniqueConstraints": {}, 1226 + "policies": {}, 1227 + "checkConstraints": {}, 1228 + "isRLSEnabled": false 1229 + } 1230 + }, 1231 + "enums": {}, 1232 + "schemas": {}, 1233 + "sequences": {}, 1234 + "roles": {}, 1235 + "policies": {}, 1236 + "views": {}, 1237 + "_meta": { 1238 + "columns": {}, 1239 + "schemas": {}, 1240 + "tables": {} 1241 + } 1242 + }
+7
apps/appview/drizzle/meta/_journal.json
··· 57 57 "when": 1771817927092, 58 58 "tag": "0007_jittery_hellion", 59 59 "breakpoints": true 60 + }, 61 + { 62 + "idx": 8, 63 + "version": "7", 64 + "when": 1771898269612, 65 + "tag": "0008_flat_sauron", 66 + "breakpoints": true 60 67 } 61 68 ] 62 69 }
+6
apps/appview/src/index.ts
··· 11 11 // Create application context with all dependencies 12 12 const ctx = await createAppContext(config); 13 13 14 + // Wire BackfillManager ↔ FirehoseService (two-phase init: both exist now) 15 + if (ctx.backfillManager) { 16 + ctx.firehose.setBackfillManager(ctx.backfillManager); 17 + ctx.backfillManager.setIndexer(ctx.firehose.getIndexer()); 18 + } 19 + 14 20 // Seed default roles if enabled 15 21 if (process.env.SEED_DEFAULT_ROLES !== "false") { 16 22 console.log("Seeding default roles...");
+3
apps/appview/src/lib/__tests__/app-context.test.ts
··· 52 52 sessionTtlDays: 7, 53 53 forumHandle: "forum.example.com", 54 54 forumPassword: "test-password", 55 + backfillRateLimit: 10, 56 + backfillConcurrency: 10, 57 + backfillCursorMaxAgeHours: 48, 55 58 }; 56 59 }); 57 60
+925
apps/appview/src/lib/__tests__/backfill-manager.test.ts
··· 1 + import { describe, it, expect, beforeEach, vi, afterEach } from "vitest"; 2 + import { BackfillManager, BackfillStatus } from "../backfill-manager.js"; 3 + import type { Database } from "@atbb/db"; 4 + import type { AppConfig } from "../config.js"; 5 + import { AtpAgent } from "@atproto/api"; 6 + import type { Indexer } from "../indexer.js"; 7 + 8 + vi.mock("@atproto/api", () => ({ 9 + AtpAgent: vi.fn().mockImplementation(() => ({ 10 + com: { 11 + atproto: { 12 + repo: { 13 + listRecords: vi.fn(), 14 + }, 15 + }, 16 + }, 17 + })), 18 + })); 19 + 20 + // Minimal mock config 21 + function mockConfig(overrides: Partial<AppConfig> = {}): AppConfig { 22 + return { 23 + port: 3000, 24 + forumDid: "did:plc:testforum", 25 + pdsUrl: "https://pds.example.com", 26 + databaseUrl: "postgres://test", 27 + jetstreamUrl: "wss://jetstream.example.com", 28 + oauthPublicUrl: "https://example.com", 29 + sessionSecret: "a".repeat(32), 30 + sessionTtlDays: 7, 31 + backfillRateLimit: 10, 32 + backfillConcurrency: 10, 33 + backfillCursorMaxAgeHours: 48, 34 + ...overrides, 35 + } as AppConfig; 36 + } 37 + 38 + describe("BackfillManager", () => { 39 + let mockDb: Database; 40 + let manager: BackfillManager; 41 + 42 + beforeEach(() => { 43 + mockDb = { 44 + select: vi.fn().mockReturnValue({ 45 + from: vi.fn().mockReturnValue({ 46 + where: vi.fn().mockReturnValue({ 47 + limit: vi.fn().mockResolvedValue([]), 48 + }), 49 + }), 50 + }), 51 + } as unknown as Database; 52 + 53 + manager = new BackfillManager(mockDb, mockConfig()); 54 + }); 55 + 56 + afterEach(() => { 57 + vi.clearAllMocks(); 58 + }); 59 + 60 + describe("checkIfNeeded", () => { 61 + it("returns FullSync when cursor is null (no cursor)", async () => { 62 + const status = await manager.checkIfNeeded(null); 63 + expect(status).toBe(BackfillStatus.FullSync); 64 + }); 65 + 66 + it("returns FullSync when cursor exists but forums table is empty", async () => { 67 + // Forums query returns empty 68 + vi.spyOn(mockDb, "select").mockReturnValue({ 69 + from: vi.fn().mockReturnValue({ 70 + where: vi.fn().mockReturnValue({ 71 + limit: vi.fn().mockResolvedValue([]), 72 + }), 73 + }), 74 + } as any); 75 + 76 + // Cursor from 1 hour ago (fresh) 77 + const cursor = BigInt((Date.now() - 1 * 60 * 60 * 1000) * 1000); 78 + const status = await manager.checkIfNeeded(cursor); 79 + expect(status).toBe(BackfillStatus.FullSync); 80 + }); 81 + 82 + it("returns CatchUp when cursor age exceeds threshold", async () => { 83 + // Forums query returns a forum (DB not empty) 84 + vi.spyOn(mockDb, "select").mockReturnValue({ 85 + from: vi.fn().mockReturnValue({ 86 + where: vi.fn().mockReturnValue({ 87 + limit: vi.fn().mockResolvedValue([{ id: 1n, rkey: "self" }]), 88 + }), 89 + }), 90 + } as any); 91 + 92 + // Cursor from 72 hours ago (stale) 93 + const cursor = BigInt((Date.now() - 72 * 60 * 60 * 1000) * 1000); 94 + const status = await manager.checkIfNeeded(cursor); 95 + expect(status).toBe(BackfillStatus.CatchUp); 96 + }); 97 + 98 + it("returns NotNeeded when cursor is fresh and DB has data", async () => { 99 + // Forums query returns a forum 100 + vi.spyOn(mockDb, "select").mockReturnValue({ 101 + from: vi.fn().mockReturnValue({ 102 + where: vi.fn().mockReturnValue({ 103 + limit: vi.fn().mockResolvedValue([{ id: 1n, rkey: "self" }]), 104 + }), 105 + }), 106 + } as any); 107 + 108 + // Cursor from 1 hour ago (fresh) 109 + const cursor = BigInt((Date.now() - 1 * 60 * 60 * 1000) * 1000); 110 + const status = await manager.checkIfNeeded(cursor); 111 + expect(status).toBe(BackfillStatus.NotNeeded); 112 + }); 113 + 114 + it("returns FullSync when DB query fails (fail safe)", async () => { 115 + vi.spyOn(mockDb, "select").mockReturnValue({ 116 + from: vi.fn().mockReturnValue({ 117 + where: vi.fn().mockReturnValue({ 118 + limit: vi.fn().mockRejectedValue(new Error("DB connection lost")), 119 + }), 120 + }), 121 + } as any); 122 + 123 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 124 + const cursor = BigInt((Date.now() - 1 * 60 * 60 * 1000) * 1000); 125 + const status = await manager.checkIfNeeded(cursor); 126 + expect(status).toBe(BackfillStatus.FullSync); 127 + consoleSpy.mockRestore(); 128 + }); 129 + }); 130 + 131 + describe("syncRepoRecords", () => { 132 + let mockIndexer: Indexer; 133 + 134 + beforeEach(() => { 135 + mockIndexer = { 136 + handlePostCreate: vi.fn().mockResolvedValue(true), 137 + handleForumCreate: vi.fn().mockResolvedValue(true), 138 + } as unknown as Indexer; 139 + }); 140 + 141 + it("fetches records and calls indexer for each one", async () => { 142 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 143 + (mockAgent.com.atproto.repo.listRecords as any).mockResolvedValueOnce({ 144 + data: { 145 + records: [ 146 + { 147 + uri: "at://did:plc:user1/space.atbb.post/abc123", 148 + cid: "bafyabc", 149 + value: { $type: "space.atbb.post", text: "Hello", createdAt: "2026-01-01T00:00:00Z" }, 150 + }, 151 + { 152 + uri: "at://did:plc:user1/space.atbb.post/def456", 153 + cid: "bafydef", 154 + value: { $type: "space.atbb.post", text: "World", createdAt: "2026-01-01T01:00:00Z" }, 155 + }, 156 + ], 157 + cursor: undefined, 158 + }, 159 + }); 160 + 161 + manager.setIndexer(mockIndexer); 162 + const stats = await manager.syncRepoRecords( 163 + "did:plc:user1", 164 + "space.atbb.post", 165 + mockAgent 166 + ); 167 + 168 + expect(stats.recordsFound).toBe(2); 169 + expect(stats.recordsIndexed).toBe(2); 170 + expect(stats.errors).toBe(0); 171 + expect(mockIndexer.handlePostCreate).toHaveBeenCalledTimes(2); 172 + expect(mockIndexer.handlePostCreate).toHaveBeenCalledWith( 173 + expect.objectContaining({ 174 + did: "did:plc:user1", 175 + commit: expect.objectContaining({ 176 + rkey: "abc123", 177 + cid: "bafyabc", 178 + record: expect.objectContaining({ text: "Hello" }), 179 + }), 180 + }) 181 + ); 182 + }); 183 + 184 + it("paginates through multiple pages", async () => { 185 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 186 + (mockAgent.com.atproto.repo.listRecords as any) 187 + .mockResolvedValueOnce({ 188 + data: { 189 + records: [{ 190 + uri: "at://did:plc:user1/space.atbb.post/page1", 191 + cid: "bafyp1", 192 + value: { $type: "space.atbb.post", text: "Page 1", createdAt: "2026-01-01T00:00:00Z" }, 193 + }], 194 + cursor: "next_page", 195 + }, 196 + }) 197 + .mockResolvedValueOnce({ 198 + data: { 199 + records: [{ 200 + uri: "at://did:plc:user1/space.atbb.post/page2", 201 + cid: "bafyp2", 202 + value: { $type: "space.atbb.post", text: "Page 2", createdAt: "2026-01-02T00:00:00Z" }, 203 + }], 204 + cursor: undefined, 205 + }, 206 + }); 207 + 208 + manager.setIndexer(mockIndexer); 209 + const stats = await manager.syncRepoRecords( 210 + "did:plc:user1", 211 + "space.atbb.post", 212 + mockAgent 213 + ); 214 + 215 + expect(stats.recordsFound).toBe(2); 216 + expect(stats.recordsIndexed).toBe(2); 217 + expect(mockAgent.com.atproto.repo.listRecords).toHaveBeenCalledTimes(2); 218 + }); 219 + 220 + it("continues on indexer errors and tracks error count", async () => { 221 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 222 + (mockAgent.com.atproto.repo.listRecords as any).mockResolvedValueOnce({ 223 + data: { 224 + records: [ 225 + { 226 + uri: "at://did:plc:user1/space.atbb.post/good", 227 + cid: "bafygood", 228 + value: { $type: "space.atbb.post", text: "Good", createdAt: "2026-01-01T00:00:00Z" }, 229 + }, 230 + { 231 + uri: "at://did:plc:user1/space.atbb.post/bad", 232 + cid: "bafybad", 233 + value: { $type: "space.atbb.post", text: "Bad", createdAt: "2026-01-01T01:00:00Z" }, 234 + }, 235 + ], 236 + cursor: undefined, 237 + }, 238 + }); 239 + 240 + (mockIndexer.handlePostCreate as any) 241 + .mockResolvedValueOnce(true) 242 + .mockRejectedValueOnce(new Error("FK missing")); 243 + 244 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 245 + manager.setIndexer(mockIndexer); 246 + const stats = await manager.syncRepoRecords( 247 + "did:plc:user1", 248 + "space.atbb.post", 249 + mockAgent 250 + ); 251 + 252 + expect(stats.recordsFound).toBe(2); 253 + expect(stats.recordsIndexed).toBe(1); 254 + expect(stats.errors).toBe(1); 255 + consoleSpy.mockRestore(); 256 + }); 257 + 258 + it("returns error stats when indexer is not set", async () => { 259 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 260 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 261 + // No setIndexer call — indexer is null 262 + const stats = await manager.syncRepoRecords("did:plc:user", "space.atbb.post", mockAgent); 263 + expect(stats.errors).toBe(1); 264 + expect(consoleSpy).toHaveBeenCalledWith(expect.stringContaining("indexer_not_set")); 265 + consoleSpy.mockRestore(); 266 + }); 267 + 268 + it("handles PDS connection failure gracefully", async () => { 269 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 270 + (mockAgent.com.atproto.repo.listRecords as any) 271 + .mockRejectedValueOnce(new Error("fetch failed")); 272 + 273 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 274 + manager.setIndexer(mockIndexer); 275 + const stats = await manager.syncRepoRecords( 276 + "did:plc:user1", 277 + "space.atbb.post", 278 + mockAgent 279 + ); 280 + 281 + expect(stats.recordsFound).toBe(0); 282 + expect(stats.recordsIndexed).toBe(0); 283 + expect(stats.errors).toBe(1); 284 + consoleSpy.mockRestore(); 285 + }); 286 + }); 287 + 288 + describe("performBackfill", () => { 289 + let mockIndexer: Indexer; 290 + let consoleSpy: any; 291 + 292 + beforeEach(() => { 293 + consoleSpy = vi.spyOn(console, "log").mockImplementation(() => {}); 294 + vi.spyOn(console, "error").mockImplementation(() => {}); 295 + vi.spyOn(console, "warn").mockImplementation(() => {}); 296 + 297 + mockIndexer = { 298 + handleForumCreate: vi.fn().mockResolvedValue(true), 299 + handleCategoryCreate: vi.fn().mockResolvedValue(true), 300 + handleBoardCreate: vi.fn().mockResolvedValue(true), 301 + handleRoleCreate: vi.fn().mockResolvedValue(true), 302 + handleMembershipCreate: vi.fn().mockResolvedValue(true), 303 + handlePostCreate: vi.fn().mockResolvedValue(true), 304 + handleModActionCreate: vi.fn().mockResolvedValue(true), 305 + } as unknown as Indexer; 306 + }); 307 + 308 + afterEach(() => { 309 + consoleSpy.mockRestore(); 310 + }); 311 + 312 + it("creates a backfill_progress row on start", async () => { 313 + const mockInsert = vi.fn().mockReturnValue({ 314 + values: vi.fn().mockReturnValue({ 315 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 316 + }), 317 + }); 318 + 319 + const mockSelectEmpty = vi.fn().mockReturnValue({ 320 + from: vi.fn().mockReturnValue({ 321 + where: vi.fn().mockReturnValue({ 322 + limit: vi.fn().mockResolvedValue([]), 323 + orderBy: vi.fn().mockResolvedValue([]), 324 + }), 325 + orderBy: vi.fn().mockResolvedValue([]), 326 + }), 327 + }); 328 + 329 + mockDb = { 330 + select: mockSelectEmpty, 331 + insert: mockInsert, 332 + update: vi.fn().mockReturnValue({ 333 + set: vi.fn().mockReturnValue({ 334 + where: vi.fn().mockResolvedValue(undefined), 335 + }), 336 + }), 337 + } as unknown as Database; 338 + 339 + manager = new BackfillManager(mockDb, mockConfig()); 340 + manager.setIndexer(mockIndexer); 341 + 342 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 343 + com: { 344 + atproto: { 345 + repo: { 346 + listRecords: vi.fn().mockResolvedValue({ 347 + data: { records: [], cursor: undefined }, 348 + }), 349 + }, 350 + }, 351 + }, 352 + }); 353 + 354 + await manager.performBackfill(BackfillStatus.FullSync); 355 + 356 + expect(mockInsert).toHaveBeenCalled(); 357 + }); 358 + 359 + it("sets isRunning flag during backfill", async () => { 360 + const mockInsert = vi.fn().mockReturnValue({ 361 + values: vi.fn().mockReturnValue({ 362 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 363 + }), 364 + }); 365 + 366 + mockDb = { 367 + select: vi.fn().mockReturnValue({ 368 + from: vi.fn().mockReturnValue({ 369 + where: vi.fn().mockReturnValue({ 370 + limit: vi.fn().mockResolvedValue([]), 371 + orderBy: vi.fn().mockResolvedValue([]), 372 + }), 373 + orderBy: vi.fn().mockResolvedValue([]), 374 + }), 375 + }), 376 + insert: mockInsert, 377 + update: vi.fn().mockReturnValue({ 378 + set: vi.fn().mockReturnValue({ 379 + where: vi.fn().mockResolvedValue(undefined), 380 + }), 381 + }), 382 + } as unknown as Database; 383 + 384 + manager = new BackfillManager(mockDb, mockConfig()); 385 + manager.setIndexer(mockIndexer); 386 + 387 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 388 + com: { 389 + atproto: { 390 + repo: { 391 + listRecords: vi.fn().mockResolvedValue({ 392 + data: { records: [], cursor: undefined }, 393 + }), 394 + }, 395 + }, 396 + }, 397 + }); 398 + 399 + expect(manager.getIsRunning()).toBe(false); 400 + const promise = manager.performBackfill(BackfillStatus.FullSync); 401 + expect(manager.getIsRunning()).toBe(true); 402 + await promise; 403 + expect(manager.getIsRunning()).toBe(false); 404 + }); 405 + 406 + it("rejects concurrent backfill attempts", async () => { 407 + const mockInsert = vi.fn().mockReturnValue({ 408 + values: vi.fn().mockReturnValue({ 409 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 410 + }), 411 + }); 412 + 413 + mockDb = { 414 + select: vi.fn().mockReturnValue({ 415 + from: vi.fn().mockReturnValue({ 416 + where: vi.fn().mockReturnValue({ 417 + limit: vi.fn().mockResolvedValue([]), 418 + orderBy: vi.fn().mockResolvedValue([]), 419 + }), 420 + orderBy: vi.fn().mockResolvedValue([]), 421 + }), 422 + }), 423 + insert: mockInsert, 424 + update: vi.fn().mockReturnValue({ 425 + set: vi.fn().mockReturnValue({ 426 + where: vi.fn().mockResolvedValue(undefined), 427 + }), 428 + }), 429 + } as unknown as Database; 430 + 431 + manager = new BackfillManager(mockDb, mockConfig()); 432 + manager.setIndexer(mockIndexer); 433 + 434 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 435 + com: { 436 + atproto: { 437 + repo: { 438 + listRecords: vi.fn().mockImplementation( 439 + () => new Promise((resolve) => 440 + setTimeout(() => resolve({ data: { records: [], cursor: undefined } }), 100) 441 + ) 442 + ), 443 + }, 444 + }, 445 + }, 446 + }); 447 + 448 + const first = manager.performBackfill(BackfillStatus.FullSync); 449 + 450 + await expect(manager.performBackfill(BackfillStatus.FullSync)) 451 + .rejects.toThrow("Backfill is already in progress"); 452 + 453 + await first; 454 + }); 455 + 456 + it("CatchUp: syncs user-owned collections and aggregates counts", async () => { 457 + // Phase 1 (5 FORUM_OWNED_COLLECTIONS) must return empty so its records don't 458 + // pollute the count. Phase 2: 2 users × 2 USER_OWNED_COLLECTIONS × 1 record = 4. 459 + const emptyPage = { data: { records: [], cursor: undefined } }; 460 + const recordPage = { 461 + data: { 462 + records: [{ 463 + uri: "at://did:plc:u/space.atbb.post/r1", 464 + cid: "bafyr1", 465 + value: { $type: "space.atbb.post", text: "hi", createdAt: "2026-01-01T00:00:00Z" }, 466 + }], 467 + cursor: undefined, 468 + }, 469 + }; 470 + 471 + const mockListRecords = vi.fn() 472 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.forum (Phase 1 call 1) 473 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.category (Phase 1 call 2) 474 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.board (Phase 1 call 3) 475 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.role (Phase 1 call 4) 476 + .mockResolvedValueOnce(emptyPage) // space.atbb.modAction (Phase 1 call 5) 477 + .mockResolvedValue(recordPage); // all Phase 2 user collection calls 478 + 479 + mockDb = { 480 + select: vi.fn().mockReturnValue({ 481 + from: vi.fn().mockReturnValue({ 482 + orderBy: vi.fn().mockResolvedValue([ 483 + { did: "did:plc:user1" }, 484 + { did: "did:plc:user2" }, 485 + ]), 486 + }), 487 + }), 488 + insert: vi.fn().mockReturnValue({ 489 + values: vi.fn().mockReturnValue({ 490 + returning: vi.fn().mockResolvedValue([{ id: 42n }]), 491 + }), 492 + }), 493 + update: vi.fn().mockReturnValue({ 494 + set: vi.fn().mockReturnValue({ 495 + where: vi.fn().mockResolvedValue(undefined), 496 + }), 497 + }), 498 + } as unknown as Database; 499 + 500 + manager = new BackfillManager(mockDb, mockConfig({ backfillConcurrency: 5 })); 501 + manager.setIndexer(mockIndexer); 502 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 503 + com: { atproto: { repo: { listRecords: mockListRecords } } }, 504 + }); 505 + 506 + const result = await manager.performBackfill(BackfillStatus.CatchUp); 507 + 508 + // Phase 1: 0 records (forum collections empty) 509 + // Phase 2: 2 users × 2 collections × 1 record each = 4 records indexed 510 + expect(result.recordsIndexed).toBe(4); 511 + expect(result.errors).toBe(0); 512 + expect(result.didsProcessed).toBe(2); 513 + expect(result.backfillId).toBe(42n); 514 + }); 515 + 516 + it("CatchUp: rejected user batch increments totalErrors and is not swallowed", async () => { 517 + // syncRepoRecords never throws — it catches PDS errors internally and returns errors:1. 518 + // For the batch callback to reject (tested by the allSettled handling), the 519 + // backfillErrors DB insert must fail, which propagates the rejection out of the callback. 520 + const emptyPage = { data: { records: [], cursor: undefined } }; 521 + 522 + const mockListRecords = vi.fn() 523 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.forum (Phase 1 call 1) 524 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.category (Phase 1 call 2) 525 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.board (Phase 1 call 3) 526 + .mockResolvedValueOnce(emptyPage) // space.atbb.forum.role (Phase 1 call 4) 527 + .mockResolvedValueOnce(emptyPage) // space.atbb.modAction (Phase 1 call 5) 528 + // user1: both collections succeed, 1 record each 529 + .mockResolvedValueOnce({ data: { records: [{ 530 + uri: "at://did:plc:user1/space.atbb.membership/self", 531 + cid: "bafymem", 532 + value: { $type: "space.atbb.membership", createdAt: "2026-01-01T00:00:00Z" }, 533 + }], cursor: undefined } }) 534 + .mockResolvedValueOnce({ data: { records: [{ 535 + uri: "at://did:plc:user1/space.atbb.post/p1", 536 + cid: "bafyp1", 537 + value: { $type: "space.atbb.post", text: "hi", createdAt: "2026-01-01T00:00:00Z" }, 538 + }], cursor: undefined } }) 539 + // user2/membership: PDS error → syncRepoRecords catches → returns errors:1 → 540 + // triggers backfillErrors insert (which rejects below) → callback rejects 541 + .mockRejectedValueOnce(new Error("PDS unreachable")); 542 + 543 + mockDb = { 544 + select: vi.fn().mockReturnValue({ 545 + from: vi.fn().mockReturnValue({ 546 + orderBy: vi.fn().mockResolvedValue([ 547 + { did: "did:plc:user1" }, 548 + { did: "did:plc:user2" }, 549 + ]), 550 + }), 551 + }), 552 + insert: vi.fn() 553 + .mockReturnValueOnce({ // backfillProgress insert — must succeed 554 + values: vi.fn().mockReturnValue({ 555 + returning: vi.fn().mockResolvedValue([{ id: 7n }]), 556 + }), 557 + }) 558 + .mockReturnValueOnce({ // backfillErrors insert for user2 — rejects to make callback throw 559 + values: vi.fn().mockReturnValue({ 560 + returning: vi.fn().mockRejectedValue(new Error("backfillErrors insert failed")), 561 + }), 562 + }), 563 + update: vi.fn().mockReturnValue({ 564 + set: vi.fn().mockReturnValue({ 565 + where: vi.fn().mockResolvedValue(undefined), 566 + }), 567 + }), 568 + } as unknown as Database; 569 + 570 + manager = new BackfillManager(mockDb, mockConfig({ backfillConcurrency: 1 })); 571 + manager.setIndexer(mockIndexer); 572 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 573 + com: { atproto: { repo: { listRecords: mockListRecords } } }, 574 + }); 575 + 576 + const result = await manager.performBackfill(BackfillStatus.CatchUp); 577 + 578 + // user1 batch (concurrency=1): fulfilled, 2 records indexed (membership + post) 579 + // user2 batch: callback rejects → allSettled rejected branch → totalErrors++ = 1 580 + expect(result.recordsIndexed).toBe(2); 581 + expect(result.errors).toBe(1); 582 + }); 583 + 584 + it("clears isRunning flag even when backfill fails", async () => { 585 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 586 + 587 + mockDb = { 588 + insert: vi.fn().mockReturnValue({ 589 + values: vi.fn().mockReturnValue({ 590 + returning: vi.fn().mockRejectedValue(new Error("DB insert failed")), 591 + }), 592 + }), 593 + update: vi.fn().mockReturnValue({ 594 + set: vi.fn().mockReturnValue({ 595 + where: vi.fn().mockResolvedValue(undefined), 596 + }), 597 + }), 598 + } as unknown as Database; 599 + 600 + manager = new BackfillManager(mockDb, mockConfig()); 601 + manager.setIndexer(mockIndexer); 602 + 603 + await expect(manager.performBackfill(BackfillStatus.FullSync)) 604 + .rejects.toThrow("DB insert failed"); 605 + 606 + expect(manager.getIsRunning()).toBe(false); 607 + consoleSpy.mockRestore(); 608 + }); 609 + }); 610 + 611 + describe("checkForInterruptedBackfill", () => { 612 + it("returns null when no interrupted backfill exists", async () => { 613 + vi.spyOn(mockDb, "select").mockReturnValue({ 614 + from: vi.fn().mockReturnValue({ 615 + where: vi.fn().mockReturnValue({ 616 + limit: vi.fn().mockResolvedValue([]), 617 + }), 618 + }), 619 + } as any); 620 + 621 + const result = await manager.checkForInterruptedBackfill(); 622 + expect(result).toBeNull(); 623 + }); 624 + 625 + it("returns null and logs error when DB query fails", async () => { 626 + vi.spyOn(mockDb, "select").mockReturnValue({ 627 + from: vi.fn().mockReturnValue({ 628 + where: vi.fn().mockReturnValue({ 629 + limit: vi.fn().mockRejectedValue(new Error("DB connection lost")), 630 + }), 631 + }), 632 + } as any); 633 + 634 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 635 + const result = await manager.checkForInterruptedBackfill(); 636 + expect(result).toBeNull(); 637 + expect(consoleSpy).toHaveBeenCalled(); 638 + consoleSpy.mockRestore(); 639 + }); 640 + 641 + it("returns interrupted backfill row when one exists", async () => { 642 + const interruptedRow = { 643 + id: 5n, 644 + status: "in_progress", 645 + backfillType: "catch_up", 646 + lastProcessedDid: "did:plc:halfway", 647 + didsTotal: 100, 648 + didsProcessed: 50, 649 + recordsIndexed: 250, 650 + startedAt: new Date(), 651 + completedAt: null, 652 + errorMessage: null, 653 + }; 654 + 655 + vi.spyOn(mockDb, "select").mockReturnValue({ 656 + from: vi.fn().mockReturnValue({ 657 + where: vi.fn().mockReturnValue({ 658 + limit: vi.fn().mockResolvedValue([interruptedRow]), 659 + }), 660 + }), 661 + } as any); 662 + 663 + const result = await manager.checkForInterruptedBackfill(); 664 + expect(result).toEqual(interruptedRow); 665 + }); 666 + }); 667 + 668 + describe("resumeBackfill", () => { 669 + let mockIndexer: Indexer; 670 + 671 + beforeEach(() => { 672 + vi.spyOn(console, "log").mockImplementation(() => {}); 673 + vi.spyOn(console, "error").mockImplementation(() => {}); 674 + 675 + mockIndexer = { 676 + handleForumCreate: vi.fn().mockResolvedValue(true), 677 + handleCategoryCreate: vi.fn().mockResolvedValue(true), 678 + handleBoardCreate: vi.fn().mockResolvedValue(true), 679 + handleRoleCreate: vi.fn().mockResolvedValue(true), 680 + handleMembershipCreate: vi.fn().mockResolvedValue(true), 681 + handlePostCreate: vi.fn().mockResolvedValue(true), 682 + handleModActionCreate: vi.fn().mockResolvedValue(true), 683 + } as unknown as Indexer; 684 + }); 685 + 686 + afterEach(() => { 687 + vi.restoreAllMocks(); 688 + }); 689 + 690 + it("resumes from lastProcessedDid and processes remaining users", async () => { 691 + // Interrupted at user1 (didsProcessed=1), user2 and user3 remain 692 + const interrupted = { 693 + id: 5n, 694 + status: "in_progress" as const, 695 + backfillType: "catch_up", 696 + lastProcessedDid: "did:plc:user1", 697 + didsTotal: 3, 698 + didsProcessed: 1, 699 + recordsIndexed: 2, 700 + startedAt: new Date(), 701 + completedAt: null, 702 + errorMessage: null, 703 + }; 704 + 705 + // user2 and user3: 1 record each per collection (2 collections = 4 total) 706 + const recordPage = { 707 + data: { 708 + records: [{ uri: "at://did:plc:u/space.atbb.post/r1", cid: "bafyr1", 709 + value: { $type: "space.atbb.post", text: "hi", createdAt: "2026-01-01T00:00:00Z" } }], 710 + cursor: undefined, 711 + }, 712 + }; 713 + 714 + const mockListRecords = vi.fn().mockResolvedValue(recordPage); 715 + 716 + mockDb = { 717 + select: vi.fn().mockReturnValue({ 718 + from: vi.fn().mockReturnValue({ 719 + where: vi.fn().mockReturnValue({ 720 + orderBy: vi.fn().mockResolvedValue([ 721 + { did: "did:plc:user2" }, 722 + { did: "did:plc:user3" }, 723 + ]), 724 + }), 725 + }), 726 + }), 727 + insert: vi.fn().mockReturnValue({ 728 + values: vi.fn().mockReturnValue({ 729 + returning: vi.fn().mockResolvedValue([]), 730 + }), 731 + }), 732 + update: vi.fn().mockReturnValue({ 733 + set: vi.fn().mockReturnValue({ 734 + where: vi.fn().mockResolvedValue(undefined), 735 + }), 736 + }), 737 + } as unknown as Database; 738 + 739 + manager = new BackfillManager(mockDb, mockConfig({ backfillConcurrency: 5 })); 740 + manager.setIndexer(mockIndexer); 741 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 742 + com: { atproto: { repo: { listRecords: mockListRecords } } }, 743 + }); 744 + 745 + const result = await manager.resumeBackfill(interrupted); 746 + 747 + // Starts from interrupted.recordsIndexed=2, adds 2 users × 2 collections × 1 record = 4 748 + expect(result.recordsIndexed).toBe(6); 749 + expect(result.errors).toBe(0); 750 + expect(result.didsProcessed).toBe(3); // 1 (prior) + 2 (resumed) 751 + expect(result.backfillId).toBe(5n); 752 + }); 753 + 754 + it("marks completed even when no remaining users", async () => { 755 + // Interrupted at the last user — no users with DID > lastProcessedDid 756 + const interrupted = { 757 + id: 3n, 758 + status: "in_progress" as const, 759 + backfillType: "catch_up", 760 + lastProcessedDid: "did:plc:last", 761 + didsTotal: 2, 762 + didsProcessed: 2, 763 + recordsIndexed: 10, 764 + startedAt: new Date(), 765 + completedAt: null, 766 + errorMessage: null, 767 + }; 768 + 769 + mockDb = { 770 + select: vi.fn().mockReturnValue({ 771 + from: vi.fn().mockReturnValue({ 772 + where: vi.fn().mockReturnValue({ 773 + orderBy: vi.fn().mockResolvedValue([]), // no remaining users 774 + }), 775 + }), 776 + }), 777 + update: vi.fn().mockReturnValue({ 778 + set: vi.fn().mockReturnValue({ 779 + where: vi.fn().mockResolvedValue(undefined), 780 + }), 781 + }), 782 + } as unknown as Database; 783 + 784 + manager = new BackfillManager(mockDb, mockConfig()); 785 + manager.setIndexer(mockIndexer); 786 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 787 + com: { atproto: { repo: { listRecords: vi.fn() } } }, 788 + }); 789 + 790 + const result = await manager.resumeBackfill(interrupted); 791 + 792 + // No new records — just marks completed with existing counts 793 + expect(result.recordsIndexed).toBe(10); 794 + expect(result.didsProcessed).toBe(2); 795 + expect(result.backfillId).toBe(3n); 796 + 797 + // DB row should be updated to completed 798 + const updateMock = mockDb.update as any; 799 + expect(updateMock).toHaveBeenCalled(); 800 + }); 801 + 802 + it("clears isRunning flag even when resume fails", async () => { 803 + const interrupted = { 804 + id: 9n, 805 + status: "in_progress" as const, 806 + backfillType: "catch_up", 807 + lastProcessedDid: "did:plc:checkpoint", 808 + didsTotal: 5, 809 + didsProcessed: 3, 810 + recordsIndexed: 15, 811 + startedAt: new Date(), 812 + completedAt: null, 813 + errorMessage: null, 814 + }; 815 + 816 + mockDb = { 817 + select: vi.fn().mockReturnValue({ 818 + from: vi.fn().mockReturnValue({ 819 + where: vi.fn().mockReturnValue({ 820 + orderBy: vi.fn().mockRejectedValue(new Error("DB query failed")), 821 + }), 822 + }), 823 + }), 824 + update: vi.fn().mockReturnValue({ 825 + set: vi.fn().mockReturnValue({ 826 + where: vi.fn().mockResolvedValue(undefined), 827 + }), 828 + }), 829 + } as unknown as Database; 830 + 831 + manager = new BackfillManager(mockDb, mockConfig()); 832 + manager.setIndexer(mockIndexer); 833 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 834 + com: { atproto: { repo: { listRecords: vi.fn() } } }, 835 + }); 836 + 837 + await expect(manager.resumeBackfill(interrupted)) 838 + .rejects.toThrow("DB query failed"); 839 + 840 + expect(manager.getIsRunning()).toBe(false); 841 + }); 842 + 843 + it("marks full_sync interrupted backfill as failed (cannot resume FullSync)", async () => { 844 + const interrupted = { 845 + id: 10n, 846 + status: "in_progress" as const, 847 + backfillType: "full_sync", 848 + lastProcessedDid: null, 849 + didsTotal: 0, 850 + didsProcessed: 0, 851 + recordsIndexed: 0, 852 + startedAt: new Date(), 853 + completedAt: null, 854 + errorMessage: null, 855 + }; 856 + 857 + const mockUpdate = vi.fn().mockReturnValue({ 858 + set: vi.fn().mockReturnValue({ 859 + where: vi.fn().mockResolvedValue(undefined), 860 + }), 861 + }); 862 + mockDb = { 863 + update: mockUpdate, 864 + } as unknown as Database; 865 + 866 + manager = new BackfillManager(mockDb, mockConfig()); 867 + manager.setIndexer(mockIndexer); 868 + 869 + await expect(manager.resumeBackfill(interrupted)) 870 + .rejects.toThrow("Interrupted FullSync cannot be resumed"); 871 + 872 + // Verify the row was marked as failed 873 + expect(mockUpdate).toHaveBeenCalled(); 874 + const setCall = mockUpdate.mock.results[0].value.set; 875 + expect(setCall).toHaveBeenCalledWith( 876 + expect.objectContaining({ status: "failed" }) 877 + ); 878 + }); 879 + 880 + it("rejects concurrent resume attempts", async () => { 881 + const interrupted = { 882 + id: 2n, 883 + status: "in_progress" as const, 884 + backfillType: "catch_up", 885 + lastProcessedDid: "did:plc:check", 886 + didsTotal: 2, 887 + didsProcessed: 1, 888 + recordsIndexed: 5, 889 + startedAt: new Date(), 890 + completedAt: null, 891 + errorMessage: null, 892 + }; 893 + 894 + mockDb = { 895 + select: vi.fn().mockReturnValue({ 896 + from: vi.fn().mockReturnValue({ 897 + where: vi.fn().mockReturnValue({ 898 + orderBy: vi.fn().mockImplementation( 899 + () => new Promise((resolve) => setTimeout(() => resolve([]), 200)) 900 + ), 901 + }), 902 + }), 903 + }), 904 + update: vi.fn().mockReturnValue({ 905 + set: vi.fn().mockReturnValue({ 906 + where: vi.fn().mockResolvedValue(undefined), 907 + }), 908 + }), 909 + } as unknown as Database; 910 + 911 + manager = new BackfillManager(mockDb, mockConfig()); 912 + manager.setIndexer(mockIndexer); 913 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 914 + com: { atproto: { repo: { listRecords: vi.fn() } } }, 915 + }); 916 + 917 + const first = manager.resumeBackfill(interrupted); 918 + 919 + await expect(manager.resumeBackfill(interrupted)) 920 + .rejects.toThrow("Backfill is already in progress"); 921 + 922 + await first; 923 + }); 924 + }); 925 + });
+26
apps/appview/src/lib/__tests__/config.test.ts
··· 203 203 expect(config.forumPassword).toBeUndefined(); 204 204 }); 205 205 }); 206 + 207 + describe("Backfill configuration", () => { 208 + it("uses default backfill values when env vars not set", async () => { 209 + delete process.env.BACKFILL_RATE_LIMIT; 210 + delete process.env.BACKFILL_CONCURRENCY; 211 + delete process.env.BACKFILL_CURSOR_MAX_AGE_HOURS; 212 + 213 + const config = await loadConfig(); 214 + 215 + expect(config.backfillRateLimit).toBe(10); 216 + expect(config.backfillConcurrency).toBe(10); 217 + expect(config.backfillCursorMaxAgeHours).toBe(48); 218 + }); 219 + 220 + it("reads backfill values from env vars", async () => { 221 + process.env.BACKFILL_RATE_LIMIT = "5"; 222 + process.env.BACKFILL_CONCURRENCY = "20"; 223 + process.env.BACKFILL_CURSOR_MAX_AGE_HOURS = "24"; 224 + 225 + const config = await loadConfig(); 226 + 227 + expect(config.backfillRateLimit).toBe(5); 228 + expect(config.backfillConcurrency).toBe(20); 229 + expect(config.backfillCursorMaxAgeHours).toBe(24); 230 + }); 231 + }); 206 232 });
+25
apps/appview/src/lib/__tests__/cursor-manager.test.ts
··· 189 189 expect(rewound).toBe(cursor - BigInt(rewindAmount)); 190 190 }); 191 191 }); 192 + 193 + describe("getCursorAgeHours", () => { 194 + it("returns null when cursor is null", () => { 195 + const age = cursorManager.getCursorAgeHours(null); 196 + expect(age).toBeNull(); 197 + }); 198 + 199 + it("calculates age in hours from microsecond cursor", () => { 200 + // Cursor from 24 hours ago 201 + const twentyFourHoursAgoUs = BigInt( 202 + (Date.now() - 24 * 60 * 60 * 1000) * 1000 203 + ); 204 + const age = cursorManager.getCursorAgeHours(twentyFourHoursAgoUs); 205 + // Allow 1-hour tolerance for test execution time 206 + expect(age).toBeGreaterThanOrEqual(23); 207 + expect(age).toBeLessThanOrEqual(25); 208 + }); 209 + 210 + it("returns near-zero for recent cursor", () => { 211 + const recentCursorUs = BigInt(Date.now() * 1000); 212 + const age = cursorManager.getCursorAgeHours(recentCursorUs); 213 + expect(age).toBeGreaterThanOrEqual(0); 214 + expect(age).toBeLessThan(1); 215 + }); 216 + }); 192 217 });
+176 -5
apps/appview/src/lib/__tests__/firehose.test.ts
··· 1 1 import { describe, it, expect, beforeEach, vi, afterEach } from "vitest"; 2 2 import { FirehoseService } from "../firehose.js"; 3 3 import type { Database } from "@atbb/db"; 4 + import { BackfillStatus } from "../backfill-manager.js"; 5 + 6 + // Mock backfill-manager to prevent @atproto/api from loading in unit tests 7 + // BackfillStatus enum is re-exported as a real object so it can be used in comparisons 8 + vi.mock("../backfill-manager.js", () => { 9 + return { 10 + BackfillStatus: { 11 + NotNeeded: "not_needed", 12 + CatchUp: "catch_up", 13 + FullSync: "full_sync", 14 + }, 15 + }; 16 + }); 4 17 5 18 // Mock Jetstream 6 19 vi.mock("@skyware/jetstream", () => { ··· 32 45 handleCategoryCreate: vi.fn(), 33 46 handleCategoryUpdate: vi.fn(), 34 47 handleCategoryDelete: vi.fn(), 48 + handleBoardCreate: vi.fn(), 49 + handleBoardUpdate: vi.fn(), 50 + handleBoardDelete: vi.fn(), 51 + handleRoleCreate: vi.fn(), 52 + handleRoleUpdate: vi.fn(), 53 + handleRoleDelete: vi.fn(), 35 54 handleMembershipCreate: vi.fn(), 36 55 handleMembershipUpdate: vi.fn(), 37 56 handleMembershipDelete: vi.fn(), ··· 191 210 ); 192 211 }); 193 212 194 - it("should handle connection errors gracefully", async () => { 195 - // Note: Error handling is tested through manual testing 196 - // Mocking the Jetstream implementation is complex due to class constructors 197 - // The error path logs to console.error and attempts reconnection 198 - expect(true).toBe(true); 213 + it("continues to start firehose when backfill throws on startup", async () => { 214 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 215 + const mockBackfillManager = { 216 + checkForInterruptedBackfill: vi.fn().mockRejectedValue(new Error("DB connection lost")), 217 + checkIfNeeded: vi.fn(), 218 + performBackfill: vi.fn(), 219 + resumeBackfill: vi.fn(), 220 + getIsRunning: vi.fn().mockReturnValue(false), 221 + }; 222 + 223 + firehoseService.setBackfillManager(mockBackfillManager as any); 224 + await firehoseService.start(); 225 + 226 + expect(firehoseService.isRunning()).toBe(true); 227 + expect(consoleSpy).toHaveBeenCalledWith( 228 + expect.stringContaining("firehose.backfill.startup_error") 229 + ); 230 + consoleSpy.mockRestore(); 231 + }); 232 + }); 233 + }); 234 + 235 + describe("Backfill Integration", () => { 236 + let mockDb: Database; 237 + let firehoseService: FirehoseService; 238 + 239 + beforeEach(() => { 240 + const mockInsert = vi.fn().mockReturnValue({ 241 + values: vi.fn().mockReturnValue({ 242 + onConflictDoUpdate: vi.fn().mockResolvedValue(undefined), 243 + }), 199 244 }); 245 + 246 + const mockSelect = vi.fn().mockReturnValue({ 247 + from: vi.fn().mockReturnValue({ 248 + where: vi.fn().mockReturnValue({ 249 + limit: vi.fn().mockResolvedValue([]), 250 + }), 251 + }), 252 + }); 253 + 254 + mockDb = { 255 + insert: mockInsert, 256 + select: mockSelect, 257 + } as unknown as Database; 258 + 259 + vi.spyOn(console, "log").mockImplementation(() => {}); 260 + vi.spyOn(console, "error").mockImplementation(() => {}); 261 + vi.spyOn(console, "warn").mockImplementation(() => {}); 262 + firehoseService = new FirehoseService(mockDb, "wss://jetstream.example.com"); 263 + }); 264 + 265 + afterEach(() => { 266 + // Use clearAllMocks (not restoreAllMocks) to preserve module mock implementations 267 + vi.clearAllMocks(); 268 + }); 269 + 270 + it("runs backfill before starting jetstream when checkIfNeeded returns CatchUp", async () => { 271 + const mockBackfillManager = { 272 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 273 + checkIfNeeded: vi.fn().mockResolvedValue(BackfillStatus.CatchUp), 274 + performBackfill: vi.fn().mockResolvedValue({ 275 + backfillId: 1n, type: BackfillStatus.CatchUp, didsProcessed: 10, 276 + recordsIndexed: 100, errors: 0, durationMs: 5000, 277 + }), 278 + resumeBackfill: vi.fn(), 279 + getIsRunning: vi.fn().mockReturnValue(false), 280 + }; 281 + 282 + firehoseService.setBackfillManager(mockBackfillManager as any); 283 + await firehoseService.start(); 284 + 285 + expect(mockBackfillManager.checkForInterruptedBackfill).toHaveBeenCalled(); 286 + expect(mockBackfillManager.performBackfill).toHaveBeenCalledWith(BackfillStatus.CatchUp); 287 + }); 288 + 289 + it("skips backfill when checkIfNeeded returns NotNeeded", async () => { 290 + const mockBackfillManager = { 291 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 292 + checkIfNeeded: vi.fn().mockResolvedValue(BackfillStatus.NotNeeded), 293 + performBackfill: vi.fn(), 294 + resumeBackfill: vi.fn(), 295 + getIsRunning: vi.fn().mockReturnValue(false), 296 + }; 297 + 298 + firehoseService.setBackfillManager(mockBackfillManager as any); 299 + await firehoseService.start(); 300 + 301 + expect(mockBackfillManager.performBackfill).not.toHaveBeenCalled(); 302 + }); 303 + 304 + it("resumes interrupted backfill before gap detection", async () => { 305 + const interruptedRow = { 306 + id: 5n, 307 + status: "in_progress", 308 + backfillType: "catch_up", 309 + lastProcessedDid: "did:plc:halfway", 310 + didsTotal: 100, 311 + didsProcessed: 50, 312 + recordsIndexed: 250, 313 + startedAt: new Date(), 314 + completedAt: null, 315 + errorMessage: null, 316 + }; 317 + 318 + const mockBackfillManager = { 319 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(interruptedRow), 320 + resumeBackfill: vi.fn().mockResolvedValue({ 321 + backfillId: 5n, type: BackfillStatus.CatchUp, didsProcessed: 100, 322 + recordsIndexed: 500, errors: 0, durationMs: 3000, 323 + }), 324 + checkIfNeeded: vi.fn(), 325 + performBackfill: vi.fn(), 326 + getIsRunning: vi.fn().mockReturnValue(false), 327 + }; 328 + 329 + firehoseService.setBackfillManager(mockBackfillManager as any); 330 + await firehoseService.start(); 331 + 332 + expect(mockBackfillManager.resumeBackfill).toHaveBeenCalledWith(interruptedRow); 333 + // Gap detection should NOT run when there's an interrupted backfill 334 + expect(mockBackfillManager.checkIfNeeded).not.toHaveBeenCalled(); 335 + }); 336 + 337 + it("starts firehose normally when no backfillManager is set", async () => { 338 + // No setBackfillManager call — should start without errors 339 + await firehoseService.start(); 340 + expect(firehoseService.isRunning()).toBe(true); 341 + }); 342 + 343 + it("exposes indexer via getIndexer()", () => { 344 + // Verify getIndexer() returns the internal indexer instance 345 + const indexer = firehoseService.getIndexer(); 346 + expect(indexer).toBeDefined(); 347 + expect(typeof indexer.handlePostCreate).toBe("function"); 348 + }); 349 + 350 + it("does not re-run backfill on reconnect", async () => { 351 + const mockBackfillManager = { 352 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 353 + checkIfNeeded: vi.fn().mockResolvedValue(BackfillStatus.NotNeeded), 354 + performBackfill: vi.fn(), 355 + resumeBackfill: vi.fn(), 356 + getIsRunning: vi.fn().mockReturnValue(false), 357 + }; 358 + 359 + firehoseService.setBackfillManager(mockBackfillManager as any); 360 + 361 + // Initial start: backfill check runs once 362 + await firehoseService.start(); 363 + expect(mockBackfillManager.checkForInterruptedBackfill).toHaveBeenCalledTimes(1); 364 + 365 + // Simulate reconnect: handleReconnect sets running=false then calls start() again 366 + await firehoseService.stop(); 367 + await firehoseService.start(); 368 + 369 + // Backfill check must NOT run again after the initial start 370 + expect(mockBackfillManager.checkForInterruptedBackfill).toHaveBeenCalledTimes(1); 200 371 }); 201 372 });
+9 -1
apps/appview/src/lib/__tests__/test-context.ts
··· 1 1 import { eq, or, like } from "drizzle-orm"; 2 2 import { drizzle } from "drizzle-orm/postgres-js"; 3 3 import postgres from "postgres"; 4 - import { forums, posts, users, categories, memberships, boards, roles, modActions } from "@atbb/db"; 4 + import { forums, posts, users, categories, memberships, boards, roles, modActions, backfillProgress, backfillErrors } from "@atbb/db"; 5 5 import * as schema from "@atbb/db"; 6 6 import type { AppConfig } from "../config.js"; 7 7 import type { AppContext } from "../app-context.js"; ··· 31 31 oauthPublicUrl: "http://localhost:3000", 32 32 sessionSecret: "test-secret-at-least-32-characters-long", 33 33 sessionTtlDays: 7, 34 + backfillRateLimit: 10, 35 + backfillConcurrency: 10, 36 + backfillCursorMaxAgeHours: 48, 34 37 }; 35 38 36 39 // Create postgres client so we can close it later ··· 60 63 await db.delete(categories).where(eq(categories.did, config.forumDid)).catch(() => {}); 61 64 await db.delete(roles).where(eq(roles.did, config.forumDid)).catch(() => {}); 62 65 await db.delete(modActions).where(eq(modActions.did, config.forumDid)).catch(() => {}); 66 + await db.delete(backfillErrors).catch(() => {}); 67 + await db.delete(backfillProgress).catch(() => {}); 63 68 await db.delete(forums).where(eq(forums.did, config.forumDid)).catch(() => {}); 64 69 }; 65 70 ··· 88 93 oauthSessionStore: stubOAuthSessionStore, 89 94 cookieSessionStore: stubCookieSessionStore, 90 95 forumAgent: stubForumAgent, 96 + backfillManager: null, 91 97 cleanDatabase, 92 98 cleanup: async () => { 93 99 // Clean up test data (order matters due to FKs: posts -> memberships -> users -> boards -> categories -> forums) ··· 127 133 await db.delete(categories).where(eq(categories.did, config.forumDid)); 128 134 await db.delete(roles).where(eq(roles.did, config.forumDid)); 129 135 await db.delete(modActions).where(eq(modActions.did, config.forumDid)); 136 + await db.delete(backfillErrors).catch(() => {}); 137 + await db.delete(backfillProgress).catch(() => {}); 130 138 await db.delete(forums).where(eq(forums.did, config.forumDid)); 131 139 // Close postgres connection to prevent leaks 132 140 await sql.end();
+3
apps/appview/src/lib/app-context.ts
··· 6 6 import { CookieSessionStore } from "./cookie-session-store.js"; 7 7 import { ForumAgent } from "@atbb/atproto"; 8 8 import type { AppConfig } from "./config.js"; 9 + import { BackfillManager } from "./backfill-manager.js"; 9 10 10 11 /** 11 12 * Application context holding all shared dependencies. ··· 20 21 oauthSessionStore: OAuthSessionStore; 21 22 cookieSessionStore: CookieSessionStore; 22 23 forumAgent: ForumAgent | null; 24 + backfillManager: BackfillManager | null; 23 25 } 24 26 25 27 /** ··· 110 112 oauthSessionStore, 111 113 cookieSessionStore, 112 114 forumAgent, 115 + backfillManager: new BackfillManager(db, config), 113 116 }; 114 117 } 115 118
+675
apps/appview/src/lib/backfill-manager.ts
··· 1 + import type { Database } from "@atbb/db"; 2 + import { forums, backfillProgress, backfillErrors, users } from "@atbb/db"; 3 + import { eq, asc, gt } from "drizzle-orm"; 4 + import { AtpAgent } from "@atproto/api"; 5 + import { CursorManager } from "./cursor-manager.js"; 6 + import type { AppConfig } from "./config.js"; 7 + import type { Indexer } from "./indexer.js"; 8 + import { isProgrammingError } from "./errors.js"; 9 + 10 + /** 11 + * Maps AT Proto collection NSIDs to Indexer handler method names. 12 + * Order matters: sync forum-owned records first (FK dependencies). 13 + */ 14 + // These collections define the sync order. Used by performBackfill() in Task 6. 15 + export const FORUM_OWNED_COLLECTIONS = [ 16 + "space.atbb.forum.forum", 17 + "space.atbb.forum.category", 18 + "space.atbb.forum.board", 19 + "space.atbb.forum.role", 20 + "space.atbb.modAction", 21 + ] as const; 22 + 23 + export const USER_OWNED_COLLECTIONS = [ 24 + "space.atbb.membership", 25 + "space.atbb.post", 26 + ] as const; 27 + 28 + const COLLECTION_HANDLER_MAP: Record<string, string> = { 29 + "space.atbb.post": "handlePostCreate", 30 + "space.atbb.forum.forum": "handleForumCreate", 31 + "space.atbb.forum.category": "handleCategoryCreate", 32 + "space.atbb.forum.board": "handleBoardCreate", 33 + "space.atbb.forum.role": "handleRoleCreate", 34 + "space.atbb.membership": "handleMembershipCreate", 35 + "space.atbb.modAction": "handleModActionCreate", 36 + }; 37 + 38 + export enum BackfillStatus { 39 + NotNeeded = "not_needed", 40 + CatchUp = "catch_up", 41 + FullSync = "full_sync", 42 + } 43 + 44 + export interface BackfillResult { 45 + backfillId: bigint; 46 + type: BackfillStatus; 47 + didsProcessed: number; 48 + recordsIndexed: number; 49 + errors: number; 50 + durationMs: number; 51 + } 52 + 53 + export interface SyncStats { 54 + recordsFound: number; 55 + recordsIndexed: number; 56 + errors: number; 57 + } 58 + 59 + export class BackfillManager { 60 + private cursorManager: CursorManager; 61 + private isRunning = false; 62 + private indexer: Indexer | null = null; 63 + 64 + constructor( 65 + private db: Database, 66 + private config: AppConfig, 67 + ) { 68 + this.cursorManager = new CursorManager(db); 69 + } 70 + 71 + /** 72 + * Inject the Indexer instance. Called during AppContext wiring. 73 + */ 74 + setIndexer(indexer: Indexer): void { 75 + this.indexer = indexer; 76 + } 77 + 78 + /** 79 + * Sync all records from a single (DID, collection) pair via listRecords. 80 + * Feeds each record through the matching Indexer handler. 81 + */ 82 + async syncRepoRecords( 83 + did: string, 84 + collection: string, 85 + agent: AtpAgent 86 + ): Promise<SyncStats> { 87 + const stats: SyncStats = { recordsFound: 0, recordsIndexed: 0, errors: 0 }; 88 + const handlerName = COLLECTION_HANDLER_MAP[collection]; 89 + 90 + if (!handlerName || !this.indexer) { 91 + console.error(JSON.stringify({ 92 + event: "backfill.sync_skipped", 93 + did, 94 + collection, 95 + reason: !handlerName ? "unknown_collection" : "indexer_not_set", 96 + timestamp: new Date().toISOString(), 97 + })); 98 + stats.errors = 1; 99 + return stats; 100 + } 101 + 102 + const handler = (this.indexer as any)[handlerName].bind(this.indexer); 103 + const delayMs = 1000 / this.config.backfillRateLimit; 104 + let cursor: string | undefined; 105 + 106 + try { 107 + do { 108 + const response = await agent.com.atproto.repo.listRecords({ 109 + repo: did, 110 + collection, 111 + limit: 100, 112 + cursor, 113 + }); 114 + 115 + const records = response.data.records; 116 + stats.recordsFound += records.length; 117 + 118 + for (const record of records) { 119 + try { 120 + const rkey = record.uri.split("/").pop()!; 121 + const event = { 122 + did, 123 + commit: { rkey, cid: record.cid, record: record.value }, 124 + }; 125 + await handler(event); 126 + stats.recordsIndexed++; 127 + } catch (error) { 128 + if (isProgrammingError(error)) throw error; 129 + stats.errors++; 130 + console.error(JSON.stringify({ 131 + event: "backfill.record_error", 132 + did, 133 + collection, 134 + uri: record.uri, 135 + error: error instanceof Error ? error.message : String(error), 136 + timestamp: new Date().toISOString(), 137 + })); 138 + } 139 + } 140 + 141 + cursor = response.data.cursor; 142 + 143 + // Rate limiting: delay between page fetches 144 + if (cursor) { 145 + await new Promise((resolve) => setTimeout(resolve, delayMs)); 146 + } 147 + } while (cursor); 148 + } catch (error) { 149 + stats.errors++; 150 + console.error(JSON.stringify({ 151 + event: "backfill.pds_error", 152 + did, 153 + collection, 154 + error: error instanceof Error ? error.message : String(error), 155 + timestamp: new Date().toISOString(), 156 + })); 157 + } 158 + 159 + return stats; 160 + } 161 + 162 + /** 163 + * Determine if backfill is needed based on cursor state and DB contents. 164 + */ 165 + async checkIfNeeded(cursor: bigint | null): Promise<BackfillStatus> { 166 + // No cursor at all → first startup or wiped cursor 167 + if (cursor === null) { 168 + console.log(JSON.stringify({ 169 + event: "backfill.decision", 170 + status: BackfillStatus.FullSync, 171 + reason: "no_cursor", 172 + timestamp: new Date().toISOString(), 173 + })); 174 + return BackfillStatus.FullSync; 175 + } 176 + 177 + // Check if DB has forum data (consistency check) 178 + let forum: { rkey: string } | undefined; 179 + try { 180 + const results = await this.db 181 + .select() 182 + .from(forums) 183 + .where(eq(forums.rkey, "self")) 184 + .limit(1); 185 + forum = results[0]; 186 + } catch (error) { 187 + console.error(JSON.stringify({ 188 + event: "backfill.decision", 189 + status: BackfillStatus.FullSync, 190 + reason: "db_query_failed", 191 + error: error instanceof Error ? error.message : String(error), 192 + timestamp: new Date().toISOString(), 193 + })); 194 + return BackfillStatus.FullSync; 195 + } 196 + 197 + if (!forum) { 198 + console.log(JSON.stringify({ 199 + event: "backfill.decision", 200 + status: BackfillStatus.FullSync, 201 + reason: "db_inconsistency", 202 + cursorTimestamp: cursor.toString(), 203 + timestamp: new Date().toISOString(), 204 + })); 205 + return BackfillStatus.FullSync; 206 + } 207 + 208 + // Check cursor age 209 + const ageHours = this.cursorManager.getCursorAgeHours(cursor)!; 210 + if (ageHours > this.config.backfillCursorMaxAgeHours) { 211 + console.log(JSON.stringify({ 212 + event: "backfill.decision", 213 + status: BackfillStatus.CatchUp, 214 + reason: "cursor_too_old", 215 + cursorAgeHours: Math.round(ageHours), 216 + thresholdHours: this.config.backfillCursorMaxAgeHours, 217 + cursorTimestamp: cursor.toString(), 218 + timestamp: new Date().toISOString(), 219 + })); 220 + return BackfillStatus.CatchUp; 221 + } 222 + 223 + console.log(JSON.stringify({ 224 + event: "backfill.decision", 225 + status: BackfillStatus.NotNeeded, 226 + reason: "cursor_fresh", 227 + cursorAgeHours: Math.round(ageHours), 228 + timestamp: new Date().toISOString(), 229 + })); 230 + return BackfillStatus.NotNeeded; 231 + } 232 + 233 + /** 234 + * Check if a backfill is currently running. 235 + */ 236 + getIsRunning(): boolean { 237 + return this.isRunning; 238 + } 239 + 240 + /** 241 + * Create an AtpAgent pointed at the forum's PDS. 242 + * Extracted as a private method for test mocking. 243 + */ 244 + private createAgentForPds(): AtpAgent { 245 + return new AtpAgent({ service: this.config.pdsUrl }); 246 + } 247 + 248 + /** 249 + * Create a progress row and return its ID. 250 + * Use this before performBackfill when you need the ID immediately (e.g., for a 202 response). 251 + * Pass the returned ID as existingRowId to performBackfill to skip duplicate row creation. 252 + */ 253 + async prepareBackfillRow(type: BackfillStatus): Promise<bigint> { 254 + const [row] = await this.db 255 + .insert(backfillProgress) 256 + .values({ 257 + status: "in_progress", 258 + backfillType: type, 259 + startedAt: new Date(), 260 + }) 261 + .returning({ id: backfillProgress.id }); 262 + return row.id; 263 + } 264 + 265 + /** 266 + * Query the backfill_progress table for any row with status = 'in_progress'. 267 + * Returns the first such row, or null if none exists. 268 + */ 269 + async checkForInterruptedBackfill() { 270 + try { 271 + const [row] = await this.db 272 + .select() 273 + .from(backfillProgress) 274 + .where(eq(backfillProgress.status, "in_progress")) 275 + .limit(1); 276 + 277 + return row ?? null; 278 + } catch (error) { 279 + if (isProgrammingError(error)) throw error; 280 + console.error(JSON.stringify({ 281 + event: "backfill.check_interrupted.failed", 282 + error: error instanceof Error ? error.message : String(error), 283 + note: "Could not check for interrupted backfills — assuming none", 284 + timestamp: new Date().toISOString(), 285 + })); 286 + return null; 287 + } 288 + } 289 + 290 + /** 291 + * Resume a CatchUp backfill from its last checkpoint (lastProcessedDid). 292 + * Only processes users with DID > lastProcessedDid. 293 + * Does NOT re-run Phase 1 (forum-owned collections). 294 + */ 295 + async resumeBackfill(interrupted: typeof backfillProgress.$inferSelect): Promise<BackfillResult> { 296 + if (this.isRunning) { 297 + throw new Error("Backfill is already in progress"); 298 + } 299 + 300 + this.isRunning = true; 301 + const startTime = Date.now(); 302 + let totalIndexed = interrupted.recordsIndexed; 303 + let totalErrors = 0; 304 + let didsProcessed = interrupted.didsProcessed; 305 + 306 + console.log(JSON.stringify({ 307 + event: "backfill.resuming", 308 + backfillId: interrupted.id.toString(), 309 + lastProcessedDid: interrupted.lastProcessedDid, 310 + didsProcessed: interrupted.didsProcessed, 311 + didsTotal: interrupted.didsTotal, 312 + timestamp: new Date().toISOString(), 313 + })); 314 + 315 + try { 316 + const agent = this.createAgentForPds(); 317 + 318 + if (interrupted.backfillType !== BackfillStatus.CatchUp) { 319 + // FullSync cannot be resumed from a checkpoint — it must re-run from scratch 320 + throw new Error( 321 + "Interrupted FullSync cannot be resumed. Re-trigger via /api/admin/backfill?force=full_sync." 322 + ); 323 + } 324 + 325 + if (interrupted.lastProcessedDid) { 326 + // Resume: fetch users after lastProcessedDid 327 + // TODO(ATB-13): Paginate for large forums 328 + const remainingUsers = await this.db 329 + .select({ did: users.did }) 330 + .from(users) 331 + .where(gt(users.did, interrupted.lastProcessedDid)) 332 + .orderBy(asc(users.did)); 333 + 334 + for (let i = 0; i < remainingUsers.length; i += this.config.backfillConcurrency) { 335 + const batch = remainingUsers.slice(i, i + this.config.backfillConcurrency); 336 + const backfillId = interrupted.id; 337 + 338 + const batchResults = await Promise.allSettled( 339 + batch.map(async (user) => { 340 + let userIndexed = 0; 341 + let userErrors = 0; 342 + for (const collection of USER_OWNED_COLLECTIONS) { 343 + const stats = await this.syncRepoRecords(user.did, collection, agent); 344 + userIndexed += stats.recordsIndexed; 345 + if (stats.errors > 0) { 346 + userErrors += stats.errors; 347 + await this.db.insert(backfillErrors).values({ 348 + backfillId, 349 + did: user.did, 350 + collection, 351 + errorMessage: `${stats.errors} record(s) failed`, 352 + createdAt: new Date(), 353 + }); 354 + } 355 + } 356 + return { indexed: userIndexed, errors: userErrors }; 357 + }) 358 + ); 359 + 360 + // Aggregate results after settlement, including DID for debuggability 361 + batchResults.forEach((result, i) => { 362 + if (result.status === "fulfilled") { 363 + totalIndexed += result.value.indexed; 364 + totalErrors += result.value.errors; 365 + } else { 366 + totalErrors++; 367 + console.error(JSON.stringify({ 368 + event: "backfill.resume.batch_user_failed", 369 + backfillId: backfillId.toString(), 370 + did: batch[i].did, 371 + error: result.reason instanceof Error ? result.reason.message : String(result.reason), 372 + timestamp: new Date().toISOString(), 373 + })); 374 + } 375 + }); 376 + 377 + didsProcessed += batch.length; 378 + 379 + try { 380 + await this.db 381 + .update(backfillProgress) 382 + .set({ 383 + didsProcessed, 384 + recordsIndexed: totalIndexed, 385 + lastProcessedDid: batch[batch.length - 1].did, 386 + }) 387 + .where(eq(backfillProgress.id, backfillId)); 388 + } catch (checkpointError) { 389 + if (isProgrammingError(checkpointError)) throw checkpointError; 390 + console.warn(JSON.stringify({ 391 + event: "backfill.resume.checkpoint_failed", 392 + backfillId: backfillId.toString(), 393 + didsProcessed, 394 + error: checkpointError instanceof Error ? checkpointError.message : String(checkpointError), 395 + note: "Checkpoint save failed — continuing backfill. Resume may reprocess this batch.", 396 + timestamp: new Date().toISOString(), 397 + })); 398 + } 399 + } 400 + } 401 + 402 + // Mark completed 403 + await this.db 404 + .update(backfillProgress) 405 + .set({ 406 + status: "completed", 407 + didsProcessed, 408 + recordsIndexed: totalIndexed, 409 + completedAt: new Date(), 410 + }) 411 + .where(eq(backfillProgress.id, interrupted.id)); 412 + 413 + const result: BackfillResult = { 414 + backfillId: interrupted.id, 415 + type: interrupted.backfillType as BackfillStatus, 416 + didsProcessed, 417 + recordsIndexed: totalIndexed, 418 + errors: totalErrors, 419 + durationMs: Date.now() - startTime, 420 + }; 421 + 422 + console.log(JSON.stringify({ 423 + event: totalErrors > 0 ? "backfill.resume.completed_with_errors" : "backfill.resume.completed", 424 + ...result, 425 + backfillId: result.backfillId.toString(), 426 + timestamp: new Date().toISOString(), 427 + })); 428 + 429 + return result; 430 + } catch (error) { 431 + // Best-effort: mark as failed 432 + try { 433 + await this.db 434 + .update(backfillProgress) 435 + .set({ 436 + status: "failed", 437 + errorMessage: error instanceof Error ? error.message : String(error), 438 + completedAt: new Date(), 439 + }) 440 + .where(eq(backfillProgress.id, interrupted.id)); 441 + } catch (updateError) { 442 + console.error(JSON.stringify({ 443 + event: "backfill.resume.failed_status_update_error", 444 + backfillId: interrupted.id.toString(), 445 + error: updateError instanceof Error ? updateError.message : String(updateError), 446 + timestamp: new Date().toISOString(), 447 + })); 448 + } 449 + 450 + console.error(JSON.stringify({ 451 + event: "backfill.resume.failed", 452 + backfillId: interrupted.id.toString(), 453 + error: error instanceof Error ? error.message : String(error), 454 + timestamp: new Date().toISOString(), 455 + })); 456 + throw error; 457 + } finally { 458 + this.isRunning = false; 459 + } 460 + } 461 + 462 + /** 463 + * Execute a backfill operation. 464 + * Phase 1: Syncs forum-owned collections from the Forum DID. 465 + * Phase 2 (CatchUp only): Syncs user-owned collections from all known users. 466 + * 467 + * @param existingRowId - If provided (from prepareBackfillRow), skips creating a new progress row. 468 + */ 469 + async performBackfill(type: BackfillStatus, existingRowId?: bigint): Promise<BackfillResult> { 470 + if (this.isRunning) { 471 + throw new Error("Backfill is already in progress"); 472 + } 473 + 474 + this.isRunning = true; 475 + const startTime = Date.now(); 476 + let backfillId: bigint | undefined = existingRowId; 477 + let totalIndexed = 0; 478 + let totalErrors = 0; 479 + let didsProcessed = 0; 480 + 481 + try { 482 + // Create progress row only if not pre-created by prepareBackfillRow 483 + if (backfillId === undefined) { 484 + const [row] = await this.db 485 + .insert(backfillProgress) 486 + .values({ 487 + status: "in_progress", 488 + backfillType: type, 489 + startedAt: new Date(), 490 + }) 491 + .returning({ id: backfillProgress.id }); 492 + backfillId = row.id; 493 + } 494 + // Capture in const so TypeScript can narrow through async closures 495 + const resolvedBackfillId: bigint = backfillId; 496 + 497 + const agent = this.createAgentForPds(); 498 + 499 + // Phase 1: Sync forum-owned collections from Forum DID 500 + for (const collection of FORUM_OWNED_COLLECTIONS) { 501 + const stats = await this.syncRepoRecords( 502 + this.config.forumDid, 503 + collection, 504 + agent 505 + ); 506 + totalIndexed += stats.recordsIndexed; 507 + totalErrors += stats.errors; 508 + if (stats.errors > 0) { 509 + await this.db.insert(backfillErrors).values({ 510 + backfillId: resolvedBackfillId, 511 + did: this.config.forumDid, 512 + collection, 513 + errorMessage: `${stats.errors} record(s) failed`, 514 + createdAt: new Date(), 515 + }); 516 + } 517 + } 518 + 519 + // Phase 2: For CatchUp, sync user-owned records from known DIDs 520 + if (type === BackfillStatus.CatchUp) { 521 + // TODO(ATB-13): Paginate for large forums — currently loads all DIDs into memory 522 + const knownUsers = await this.db 523 + .select({ did: users.did }) 524 + .from(users) 525 + .orderBy(asc(users.did)); 526 + 527 + const didsTotal = knownUsers.length; 528 + 529 + await this.db 530 + .update(backfillProgress) 531 + .set({ didsTotal }) 532 + .where(eq(backfillProgress.id, backfillId)); 533 + 534 + // Process in batches of backfillConcurrency 535 + for (let i = 0; i < knownUsers.length; i += this.config.backfillConcurrency) { 536 + const batch = knownUsers.slice(i, i + this.config.backfillConcurrency); 537 + 538 + const batchResults = await Promise.allSettled( 539 + batch.map(async (user) => { 540 + let userIndexed = 0; 541 + let userErrors = 0; 542 + for (const collection of USER_OWNED_COLLECTIONS) { 543 + const stats = await this.syncRepoRecords(user.did, collection, agent); 544 + userIndexed += stats.recordsIndexed; 545 + if (stats.errors > 0) { 546 + userErrors += stats.errors; 547 + await this.db.insert(backfillErrors).values({ 548 + backfillId: resolvedBackfillId, 549 + did: user.did, 550 + collection, 551 + errorMessage: `${stats.errors} record(s) failed`, 552 + createdAt: new Date(), 553 + }); 554 + } 555 + } 556 + return { indexed: userIndexed, errors: userErrors }; 557 + }) 558 + ); 559 + 560 + // Aggregate results after settlement, including DID for debuggability 561 + batchResults.forEach((result, i) => { 562 + if (result.status === "fulfilled") { 563 + totalIndexed += result.value.indexed; 564 + totalErrors += result.value.errors; 565 + } else { 566 + totalErrors++; 567 + console.error(JSON.stringify({ 568 + event: "backfill.batch_user_failed", 569 + backfillId: resolvedBackfillId.toString(), 570 + did: batch[i].did, 571 + error: result.reason instanceof Error ? result.reason.message : String(result.reason), 572 + timestamp: new Date().toISOString(), 573 + })); 574 + } 575 + }); 576 + 577 + didsProcessed += batch.length; 578 + 579 + try { 580 + await this.db 581 + .update(backfillProgress) 582 + .set({ 583 + didsProcessed, 584 + recordsIndexed: totalIndexed, 585 + lastProcessedDid: batch[batch.length - 1].did, 586 + }) 587 + .where(eq(backfillProgress.id, backfillId)); 588 + } catch (checkpointError) { 589 + if (isProgrammingError(checkpointError)) throw checkpointError; 590 + console.warn(JSON.stringify({ 591 + event: "backfill.checkpoint_failed", 592 + backfillId: resolvedBackfillId.toString(), 593 + didsProcessed, 594 + error: checkpointError instanceof Error ? checkpointError.message : String(checkpointError), 595 + note: "Checkpoint save failed — continuing backfill. Resume may reprocess this batch.", 596 + timestamp: new Date().toISOString(), 597 + })); 598 + } 599 + 600 + console.log(JSON.stringify({ 601 + event: "backfill.progress", 602 + backfillId: backfillId.toString(), 603 + type, 604 + didsProcessed, 605 + didsTotal, 606 + recordsIndexed: totalIndexed, 607 + elapsedMs: Date.now() - startTime, 608 + timestamp: new Date().toISOString(), 609 + })); 610 + } 611 + } 612 + 613 + // Mark completed 614 + await this.db 615 + .update(backfillProgress) 616 + .set({ 617 + status: "completed", 618 + didsProcessed, 619 + recordsIndexed: totalIndexed, 620 + completedAt: new Date(), 621 + }) 622 + .where(eq(backfillProgress.id, backfillId)); 623 + 624 + const result: BackfillResult = { 625 + backfillId: resolvedBackfillId, 626 + type, 627 + didsProcessed, 628 + recordsIndexed: totalIndexed, 629 + errors: totalErrors, 630 + durationMs: Date.now() - startTime, 631 + }; 632 + 633 + console.log(JSON.stringify({ 634 + event: totalErrors > 0 ? "backfill.completed_with_errors" : "backfill.completed", 635 + ...result, 636 + backfillId: result.backfillId.toString(), 637 + timestamp: new Date().toISOString(), 638 + })); 639 + 640 + return result; 641 + } catch (error) { 642 + // Best-effort: mark progress row as failed (if it was created) 643 + if (backfillId !== undefined) { 644 + try { 645 + await this.db 646 + .update(backfillProgress) 647 + .set({ 648 + status: "failed", 649 + errorMessage: error instanceof Error ? error.message : String(error), 650 + completedAt: new Date(), 651 + }) 652 + .where(eq(backfillProgress.id, backfillId)); 653 + } catch (updateError) { 654 + console.error(JSON.stringify({ 655 + event: "backfill.failed_status_update_error", 656 + backfillId: backfillId.toString(), 657 + error: updateError instanceof Error ? updateError.message : String(updateError), 658 + timestamp: new Date().toISOString(), 659 + })); 660 + } 661 + } 662 + 663 + console.error(JSON.stringify({ 664 + event: "backfill.failed", 665 + backfillId: backfillId !== undefined ? backfillId.toString() : "not_created", 666 + error: error instanceof Error ? error.message : String(error), 667 + timestamp: new Date().toISOString(), 668 + })); 669 + throw error; 670 + } finally { 671 + this.isRunning = false; 672 + } 673 + } 674 + } 675 +
+8
apps/appview/src/lib/config.ts
··· 12 12 // Forum credentials (optional - for server-side PDS writes) 13 13 forumHandle?: string; 14 14 forumPassword?: string; 15 + // Backfill configuration 16 + backfillRateLimit: number; 17 + backfillConcurrency: number; 18 + backfillCursorMaxAgeHours: number; 15 19 } 16 20 17 21 export function loadConfig(): AppConfig { ··· 31 35 // Forum credentials (optional - for server-side PDS writes) 32 36 forumHandle: process.env.FORUM_HANDLE, 33 37 forumPassword: process.env.FORUM_PASSWORD, 38 + // Backfill configuration 39 + backfillRateLimit: parseInt(process.env.BACKFILL_RATE_LIMIT ?? "10", 10), 40 + backfillConcurrency: parseInt(process.env.BACKFILL_CONCURRENCY ?? "10", 10), 41 + backfillCursorMaxAgeHours: parseInt(process.env.BACKFILL_CURSOR_MAX_AGE_HOURS ?? "48", 10), 34 42 }; 35 43 36 44 validateOAuthConfig(config);
+14
apps/appview/src/lib/cursor-manager.ts
··· 69 69 rewind(cursor: bigint, microseconds: number): bigint { 70 70 return cursor - BigInt(microseconds); 71 71 } 72 + 73 + /** 74 + * Calculate cursor age in hours. 75 + * Cursor values are Jetstream timestamps in microseconds since epoch. 76 + * 77 + * @param cursor - Cursor value (microseconds), or null 78 + * @returns Age in hours, or null if cursor is null 79 + */ 80 + getCursorAgeHours(cursor: bigint | null): number | null { 81 + if (cursor === null) return null; 82 + const cursorMs = Number(cursor / 1000n); 83 + const ageMs = Date.now() - cursorMs; 84 + return ageMs / (1000 * 60 * 60); 85 + } 72 86 }
+79 -4
apps/appview/src/lib/firehose.ts
··· 5 5 import { CircuitBreaker } from "./circuit-breaker.js"; 6 6 import { ReconnectionManager } from "./reconnection-manager.js"; 7 7 import { EventHandlerRegistry } from "./event-handler-registry.js"; 8 + import type { BackfillManager } from "./backfill-manager.js"; 9 + import { BackfillStatus } from "./backfill-manager.js"; 8 10 9 11 /** 10 12 * Firehose service that subscribes to AT Proto Jetstream ··· 31 33 32 34 // Event handler registry 33 35 private handlerRegistry: EventHandlerRegistry; 36 + 37 + private backfillManager: BackfillManager | null = null; 38 + 39 + // Guard: only run startup backfill on the initial start, not on reconnects. 40 + private isInitialStart = true; 34 41 35 42 // Collections we're interested in (full lexicon IDs) 36 43 private readonly wantedCollections: string[]; ··· 164 171 return; 165 172 } 166 173 174 + // Check for backfill before starting firehose — only on the initial start. 175 + // Reconnects skip this block to avoid re-running a completed backfill every 176 + // time the Jetstream WebSocket drops and recovers. 177 + // Wrapped in try-catch so a transient DB error at startup doesn't kill the process — 178 + // stale data served from the firehose is better than no data at all. 179 + if (this.isInitialStart && this.backfillManager) { 180 + this.isInitialStart = false; 181 + try { 182 + const interrupted = await this.backfillManager.checkForInterruptedBackfill(); 183 + if (interrupted) { 184 + console.log(JSON.stringify({ 185 + event: "firehose.backfill.resuming_interrupted", 186 + backfillId: interrupted.id.toString(), 187 + lastProcessedDid: interrupted.lastProcessedDid, 188 + timestamp: new Date().toISOString(), 189 + })); 190 + await this.backfillManager.resumeBackfill(interrupted); 191 + console.log(JSON.stringify({ 192 + event: "firehose.backfill.resumed", 193 + backfillId: interrupted.id.toString(), 194 + timestamp: new Date().toISOString(), 195 + })); 196 + } else { 197 + const savedCursorForCheck = await this.cursorManager.load(); 198 + const backfillStatus = await this.backfillManager.checkIfNeeded(savedCursorForCheck); 199 + 200 + if (backfillStatus !== BackfillStatus.NotNeeded) { 201 + console.log(JSON.stringify({ 202 + event: "firehose.backfill.starting", 203 + type: backfillStatus, 204 + timestamp: new Date().toISOString(), 205 + })); 206 + await this.backfillManager.performBackfill(backfillStatus); 207 + console.log(JSON.stringify({ 208 + event: "firehose.backfill.completed", 209 + type: backfillStatus, 210 + timestamp: new Date().toISOString(), 211 + })); 212 + } 213 + } 214 + } catch (error) { 215 + console.error(JSON.stringify({ 216 + event: "firehose.backfill.startup_error", 217 + error: error instanceof Error ? error.message : String(error), 218 + note: "Backfill skipped — firehose will start without it", 219 + timestamp: new Date().toISOString(), 220 + })); 221 + // Continue to start firehose — stale data is better than no data 222 + } 223 + } 224 + 167 225 try { 168 226 // Load the last cursor from database 169 227 const savedCursor = await this.cursorManager.load(); ··· 217 275 } 218 276 219 277 /** 278 + * Inject the BackfillManager. Called during AppContext wiring. 279 + */ 280 + setBackfillManager(manager: BackfillManager): void { 281 + this.backfillManager = manager; 282 + } 283 + 284 + /** 285 + * Expose the Indexer instance for BackfillManager wiring. 286 + */ 287 + getIndexer(): Indexer { 288 + return this.indexer; 289 + } 290 + 291 + /** 220 292 * Handle reconnection with exponential backoff 221 293 */ 222 294 private async handleReconnect() { ··· 225 297 this.running = false; 226 298 await this.start(); 227 299 }); 228 - } catch { 229 - console.error( 230 - `[FATAL] Firehose indexing has stopped. The appview will continue serving stale data.` 231 - ); 300 + } catch (error) { 301 + console.error(JSON.stringify({ 302 + event: "firehose.reconnect.exhausted", 303 + error: error instanceof Error ? error.message : String(error), 304 + note: "Firehose indexing has stopped. AppView will continue serving stale data.", 305 + timestamp: new Date().toISOString(), 306 + })); 232 307 this.running = false; 233 308 } 234 309 }
+4 -1
apps/appview/src/lib/route-errors.ts
··· 260 260 try { 261 261 const body = await c.req.json(); 262 262 return { body, error: null }; 263 - } catch { 263 + } catch (error) { 264 + // Only SyntaxError is expected here (malformed JSON from user input). 265 + // Re-throw anything unexpected so programming bugs are not silently swallowed. 266 + if (!(error instanceof SyntaxError)) throw error; 264 267 return { 265 268 body: null, 266 269 error: c.json({ error: "Invalid JSON in request body" }, 400) as unknown as Response,
+469
apps/appview/src/routes/__tests__/admin-backfill.test.ts
··· 1 + import { describe, it, expect, vi, beforeEach, afterEach } from "vitest"; 2 + import { Hono } from "hono"; 3 + import { createAdminRoutes } from "../admin.js"; 4 + import { createTestContext, type TestContext } from "../../lib/__tests__/test-context.js"; 5 + import { roles, memberships, users, backfillProgress, backfillErrors } from "@atbb/db"; 6 + import { BackfillStatus } from "../../lib/backfill-manager.js"; 7 + 8 + // Mock restoreOAuthSession so tests control auth without real OAuth 9 + vi.mock("../../lib/session.js", () => ({ 10 + restoreOAuthSession: vi.fn(), 11 + })); 12 + 13 + // Mock Agent construction so auth middleware doesn't fail 14 + vi.mock("@atproto/api", () => ({ 15 + Agent: vi.fn().mockImplementation(() => ({})), 16 + AtpAgent: vi.fn().mockImplementation(() => ({ 17 + com: { atproto: { repo: { listRecords: vi.fn() } } }, 18 + })), 19 + })); 20 + 21 + import { restoreOAuthSession } from "../../lib/session.js"; 22 + 23 + const ADMIN_DID = "did:plc:test-admin-backfill"; 24 + const ROLE_RKEY = "admin-backfill-owner-role"; 25 + 26 + describe("Admin Backfill Routes", () => { 27 + let ctx: TestContext; 28 + let app: Hono; 29 + let mockBackfillManager: any; 30 + 31 + const authHeaders = { Cookie: "atbb_session=test-session" }; 32 + 33 + beforeEach(async () => { 34 + ctx = await createTestContext(); 35 + 36 + mockBackfillManager = { 37 + getIsRunning: vi.fn().mockReturnValue(false), 38 + checkIfNeeded: vi.fn().mockResolvedValue(BackfillStatus.NotNeeded), 39 + prepareBackfillRow: vi.fn().mockResolvedValue(42n), 40 + performBackfill: vi.fn().mockResolvedValue({ 41 + backfillId: 1n, 42 + type: BackfillStatus.CatchUp, 43 + didsProcessed: 0, 44 + recordsIndexed: 0, 45 + errors: 0, 46 + durationMs: 100, 47 + }), 48 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 49 + }; 50 + 51 + // Inject mock backfillManager into context 52 + (ctx as any).backfillManager = mockBackfillManager; 53 + 54 + app = new Hono(); 55 + app.route("/api/admin", createAdminRoutes(ctx)); 56 + 57 + // Default: mock auth to return valid session for "test-session" 58 + vi.mocked(restoreOAuthSession).mockResolvedValue({ 59 + oauthSession: { 60 + did: ADMIN_DID, 61 + serverMetadata: { issuer: "https://pds.example.com" }, 62 + } as any, 63 + cookieSession: { 64 + did: ADMIN_DID, 65 + handle: "admin.test", 66 + expiresAt: new Date(Date.now() + 3600_000), 67 + createdAt: new Date(), 68 + }, 69 + }); 70 + }); 71 + 72 + afterEach(async () => { 73 + await ctx.cleanup(); 74 + vi.clearAllMocks(); 75 + }); 76 + 77 + // Helper: insert admin user with manageForum permission in DB 78 + async function setupAdminUser() { 79 + await ctx.db.insert(roles).values({ 80 + did: ctx.config.forumDid, 81 + rkey: ROLE_RKEY, 82 + cid: "test-cid", 83 + name: "Owner", 84 + description: "Forum owner", 85 + permissions: ["*"], 86 + priority: 0, 87 + createdAt: new Date(), 88 + indexedAt: new Date(), 89 + }); 90 + 91 + await ctx.db.insert(users).values({ 92 + did: ADMIN_DID, 93 + handle: "admin.test", 94 + indexedAt: new Date(), 95 + }); 96 + 97 + await ctx.db.insert(memberships).values({ 98 + did: ADMIN_DID, 99 + rkey: "admin-backfill-membership", 100 + cid: "test-cid", 101 + forumUri: `at://${ctx.config.forumDid}/space.atbb.forum.forum/self`, 102 + roleUri: `at://${ctx.config.forumDid}/space.atbb.forum.role/${ROLE_RKEY}`, 103 + createdAt: new Date(), 104 + indexedAt: new Date(), 105 + }); 106 + } 107 + 108 + describe("POST /api/admin/backfill", () => { 109 + it("returns 401 without authentication", async () => { 110 + const res = await app.request("/api/admin/backfill", { method: "POST" }); 111 + expect(res.status).toBe(401); 112 + }); 113 + 114 + it("returns 403 when user lacks manageForum permission", async () => { 115 + // Auth but no membership/role in DB → permission check fails 116 + await ctx.db.insert(users).values({ 117 + did: ADMIN_DID, 118 + handle: "admin.test", 119 + indexedAt: new Date(), 120 + }); 121 + 122 + const res = await app.request("/api/admin/backfill", { 123 + method: "POST", 124 + headers: authHeaders, 125 + }); 126 + expect(res.status).toBe(403); 127 + }); 128 + 129 + it("returns 503 when backfillManager is not available", async () => { 130 + await setupAdminUser(); 131 + (ctx as any).backfillManager = null; 132 + 133 + const res = await app.request("/api/admin/backfill", { 134 + method: "POST", 135 + headers: authHeaders, 136 + }); 137 + expect(res.status).toBe(503); 138 + const data = await res.json(); 139 + expect(data.error).toContain("not available"); 140 + }); 141 + 142 + it("returns 409 when backfill is already running", async () => { 143 + await setupAdminUser(); 144 + mockBackfillManager.getIsRunning.mockReturnValue(true); 145 + 146 + const res = await app.request("/api/admin/backfill", { 147 + method: "POST", 148 + headers: authHeaders, 149 + }); 150 + expect(res.status).toBe(409); 151 + const data = await res.json(); 152 + expect(data.error).toContain("already in progress"); 153 + }); 154 + 155 + it("returns 200 with helpful message when backfill is not needed", async () => { 156 + await setupAdminUser(); 157 + mockBackfillManager.checkIfNeeded.mockResolvedValue(BackfillStatus.NotNeeded); 158 + 159 + const res = await app.request("/api/admin/backfill", { 160 + method: "POST", 161 + headers: authHeaders, 162 + }); 163 + expect(res.status).toBe(200); 164 + const data = await res.json(); 165 + expect(data.message).toContain("No backfill needed"); 166 + expect(mockBackfillManager.performBackfill).not.toHaveBeenCalled(); 167 + }); 168 + 169 + it("returns 202 and triggers backfill when gap is detected", async () => { 170 + await setupAdminUser(); 171 + mockBackfillManager.checkIfNeeded.mockResolvedValue(BackfillStatus.CatchUp); 172 + 173 + const res = await app.request("/api/admin/backfill", { 174 + method: "POST", 175 + headers: authHeaders, 176 + }); 177 + expect(res.status).toBe(202); 178 + const data = await res.json(); 179 + expect(data.message).toContain("started"); 180 + expect(data.type).toBe(BackfillStatus.CatchUp); 181 + expect(data.status).toBe("in_progress"); 182 + expect(data.id).toBe("42"); // returned by prepareBackfillRow mock 183 + expect(mockBackfillManager.prepareBackfillRow).toHaveBeenCalledWith(BackfillStatus.CatchUp); 184 + expect(mockBackfillManager.performBackfill).toHaveBeenCalledWith(BackfillStatus.CatchUp, 42n); 185 + }); 186 + 187 + it("forces catch_up backfill when ?force=catch_up is specified", async () => { 188 + await setupAdminUser(); 189 + // Even if checkIfNeeded says NotNeeded, force overrides 190 + mockBackfillManager.checkIfNeeded.mockResolvedValue(BackfillStatus.NotNeeded); 191 + 192 + const res = await app.request("/api/admin/backfill?force=catch_up", { 193 + method: "POST", 194 + headers: authHeaders, 195 + }); 196 + expect(res.status).toBe(202); 197 + const data = await res.json(); 198 + expect(data.id).toBe("42"); 199 + expect(mockBackfillManager.prepareBackfillRow).toHaveBeenCalledWith(BackfillStatus.CatchUp); 200 + expect(mockBackfillManager.performBackfill).toHaveBeenCalledWith(BackfillStatus.CatchUp, 42n); 201 + // checkIfNeeded should NOT be called when force is specified 202 + expect(mockBackfillManager.checkIfNeeded).not.toHaveBeenCalled(); 203 + }); 204 + 205 + it("forces full_sync backfill when ?force=full_sync is specified", async () => { 206 + await setupAdminUser(); 207 + 208 + const res = await app.request("/api/admin/backfill?force=full_sync", { 209 + method: "POST", 210 + headers: authHeaders, 211 + }); 212 + expect(res.status).toBe(202); 213 + const data = await res.json(); 214 + expect(data.id).toBe("42"); 215 + expect(mockBackfillManager.prepareBackfillRow).toHaveBeenCalledWith(BackfillStatus.FullSync); 216 + expect(mockBackfillManager.performBackfill).toHaveBeenCalledWith(BackfillStatus.FullSync, 42n); 217 + }); 218 + 219 + it("falls through to gap detection when ?force is an unrecognized value", async () => { 220 + await setupAdminUser(); 221 + mockBackfillManager.checkIfNeeded.mockResolvedValue(BackfillStatus.CatchUp); 222 + 223 + const res = await app.request("/api/admin/backfill?force=garbage", { 224 + method: "POST", 225 + headers: authHeaders, 226 + }); 227 + // Unrecognized force value is ignored — gap detection runs 228 + expect(res.status).toBe(202); 229 + expect(mockBackfillManager.checkIfNeeded).toHaveBeenCalled(); 230 + }); 231 + }); 232 + 233 + describe("GET /api/admin/backfill/:id", () => { 234 + it("returns 401 without authentication", async () => { 235 + const res = await app.request("/api/admin/backfill/1"); 236 + expect(res.status).toBe(401); 237 + }); 238 + 239 + it("returns 403 when user lacks manageForum permission", async () => { 240 + // Auth works (ADMIN_DID) but no role/membership in DB 241 + await ctx.db.insert(users).values({ 242 + did: ADMIN_DID, 243 + handle: "admin.test", 244 + indexedAt: new Date(), 245 + }); 246 + 247 + const res = await app.request("/api/admin/backfill/1", { 248 + headers: authHeaders, 249 + }); 250 + expect(res.status).toBe(403); 251 + }); 252 + 253 + it("returns 400 for non-numeric backfill ID", async () => { 254 + await setupAdminUser(); 255 + const res = await app.request("/api/admin/backfill/notanumber", { 256 + headers: authHeaders, 257 + }); 258 + expect(res.status).toBe(400); 259 + const data = await res.json(); 260 + expect(data.error).toContain("Invalid backfill ID"); 261 + }); 262 + 263 + it("returns 400 for decimal backfill ID", async () => { 264 + await setupAdminUser(); 265 + const res = await app.request("/api/admin/backfill/5.9", { 266 + headers: authHeaders, 267 + }); 268 + expect(res.status).toBe(400); 269 + }); 270 + 271 + it("returns 500 when database query fails", async () => { 272 + await setupAdminUser(); 273 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 274 + 275 + // requirePermission makes 2 DB selects (membership + role); let them pass, 276 + // then fail on the handler's backfill_progress query (call 3). 277 + const origSelect = ctx.db.select.bind(ctx.db); 278 + vi.spyOn(ctx.db, "select") 279 + .mockImplementationOnce(() => origSelect() as any) // permissions: membership 280 + .mockImplementationOnce(() => origSelect() as any) // permissions: role 281 + .mockReturnValueOnce({ // handler: backfill_progress 282 + from: vi.fn().mockReturnValue({ 283 + where: vi.fn().mockReturnValue({ 284 + limit: vi.fn().mockRejectedValue(new Error("DB connection lost")), 285 + }), 286 + }), 287 + } as any); 288 + 289 + const res = await app.request("/api/admin/backfill/123", { headers: authHeaders }); 290 + expect(res.status).toBe(500); 291 + 292 + consoleSpy.mockRestore(); 293 + }); 294 + 295 + it("returns 404 for unknown backfill ID", async () => { 296 + await setupAdminUser(); 297 + const res = await app.request("/api/admin/backfill/999999", { 298 + headers: authHeaders, 299 + }); 300 + expect(res.status).toBe(404); 301 + const data = await res.json(); 302 + expect(data.error).toContain("not found"); 303 + }); 304 + 305 + it("returns progress data for a known backfill ID (completed)", async () => { 306 + await setupAdminUser(); 307 + 308 + const [row] = await ctx.db 309 + .insert(backfillProgress) 310 + .values({ 311 + status: "completed", 312 + backfillType: "catch_up", 313 + didsTotal: 10, 314 + didsProcessed: 10, 315 + recordsIndexed: 50, 316 + startedAt: new Date("2026-01-01T00:00:00Z"), 317 + completedAt: new Date("2026-01-01T00:05:00Z"), 318 + }) 319 + .returning({ id: backfillProgress.id }); 320 + 321 + const res = await app.request(`/api/admin/backfill/${row.id.toString()}`, { 322 + headers: authHeaders, 323 + }); 324 + expect(res.status).toBe(200); 325 + const data = await res.json(); 326 + expect(data.id).toBe(row.id.toString()); 327 + expect(data.status).toBe("completed"); 328 + expect(data.type).toBe("catch_up"); 329 + expect(data.didsTotal).toBe(10); 330 + expect(data.didsProcessed).toBe(10); 331 + expect(data.recordsIndexed).toBe(50); 332 + expect(data.errorCount).toBe(0); 333 + }); 334 + 335 + it("returns in_progress status for a running backfill", async () => { 336 + await setupAdminUser(); 337 + 338 + const [row] = await ctx.db 339 + .insert(backfillProgress) 340 + .values({ 341 + status: "in_progress", 342 + backfillType: "full_sync", 343 + didsTotal: 100, 344 + didsProcessed: 30, 345 + recordsIndexed: 75, 346 + startedAt: new Date("2026-01-01T00:00:00Z"), 347 + }) 348 + .returning({ id: backfillProgress.id }); 349 + 350 + const res = await app.request(`/api/admin/backfill/${row.id.toString()}`, { 351 + headers: authHeaders, 352 + }); 353 + expect(res.status).toBe(200); 354 + const data = await res.json(); 355 + expect(data.status).toBe("in_progress"); 356 + expect(data.didsProcessed).toBe(30); 357 + expect(data.completedAt).toBeNull(); 358 + }); 359 + }); 360 + 361 + describe("GET /api/admin/backfill/:id/errors", () => { 362 + it("returns 401 without authentication", async () => { 363 + const res = await app.request("/api/admin/backfill/1/errors"); 364 + expect(res.status).toBe(401); 365 + }); 366 + 367 + it("returns 403 when user lacks manageForum permission", async () => { 368 + await ctx.db.insert(users).values({ 369 + did: ADMIN_DID, 370 + handle: "admin.test", 371 + indexedAt: new Date(), 372 + }); 373 + 374 + const res = await app.request("/api/admin/backfill/1/errors", { 375 + headers: authHeaders, 376 + }); 377 + expect(res.status).toBe(403); 378 + }); 379 + 380 + it("returns 400 for non-numeric backfill ID", async () => { 381 + await setupAdminUser(); 382 + const res = await app.request("/api/admin/backfill/notanumber/errors", { 383 + headers: authHeaders, 384 + }); 385 + expect(res.status).toBe(400); 386 + }); 387 + 388 + it("returns 500 when database query fails", async () => { 389 + await setupAdminUser(); 390 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 391 + 392 + const origSelect = ctx.db.select.bind(ctx.db); 393 + vi.spyOn(ctx.db, "select") 394 + .mockImplementationOnce(() => origSelect() as any) // permissions: membership 395 + .mockImplementationOnce(() => origSelect() as any) // permissions: role 396 + .mockReturnValueOnce({ // handler: backfill_errors query 397 + from: vi.fn().mockReturnValue({ 398 + where: vi.fn().mockReturnValue({ 399 + orderBy: vi.fn().mockReturnValue({ 400 + limit: vi.fn().mockRejectedValue(new Error("DB connection lost")), 401 + }), 402 + }), 403 + }), 404 + } as any); 405 + 406 + const res = await app.request("/api/admin/backfill/123/errors", { headers: authHeaders }); 407 + expect(res.status).toBe(500); 408 + 409 + consoleSpy.mockRestore(); 410 + }); 411 + 412 + it("returns empty errors array when no errors exist", async () => { 413 + await setupAdminUser(); 414 + 415 + const [row] = await ctx.db 416 + .insert(backfillProgress) 417 + .values({ 418 + status: "completed", 419 + backfillType: "full_sync", 420 + didsTotal: 3, 421 + didsProcessed: 3, 422 + recordsIndexed: 15, 423 + startedAt: new Date("2026-01-01T00:00:00Z"), 424 + }) 425 + .returning({ id: backfillProgress.id }); 426 + 427 + const res = await app.request(`/api/admin/backfill/${row.id.toString()}/errors`, { 428 + headers: authHeaders, 429 + }); 430 + expect(res.status).toBe(200); 431 + const data = await res.json(); 432 + expect(data.errors).toHaveLength(0); 433 + }); 434 + 435 + it("returns errors for a backfill with failures", async () => { 436 + await setupAdminUser(); 437 + 438 + const [row] = await ctx.db 439 + .insert(backfillProgress) 440 + .values({ 441 + status: "completed", 442 + backfillType: "catch_up", 443 + didsTotal: 5, 444 + didsProcessed: 5, 445 + recordsIndexed: 8, 446 + startedAt: new Date("2026-01-01T00:00:00Z"), 447 + }) 448 + .returning({ id: backfillProgress.id }); 449 + 450 + await ctx.db.insert(backfillErrors).values({ 451 + backfillId: row.id, 452 + did: "did:plc:failed-user", 453 + collection: "space.atbb.post", 454 + errorMessage: "PDS connection failed", 455 + createdAt: new Date("2026-01-01T00:01:00Z"), 456 + }); 457 + 458 + const res = await app.request(`/api/admin/backfill/${row.id.toString()}/errors`, { 459 + headers: authHeaders, 460 + }); 461 + expect(res.status).toBe(200); 462 + const data = await res.json(); 463 + expect(data.errors).toHaveLength(1); 464 + expect(data.errors[0].did).toBe("did:plc:failed-user"); 465 + expect(data.errors[0].collection).toBe("space.atbb.post"); 466 + expect(data.errors[0].errorMessage).toBe("PDS connection failed"); 467 + }); 468 + }); 469 + });
+6 -4
apps/appview/src/routes/__tests__/admin.test.ts
··· 183 183 expect(res.status).toBe(403); 184 184 const data = await res.json(); 185 185 expect(data.error).toContain("equal or higher authority"); 186 - expect(data.yourPriority).toBe(10); 187 - expect(data.targetRolePriority).toBe(10); 186 + // Priority values must not be leaked in responses (security: CLAUDE.md) 187 + expect(data.yourPriority).toBeUndefined(); 188 + expect(data.targetRolePriority).toBeUndefined(); 188 189 expect(mockPutRecord).not.toHaveBeenCalled(); 189 190 }); 190 191 ··· 208 209 expect(res.status).toBe(403); 209 210 const data = await res.json(); 210 211 expect(data.error).toContain("equal or higher authority"); 211 - expect(data.yourPriority).toBe(10); 212 - expect(data.targetRolePriority).toBe(0); 212 + // Priority values must not be leaked in responses (security: CLAUDE.md) 213 + expect(data.yourPriority).toBeUndefined(); 214 + expect(data.targetRolePriority).toBeUndefined(); 213 215 expect(mockPutRecord).not.toHaveBeenCalled(); 214 216 }); 215 217
+1
apps/appview/src/routes/__tests__/health.test.ts
··· 29 29 authenticated: true, 30 30 }), 31 31 } as any, 32 + backfillManager: null, 32 33 }; 33 34 34 35 app = new Hono().route("/", createHealthRoutes(ctx));
+181 -3
apps/appview/src/routes/admin.ts
··· 3 3 import type { Variables } from "../types.js"; 4 4 import { requireAuth } from "../middleware/auth.js"; 5 5 import { requirePermission, getUserRole } from "../middleware/permissions.js"; 6 - import { memberships, roles, users, forums } from "@atbb/db"; 6 + import { memberships, roles, users, forums, backfillProgress, backfillErrors } from "@atbb/db"; 7 7 import { eq, and, sql, asc } from "drizzle-orm"; 8 + import { isProgrammingError } from "../lib/errors.js"; 9 + import { BackfillStatus } from "../lib/backfill-manager.js"; 10 + import { CursorManager } from "../lib/cursor-manager.js"; 8 11 import { 9 12 handleReadError, 10 13 handleWriteError, ··· 75 78 if (role.priority <= assignerRole.priority) { 76 79 return c.json({ 77 80 error: "Cannot assign role with equal or higher authority", 78 - yourPriority: assignerRole.priority, 79 - targetRolePriority: role.priority 80 81 }, 403); 81 82 } 82 83 ··· 220 221 roleUri: member.roleUri, 221 222 joinedAt: member.joinedAt?.toISOString(), 222 223 })), 224 + isTruncated: membersList.length === 100, 223 225 }); 224 226 } catch (error) { 225 227 return handleReadError(c, error, "Failed to retrieve members", { ··· 282 284 }); 283 285 } 284 286 }); 287 + 288 + /** 289 + * POST /api/admin/backfill 290 + * 291 + * Trigger a backfill operation. Runs asynchronously. 292 + * Returns 202 Accepted immediately. 293 + * Use ?force=catch_up or ?force=full_sync to override gap detection. 294 + */ 295 + app.post( 296 + "/backfill", 297 + requireAuth(ctx), 298 + requirePermission(ctx, "space.atbb.permission.manageForum"), 299 + async (c) => { 300 + const backfillManager = ctx.backfillManager; 301 + if (!backfillManager) { 302 + return c.json({ error: "Backfill manager not available" }, 503); 303 + } 304 + 305 + if (backfillManager.getIsRunning()) { 306 + return c.json({ error: "A backfill is already in progress" }, 409); 307 + } 308 + 309 + // Determine backfill type 310 + const force = c.req.query("force"); 311 + let type: BackfillStatus; 312 + 313 + if (force === "catch_up" || force === "full_sync") { 314 + type = force === "catch_up" ? BackfillStatus.CatchUp : BackfillStatus.FullSync; 315 + } else { 316 + try { 317 + const cursor = await new CursorManager(ctx.db).load(); 318 + type = await backfillManager.checkIfNeeded(cursor); 319 + } catch (error) { 320 + if (isProgrammingError(error)) throw error; 321 + console.error(JSON.stringify({ 322 + event: "backfill.admin_trigger.check_failed", 323 + error: error instanceof Error ? error.message : String(error), 324 + timestamp: new Date().toISOString(), 325 + })); 326 + return c.json({ error: "Failed to check backfill status. Please try again later." }, 500); 327 + } 328 + 329 + if (type === BackfillStatus.NotNeeded) { 330 + return c.json({ 331 + message: "No backfill needed. Use ?force=catch_up or ?force=full_sync to override.", 332 + }, 200); 333 + } 334 + } 335 + 336 + // Create progress row first so we can return the ID immediately in the 202 response 337 + let progressId: bigint; 338 + try { 339 + progressId = await backfillManager.prepareBackfillRow(type); 340 + } catch (error) { 341 + if (isProgrammingError(error)) throw error; 342 + console.error(JSON.stringify({ 343 + event: "backfill.admin_trigger.create_row_failed", 344 + error: error instanceof Error ? error.message : String(error), 345 + timestamp: new Date().toISOString(), 346 + })); 347 + return c.json({ error: "Failed to start backfill. Please try again later." }, 500); 348 + } 349 + 350 + // Fire and forget — don't await so response is immediate 351 + backfillManager.performBackfill(type, progressId).catch((err) => { 352 + console.error(JSON.stringify({ 353 + event: "backfill.admin_trigger_failed", 354 + backfillId: progressId.toString(), 355 + error: err instanceof Error ? err.message : String(err), 356 + timestamp: new Date().toISOString(), 357 + })); 358 + }); 359 + 360 + return c.json({ 361 + message: "Backfill started", 362 + type, 363 + status: "in_progress", 364 + id: progressId.toString(), 365 + }, 202); 366 + } 367 + ); 368 + 369 + /** 370 + * GET /api/admin/backfill/:id 371 + * 372 + * Get status and progress for a specific backfill by ID. 373 + */ 374 + app.get( 375 + "/backfill/:id", 376 + requireAuth(ctx), 377 + requirePermission(ctx, "space.atbb.permission.manageForum"), 378 + async (c) => { 379 + const id = c.req.param("id"); 380 + if (!/^\d+$/.test(id)) { 381 + return c.json({ error: "Invalid backfill ID" }, 400); 382 + } 383 + const parsedId = BigInt(id); 384 + 385 + try { 386 + const [row] = await ctx.db 387 + .select() 388 + .from(backfillProgress) 389 + .where(eq(backfillProgress.id, parsedId)) 390 + .limit(1); 391 + 392 + if (!row) { 393 + return c.json({ error: "Backfill not found" }, 404); 394 + } 395 + 396 + const [errorCount] = await ctx.db 397 + .select({ count: sql<number>`count(*)::int` }) 398 + .from(backfillErrors) 399 + .where(eq(backfillErrors.backfillId, row.id)); 400 + 401 + return c.json({ 402 + id: row.id.toString(), 403 + status: row.status, 404 + type: row.backfillType, 405 + didsTotal: row.didsTotal, 406 + didsProcessed: row.didsProcessed, 407 + recordsIndexed: row.recordsIndexed, 408 + errorCount: errorCount?.count ?? 0, 409 + startedAt: row.startedAt.toISOString(), 410 + completedAt: row.completedAt?.toISOString() ?? null, 411 + errorMessage: row.errorMessage, 412 + }); 413 + } catch (error) { 414 + return handleReadError(c, error, "Failed to fetch backfill progress", { 415 + operation: "GET /api/admin/backfill/:id", 416 + id, 417 + }); 418 + } 419 + } 420 + ); 421 + 422 + /** 423 + * GET /api/admin/backfill/:id/errors 424 + * 425 + * List per-DID errors for a specific backfill. 426 + */ 427 + app.get( 428 + "/backfill/:id/errors", 429 + requireAuth(ctx), 430 + requirePermission(ctx, "space.atbb.permission.manageForum"), 431 + async (c) => { 432 + const id = c.req.param("id"); 433 + if (!/^\d+$/.test(id)) { 434 + return c.json({ error: "Invalid backfill ID" }, 400); 435 + } 436 + const parsedId = BigInt(id); 437 + 438 + try { 439 + const errors = await ctx.db 440 + .select() 441 + .from(backfillErrors) 442 + .where(eq(backfillErrors.backfillId, parsedId)) 443 + .orderBy(asc(backfillErrors.createdAt)) 444 + .limit(1000); 445 + 446 + return c.json({ 447 + errors: errors.map((e) => ({ 448 + id: e.id.toString(), 449 + did: e.did, 450 + collection: e.collection, 451 + errorMessage: e.errorMessage, 452 + createdAt: e.createdAt.toISOString(), 453 + })), 454 + }); 455 + } catch (error) { 456 + return handleReadError(c, error, "Failed to fetch backfill errors", { 457 + operation: "GET /api/admin/backfill/:id/errors", 458 + id, 459 + }); 460 + } 461 + } 462 + ); 285 463 286 464 return app; 287 465 }
+50
bruno/AppView API/Admin/Get Backfill Errors.bru
··· 1 + meta { 2 + name: Get Backfill Errors 3 + type: http 4 + seq: 3 5 + } 6 + 7 + get { 8 + url: {{appview_url}}/api/admin/backfill/1/errors 9 + } 10 + 11 + assert { 12 + res.status: eq 200 13 + res.body.errors: isDefined 14 + } 15 + 16 + docs { 17 + Lists per-DID errors recorded during a specific backfill run. 18 + 19 + Requires authentication via session cookie and `space.atbb.permission.manageForum` permission. 20 + 21 + Path params: 22 + - id: integer (required) - The numeric ID from the backfill_progress table 23 + 24 + Returns: 25 + { 26 + "errors": [ 27 + { 28 + "id": "1", 29 + "did": "did:plc:example", 30 + "collection": "space.atbb.post", 31 + "errorMessage": "fetch failed: connection refused", 32 + "createdAt": "2026-02-23T10:05:00.000Z" 33 + } 34 + ] 35 + } 36 + 37 + Returns an empty errors array if the backfill completed with no per-DID failures. 38 + 39 + Error codes: 40 + - 400: Invalid backfill ID (non-integer path parameter) 41 + - 401: Unauthorized (not authenticated) 42 + - 403: Forbidden (lacks manageForum permission) 43 + - 500: Server error 44 + 45 + Notes: 46 + - Results are ordered by createdAt ascending (earliest errors first) 47 + - Limited to 1000 entries per response 48 + - Per-DID errors are partial failures; the backfill continues processing other DIDs 49 + - A null collection means the error occurred before reaching the collection sync stage 50 + }
+56
bruno/AppView API/Admin/Get Backfill Status.bru
··· 1 + meta { 2 + name: Get Backfill Status 3 + type: http 4 + seq: 2 5 + } 6 + 7 + get { 8 + url: {{appview_url}}/api/admin/backfill/1 9 + } 10 + 11 + assert { 12 + res.status: eq 200 13 + res.body.id: isDefined 14 + res.body.status: isDefined 15 + res.body.type: isDefined 16 + } 17 + 18 + docs { 19 + Returns the status and progress for a specific backfill run by its integer ID. 20 + 21 + Requires authentication via session cookie and `space.atbb.permission.manageForum` permission. 22 + 23 + Path params: 24 + - id: integer (required) - The numeric ID from the backfill_progress table 25 + 26 + Returns: 27 + { 28 + "id": "1", 29 + "status": "in_progress" | "completed" | "failed", 30 + "type": "catch_up" | "full_sync", 31 + "didsTotal": 150, 32 + "didsProcessed": 75, 33 + "recordsIndexed": 1200, 34 + "errorCount": 2, 35 + "startedAt": "2026-02-23T10:00:00.000Z", 36 + "completedAt": null, 37 + "errorMessage": null 38 + } 39 + 40 + Status values: 41 + - "in_progress" — backfill is actively running 42 + - "completed" — backfill finished successfully 43 + - "failed" — backfill encountered a fatal error (see errorMessage) 44 + 45 + Error codes: 46 + - 400: Invalid backfill ID (non-integer path parameter) 47 + - 401: Unauthorized (not authenticated) 48 + - 403: Forbidden (lacks manageForum permission) 49 + - 404: Backfill not found (ID does not exist in backfill_progress table) 50 + - 500: Server error 51 + 52 + Notes: 53 + - errorCount is the number of per-DID failures (partial failures don't set status=failed) 54 + - Use GET /api/admin/backfill/:id/errors to see per-DID error details 55 + - completedAt is null while in_progress and when status=failed 56 + }
+65
bruno/AppView API/Admin/Trigger Backfill.bru
··· 1 + meta { 2 + name: Trigger Backfill 3 + type: http 4 + seq: 1 5 + } 6 + 7 + post { 8 + url: {{appview_url}}/api/admin/backfill 9 + } 10 + 11 + params:query { 12 + ~force: full_sync 13 + } 14 + 15 + headers { 16 + Content-Type: application/json 17 + } 18 + 19 + assert { 20 + res.status: in [200, 202] 21 + res.body.message: isDefined 22 + } 23 + 24 + docs { 25 + Triggers a backfill of AT Protocol repo records for all known forum members. 26 + Runs asynchronously — returns immediately with 202 Accepted or 200 (if not needed). 27 + 28 + Requires authentication via session cookie and `space.atbb.permission.manageForum` permission. 29 + 30 + Query params: 31 + - force: string (optional) - Override automatic gap detection. One of: 32 + - "catch_up" — Replay firehose events since last cursor (missed events only) 33 + - "full_sync" — Re-sync every member's full repo from scratch 34 + 35 + When `force` is omitted, the endpoint runs automatic gap detection: 36 + - If cursor is older than backfillCursorMaxAgeHours, triggers CatchUp 37 + - If no cursor exists at all, triggers FullSync 38 + - If cursor is recent and healthy, returns 200 with "No backfill needed" message 39 + 40 + Returns (202 — backfill started): 41 + { 42 + "message": "Backfill started", 43 + "type": "catch_up" | "full_sync", 44 + "status": "in_progress", 45 + "id": "42" // bigint as string — use with GET /api/admin/backfill/:id to poll progress 46 + } 47 + 48 + Returns (200 — no backfill needed): 49 + { 50 + "message": "No backfill needed. Use ?force=catch_up or ?force=full_sync to override." 51 + } 52 + 53 + Error codes: 54 + - 401: Unauthorized (not authenticated) 55 + - 403: Forbidden (lacks manageForum permission) 56 + - 409: Conflict — a backfill is already in progress 57 + - 503: BackfillManager not available (server configuration issue) 58 + 59 + Error codes (additional): 60 + - 500: Failed to check backfill status or create progress row (server error) 61 + 62 + Notes: 63 + - The backfill ID in the 202 response can be used immediately with GET /api/admin/backfill/:id 64 + - Unrecognized values for ?force are ignored and fall through to automatic gap detection 65 + }
+15 -1
docs/atproto-forum-plan.md
··· 220 220 - Files: `apps/appview/src/lib/ban-enforcer.ts` (new), `apps/appview/src/lib/indexer.ts` (3 handler overrides), `apps/appview/src/lib/__tests__/indexer-ban-enforcer.test.ts` (new), `apps/appview/src/lib/__tests__/indexer.test.ts` (additions) 221 221 - [x] Document the trust model: operators must trust their AppView instance, which is acceptable for self-hosted single-server deployments 222 222 - ATB-22 | `docs/trust-model.md` (new) — covers operator responsibilities, user data guarantees, security implications, and future delegation path; referenced from deployment guide 223 + - [x] **ATB-13: Backfill & Repo Sync** — **Complete:** 2026-02-23 224 + - Automatic gap detection on startup: if cursor is stale (>48h) → CatchUp; no cursor → FullSync; healthy cursor → skip 225 + - `BackfillManager` orchestrates full-repo sync via `com.atproto.sync.listRepos` + `com.atproto.repo.listRecords` for each member DID 226 + - Processes all `space.atbb.*` collections through the same `Indexer` handlers used by the live firehose 227 + - Interrupt recovery: in-progress backfills at shutdown are resumed on next startup via `status='interrupted'` checkpoint 228 + - Admin API endpoints for manual triggering and monitoring: 229 + - `POST /api/admin/backfill` — triggers backfill (202 async, 200 if not needed, 409 if running, optional `?force=catch_up|full_sync`) 230 + - `GET /api/admin/backfill/:id` — polls status, progress, and error count for a specific run 231 + - `GET /api/admin/backfill/:id/errors` — lists per-DID errors (partial failures that didn't stop the run) 232 + - DB tables: `backfill_progress` (status, type, dids_total/processed, records_indexed, checkpoint) and `backfill_errors` (per-DID failures) 233 + - Rate-limited with configurable concurrency (`backfillRateLimit`, `backfillConcurrency` in config) 234 + - 555 tests passing; 16 new tests in `admin-backfill.test.ts` covering auth, progress, errors, and 409/503 edge cases 235 + - Files: `apps/appview/src/lib/backfill-manager.ts` (new), `apps/appview/src/routes/admin.ts` (3 new routes), `apps/appview/src/lib/firehose.ts` (startup integration), `packages/db/src/schema.ts` (2 new tables) 236 + - Bruno collection: `bruno/AppView API/Admin/` (3 .bru files) 223 237 224 238 #### Phase 4: Web UI (Week 7–9) 225 239 - [x] Forum homepage: category list, recent topics ··· 252 266 2. **Firehose filtering:** The AT Proto firehose is *all* records from *all* users. Filtering for your Lexicon types at scale requires thought. For MVP (small user base), naive filtering is fine. 253 267 3. **PDS write path:** Writing records to a user's PDS on their behalf requires proper OAuth scopes. Verify the current state of the AT Proto OAuth spec — it's been evolving. 254 268 4. **Record deletion:** If a user deletes a post from their PDS, the firehose emits a tombstone. Your indexer needs to handle this (soft-delete from index). 255 - 5. **Backfill:** If your AppView goes down and comes back, you need to backfill missed records. AT Proto supports repo sync for this, but it adds complexity. 269 + 5. **~~Backfill~~** ✅ RESOLVED — ATB-13 implemented full backfill & repo sync. `BackfillManager` handles gap detection, CatchUp vs FullSync, interrupt recovery, rate-limited repo crawl, and admin trigger API (`POST /api/admin/backfill`). See Phase 3 ATB-13 entry for details. 256 270 257 271 --- 258 272
+197
docs/plans/2026-02-22-backfill-repo-sync-design.md
··· 1 + # ATB-13: Backfill and Repo Sync Design 2 + 3 + **Date:** 2026-02-22 4 + **Linear:** ATB-13 5 + **Status:** Approved 6 + 7 + ## Problem 8 + 9 + The AppView must handle restarts gracefully. Short downtimes (<1 hour) recover via cursor resume on the firehose. Longer downtimes (>48 hours) risk exceeding the ~72-hour firehose retention window, causing permanent data loss. First-time startups with an empty database see no historical data at all. 10 + 11 + ## Design Decisions 12 + 13 + | Decision | Choice | Rationale | 14 + |----------|--------|-----------| 15 + | First-startup scope | Forum PDS only | Empty DB has no known user DIDs; firehose discovers users going forward | 16 + | Firehose during backfill | Blocked | Simpler than concurrent — no deduplication needed | 17 + | Progress tracking | DB table with resume | Survives crashes; operator visibility via admin API | 18 + | Admin API style | Async with polling | Backfills can take minutes; synchronous would timeout | 19 + | Indexing approach | Reuse existing Indexer handlers | Zero logic duplication; well-tested FK resolution, ban enforcement, soft deletes | 20 + 21 + ## Architecture 22 + 23 + ### BackfillManager Class 24 + 25 + New class at `apps/appview/src/lib/backfill-manager.ts`. Injected into AppContext. 26 + 27 + **Core methods:** 28 + 29 + - `checkIfNeeded(cursor: bigint | null): Promise<BackfillStatus>` — gap detection 30 + - `performBackfill(): Promise<BackfillResult>` — orchestrates full sync 31 + - `syncRepoRecords(did: string, collection: string): Promise<SyncStats>` — syncs one (DID, collection) pair 32 + 33 + ### Gap Detection 34 + 35 + ``` 36 + BackfillStatus = NotNeeded | CatchUp | FullSync 37 + ``` 38 + 39 + Decision logic: 40 + 41 + 1. No cursor → `FullSync` 42 + 2. Cursor exists, forums table empty → `FullSync` (DB inconsistency) 43 + 3. Cursor age > 48 hours → `CatchUp` 44 + 4. Otherwise → `NotNeeded` 45 + 46 + The 48-hour threshold (configurable via `BACKFILL_CURSOR_MAX_AGE_HOURS`) provides safety margin below the ~72-hour firehose retention window. 47 + 48 + ### Repo Sync Mechanism 49 + 50 + Uses `com.atproto.repo.listRecords()` (collection-based sync, not full CAR files). 51 + 52 + For each (DID, collection) pair: 53 + 54 + 1. Paginate through `listRecords({ repo: did, collection, limit: 100 })` 55 + 2. Transform each record to `CommitCreateEvent` shape (~10-line adapter) 56 + 3. Call matching `indexer.handleXCreate(event)` — reuses all existing logic 57 + 4. Track success/error counts 58 + 59 + **Event shape adapter:** 60 + 61 + ```typescript 62 + function toCreateEvent(did: string, record: ListRecordItem): CommitCreateEvent { 63 + const rkey = record.uri.split("/").pop()!; 64 + return { 65 + did, 66 + commit: { rkey, cid: record.cid }, 67 + record: record.value, 68 + }; 69 + } 70 + ``` 71 + 72 + **Collection sync order** (respects FK dependencies): 73 + 74 + 1. `space.atbb.forum.forum` (no deps) 75 + 2. `space.atbb.forum.category` (FK to forum) 76 + 3. `space.atbb.forum.board` (FK to category) 77 + 4. `space.atbb.forum.role` (FK to forum) 78 + 5. `space.atbb.membership` (FK to forum, user) 79 + 6. `space.atbb.post` (FK to board, user, optionally parent post) 80 + 7. `space.atbb.mod_action` (FK to forum) 81 + 8. `space.atbb.reaction` (FK to post, user — stub) 82 + 83 + **Rate limiting:** Delay-based throttle at `1000 / BACKFILL_RATE_LIMIT` ms between page fetches. Default: 10 req/s per PDS. 84 + 85 + ### Conflict Resolution 86 + 87 + - `UNIQUE(did, rkey)` constraint handles duplicates naturally via Indexer's upsert logic 88 + - CID comparison: if CID differs from indexed record, the Indexer updates it 89 + - Tombstone/deletion detection via `listRecords` not possible — deferred to post-MVP (would require `com.atproto.sync.getRepo` CAR parsing) 90 + 91 + ### Backfill Orchestration 92 + 93 + **FullSync flow:** 94 + 95 + 1. Sync Forum DID across all forum-owned collections (in dependency order) 96 + 2. Mark backfill completed 97 + 3. Return stats 98 + 99 + **CatchUp flow:** 100 + 101 + 1. Sync Forum DID first (structure may have changed) 102 + 2. Query `users` table for all known DIDs, sorted by `did ASC` 103 + 3. Process DIDs in batches of `BACKFILL_CONCURRENCY` (default: 10) 104 + 4. For each DID: sync membership, posts (user-owned collections) 105 + 5. Update `backfill_progress` row every batch 106 + 6. Mark completed; return stats 107 + 108 + **Resume from checkpoint:** 109 + 110 + On startup, check for `backfill_progress` row with `status = 'in_progress'`. If found, resume from `last_processed_did` by skipping alphabetically earlier DIDs. 111 + 112 + ### Database Tables 113 + 114 + ```sql 115 + CREATE TABLE backfill_progress ( 116 + id SERIAL PRIMARY KEY, 117 + status VARCHAR(20) NOT NULL, -- 'in_progress', 'completed', 'failed' 118 + backfill_type VARCHAR(20) NOT NULL, -- 'full_sync', 'catch_up' 119 + last_processed_did VARCHAR(255), 120 + dids_total INTEGER DEFAULT 0, 121 + dids_processed INTEGER DEFAULT 0, 122 + records_indexed INTEGER DEFAULT 0, 123 + started_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(), 124 + completed_at TIMESTAMP WITH TIME ZONE, 125 + error_message TEXT 126 + ); 127 + 128 + CREATE TABLE backfill_errors ( 129 + id SERIAL PRIMARY KEY, 130 + backfill_id INTEGER NOT NULL REFERENCES backfill_progress(id), 131 + did VARCHAR(255) NOT NULL, 132 + collection VARCHAR(255) NOT NULL, 133 + error_message TEXT NOT NULL, 134 + created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() 135 + ); 136 + ``` 137 + 138 + ### Firehose Integration 139 + 140 + Modified `FirehoseService.start()`: 141 + 142 + 1. Check for interrupted backfill (resume if found) 143 + 2. Load cursor, run `backfillManager.checkIfNeeded(cursor)` 144 + 3. If backfill needed: set `isBackfilling = true`, await backfill, clear flag 145 + 4. Proceed with existing cursor resume + jetstream start 146 + 147 + Guard: reject `start()` calls while `isBackfilling === true`. 148 + 149 + ### Admin API 150 + 151 + | Method | Path | Permission | Description | 152 + |--------|------|------------|-------------| 153 + | `POST` | `/api/admin/backfill` | `manageForum` | Trigger manual backfill; returns `{ backfillId, status }` | 154 + | `GET` | `/api/admin/backfill/:id` | `manageForum` | Poll progress + error count | 155 + | `GET` | `/api/admin/backfill/:id/errors` | `manageForum` | List errors for a backfill run | 156 + 157 + POST behavior: check `isBackfilling` (409 if busy), determine type via `checkIfNeeded()`, allow `?force=catch_up|full_sync` override, kick off async, return immediately. 158 + 159 + ### Error Handling 160 + 161 + - **PDS unreachable:** Log warning, insert `backfill_errors` row, continue to next DID 162 + - **Record parse failure:** Log with AT URI, continue to next record 163 + - **Programming errors:** Re-throw (TypeError, ReferenceError, SyntaxError) 164 + - **Partial completion:** Status set to `completed`, errors queryable via admin API 165 + 166 + ### Configuration 167 + 168 + | Variable | Default | Description | 169 + |----------|---------|-------------| 170 + | `BACKFILL_RATE_LIMIT` | `10` | Max XRPC requests/second per PDS | 171 + | `BACKFILL_CONCURRENCY` | `10` | Max DIDs processed concurrently | 172 + | `BACKFILL_CURSOR_MAX_AGE_HOURS` | `48` | Cursor age threshold for CatchUp | 173 + 174 + ## Files 175 + 176 + | Action | File | 177 + |--------|------| 178 + | Create | `apps/appview/src/lib/backfill-manager.ts` | 179 + | Create | `apps/appview/src/lib/__tests__/backfill-manager.test.ts` | 180 + | Create | `apps/appview/src/lib/__tests__/backfill-integration.test.ts` | 181 + | Create | `packages/db/drizzle/migrations/XXXX_add_backfill_tables.sql` | 182 + | Modify | `packages/db/src/schema.ts` — add backfill tables | 183 + | Modify | `apps/appview/src/lib/firehose.ts` — backfill check in start() | 184 + | Modify | `apps/appview/src/lib/cursor-manager.ts` — add getCursorAge() | 185 + | Modify | `apps/appview/src/lib/app-context.ts` — add backfillManager | 186 + | Modify | `apps/appview/src/lib/config.ts` — add backfill config fields | 187 + | Modify | `apps/appview/src/routes/admin.ts` — add backfill endpoints | 188 + | Modify | `apps/appview/src/index.ts` — wire BackfillManager into startup | 189 + | Modify | `turbo.json` — add backfill env vars | 190 + 191 + ## Testing 192 + 193 + **Unit tests:** Gap detection (all 4 scenarios), syncRepoRecords pagination, event shape transformation, rate limiting, conflict resolution, resume logic, progress updates. 194 + 195 + **Integration tests:** FullSync with mock PDS, CatchUp with known users, interrupted resume, partial PDS failure, admin endpoint trigger/poll, firehose blocked during backfill. 196 + 197 + **Mocking:** Mock `AtpAgent.com.atproto.repo.listRecords` for controlled responses. Mock Indexer methods to verify event shapes. Real test DB for progress tracking.
+1876
docs/plans/2026-02-22-backfill-repo-sync-implementation.md
··· 1 + # ATB-13: Backfill & Repo Sync Implementation Plan 2 + 3 + > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. 4 + 5 + **Goal:** Enable the AppView to recover from extended downtime and bootstrap from an empty database by syncing records from AT Proto PDSs. 6 + 7 + **Architecture:** A `BackfillManager` class performs gap detection on startup, then fetches records via `com.atproto.repo.listRecords()` and feeds them through the existing `Indexer` handlers. Progress is tracked in new DB tables for crash-resilient resume. Admin API endpoints allow manual triggers and progress monitoring. 8 + 9 + **Tech Stack:** TypeScript, Hono, Drizzle ORM, `@atproto/api` (AtpAgent), Vitest 10 + 11 + **Design doc:** `docs/plans/2026-02-22-backfill-repo-sync-design.md` 12 + 13 + --- 14 + 15 + ## Task 1: Add Backfill Tables to DB Schema 16 + 17 + Add `backfillProgress` and `backfillErrors` tables to the shared DB package, then generate a Drizzle migration. 18 + 19 + **Files:** 20 + - Modify: `packages/db/src/schema.ts` (append after `roles` table, line ~213) 21 + - Modify: `packages/db/src/index.ts` (already re-exports all of schema.ts via `export * from "./schema.js"`) 22 + - Generate: `apps/appview/drizzle/0007_*.sql` (auto-generated by drizzle-kit) 23 + 24 + **Step 1: Add backfill tables to schema** 25 + 26 + Append to `packages/db/src/schema.ts` after the `roles` table: 27 + 28 + ```typescript 29 + // ── backfill_progress ─────────────────────────────────── 30 + // Tracks backfill job state for crash-resilient resume. 31 + export const backfillProgress = pgTable("backfill_progress", { 32 + id: bigserial("id", { mode: "bigint" }).primaryKey(), 33 + status: text("status").notNull(), // 'in_progress', 'completed', 'failed' 34 + backfillType: text("backfill_type").notNull(), // 'full_sync', 'catch_up' 35 + lastProcessedDid: text("last_processed_did"), 36 + didsTotal: integer("dids_total").notNull().default(0), 37 + didsProcessed: integer("dids_processed").notNull().default(0), 38 + recordsIndexed: integer("records_indexed").notNull().default(0), 39 + startedAt: timestamp("started_at", { withTimezone: true }).notNull(), 40 + completedAt: timestamp("completed_at", { withTimezone: true }), 41 + errorMessage: text("error_message"), 42 + }); 43 + 44 + // ── backfill_errors ───────────────────────────────────── 45 + // Per-DID error log for failed backfill syncs. 46 + export const backfillErrors = pgTable( 47 + "backfill_errors", 48 + { 49 + id: bigserial("id", { mode: "bigint" }).primaryKey(), 50 + backfillId: bigint("backfill_id", { mode: "bigint" }) 51 + .notNull() 52 + .references(() => backfillProgress.id), 53 + did: text("did").notNull(), 54 + collection: text("collection").notNull(), 55 + errorMessage: text("error_message").notNull(), 56 + createdAt: timestamp("created_at", { withTimezone: true }).notNull(), 57 + }, 58 + (table) => [index("backfill_errors_backfill_id_idx").on(table.backfillId)] 59 + ); 60 + ``` 61 + 62 + **Step 2: Build the DB package** 63 + 64 + Run: `pnpm --filter @atbb/db build` 65 + Expected: Clean build with no errors. 66 + 67 + **Step 3: Generate the Drizzle migration** 68 + 69 + Run: `cd apps/appview && pnpm db:generate` 70 + Expected: New migration file in `apps/appview/drizzle/` with `CREATE TABLE backfill_progress` and `CREATE TABLE backfill_errors`. 71 + 72 + **Step 4: Verify migration SQL** 73 + 74 + Read the generated `.sql` file and confirm it creates both tables with correct columns, types, and the FK from `backfill_errors.backfill_id` to `backfill_progress.id`. 75 + 76 + **Step 5: Commit** 77 + 78 + ```bash 79 + git add packages/db/src/schema.ts apps/appview/drizzle/ 80 + git commit -m "feat(db): add backfill_progress and backfill_errors tables (ATB-13)" 81 + ``` 82 + 83 + --- 84 + 85 + ## Task 2: Add Backfill Config Fields 86 + 87 + Extend `AppConfig` with three new optional fields for backfill tuning, all with sensible defaults. 88 + 89 + **Files:** 90 + - Modify: `apps/appview/src/lib/config.ts` 91 + - Modify: `apps/appview/src/lib/__tests__/config.test.ts` 92 + - Modify: `turbo.json` (add env vars to test task) 93 + 94 + **Step 1: Write the failing test** 95 + 96 + Add to `apps/appview/src/lib/__tests__/config.test.ts`: 97 + 98 + ```typescript 99 + describe("backfill config", () => { 100 + it("uses default backfill values when env vars not set", () => { 101 + const config = loadConfig(); 102 + expect(config.backfillRateLimit).toBe(10); 103 + expect(config.backfillConcurrency).toBe(10); 104 + expect(config.backfillCursorMaxAgeHours).toBe(48); 105 + }); 106 + 107 + it("reads backfill values from env vars", () => { 108 + process.env.BACKFILL_RATE_LIMIT = "5"; 109 + process.env.BACKFILL_CONCURRENCY = "20"; 110 + process.env.BACKFILL_CURSOR_MAX_AGE_HOURS = "24"; 111 + 112 + const config = loadConfig(); 113 + expect(config.backfillRateLimit).toBe(5); 114 + expect(config.backfillConcurrency).toBe(20); 115 + expect(config.backfillCursorMaxAgeHours).toBe(24); 116 + 117 + delete process.env.BACKFILL_RATE_LIMIT; 118 + delete process.env.BACKFILL_CONCURRENCY; 119 + delete process.env.BACKFILL_CURSOR_MAX_AGE_HOURS; 120 + }); 121 + }); 122 + ``` 123 + 124 + **Step 2: Run test to verify it fails** 125 + 126 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/config.test.ts` 127 + Expected: FAIL — `config.backfillRateLimit` is `undefined`. 128 + 129 + **Step 3: Implement config changes** 130 + 131 + In `apps/appview/src/lib/config.ts`, add to `AppConfig` interface: 132 + 133 + ```typescript 134 + // Backfill configuration 135 + backfillRateLimit: number; 136 + backfillConcurrency: number; 137 + backfillCursorMaxAgeHours: number; 138 + ``` 139 + 140 + In `loadConfig()`, add to the config object: 141 + 142 + ```typescript 143 + // Backfill configuration 144 + backfillRateLimit: parseInt(process.env.BACKFILL_RATE_LIMIT ?? "10", 10), 145 + backfillConcurrency: parseInt(process.env.BACKFILL_CONCURRENCY ?? "10", 10), 146 + backfillCursorMaxAgeHours: parseInt(process.env.BACKFILL_CURSOR_MAX_AGE_HOURS ?? "48", 10), 147 + ``` 148 + 149 + **Step 4: Add env vars to turbo.json** 150 + 151 + In `turbo.json`, update the `test` task env array: 152 + 153 + ```json 154 + "env": ["DATABASE_URL", "BACKFILL_RATE_LIMIT", "BACKFILL_CONCURRENCY", "BACKFILL_CURSOR_MAX_AGE_HOURS"] 155 + ``` 156 + 157 + **Step 5: Run test to verify it passes** 158 + 159 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/config.test.ts` 160 + Expected: PASS 161 + 162 + **Step 6: Commit** 163 + 164 + ```bash 165 + git add apps/appview/src/lib/config.ts apps/appview/src/lib/__tests__/config.test.ts turbo.json 166 + git commit -m "feat(appview): add backfill configuration fields (ATB-13)" 167 + ``` 168 + 169 + --- 170 + 171 + ## Task 3: Add `getCursorAge()` to CursorManager 172 + 173 + Add a method that calculates cursor age in hours, used by gap detection to decide if backfill is needed. 174 + 175 + **Files:** 176 + - Modify: `apps/appview/src/lib/cursor-manager.ts` 177 + - Modify: `apps/appview/src/lib/__tests__/cursor-manager.test.ts` 178 + 179 + **Step 1: Write the failing tests** 180 + 181 + Add to `apps/appview/src/lib/__tests__/cursor-manager.test.ts`: 182 + 183 + ```typescript 184 + describe("getCursorAgeHours", () => { 185 + it("returns null when cursor is null", () => { 186 + const age = cursorManager.getCursorAgeHours(null); 187 + expect(age).toBeNull(); 188 + }); 189 + 190 + it("calculates age in hours from microsecond cursor", () => { 191 + // Cursor from 24 hours ago 192 + const twentyFourHoursAgoUs = BigInt((Date.now() - 24 * 60 * 60 * 1000) * 1000); 193 + const age = cursorManager.getCursorAgeHours(twentyFourHoursAgoUs); 194 + // Allow 1-hour tolerance for test execution time 195 + expect(age).toBeGreaterThanOrEqual(23); 196 + expect(age).toBeLessThanOrEqual(25); 197 + }); 198 + 199 + it("returns near-zero for recent cursor", () => { 200 + const recentCursorUs = BigInt(Date.now() * 1000); 201 + const age = cursorManager.getCursorAgeHours(recentCursorUs); 202 + expect(age).toBeGreaterThanOrEqual(0); 203 + expect(age).toBeLessThan(1); 204 + }); 205 + }); 206 + ``` 207 + 208 + **Step 2: Run test to verify it fails** 209 + 210 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/cursor-manager.test.ts` 211 + Expected: FAIL — `cursorManager.getCursorAgeHours is not a function`. 212 + 213 + **Step 3: Implement getCursorAgeHours** 214 + 215 + Add to `apps/appview/src/lib/cursor-manager.ts`: 216 + 217 + ```typescript 218 + /** 219 + * Calculate cursor age in hours. 220 + * Cursor values are Jetstream timestamps in microseconds since epoch. 221 + * 222 + * @param cursor - Cursor value (microseconds), or null 223 + * @returns Age in hours, or null if cursor is null 224 + */ 225 + getCursorAgeHours(cursor: bigint | null): number | null { 226 + if (cursor === null) return null; 227 + const cursorMs = Number(cursor / 1000n); 228 + const ageMs = Date.now() - cursorMs; 229 + return ageMs / (1000 * 60 * 60); 230 + } 231 + ``` 232 + 233 + **Step 4: Run test to verify it passes** 234 + 235 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/cursor-manager.test.ts` 236 + Expected: All tests PASS. 237 + 238 + **Step 5: Commit** 239 + 240 + ```bash 241 + git add apps/appview/src/lib/cursor-manager.ts apps/appview/src/lib/__tests__/cursor-manager.test.ts 242 + git commit -m "feat(appview): add getCursorAgeHours to CursorManager (ATB-13)" 243 + ``` 244 + 245 + --- 246 + 247 + ## Task 4: BackfillManager — Gap Detection 248 + 249 + Create the `BackfillManager` class with gap detection logic (`checkIfNeeded`). 250 + 251 + **Files:** 252 + - Create: `apps/appview/src/lib/backfill-manager.ts` 253 + - Create: `apps/appview/src/lib/__tests__/backfill-manager.test.ts` 254 + 255 + **Step 1: Write the failing tests** 256 + 257 + Create `apps/appview/src/lib/__tests__/backfill-manager.test.ts`: 258 + 259 + ```typescript 260 + import { describe, it, expect, beforeEach, vi, afterEach } from "vitest"; 261 + import { BackfillManager, BackfillStatus } from "../backfill-manager.js"; 262 + import type { Database } from "@atbb/db"; 263 + import type { AppConfig } from "../config.js"; 264 + 265 + // Minimal mock config 266 + function mockConfig(overrides: Partial<AppConfig> = {}): AppConfig { 267 + return { 268 + port: 3000, 269 + forumDid: "did:plc:testforum", 270 + pdsUrl: "https://pds.example.com", 271 + databaseUrl: "postgres://test", 272 + jetstreamUrl: "wss://jetstream.example.com", 273 + oauthPublicUrl: "https://example.com", 274 + sessionSecret: "a".repeat(32), 275 + sessionTtlDays: 7, 276 + backfillRateLimit: 10, 277 + backfillConcurrency: 10, 278 + backfillCursorMaxAgeHours: 48, 279 + ...overrides, 280 + } as AppConfig; 281 + } 282 + 283 + describe("BackfillManager", () => { 284 + let mockDb: Database; 285 + let manager: BackfillManager; 286 + 287 + beforeEach(() => { 288 + mockDb = { 289 + select: vi.fn().mockReturnValue({ 290 + from: vi.fn().mockReturnValue({ 291 + where: vi.fn().mockReturnValue({ 292 + limit: vi.fn().mockResolvedValue([]), 293 + }), 294 + }), 295 + }), 296 + } as unknown as Database; 297 + 298 + manager = new BackfillManager(mockDb, mockConfig()); 299 + }); 300 + 301 + afterEach(() => { 302 + vi.clearAllMocks(); 303 + }); 304 + 305 + describe("checkIfNeeded", () => { 306 + it("returns FullSync when cursor is null (no cursor)", async () => { 307 + const status = await manager.checkIfNeeded(null); 308 + expect(status).toBe(BackfillStatus.FullSync); 309 + }); 310 + 311 + it("returns FullSync when cursor exists but forums table is empty", async () => { 312 + // Forums query returns empty 313 + vi.spyOn(mockDb, "select").mockReturnValue({ 314 + from: vi.fn().mockReturnValue({ 315 + where: vi.fn().mockReturnValue({ 316 + limit: vi.fn().mockResolvedValue([]), 317 + }), 318 + }), 319 + } as any); 320 + 321 + // Cursor from 1 hour ago (fresh) 322 + const cursor = BigInt((Date.now() - 1 * 60 * 60 * 1000) * 1000); 323 + const status = await manager.checkIfNeeded(cursor); 324 + expect(status).toBe(BackfillStatus.FullSync); 325 + }); 326 + 327 + it("returns CatchUp when cursor age exceeds threshold", async () => { 328 + // Forums query returns a forum (DB not empty) 329 + vi.spyOn(mockDb, "select").mockReturnValue({ 330 + from: vi.fn().mockReturnValue({ 331 + where: vi.fn().mockReturnValue({ 332 + limit: vi.fn().mockResolvedValue([{ id: 1n, rkey: "self" }]), 333 + }), 334 + }), 335 + } as any); 336 + 337 + // Cursor from 72 hours ago (stale) 338 + const cursor = BigInt((Date.now() - 72 * 60 * 60 * 1000) * 1000); 339 + const status = await manager.checkIfNeeded(cursor); 340 + expect(status).toBe(BackfillStatus.CatchUp); 341 + }); 342 + 343 + it("returns NotNeeded when cursor is fresh and DB has data", async () => { 344 + // Forums query returns a forum 345 + vi.spyOn(mockDb, "select").mockReturnValue({ 346 + from: vi.fn().mockReturnValue({ 347 + where: vi.fn().mockReturnValue({ 348 + limit: vi.fn().mockResolvedValue([{ id: 1n, rkey: "self" }]), 349 + }), 350 + }), 351 + } as any); 352 + 353 + // Cursor from 1 hour ago (fresh) 354 + const cursor = BigInt((Date.now() - 1 * 60 * 60 * 1000) * 1000); 355 + const status = await manager.checkIfNeeded(cursor); 356 + expect(status).toBe(BackfillStatus.NotNeeded); 357 + }); 358 + }); 359 + }); 360 + ``` 361 + 362 + **Step 2: Run test to verify it fails** 363 + 364 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 365 + Expected: FAIL — module `../backfill-manager.js` not found. 366 + 367 + **Step 3: Implement BackfillManager with gap detection** 368 + 369 + Create `apps/appview/src/lib/backfill-manager.ts`: 370 + 371 + ```typescript 372 + import type { Database } from "@atbb/db"; 373 + import { forums, backfillProgress, backfillErrors, users } from "@atbb/db"; 374 + import { eq, asc, gt } from "drizzle-orm"; 375 + import { CursorManager } from "./cursor-manager.js"; 376 + import type { Indexer } from "./indexer.js"; 377 + import type { AppConfig } from "./config.js"; 378 + 379 + export enum BackfillStatus { 380 + NotNeeded = "not_needed", 381 + CatchUp = "catch_up", 382 + FullSync = "full_sync", 383 + } 384 + 385 + export interface BackfillResult { 386 + backfillId: bigint; 387 + type: BackfillStatus; 388 + didsProcessed: number; 389 + recordsIndexed: number; 390 + errors: number; 391 + durationMs: number; 392 + } 393 + 394 + export interface SyncStats { 395 + recordsFound: number; 396 + recordsIndexed: number; 397 + errors: number; 398 + } 399 + 400 + export class BackfillManager { 401 + private cursorManager: CursorManager; 402 + private isRunning = false; 403 + 404 + constructor( 405 + private db: Database, 406 + private config: AppConfig, 407 + ) { 408 + this.cursorManager = new CursorManager(db); 409 + } 410 + 411 + /** 412 + * Determine if backfill is needed based on cursor state and DB contents. 413 + */ 414 + async checkIfNeeded(cursor: bigint | null): Promise<BackfillStatus> { 415 + // No cursor at all → first startup or wiped cursor 416 + if (cursor === null) { 417 + console.log(JSON.stringify({ 418 + event: "backfill.decision", 419 + status: BackfillStatus.FullSync, 420 + reason: "no_cursor", 421 + timestamp: new Date().toISOString(), 422 + })); 423 + return BackfillStatus.FullSync; 424 + } 425 + 426 + // Check if DB has forum data (consistency check) 427 + const [forum] = await this.db 428 + .select() 429 + .from(forums) 430 + .where(eq(forums.rkey, "self")) 431 + .limit(1); 432 + 433 + if (!forum) { 434 + console.log(JSON.stringify({ 435 + event: "backfill.decision", 436 + status: BackfillStatus.FullSync, 437 + reason: "db_inconsistency", 438 + cursorTimestamp: cursor.toString(), 439 + timestamp: new Date().toISOString(), 440 + })); 441 + return BackfillStatus.FullSync; 442 + } 443 + 444 + // Check cursor age 445 + const ageHours = this.cursorManager.getCursorAgeHours(cursor); 446 + if (ageHours !== null && ageHours > this.config.backfillCursorMaxAgeHours) { 447 + console.log(JSON.stringify({ 448 + event: "backfill.decision", 449 + status: BackfillStatus.CatchUp, 450 + reason: "cursor_too_old", 451 + cursorAgeHours: Math.round(ageHours), 452 + thresholdHours: this.config.backfillCursorMaxAgeHours, 453 + cursorTimestamp: cursor.toString(), 454 + timestamp: new Date().toISOString(), 455 + })); 456 + return BackfillStatus.CatchUp; 457 + } 458 + 459 + console.log(JSON.stringify({ 460 + event: "backfill.decision", 461 + status: BackfillStatus.NotNeeded, 462 + reason: "cursor_fresh", 463 + cursorAgeHours: ageHours !== null ? Math.round(ageHours) : null, 464 + timestamp: new Date().toISOString(), 465 + })); 466 + return BackfillStatus.NotNeeded; 467 + } 468 + 469 + /** 470 + * Check if a backfill is currently running. 471 + */ 472 + getIsRunning(): boolean { 473 + return this.isRunning; 474 + } 475 + } 476 + ``` 477 + 478 + **Step 4: Run test to verify it passes** 479 + 480 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 481 + Expected: All 4 gap detection tests PASS. 482 + 483 + **Step 5: Commit** 484 + 485 + ```bash 486 + git add apps/appview/src/lib/backfill-manager.ts apps/appview/src/lib/__tests__/backfill-manager.test.ts 487 + git commit -m "feat(appview): add BackfillManager with gap detection (ATB-13)" 488 + ``` 489 + 490 + --- 491 + 492 + ## Task 5: BackfillManager — Event Adapter & syncRepoRecords 493 + 494 + Implement the core repo sync method that fetches records from a PDS and feeds them through the Indexer. 495 + 496 + **Files:** 497 + - Modify: `apps/appview/src/lib/backfill-manager.ts` 498 + - Modify: `apps/appview/src/lib/__tests__/backfill-manager.test.ts` 499 + 500 + **Step 1: Write the failing tests** 501 + 502 + Add to `backfill-manager.test.ts`: 503 + 504 + ```typescript 505 + import { Indexer } from "../indexer.js"; 506 + import { AtpAgent } from "@atproto/api"; 507 + 508 + // Mock AtpAgent 509 + vi.mock("@atproto/api", () => ({ 510 + AtpAgent: vi.fn().mockImplementation(() => ({ 511 + com: { 512 + atproto: { 513 + repo: { 514 + listRecords: vi.fn(), 515 + }, 516 + }, 517 + }, 518 + })), 519 + })); 520 + 521 + describe("syncRepoRecords", () => { 522 + let mockIndexer: Indexer; 523 + 524 + beforeEach(() => { 525 + mockIndexer = { 526 + handlePostCreate: vi.fn().mockResolvedValue(true), 527 + handleForumCreate: vi.fn().mockResolvedValue(true), 528 + } as unknown as Indexer; 529 + }); 530 + 531 + it("fetches records and calls indexer for each one", async () => { 532 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 533 + (mockAgent.com.atproto.repo.listRecords as any).mockResolvedValueOnce({ 534 + data: { 535 + records: [ 536 + { 537 + uri: "at://did:plc:user1/space.atbb.post/abc123", 538 + cid: "bafyabc", 539 + value: { $type: "space.atbb.post", text: "Hello", createdAt: "2026-01-01T00:00:00Z" }, 540 + }, 541 + { 542 + uri: "at://did:plc:user1/space.atbb.post/def456", 543 + cid: "bafydef", 544 + value: { $type: "space.atbb.post", text: "World", createdAt: "2026-01-01T01:00:00Z" }, 545 + }, 546 + ], 547 + cursor: undefined, // No more pages 548 + }, 549 + }); 550 + 551 + manager.setIndexer(mockIndexer); 552 + const stats = await manager.syncRepoRecords( 553 + "did:plc:user1", 554 + "space.atbb.post", 555 + mockAgent 556 + ); 557 + 558 + expect(stats.recordsFound).toBe(2); 559 + expect(stats.recordsIndexed).toBe(2); 560 + expect(stats.errors).toBe(0); 561 + expect(mockIndexer.handlePostCreate).toHaveBeenCalledTimes(2); 562 + expect(mockIndexer.handlePostCreate).toHaveBeenCalledWith( 563 + expect.objectContaining({ 564 + did: "did:plc:user1", 565 + commit: { rkey: "abc123", cid: "bafyabc" }, 566 + record: expect.objectContaining({ text: "Hello" }), 567 + }) 568 + ); 569 + }); 570 + 571 + it("paginates through multiple pages", async () => { 572 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 573 + (mockAgent.com.atproto.repo.listRecords as any) 574 + .mockResolvedValueOnce({ 575 + data: { 576 + records: [{ 577 + uri: "at://did:plc:user1/space.atbb.post/page1", 578 + cid: "bafyp1", 579 + value: { $type: "space.atbb.post", text: "Page 1", createdAt: "2026-01-01T00:00:00Z" }, 580 + }], 581 + cursor: "next_page", 582 + }, 583 + }) 584 + .mockResolvedValueOnce({ 585 + data: { 586 + records: [{ 587 + uri: "at://did:plc:user1/space.atbb.post/page2", 588 + cid: "bafyp2", 589 + value: { $type: "space.atbb.post", text: "Page 2", createdAt: "2026-01-02T00:00:00Z" }, 590 + }], 591 + cursor: undefined, 592 + }, 593 + }); 594 + 595 + manager.setIndexer(mockIndexer); 596 + const stats = await manager.syncRepoRecords( 597 + "did:plc:user1", 598 + "space.atbb.post", 599 + mockAgent 600 + ); 601 + 602 + expect(stats.recordsFound).toBe(2); 603 + expect(stats.recordsIndexed).toBe(2); 604 + expect(mockAgent.com.atproto.repo.listRecords).toHaveBeenCalledTimes(2); 605 + }); 606 + 607 + it("continues on indexer errors and tracks error count", async () => { 608 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 609 + (mockAgent.com.atproto.repo.listRecords as any).mockResolvedValueOnce({ 610 + data: { 611 + records: [ 612 + { 613 + uri: "at://did:plc:user1/space.atbb.post/good", 614 + cid: "bafygood", 615 + value: { $type: "space.atbb.post", text: "Good", createdAt: "2026-01-01T00:00:00Z" }, 616 + }, 617 + { 618 + uri: "at://did:plc:user1/space.atbb.post/bad", 619 + cid: "bafybad", 620 + value: { $type: "space.atbb.post", text: "Bad", createdAt: "2026-01-01T01:00:00Z" }, 621 + }, 622 + ], 623 + cursor: undefined, 624 + }, 625 + }); 626 + 627 + (mockIndexer.handlePostCreate as any) 628 + .mockResolvedValueOnce(true) 629 + .mockRejectedValueOnce(new Error("FK missing")); 630 + 631 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 632 + manager.setIndexer(mockIndexer); 633 + const stats = await manager.syncRepoRecords( 634 + "did:plc:user1", 635 + "space.atbb.post", 636 + mockAgent 637 + ); 638 + 639 + expect(stats.recordsFound).toBe(2); 640 + expect(stats.recordsIndexed).toBe(1); 641 + expect(stats.errors).toBe(1); 642 + consoleSpy.mockRestore(); 643 + }); 644 + 645 + it("handles PDS connection failure gracefully", async () => { 646 + const mockAgent = new AtpAgent({ service: "https://pds.example.com" }); 647 + (mockAgent.com.atproto.repo.listRecords as any) 648 + .mockRejectedValueOnce(new Error("fetch failed")); 649 + 650 + const consoleSpy = vi.spyOn(console, "error").mockImplementation(() => {}); 651 + manager.setIndexer(mockIndexer); 652 + const stats = await manager.syncRepoRecords( 653 + "did:plc:user1", 654 + "space.atbb.post", 655 + mockAgent 656 + ); 657 + 658 + expect(stats.recordsFound).toBe(0); 659 + expect(stats.recordsIndexed).toBe(0); 660 + expect(stats.errors).toBe(1); 661 + consoleSpy.mockRestore(); 662 + }); 663 + }); 664 + ``` 665 + 666 + **Step 2: Run test to verify it fails** 667 + 668 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 669 + Expected: FAIL — `manager.setIndexer is not a function` / `manager.syncRepoRecords is not a function`. 670 + 671 + **Step 3: Implement syncRepoRecords and the event adapter** 672 + 673 + Add to `apps/appview/src/lib/backfill-manager.ts`: 674 + 675 + 1. Add import for `AtpAgent`: 676 + ```typescript 677 + import { AtpAgent } from "@atproto/api"; 678 + ``` 679 + 680 + 2. Add the collection-to-handler mapping constant outside the class: 681 + ```typescript 682 + /** 683 + * Maps AT Proto collection NSIDs to Indexer handler method names. 684 + * Order matters: sync forum-owned records first (FK dependencies). 685 + */ 686 + const FORUM_OWNED_COLLECTIONS = [ 687 + "space.atbb.forum.forum", 688 + "space.atbb.forum.category", 689 + "space.atbb.forum.board", 690 + "space.atbb.forum.role", 691 + "space.atbb.modAction", 692 + ] as const; 693 + 694 + const USER_OWNED_COLLECTIONS = [ 695 + "space.atbb.membership", 696 + "space.atbb.post", 697 + ] as const; 698 + 699 + const COLLECTION_HANDLER_MAP: Record<string, string> = { 700 + "space.atbb.post": "handlePostCreate", 701 + "space.atbb.forum.forum": "handleForumCreate", 702 + "space.atbb.forum.category": "handleCategoryCreate", 703 + "space.atbb.forum.board": "handleBoardCreate", 704 + "space.atbb.forum.role": "handleRoleCreate", 705 + "space.atbb.membership": "handleMembershipCreate", 706 + "space.atbb.modAction": "handleModActionCreate", 707 + }; 708 + ``` 709 + 710 + 3. Add `setIndexer()` and `syncRepoRecords()` methods to the class: 711 + ```typescript 712 + private indexer: Indexer | null = null; 713 + 714 + /** 715 + * Inject the Indexer instance. Called during AppContext wiring. 716 + */ 717 + setIndexer(indexer: Indexer): void { 718 + this.indexer = indexer; 719 + } 720 + 721 + /** 722 + * Sync all records from a single (DID, collection) pair via listRecords. 723 + * Feeds each record through the matching Indexer handler. 724 + */ 725 + async syncRepoRecords( 726 + did: string, 727 + collection: string, 728 + agent: AtpAgent 729 + ): Promise<SyncStats> { 730 + const stats: SyncStats = { recordsFound: 0, recordsIndexed: 0, errors: 0 }; 731 + const handlerName = COLLECTION_HANDLER_MAP[collection]; 732 + 733 + if (!handlerName || !this.indexer) { 734 + stats.errors = 1; 735 + return stats; 736 + } 737 + 738 + const handler = (this.indexer as any)[handlerName].bind(this.indexer); 739 + const delayMs = 1000 / this.config.backfillRateLimit; 740 + let cursor: string | undefined; 741 + 742 + try { 743 + do { 744 + const response = await agent.com.atproto.repo.listRecords({ 745 + repo: did, 746 + collection, 747 + limit: 100, 748 + cursor, 749 + }); 750 + 751 + const records = response.data.records; 752 + stats.recordsFound += records.length; 753 + 754 + for (const record of records) { 755 + try { 756 + const rkey = record.uri.split("/").pop()!; 757 + const event = { 758 + did, 759 + commit: { rkey, cid: record.cid }, 760 + record: record.value, 761 + }; 762 + await handler(event); 763 + stats.recordsIndexed++; 764 + } catch (error) { 765 + stats.errors++; 766 + console.error(JSON.stringify({ 767 + event: "backfill.record_error", 768 + did, 769 + collection, 770 + uri: record.uri, 771 + error: error instanceof Error ? error.message : String(error), 772 + timestamp: new Date().toISOString(), 773 + })); 774 + } 775 + } 776 + 777 + cursor = response.data.cursor; 778 + 779 + // Rate limiting: delay between page fetches 780 + if (cursor) { 781 + await new Promise((resolve) => setTimeout(resolve, delayMs)); 782 + } 783 + } while (cursor); 784 + } catch (error) { 785 + stats.errors++; 786 + console.error(JSON.stringify({ 787 + event: "backfill.pds_error", 788 + did, 789 + collection, 790 + error: error instanceof Error ? error.message : String(error), 791 + timestamp: new Date().toISOString(), 792 + })); 793 + } 794 + 795 + return stats; 796 + } 797 + ``` 798 + 799 + **Step 4: Run test to verify it passes** 800 + 801 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 802 + Expected: All tests PASS. 803 + 804 + **Step 5: Commit** 805 + 806 + ```bash 807 + git add apps/appview/src/lib/backfill-manager.ts apps/appview/src/lib/__tests__/backfill-manager.test.ts 808 + git commit -m "feat(appview): add syncRepoRecords with event adapter (ATB-13)" 809 + ``` 810 + 811 + --- 812 + 813 + ## Task 6: BackfillManager — performBackfill Orchestration 814 + 815 + Implement the orchestration layer that coordinates syncing across multiple DIDs with DB-backed progress tracking. 816 + 817 + **Files:** 818 + - Modify: `apps/appview/src/lib/backfill-manager.ts` 819 + - Modify: `apps/appview/src/lib/__tests__/backfill-manager.test.ts` 820 + 821 + **Step 1: Write the failing tests** 822 + 823 + Add to `backfill-manager.test.ts`: 824 + 825 + ```typescript 826 + describe("performBackfill", () => { 827 + let mockIndexer: Indexer; 828 + let consoleSpy: any; 829 + 830 + beforeEach(() => { 831 + consoleSpy = vi.spyOn(console, "log").mockImplementation(() => {}); 832 + vi.spyOn(console, "error").mockImplementation(() => {}); 833 + vi.spyOn(console, "warn").mockImplementation(() => {}); 834 + 835 + mockIndexer = { 836 + handleForumCreate: vi.fn().mockResolvedValue(true), 837 + handleCategoryCreate: vi.fn().mockResolvedValue(true), 838 + handleBoardCreate: vi.fn().mockResolvedValue(true), 839 + handleRoleCreate: vi.fn().mockResolvedValue(true), 840 + handleMembershipCreate: vi.fn().mockResolvedValue(true), 841 + handlePostCreate: vi.fn().mockResolvedValue(true), 842 + handleModActionCreate: vi.fn().mockResolvedValue(true), 843 + } as unknown as Indexer; 844 + }); 845 + 846 + afterEach(() => { 847 + consoleSpy.mockRestore(); 848 + }); 849 + 850 + it("creates a backfill_progress row on start", async () => { 851 + const mockInsert = vi.fn().mockReturnValue({ 852 + values: vi.fn().mockReturnValue({ 853 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 854 + }), 855 + }); 856 + 857 + // Mock: no incomplete backfill, no users, PDS returns empty 858 + const mockSelectEmpty = vi.fn().mockReturnValue({ 859 + from: vi.fn().mockReturnValue({ 860 + where: vi.fn().mockReturnValue({ 861 + limit: vi.fn().mockResolvedValue([]), 862 + orderBy: vi.fn().mockResolvedValue([]), 863 + }), 864 + orderBy: vi.fn().mockResolvedValue([]), 865 + }), 866 + }); 867 + 868 + mockDb = { 869 + select: mockSelectEmpty, 870 + insert: mockInsert, 871 + update: vi.fn().mockReturnValue({ 872 + set: vi.fn().mockReturnValue({ 873 + where: vi.fn().mockResolvedValue(undefined), 874 + }), 875 + }), 876 + } as unknown as Database; 877 + 878 + manager = new BackfillManager(mockDb, mockConfig()); 879 + manager.setIndexer(mockIndexer); 880 + 881 + // Mock the AtpAgent to return empty records 882 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 883 + com: { 884 + atproto: { 885 + repo: { 886 + listRecords: vi.fn().mockResolvedValue({ 887 + data: { records: [], cursor: undefined }, 888 + }), 889 + }, 890 + }, 891 + }, 892 + }); 893 + 894 + await manager.performBackfill(BackfillStatus.FullSync); 895 + 896 + expect(mockInsert).toHaveBeenCalled(); 897 + }); 898 + 899 + it("sets isRunning flag during backfill", async () => { 900 + // Setup minimal mocks 901 + const mockInsert = vi.fn().mockReturnValue({ 902 + values: vi.fn().mockReturnValue({ 903 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 904 + }), 905 + }); 906 + 907 + mockDb = { 908 + select: vi.fn().mockReturnValue({ 909 + from: vi.fn().mockReturnValue({ 910 + where: vi.fn().mockReturnValue({ 911 + limit: vi.fn().mockResolvedValue([]), 912 + orderBy: vi.fn().mockResolvedValue([]), 913 + }), 914 + orderBy: vi.fn().mockResolvedValue([]), 915 + }), 916 + }), 917 + insert: mockInsert, 918 + update: vi.fn().mockReturnValue({ 919 + set: vi.fn().mockReturnValue({ 920 + where: vi.fn().mockResolvedValue(undefined), 921 + }), 922 + }), 923 + } as unknown as Database; 924 + 925 + manager = new BackfillManager(mockDb, mockConfig()); 926 + manager.setIndexer(mockIndexer); 927 + 928 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 929 + com: { 930 + atproto: { 931 + repo: { 932 + listRecords: vi.fn().mockResolvedValue({ 933 + data: { records: [], cursor: undefined }, 934 + }), 935 + }, 936 + }, 937 + }, 938 + }); 939 + 940 + expect(manager.getIsRunning()).toBe(false); 941 + const promise = manager.performBackfill(BackfillStatus.FullSync); 942 + expect(manager.getIsRunning()).toBe(true); 943 + await promise; 944 + expect(manager.getIsRunning()).toBe(false); 945 + }); 946 + 947 + it("rejects concurrent backfill attempts", async () => { 948 + // Setup minimal mocks that create a slow backfill 949 + const mockInsert = vi.fn().mockReturnValue({ 950 + values: vi.fn().mockReturnValue({ 951 + returning: vi.fn().mockResolvedValue([{ id: 1n }]), 952 + }), 953 + }); 954 + 955 + mockDb = { 956 + select: vi.fn().mockReturnValue({ 957 + from: vi.fn().mockReturnValue({ 958 + where: vi.fn().mockReturnValue({ 959 + limit: vi.fn().mockResolvedValue([]), 960 + orderBy: vi.fn().mockResolvedValue([]), 961 + }), 962 + orderBy: vi.fn().mockResolvedValue([]), 963 + }), 964 + }), 965 + insert: mockInsert, 966 + update: vi.fn().mockReturnValue({ 967 + set: vi.fn().mockReturnValue({ 968 + where: vi.fn().mockResolvedValue(undefined), 969 + }), 970 + }), 971 + } as unknown as Database; 972 + 973 + manager = new BackfillManager(mockDb, mockConfig()); 974 + manager.setIndexer(mockIndexer); 975 + 976 + vi.spyOn(manager as any, "createAgentForPds").mockReturnValue({ 977 + com: { 978 + atproto: { 979 + repo: { 980 + listRecords: vi.fn().mockImplementation( 981 + () => new Promise((resolve) => 982 + setTimeout(() => resolve({ data: { records: [], cursor: undefined } }), 100) 983 + ) 984 + ), 985 + }, 986 + }, 987 + }, 988 + }); 989 + 990 + // Start first backfill (don't await) 991 + const first = manager.performBackfill(BackfillStatus.FullSync); 992 + 993 + // Try to start second — should reject 994 + await expect(manager.performBackfill(BackfillStatus.FullSync)) 995 + .rejects.toThrow("Backfill is already in progress"); 996 + 997 + await first; 998 + }); 999 + }); 1000 + ``` 1001 + 1002 + **Step 2: Run test to verify it fails** 1003 + 1004 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 1005 + Expected: FAIL — `manager.performBackfill is not a function`. 1006 + 1007 + **Step 3: Implement performBackfill** 1008 + 1009 + Add to `BackfillManager` class in `backfill-manager.ts`: 1010 + 1011 + ```typescript 1012 + /** 1013 + * Create an AtpAgent pointed at the forum's PDS. 1014 + * Extracted for test mocking. 1015 + */ 1016 + private createAgentForPds(): AtpAgent { 1017 + return new AtpAgent({ service: this.config.pdsUrl }); 1018 + } 1019 + 1020 + /** 1021 + * Execute a backfill operation. 1022 + */ 1023 + async performBackfill(type: BackfillStatus): Promise<BackfillResult> { 1024 + if (this.isRunning) { 1025 + throw new Error("Backfill is already in progress"); 1026 + } 1027 + 1028 + this.isRunning = true; 1029 + const startTime = Date.now(); 1030 + let backfillId: bigint; 1031 + let totalIndexed = 0; 1032 + let totalErrors = 0; 1033 + let didsProcessed = 0; 1034 + 1035 + try { 1036 + // Create progress row 1037 + const [row] = await this.db 1038 + .insert(backfillProgress) 1039 + .values({ 1040 + status: "in_progress", 1041 + backfillType: type, 1042 + startedAt: new Date(), 1043 + }) 1044 + .returning({ id: backfillProgress.id }); 1045 + backfillId = row.id; 1046 + 1047 + const agent = this.createAgentForPds(); 1048 + 1049 + // Phase 1: Sync forum-owned collections from Forum DID 1050 + for (const collection of FORUM_OWNED_COLLECTIONS) { 1051 + const stats = await this.syncRepoRecords( 1052 + this.config.forumDid, 1053 + collection, 1054 + agent 1055 + ); 1056 + totalIndexed += stats.recordsIndexed; 1057 + totalErrors += stats.errors; 1058 + } 1059 + 1060 + // Phase 2: For CatchUp, sync user-owned records from known DIDs 1061 + if (type === BackfillStatus.CatchUp) { 1062 + const knownUsers = await this.db 1063 + .select({ did: users.did }) 1064 + .from(users) 1065 + .orderBy(asc(users.did)); 1066 + 1067 + const didsTotal = knownUsers.length; 1068 + 1069 + await this.db 1070 + .update(backfillProgress) 1071 + .set({ didsTotal }) 1072 + .where(eq(backfillProgress.id, backfillId)); 1073 + 1074 + // Process in batches 1075 + for (let i = 0; i < knownUsers.length; i += this.config.backfillConcurrency) { 1076 + const batch = knownUsers.slice(i, i + this.config.backfillConcurrency); 1077 + 1078 + const results = await Promise.allSettled( 1079 + batch.map(async (user) => { 1080 + for (const collection of USER_OWNED_COLLECTIONS) { 1081 + const stats = await this.syncRepoRecords(user.did, collection, agent); 1082 + totalIndexed += stats.recordsIndexed; 1083 + if (stats.errors > 0) { 1084 + totalErrors += stats.errors; 1085 + await this.db.insert(backfillErrors).values({ 1086 + backfillId, 1087 + did: user.did, 1088 + collection, 1089 + errorMessage: `${stats.errors} record(s) failed`, 1090 + createdAt: new Date(), 1091 + }); 1092 + } 1093 + } 1094 + }) 1095 + ); 1096 + 1097 + didsProcessed += batch.length; 1098 + 1099 + // Update progress 1100 + await this.db 1101 + .update(backfillProgress) 1102 + .set({ 1103 + didsProcessed, 1104 + recordsIndexed: totalIndexed, 1105 + lastProcessedDid: batch[batch.length - 1].did, 1106 + }) 1107 + .where(eq(backfillProgress.id, backfillId)); 1108 + 1109 + console.log(JSON.stringify({ 1110 + event: "backfill.progress", 1111 + backfillId: backfillId.toString(), 1112 + type, 1113 + didsProcessed, 1114 + didsTotal, 1115 + recordsIndexed: totalIndexed, 1116 + elapsedMs: Date.now() - startTime, 1117 + timestamp: new Date().toISOString(), 1118 + })); 1119 + } 1120 + } 1121 + 1122 + // Mark completed 1123 + await this.db 1124 + .update(backfillProgress) 1125 + .set({ 1126 + status: "completed", 1127 + didsProcessed, 1128 + recordsIndexed: totalIndexed, 1129 + completedAt: new Date(), 1130 + }) 1131 + .where(eq(backfillProgress.id, backfillId)); 1132 + 1133 + const result: BackfillResult = { 1134 + backfillId, 1135 + type, 1136 + didsProcessed, 1137 + recordsIndexed: totalIndexed, 1138 + errors: totalErrors, 1139 + durationMs: Date.now() - startTime, 1140 + }; 1141 + 1142 + console.log(JSON.stringify({ 1143 + event: totalErrors > 0 ? "backfill.completed_with_errors" : "backfill.completed", 1144 + ...result, 1145 + backfillId: result.backfillId.toString(), 1146 + timestamp: new Date().toISOString(), 1147 + })); 1148 + 1149 + return result; 1150 + } catch (error) { 1151 + console.error(JSON.stringify({ 1152 + event: "backfill.failed", 1153 + error: error instanceof Error ? error.message : String(error), 1154 + timestamp: new Date().toISOString(), 1155 + })); 1156 + throw error; 1157 + } finally { 1158 + this.isRunning = false; 1159 + } 1160 + } 1161 + ``` 1162 + 1163 + **Step 4: Run test to verify it passes** 1164 + 1165 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 1166 + Expected: All tests PASS. 1167 + 1168 + **Step 5: Commit** 1169 + 1170 + ```bash 1171 + git add apps/appview/src/lib/backfill-manager.ts apps/appview/src/lib/__tests__/backfill-manager.test.ts 1172 + git commit -m "feat(appview): add performBackfill orchestration with progress tracking (ATB-13)" 1173 + ``` 1174 + 1175 + --- 1176 + 1177 + ## Task 7: BackfillManager — Resume from Checkpoint 1178 + 1179 + Implement detection and resume of interrupted backfills. 1180 + 1181 + **Files:** 1182 + - Modify: `apps/appview/src/lib/backfill-manager.ts` 1183 + - Modify: `apps/appview/src/lib/__tests__/backfill-manager.test.ts` 1184 + 1185 + **Step 1: Write the failing tests** 1186 + 1187 + Add to `backfill-manager.test.ts`: 1188 + 1189 + ```typescript 1190 + describe("checkForInterruptedBackfill", () => { 1191 + it("returns null when no interrupted backfill exists", async () => { 1192 + vi.spyOn(mockDb, "select").mockReturnValue({ 1193 + from: vi.fn().mockReturnValue({ 1194 + where: vi.fn().mockReturnValue({ 1195 + limit: vi.fn().mockResolvedValue([]), 1196 + }), 1197 + }), 1198 + } as any); 1199 + 1200 + const result = await manager.checkForInterruptedBackfill(); 1201 + expect(result).toBeNull(); 1202 + }); 1203 + 1204 + it("returns interrupted backfill row when one exists", async () => { 1205 + const interruptedRow = { 1206 + id: 5n, 1207 + status: "in_progress", 1208 + backfillType: "catch_up", 1209 + lastProcessedDid: "did:plc:halfway", 1210 + didsTotal: 100, 1211 + didsProcessed: 50, 1212 + }; 1213 + 1214 + vi.spyOn(mockDb, "select").mockReturnValue({ 1215 + from: vi.fn().mockReturnValue({ 1216 + where: vi.fn().mockReturnValue({ 1217 + limit: vi.fn().mockResolvedValue([interruptedRow]), 1218 + }), 1219 + }), 1220 + } as any); 1221 + 1222 + const result = await manager.checkForInterruptedBackfill(); 1223 + expect(result).toEqual(interruptedRow); 1224 + }); 1225 + }); 1226 + ``` 1227 + 1228 + **Step 2: Run test to verify it fails** 1229 + 1230 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 1231 + Expected: FAIL — `manager.checkForInterruptedBackfill is not a function`. 1232 + 1233 + **Step 3: Implement checkForInterruptedBackfill** 1234 + 1235 + Add to `BackfillManager` class: 1236 + 1237 + ```typescript 1238 + /** 1239 + * Check for an interrupted backfill from a previous run. 1240 + * Returns the in-progress row if found, null otherwise. 1241 + */ 1242 + async checkForInterruptedBackfill() { 1243 + const [row] = await this.db 1244 + .select() 1245 + .from(backfillProgress) 1246 + .where(eq(backfillProgress.status, "in_progress")) 1247 + .limit(1); 1248 + 1249 + return row ?? null; 1250 + } 1251 + 1252 + /** 1253 + * Resume an interrupted backfill from its last checkpoint. 1254 + * Reuses performBackfill but skips already-processed DIDs. 1255 + */ 1256 + async resumeBackfill(interrupted: typeof backfillProgress.$inferSelect): Promise<BackfillResult> { 1257 + if (this.isRunning) { 1258 + throw new Error("Backfill is already in progress"); 1259 + } 1260 + 1261 + this.isRunning = true; 1262 + const startTime = Date.now(); 1263 + let totalIndexed = interrupted.recordsIndexed; 1264 + let totalErrors = 0; 1265 + let didsProcessed = interrupted.didsProcessed; 1266 + 1267 + console.log(JSON.stringify({ 1268 + event: "backfill.resuming", 1269 + backfillId: interrupted.id.toString(), 1270 + lastProcessedDid: interrupted.lastProcessedDid, 1271 + didsProcessed: interrupted.didsProcessed, 1272 + didsTotal: interrupted.didsTotal, 1273 + timestamp: new Date().toISOString(), 1274 + })); 1275 + 1276 + try { 1277 + const agent = this.createAgentForPds(); 1278 + 1279 + if (interrupted.backfillType === BackfillStatus.CatchUp && interrupted.lastProcessedDid) { 1280 + // Resume: fetch users after lastProcessedDid 1281 + const remainingUsers = await this.db 1282 + .select({ did: users.did }) 1283 + .from(users) 1284 + .where(gt(users.did, interrupted.lastProcessedDid)) 1285 + .orderBy(asc(users.did)); 1286 + 1287 + for (let i = 0; i < remainingUsers.length; i += this.config.backfillConcurrency) { 1288 + const batch = remainingUsers.slice(i, i + this.config.backfillConcurrency); 1289 + 1290 + await Promise.allSettled( 1291 + batch.map(async (user) => { 1292 + for (const collection of USER_OWNED_COLLECTIONS) { 1293 + const stats = await this.syncRepoRecords(user.did, collection, agent); 1294 + totalIndexed += stats.recordsIndexed; 1295 + if (stats.errors > 0) { 1296 + totalErrors += stats.errors; 1297 + await this.db.insert(backfillErrors).values({ 1298 + backfillId: interrupted.id, 1299 + did: user.did, 1300 + collection, 1301 + errorMessage: `${stats.errors} record(s) failed`, 1302 + createdAt: new Date(), 1303 + }); 1304 + } 1305 + } 1306 + }) 1307 + ); 1308 + 1309 + didsProcessed += batch.length; 1310 + 1311 + await this.db 1312 + .update(backfillProgress) 1313 + .set({ 1314 + didsProcessed, 1315 + recordsIndexed: totalIndexed, 1316 + lastProcessedDid: batch[batch.length - 1].did, 1317 + }) 1318 + .where(eq(backfillProgress.id, interrupted.id)); 1319 + } 1320 + } 1321 + 1322 + // Mark completed 1323 + await this.db 1324 + .update(backfillProgress) 1325 + .set({ 1326 + status: "completed", 1327 + didsProcessed, 1328 + recordsIndexed: totalIndexed, 1329 + completedAt: new Date(), 1330 + }) 1331 + .where(eq(backfillProgress.id, interrupted.id)); 1332 + 1333 + return { 1334 + backfillId: interrupted.id, 1335 + type: interrupted.backfillType as BackfillStatus, 1336 + didsProcessed, 1337 + recordsIndexed: totalIndexed, 1338 + errors: totalErrors, 1339 + durationMs: Date.now() - startTime, 1340 + }; 1341 + } finally { 1342 + this.isRunning = false; 1343 + } 1344 + } 1345 + ``` 1346 + 1347 + **Step 4: Run test to verify it passes** 1348 + 1349 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/backfill-manager.test.ts` 1350 + Expected: All tests PASS. 1351 + 1352 + **Step 5: Commit** 1353 + 1354 + ```bash 1355 + git add apps/appview/src/lib/backfill-manager.ts apps/appview/src/lib/__tests__/backfill-manager.test.ts 1356 + git commit -m "feat(appview): add interrupted backfill resume (ATB-13)" 1357 + ``` 1358 + 1359 + --- 1360 + 1361 + ## Task 8: Firehose Integration 1362 + 1363 + Modify `FirehoseService.start()` to check for and execute backfill before starting the Jetstream subscription. 1364 + 1365 + **Files:** 1366 + - Modify: `apps/appview/src/lib/firehose.ts` 1367 + - Modify: `apps/appview/src/lib/__tests__/firehose.test.ts` 1368 + 1369 + **Step 1: Write the failing tests** 1370 + 1371 + Add to `apps/appview/src/lib/__tests__/firehose.test.ts`. The exact test additions depend on how the existing tests mock the Jetstream. Look at the existing test file for patterns and add: 1372 + 1373 + ```typescript 1374 + describe("backfill integration", () => { 1375 + it("runs backfill before starting jetstream when checkIfNeeded returns CatchUp", async () => { 1376 + // Mock backfillManager 1377 + const mockBackfillManager = { 1378 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 1379 + checkIfNeeded: vi.fn().mockResolvedValue("catch_up"), 1380 + performBackfill: vi.fn().mockResolvedValue({ 1381 + backfillId: 1n, type: "catch_up", didsProcessed: 10, 1382 + recordsIndexed: 100, errors: 0, durationMs: 5000, 1383 + }), 1384 + getIsRunning: vi.fn().mockReturnValue(false), 1385 + }; 1386 + 1387 + firehoseService.setBackfillManager(mockBackfillManager as any); 1388 + await firehoseService.start(); 1389 + 1390 + expect(mockBackfillManager.checkForInterruptedBackfill).toHaveBeenCalled(); 1391 + expect(mockBackfillManager.performBackfill).toHaveBeenCalledWith("catch_up"); 1392 + }); 1393 + 1394 + it("skips backfill when checkIfNeeded returns NotNeeded", async () => { 1395 + const mockBackfillManager = { 1396 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(null), 1397 + checkIfNeeded: vi.fn().mockResolvedValue("not_needed"), 1398 + performBackfill: vi.fn(), 1399 + getIsRunning: vi.fn().mockReturnValue(false), 1400 + }; 1401 + 1402 + firehoseService.setBackfillManager(mockBackfillManager as any); 1403 + await firehoseService.start(); 1404 + 1405 + expect(mockBackfillManager.performBackfill).not.toHaveBeenCalled(); 1406 + }); 1407 + 1408 + it("resumes interrupted backfill before gap detection", async () => { 1409 + const interruptedRow = { 1410 + id: 5n, status: "in_progress", backfillType: "catch_up", 1411 + lastProcessedDid: "did:plc:halfway", didsTotal: 100, didsProcessed: 50, 1412 + }; 1413 + 1414 + const mockBackfillManager = { 1415 + checkForInterruptedBackfill: vi.fn().mockResolvedValue(interruptedRow), 1416 + resumeBackfill: vi.fn().mockResolvedValue({ 1417 + backfillId: 5n, type: "catch_up", didsProcessed: 100, 1418 + recordsIndexed: 500, errors: 0, durationMs: 3000, 1419 + }), 1420 + checkIfNeeded: vi.fn(), 1421 + performBackfill: vi.fn(), 1422 + getIsRunning: vi.fn().mockReturnValue(false), 1423 + }; 1424 + 1425 + firehoseService.setBackfillManager(mockBackfillManager as any); 1426 + await firehoseService.start(); 1427 + 1428 + expect(mockBackfillManager.resumeBackfill).toHaveBeenCalledWith(interruptedRow); 1429 + // Should NOT run gap detection after resume 1430 + expect(mockBackfillManager.checkIfNeeded).not.toHaveBeenCalled(); 1431 + }); 1432 + }); 1433 + ``` 1434 + 1435 + **Step 2: Run test to verify it fails** 1436 + 1437 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/firehose.test.ts` 1438 + Expected: FAIL — `firehoseService.setBackfillManager is not a function`. 1439 + 1440 + **Step 3: Modify FirehoseService** 1441 + 1442 + In `apps/appview/src/lib/firehose.ts`: 1443 + 1444 + 1. Add import: 1445 + ```typescript 1446 + import type { BackfillManager } from "./backfill-manager.js"; 1447 + import { BackfillStatus } from "./backfill-manager.js"; 1448 + ``` 1449 + 1450 + 2. Add field and setter: 1451 + ```typescript 1452 + private backfillManager: BackfillManager | null = null; 1453 + 1454 + setBackfillManager(manager: BackfillManager): void { 1455 + this.backfillManager = manager; 1456 + } 1457 + ``` 1458 + 1459 + 3. Modify `start()` to add backfill check at the top (after the `if (this.running)` guard, before cursor loading): 1460 + 1461 + ```typescript 1462 + async start() { 1463 + if (this.running) { 1464 + console.warn("Firehose service is already running"); 1465 + return; 1466 + } 1467 + 1468 + try { 1469 + // Check for backfill before starting firehose 1470 + if (this.backfillManager) { 1471 + // First: check for interrupted backfill from previous run 1472 + const interrupted = await this.backfillManager.checkForInterruptedBackfill(); 1473 + if (interrupted) { 1474 + console.log("Resuming interrupted backfill before starting firehose"); 1475 + await this.backfillManager.resumeBackfill(interrupted); 1476 + console.log("Interrupted backfill resumed successfully"); 1477 + } else { 1478 + // Normal gap detection 1479 + const savedCursorForCheck = await this.cursorManager.load(); 1480 + const backfillStatus = await this.backfillManager.checkIfNeeded(savedCursorForCheck); 1481 + 1482 + if (backfillStatus !== BackfillStatus.NotNeeded) { 1483 + console.log(`Backfill required (${backfillStatus}) before starting firehose`); 1484 + await this.backfillManager.performBackfill(backfillStatus); 1485 + console.log("Backfill completed, starting firehose"); 1486 + } 1487 + } 1488 + } 1489 + 1490 + // Existing cursor resume logic... 1491 + const savedCursor = await this.cursorManager.load(); 1492 + // ... rest of existing start() method 1493 + ``` 1494 + 1495 + **Step 4: Run test to verify it passes** 1496 + 1497 + Run: `pnpm --filter @atbb/appview test src/lib/__tests__/firehose.test.ts` 1498 + Expected: All tests PASS. 1499 + 1500 + **Step 5: Run full test suite** 1501 + 1502 + Run: `pnpm test` 1503 + Expected: All tests across all packages PASS. 1504 + 1505 + **Step 6: Commit** 1506 + 1507 + ```bash 1508 + git add apps/appview/src/lib/firehose.ts apps/appview/src/lib/__tests__/firehose.test.ts 1509 + git commit -m "feat(appview): integrate backfill check into FirehoseService.start() (ATB-13)" 1510 + ``` 1511 + 1512 + --- 1513 + 1514 + ## Task 9: Admin Backfill Endpoints 1515 + 1516 + Add three admin endpoints for manual backfill management: trigger, poll status, and list errors. 1517 + 1518 + **Files:** 1519 + - Modify: `apps/appview/src/routes/admin.ts` 1520 + - Create or modify: `apps/appview/src/routes/__tests__/admin-backfill.test.ts` 1521 + 1522 + **Step 1: Write the failing tests** 1523 + 1524 + Create `apps/appview/src/routes/__tests__/admin-backfill.test.ts`. Follow the existing admin route test patterns (look at existing admin tests for `createTestContext` helpers and auth patterns). Tests should cover: 1525 + 1526 + ```typescript 1527 + describe("POST /api/admin/backfill", () => { 1528 + it("returns 409 when backfill is already running", async () => { 1529 + // Mock backfillManager.getIsRunning() → true 1530 + // Expect 409 Conflict 1531 + }); 1532 + 1533 + it("triggers backfill and returns backfill ID", async () => { 1534 + // Mock backfillManager.getIsRunning() → false 1535 + // Mock checkIfNeeded → CatchUp 1536 + // Mock performBackfill → resolves 1537 + // Expect 202 with { backfillId, type, status: "in_progress" } 1538 + }); 1539 + 1540 + it("allows force override when backfill not needed", async () => { 1541 + // Request with ?force=catch_up 1542 + // Mock checkIfNeeded → NotNeeded 1543 + // Expect backfill triggered anyway with CatchUp type 1544 + }); 1545 + 1546 + it("returns 401 without auth", async () => { 1547 + // No auth header 1548 + // Expect 401 1549 + }); 1550 + 1551 + it("returns 403 without manageForum permission", async () => { 1552 + // Auth but lacking permission 1553 + // Expect 403 1554 + }); 1555 + }); 1556 + 1557 + describe("GET /api/admin/backfill/:id", () => { 1558 + it("returns backfill progress", async () => { 1559 + // Mock db select returns progress row 1560 + // Expect 200 with progress data 1561 + }); 1562 + 1563 + it("returns 404 for unknown backfill ID", async () => { 1564 + // Mock db select returns empty 1565 + // Expect 404 1566 + }); 1567 + }); 1568 + 1569 + describe("GET /api/admin/backfill/:id/errors", () => { 1570 + it("returns error list for backfill", async () => { 1571 + // Mock db select returns error rows 1572 + // Expect 200 with errors array 1573 + }); 1574 + }); 1575 + ``` 1576 + 1577 + **Step 2: Run tests to verify they fail** 1578 + 1579 + Run: `pnpm --filter @atbb/appview test src/routes/__tests__/admin-backfill.test.ts` 1580 + Expected: FAIL — routes not defined. 1581 + 1582 + **Step 3: Implement admin backfill endpoints** 1583 + 1584 + Add to `apps/appview/src/routes/admin.ts`. Add `backfillProgress` and `backfillErrors` imports from `@atbb/db`. Then add three routes: 1585 + 1586 + **POST /backfill:** 1587 + ```typescript 1588 + app.post( 1589 + "/backfill", 1590 + requireAuth(ctx), 1591 + requirePermission(ctx, "space.atbb.permission.manageForum"), 1592 + async (c) => { 1593 + const backfillManager = ctx.backfillManager; 1594 + if (!backfillManager) { 1595 + return c.json({ error: "Backfill manager not available" }, 503); 1596 + } 1597 + 1598 + if (backfillManager.getIsRunning()) { 1599 + return c.json({ error: "A backfill is already in progress" }, 409); 1600 + } 1601 + 1602 + // Determine backfill type 1603 + const force = c.req.query("force"); 1604 + let type: BackfillStatus; 1605 + 1606 + if (force === "catch_up" || force === "full_sync") { 1607 + type = force === "catch_up" ? BackfillStatus.CatchUp : BackfillStatus.FullSync; 1608 + } else { 1609 + const cursor = await new CursorManager(ctx.db).load(); 1610 + type = await backfillManager.checkIfNeeded(cursor); 1611 + 1612 + if (type === BackfillStatus.NotNeeded) { 1613 + return c.json({ 1614 + message: "No backfill needed. Use ?force=catch_up or ?force=full_sync to override.", 1615 + }, 200); 1616 + } 1617 + } 1618 + 1619 + // Start async (don't await) 1620 + const resultPromise = backfillManager.performBackfill(type); 1621 + // We need the backfill ID — wait briefly for the progress row to be created 1622 + // The performBackfill creates the row immediately, so we can peek at it 1623 + resultPromise.catch((err) => { 1624 + console.error(JSON.stringify({ 1625 + event: "backfill.admin_trigger_failed", 1626 + error: err instanceof Error ? err.message : String(err), 1627 + timestamp: new Date().toISOString(), 1628 + })); 1629 + }); 1630 + 1631 + return c.json({ 1632 + message: "Backfill started", 1633 + type, 1634 + status: "in_progress", 1635 + }, 202); 1636 + } 1637 + ); 1638 + ``` 1639 + 1640 + **GET /backfill/:id:** 1641 + ```typescript 1642 + app.get( 1643 + "/backfill/:id", 1644 + requireAuth(ctx), 1645 + requirePermission(ctx, "space.atbb.permission.manageForum"), 1646 + async (c) => { 1647 + const id = c.req.param("id"); 1648 + const parsedId = parseInt(id, 10); 1649 + if (isNaN(parsedId)) { 1650 + return c.json({ error: "Invalid backfill ID" }, 400); 1651 + } 1652 + 1653 + try { 1654 + const [row] = await ctx.db 1655 + .select() 1656 + .from(backfillProgress) 1657 + .where(eq(backfillProgress.id, BigInt(parsedId))) 1658 + .limit(1); 1659 + 1660 + if (!row) { 1661 + return c.json({ error: "Backfill not found" }, 404); 1662 + } 1663 + 1664 + // Count errors 1665 + const [errorCount] = await ctx.db 1666 + .select({ count: sql<number>`count(*)` }) 1667 + .from(backfillErrors) 1668 + .where(eq(backfillErrors.backfillId, row.id)); 1669 + 1670 + return c.json({ 1671 + id: row.id.toString(), 1672 + status: row.status, 1673 + type: row.backfillType, 1674 + didsTotal: row.didsTotal, 1675 + didsProcessed: row.didsProcessed, 1676 + recordsIndexed: row.recordsIndexed, 1677 + errorCount: errorCount?.count ?? 0, 1678 + startedAt: row.startedAt.toISOString(), 1679 + completedAt: row.completedAt?.toISOString() ?? null, 1680 + errorMessage: row.errorMessage, 1681 + }); 1682 + } catch (error) { 1683 + if (isProgrammingError(error)) throw error; 1684 + console.error("Failed to fetch backfill progress", { error }); 1685 + return c.json({ error: "Failed to fetch backfill progress" }, 500); 1686 + } 1687 + } 1688 + ); 1689 + ``` 1690 + 1691 + **GET /backfill/:id/errors:** 1692 + ```typescript 1693 + app.get( 1694 + "/backfill/:id/errors", 1695 + requireAuth(ctx), 1696 + requirePermission(ctx, "space.atbb.permission.manageForum"), 1697 + async (c) => { 1698 + const id = c.req.param("id"); 1699 + const parsedId = parseInt(id, 10); 1700 + if (isNaN(parsedId)) { 1701 + return c.json({ error: "Invalid backfill ID" }, 400); 1702 + } 1703 + 1704 + try { 1705 + const errors = await ctx.db 1706 + .select() 1707 + .from(backfillErrors) 1708 + .where(eq(backfillErrors.backfillId, BigInt(parsedId))) 1709 + .orderBy(asc(backfillErrors.createdAt)) 1710 + .limit(1000); 1711 + 1712 + return c.json({ 1713 + errors: errors.map((e) => ({ 1714 + id: e.id.toString(), 1715 + did: e.did, 1716 + collection: e.collection, 1717 + errorMessage: e.errorMessage, 1718 + createdAt: e.createdAt.toISOString(), 1719 + })), 1720 + }); 1721 + } catch (error) { 1722 + if (isProgrammingError(error)) throw error; 1723 + console.error("Failed to fetch backfill errors", { error }); 1724 + return c.json({ error: "Failed to fetch backfill errors" }, 500); 1725 + } 1726 + } 1727 + ); 1728 + ``` 1729 + 1730 + **Step 4: Run test to verify it passes** 1731 + 1732 + Run: `pnpm --filter @atbb/appview test src/routes/__tests__/admin-backfill.test.ts` 1733 + Expected: All tests PASS. 1734 + 1735 + **Step 5: Commit** 1736 + 1737 + ```bash 1738 + git add apps/appview/src/routes/admin.ts apps/appview/src/routes/__tests__/admin-backfill.test.ts 1739 + git commit -m "feat(appview): add admin backfill endpoints (ATB-13)" 1740 + ``` 1741 + 1742 + --- 1743 + 1744 + ## Task 10: Wire BackfillManager into AppContext & Startup 1745 + 1746 + Connect everything together: add `BackfillManager` to `AppContext`, inject the `Indexer`, and update startup. 1747 + 1748 + **Files:** 1749 + - Modify: `apps/appview/src/lib/app-context.ts` 1750 + - Modify: `apps/appview/src/index.ts` 1751 + - Modify: `apps/appview/src/lib/__tests__/app-context.test.ts` (if it tests the interface) 1752 + 1753 + **Step 1: Add backfillManager to AppContext interface** 1754 + 1755 + In `apps/appview/src/lib/app-context.ts`: 1756 + 1757 + ```typescript 1758 + import { BackfillManager } from "./backfill-manager.js"; 1759 + 1760 + export interface AppContext { 1761 + // ... existing fields ... 1762 + backfillManager: BackfillManager; 1763 + } 1764 + ``` 1765 + 1766 + **Step 2: Create BackfillManager in createAppContext** 1767 + 1768 + In `createAppContext()`, after creating the `firehose`: 1769 + 1770 + ```typescript 1771 + const backfillManager = new BackfillManager(db, config); 1772 + ``` 1773 + 1774 + And add it to the return object: 1775 + 1776 + ```typescript 1777 + return { 1778 + // ... existing fields ... 1779 + backfillManager, 1780 + }; 1781 + ``` 1782 + 1783 + **Step 3: Wire up in index.ts** 1784 + 1785 + In `apps/appview/src/index.ts`, after `createAppContext` and before `firehose.start()`: 1786 + 1787 + ```typescript 1788 + // Wire backfill manager into firehose (Indexer is created inside FirehoseService) 1789 + ctx.firehose.setBackfillManager(ctx.backfillManager); 1790 + ``` 1791 + 1792 + The `Indexer` is currently created inside `FirehoseService`. To make it available to `BackfillManager`, either: 1793 + - Extract `Indexer` creation to `createAppContext` and inject into both, or 1794 + - Have `FirehoseService` expose its `Indexer` via a getter, and wire in `index.ts`: 1795 + 1796 + ```typescript 1797 + // After firehose creation, wire indexer to backfill manager 1798 + ctx.backfillManager.setIndexer(ctx.firehose.getIndexer()); 1799 + ``` 1800 + 1801 + Add a public getter to `FirehoseService`: 1802 + ```typescript 1803 + getIndexer(): Indexer { 1804 + return this.indexer; 1805 + } 1806 + ``` 1807 + 1808 + **Step 4: Run full test suite** 1809 + 1810 + Run: `pnpm test` 1811 + Expected: All tests PASS across all packages. 1812 + 1813 + **Step 5: Run build** 1814 + 1815 + Run: `pnpm build` 1816 + Expected: Clean build. 1817 + 1818 + **Step 6: Commit** 1819 + 1820 + ```bash 1821 + git add apps/appview/src/lib/app-context.ts apps/appview/src/lib/firehose.ts apps/appview/src/index.ts 1822 + git commit -m "feat(appview): wire BackfillManager into AppContext and startup (ATB-13)" 1823 + ``` 1824 + 1825 + --- 1826 + 1827 + ## Task 11: Update Bruno Collection & Documentation 1828 + 1829 + Add Bruno requests for the new admin backfill endpoints and update the project plan. 1830 + 1831 + **Files:** 1832 + - Create: `bruno/AppView API/Admin/Trigger Backfill.bru` 1833 + - Create: `bruno/AppView API/Admin/Get Backfill Status.bru` 1834 + - Create: `bruno/AppView API/Admin/Get Backfill Errors.bru` 1835 + - Modify: `docs/atproto-forum-plan.md` — mark ATB-13 complete 1836 + 1837 + **Step 1: Create Bruno files** 1838 + 1839 + Follow the existing `.bru` file patterns in the `bruno/AppView API/` directory. 1840 + 1841 + **Step 2: Update project plan** 1842 + 1843 + In `docs/atproto-forum-plan.md`, mark ATB-13 as complete with a brief status note. 1844 + 1845 + **Step 3: Commit** 1846 + 1847 + ```bash 1848 + git add bruno/ docs/atproto-forum-plan.md 1849 + git commit -m "docs: add backfill Bruno collection and update plan (ATB-13)" 1850 + ``` 1851 + 1852 + --- 1853 + 1854 + ## Task 12: Final Verification 1855 + 1856 + Run the full build + test + lint pipeline to confirm everything works. 1857 + 1858 + **Step 1: Build** 1859 + 1860 + Run: `pnpm build` 1861 + Expected: Clean build across all packages. 1862 + 1863 + **Step 2: Test** 1864 + 1865 + Run: `pnpm test` 1866 + Expected: All tests pass. 1867 + 1868 + **Step 3: Lint** 1869 + 1870 + Run: `pnpm turbo lint` 1871 + Expected: No type errors. 1872 + 1873 + **Step 4: Review all changes** 1874 + 1875 + Run: `git log --oneline main..HEAD` 1876 + Verify commit history is clean and each commit is atomic.
+2 -1
packages/atproto/src/errors.ts
··· 57 57 msg.includes("postgres") || 58 58 msg.includes("database") || 59 59 msg.includes("sql") || 60 - msg.includes("query") 60 + // drizzle-orm wraps all failed queries as: "Failed query: <sql>\nparams: <params>" 61 + msg.includes("failed query") 61 62 ); 62 63 }
+32
packages/db/src/schema.ts
··· 212 212 index("roles_did_name_idx").on(table.did, table.name), 213 213 ] 214 214 ); 215 + 216 + // ── backfill_progress ─────────────────────────────────── 217 + // Tracks backfill job state for crash-resilient resume. 218 + export const backfillProgress = pgTable("backfill_progress", { 219 + id: bigserial("id", { mode: "bigint" }).primaryKey(), 220 + status: text("status").notNull(), // 'in_progress', 'completed', 'failed' 221 + backfillType: text("backfill_type").notNull(), // 'full_sync', 'catch_up' 222 + lastProcessedDid: text("last_processed_did"), 223 + didsTotal: integer("dids_total").notNull().default(0), 224 + didsProcessed: integer("dids_processed").notNull().default(0), 225 + recordsIndexed: integer("records_indexed").notNull().default(0), 226 + startedAt: timestamp("started_at", { withTimezone: true }).notNull(), 227 + completedAt: timestamp("completed_at", { withTimezone: true }), 228 + errorMessage: text("error_message"), 229 + }); 230 + 231 + // ── backfill_errors ───────────────────────────────────── 232 + // Per-DID error log for failed backfill syncs. 233 + export const backfillErrors = pgTable( 234 + "backfill_errors", 235 + { 236 + id: bigserial("id", { mode: "bigint" }).primaryKey(), 237 + backfillId: bigint("backfill_id", { mode: "bigint" }) 238 + .notNull() 239 + .references(() => backfillProgress.id), 240 + did: text("did").notNull(), 241 + collection: text("collection").notNull(), 242 + errorMessage: text("error_message").notNull(), 243 + createdAt: timestamp("created_at", { withTimezone: true }).notNull(), 244 + }, 245 + (table) => [index("backfill_errors_backfill_id_idx").on(table.backfillId)] 246 + );
+1 -1
turbo.json
··· 19 19 }, 20 20 "test": { 21 21 "dependsOn": ["^build"], 22 - "env": ["DATABASE_URL"] 22 + "env": ["DATABASE_URL", "BACKFILL_RATE_LIMIT", "BACKFILL_CONCURRENCY", "BACKFILL_CURSOR_MAX_AGE_HOURS"] 23 23 }, 24 24 "clean": { 25 25 "cache": false