A tool for parsing traffic on the jetstream and applying a moderation workstream based on regexp based rules

Added claude instructions

+410
+98
.claude/.agents/code-reviewer.md
··· 1 + --- 2 + name: code-reviewer-v1 3 + description: Call this agent to review staged and unstaged code in the repository. It evaluates code quality and security. 4 + tools: Bash, Glob, Grep, LS, Read, WebFetch, TodoWrite, WebSearch, BashOutput, KillBash, mcp__git-mcp-server__git_add, mcp__git-mcp-server__git_branch, mcp__git-mcp-server__git_checkout, mcp__git-mcp-server__git_cherry_pick, mcp__git-mcp-server__git_clean, mcp__git-mcp-server__git_clear_working_dir, mcp__git-mcp-server__git_clone, mcp__git-mcp-server__git_commit, mcp__git-mcp-server__git_diff, mcp__git-mcp-server__git_fetch, mcp__git-mcp-server__git_init, mcp__git-mcp-server__git_log, mcp__git-mcp-server__git_merge, mcp__git-mcp-server__git_pull, mcp__git-mcp-server__git_push, mcp__git-mcp-server__git_rebase, mcp__git-mcp-server__git_remote, mcp__git-mcp-server__git_reset, mcp__git-mcp-server__git_set_working_dir, mcp__git-mcp-server__git_show, mcp__git-mcp-server__git_stash, mcp__git-mcp-server__git_status, mcp__git-mcp-server__git_tag, mcp__git-mcp-server__git_worktree, mcp__git-mcp-server__git_wrapup_instructions 5 + color: green 6 + --- 7 + **All imports in this document should be treated as if they were in the main prompt file.** 8 + 9 + You are a comprehensive code review agent examining a piece of code that has been created by the main agent that calls you. Your role is to provide thorough, constructive feedback that ensures code quality, maintainability, and alignment with established patterns and decisions, while also suggesting ways to improve both the code in question but also our stored memory bank for future iterations. 10 + 11 + The agent that calls you may also provide you with a Task Master task definition. Your evaluation of the output should take into account this task definition and ensure that the provided solution meets our goals. 12 + 13 + ## Review Methodology 14 + 15 + ### Phase 1: Context Gathering 16 + 1. Check the repository's Git status, both staged and unstaged 17 + 2. Examine the full diff to understand what's changing 18 + 4. Search the codebase for similar patterns or implementations that might be reusable 19 + 20 + ### Phase 2: Comprehensive Review 21 + #### Code Quality & Patterns 22 + - **Compilation**: For all touched packages and apps, make sure the code compiles and all tests pass 23 + - **DRY Violations**: Search for similar code patterns elsewhere in the codebase 24 + - **Consistency**: Does this follow established patterns in the project? 25 + - **Abstraction Level**: Is this the right level of generalization? 26 + - **Naming**: Are names clear, consistent, and follow project conventions? 27 + 28 + #### Engineering Excellence 29 + - **Error Handling**: How are errors caught, logged, and recovered from? 30 + - **Edge Cases**: What happens with null/undefined/empty/malformed inputs? 31 + - **Performance**: Will this scale with realistic data volumes? 32 + - Consider cases where an iterative approach is being done when a parallel approach would be better 33 + - Example: the original implementation of Fastify health checks had try-catch blocks all in a row; a good suggestion would be to make these into functions called with `Promise.allSettled` 34 + - **Security**: Are there injection risks, exposed secrets, or auth bypasses? 35 + - **Testing**: Are critical paths tested? Are tests meaningful? 36 + - Our system is entirely built around a dependency injector; we can create (and make DRY and reusable) stub implementations of our services in order to allow for more integrated tests. Recommend this proactively. 37 + 38 + #### Integration & Dependencies 39 + - **Codebase Fit**: Does this integrate well with existing modules? 40 + - **Dependencies**: Are we adding unnecessary dependencies when existing utilities could work? 41 + - **Side Effects**: What other parts of the system might this affect? 42 + 43 + ### Phase 3: Knowledge Management Assessment 44 + 45 + Identify knowledge gaps and opportunities: 46 + 47 + #### Flag for Documentation 48 + - **New Techniques**: "This retry mechanism is well-implemented and reusable. 49 + - **Missing Decisions**: "Choosing WebSockets over SSE here seems like an architectural decision that should be recorded" 50 + - **Complex Logic**: "This order processing logic should be captured as a detail entry" 51 + - **Implementation doesn't match product concepts**: 52 + 53 + ## Review Output Format 54 + 55 + Structure your review as: 56 + 57 + ### Summary 58 + Brief overview of the changes and overall assessment 59 + 60 + ### Critical Issues 🔴 61 + Must-fix problems (security, bugs, broken functionality) 62 + 63 + ### Important Suggestions 🟡 64 + Should-fix issues (performance, maintainability, patterns) 65 + 66 + ### Minor Improvements 🟢 67 + Nice-to-have enhancements (style, optimization, clarity) 68 + 69 + ### Knowledge Management 70 + - **Alignment Check**: How this aligns with existing knowledge 71 + - **Documentation Opportunities**: What should be added to Basic Memory 72 + - **Updates Needed**: What existing entries need updating 73 + 74 + ### Code Reuse Opportunities 75 + Specific suggestions for using existing code instead of reimplementing 76 + 77 + ## Review Tone 78 + 79 + Be constructive and specific: 80 + - ✅ "Consider using the cursor pagination technique from `src/api/utils.ts:142` instead" 81 + - ❌ "This pagination is wrong" 82 + 83 + - ✅ "This deviates from our decision to use Zod for validation. If intentional, please update the decision entry" 84 + - ❌ "You should use Zod" 85 + 86 + - ✅ "Great implementation of circuit breaker! This is reusable - worth documenting" 87 + - ❌ "Good code" 88 + 89 + ## Special Instructions 90 + 91 + 1. **Search Extensively**: Use Grep and Glob liberally to find similar code patterns 92 + 2. **Reference Specifically**: Include file paths and line numbers in feedback 93 + 3. **Suggest Alternatives**: Don't just identify problems - propose solutions 94 + 4. **Prioritize Feedback**: Focus on what matters most for safety and maintainability 95 + 5. **Learn from History**: Check Basic Memory for past decisions and patterns 96 + 6. **Think Long-term**: Consider how this code will age and be maintained 97 + 98 + Remember: Your goal is not just to find problems, but to help maintain a coherent, well-documented, and maintainable codebase that builds on established knowledge and patterns.
+174
.claude/mcp-descriptions/git-mcp.mdc
··· 1 + --- 2 + description: 3 + globs: 4 + alwaysApply: true 5 + --- 6 + # LLM Agent Guidelines for `@cyanheads/git-mcp-server` 7 + 8 + This document provides a concise overview of the available Git tools, designed to be used as a quick-reference guide for an LLM coding assistant. 9 + 10 + ### Guiding Principles for the LLM Agent 11 + 12 + * **Human-in-the-Loop**: Do not commit any changes without explicit permission from a human operator. 13 + * **Safety First**: Never use potentially destructive commands like `git_reset`, `git_clean`, or `git_push` with the `force` option enabled. These operations can lead to permanent data loss. 14 + * **Session Context is Key**: Always start your workflow by setting a working directory with `git_set_working_dir`. Subsequent commands can then use `.` as the path, which is more efficient. Use `git_clear_working_dir` when a session is complete. 15 + * **Conventional Commits**: When using `git_commit`, write clear, concise messages following the Conventional Commits format: `type(scope): subject`. The tool's description provides detailed guidance. 16 + * **Review Before Committing**: Before committing, always use `git_status` and `git_diff` to review the changes. This ensures you create logical, atomic commits. 17 + 18 + --- 19 + 20 + ## Commonly Used Tools 21 + 22 + These are the essential tools for day-to-day version control tasks. 23 + 24 + ### `git_set_working_dir` 25 + 26 + * **Description**: Sets the default working directory for the current session. Subsequent Git tool calls can use `.` for the `path`, which will resolve to this directory. **This should be the first tool you use in any workflow.** 27 + * **When to Use**: At the beginning of any task that involves a Git repository to establish context for all subsequent commands. 28 + * **Input Parameters**: 29 + 30 + | Parameter | Type | Description | 31 + | :----------------------- | :------ | :--------------------------------------------------------------------------- | 32 + | `path` | string | The **absolute path** to set as the default working directory. | 33 + | `validateGitRepo` | boolean | Validate that the path is a Git repository. Defaults to `true`. | 34 + | `initializeIfNotPresent` | boolean | If not a Git repository, initialize it with 'git init'. Defaults to `false`. | 35 + 36 + ### `git_status` 37 + 38 + * **Description**: Retrieves the status of the repository, showing staged, unstaged, and untracked files. 39 + * **When to Use**: Use this frequently to check the state of the repository before staging changes, after pulling from a remote, or before committing. 40 + * **Input Parameters**: 41 + 42 + | Parameter | Type | Description | 43 + | :-------- | :----- | :------------------------------------------------------------------------------------------------------------------------------------ | 44 + | `path` | string | Path to the Git repository. Defaults to `.` (the session's working directory). | 45 + 46 + ### `git_add` 47 + 48 + * **Description**: Stages changes, adding them to the index before committing. 49 + * **When to Use**: After making changes to files and before you are ready to commit them. 50 + * **Input Parameters**: 51 + 52 + | Parameter | Type | Description | 53 + | :-------- | :------------------- | :------------------------------------------------------------------------------------------------------------------------------------ | 54 + | `path` | string | Path to the Git repository. Defaults to the directory set via `git_set_working_dir`. | 55 + | `files` | string \| string\[] | Files or patterns to stage. Defaults to all changes (`.`). | 56 + 57 + ### `git_commit` 58 + 59 + * **Description**: Commits staged changes to the repository with a descriptive message. 60 + * **When to Use**: After staging a logical group of changes with `git_add` and receiving approval from the operator to commit. 61 + * **Input Parameters**: 62 + 63 + | Parameter | Type | Description | 64 + | :----------- | :------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------- | 65 + | `path` | string | Path to the Git repository. | 66 + | `message` | string | The commit message. | 67 + | `author` | object | Override the commit author (`{ name: string, email: string }`). | 68 + | `filesToStage`| string\[] | An array of file paths to stage before committing. | 69 + 70 + ### `git_log` 71 + 72 + * **Description**: Shows the commit history. Can be filtered by author, date, or branch. 73 + * **When to Use**: To review recent changes, find a specific commit, or understand the history of a file or branch. 74 + * **Input Parameters**: 75 + 76 + | Parameter | Type | Description | 77 + | :------------- | :------ | :------------------------------------------------------------------- | 78 + | `path` | string | Path to the Git repository. | 79 + | `maxCount` | number | Limit the number of commits to output. | 80 + | `author` | string | Filter commits by a specific author. | 81 + | `since` | string | Show commits more recent than a specific date. | 82 + | `until` | string | Show commits older than a specific date. | 83 + | `branchOrFile` | string | Show logs for a specific branch, tag, or file path. | 84 + | `showSignature`| boolean | Show signature verification status for commits. | 85 + 86 + ### `git_diff` 87 + 88 + * **Description**: Shows changes between commits, the working tree, etc. 89 + * **When to Use**: To review unstaged changes before adding, or to see the difference between two branches or commits. 90 + * **Input Parameters**: 91 + 92 + | Parameter | Type | Description | 93 + | :--------------- | :------ | :------------------------------------------------------------------------------- | 94 + | `path` | string | Path to the Git repository. | 95 + | `commit1` | string | First commit, branch, or ref for comparison. | 96 + | `commit2` | string | Second commit, branch, or ref for comparison. | 97 + | `staged` | boolean | Show diff of staged changes. | 98 + | `file` | string | Limit the diff to a specific file. | 99 + | `includeUntracked`| boolean | Include untracked files in the diff output. | 100 + 101 + ### `git_branch` 102 + 103 + * **Description**: Manages branches: list, create, delete, and rename. 104 + * **When to Use**: To see what branches are available, create a new branch for a feature or bugfix, or clean up old branches. DO NOT do this without human operator confirmation. 105 + * **Input Parameters**: 106 + 107 + | Parameter | Type | Description | 108 + | :------------ | :------ | :------------------------------------------------------------------- | 109 + | `path` | string | Path to the Git repository. | 110 + | `mode` | enum | The operation: `list`, `create`, `delete`, `rename`, `show-current`. | 111 + | `branchName` | string | Name of the branch for create/delete/rename operations. | 112 + | `newBranchName`| string | The new name for the branch when renaming. | 113 + | `startPoint` | string | The starting point for a new branch. | 114 + | `force` | boolean | Force the operation (e.g., deleting an unmerged branch). | 115 + | `all` | boolean | List all branches (local and remote). | 116 + | `remote` | boolean | Act on remote-tracking branches. | 117 + 118 + ### `git_checkout` 119 + 120 + * **Description**: Switches branches or restores working tree files. 121 + * **When to Use**: To start working on a different branch or to discard changes in a specific file. DO NOT do this without human operator confirmation. 122 + * **Input Parameters**: 123 + 124 + | Parameter | Type | Description | 125 + | :----------- | :------ | :---------------------------------------------------------------- | 126 + | `path` | string | Path to the Git repository. | 127 + | `branchOrPath`| string | The branch, commit, tag, or file path to checkout. | 128 + | `newBranch` | string | Create a new branch before checking out. | 129 + | `force` | boolean | Force checkout, discarding local changes. | 130 + 131 + ### `git_pull` 132 + 133 + * **Description**: Fetches from and integrates with a remote repository or a local branch. 134 + * **When to Use**: To update your current local branch with changes from its remote counterpart. DO NOT do this without human operator confirmation. 135 + * **Input Parameters**: 136 + 137 + | Parameter | Type | Description | 138 + | :-------- | :------ | :----------------------------------------------------------------- | 139 + | `path` | string | Path to the Git repository. | 140 + | `remote` | string | The remote repository to pull from (e.g., 'origin'). | 141 + | `branch` | string | The remote branch to pull. | 142 + | `rebase` | boolean | Use 'git pull --rebase' instead of merge. | 143 + | `ffOnly` | boolean | Only allow fast-forward merges. | 144 + 145 + ### `git_push` 146 + 147 + * **Description**: Updates remote refs with local changes. 148 + * **When to Use**: After committing your changes locally, use this to share them on the remote repository. DO NOT do this without human operator confirmation. 149 + * **Input Parameters**: 150 + 151 + | Parameter | Type | Description | 152 + | :----------- | :------ | :----------------------------------------------------------- | 153 + | `path` | string | Path to the Git repository. | 154 + | `remote` | string | The remote repository to push to. | 155 + | `branch` | string | The local branch to push. | 156 + | `remoteBranch`| string | The remote branch to push to. | 157 + | `force` | boolean | Force the push (use with caution). | 158 + | `forceWithLease`| boolean | Force push only if remote ref is as expected. | 159 + | `setUpstream`| boolean | Set the upstream tracking configuration. | 160 + | `tags` | boolean | Push all tags. | 161 + | `delete` | boolean | Delete the remote branch. | 162 + 163 + --- 164 + 165 + ## Complex Situations 166 + 167 + If you encounter a situation where you believe a more advanced or potentially destructive tool is needed (such as `git rebase`, `git reset`, `git cherry-pick`, or `git clean`), **do not proceed automatically**. 168 + 169 + Instead, you should: 170 + 171 + 1. **Pause execution.** 172 + 2. **Explain the situation** to the human operator. 173 + 3. **State which advanced Git operation you think is necessary and why.** 174 + 4. **Await explicit instruction** from the operator before taking any further action.
+21
.claude/settings.local.json
··· 1 + { 2 + "permissions": { 3 + "allow": [ 4 + "mcp__git-mcp-server__git_status", 5 + "mcp__git-mcp-server__git_diff", 6 + "mcp__git-mcp-server__git_set_working_dir", 7 + "mcp__git-mcp-server__git_branch", 8 + "mcp__git-mcp-server__git_checkout", 9 + "Bash(bun run lint:*)", 10 + "mcp__git-mcp-server__git_commit", 11 + "mcp__git-mcp-server__git_push" 12 + ], 13 + "ask": [ 14 + "curl" 15 + ] 16 + }, 17 + "enableAllProjectMcpServers": true, 18 + "enabledMcpjsonServers": [ 19 + "git-mcp-server" 20 + ] 21 + }
+13
.mcp.json
··· 1 + { 2 + "mcpServers": { 3 + "git-mcp-server": { 4 + "type": "stdio", 5 + "command": "npx", 6 + "args": ["@cyanheads/git-mcp-server"], 7 + "env": { 8 + "MCP_LOG_LEVEL": "info", 9 + "GIT_SIGN_COMMITS": "false" 10 + } 11 + } 12 + } 13 + }
+104
CLAUDE.md
··· 1 + # Claude Code Instructions 2 + 3 + **All imports in this document should be treated as if they were in the main prompt file.** 4 + 5 + ## MCP Orientation Instructions 6 + 7 + @.claude/mcp-descriptions/github-mcp.mdc 8 + 9 + NEVER USE A COMMAND-LINE TOOL WHEN AN MCP TOOL IS AVAILABLE. IF YOU THINK AN MCP TOOL IS MALFUNCTIONING AND CANNOT OTHERWISE CONTINUE, STOP AND ASK THE HUMAN OPERATOR FOR ASSISTANCE. 10 + 11 + ## Development Commands 12 + 13 + ### Running the Application 14 + 15 + - `bun run start` - Run the main application (production mode) 16 + - `bun run dev` - Run in development mode with file watching 17 + - `bun i` - Install dependencies 18 + 19 + ### Code Quality 20 + 21 + - `bun run format` - Format code using Prettier 22 + - `bun run lint` - Run ESLint to check for issues 23 + - `bun run lint:fix` - Automatically fix ESLint issues where possible 24 + 25 + ### Docker Deployment 26 + 27 + - `docker build -pull -t skywatch-tools .` - Build Docker image 28 + - `docker run -d -p 4101:4101 skywatch-autolabeler` - Run container 29 + 30 + ## Architecture Overview 31 + 32 + This is a TypeScript rewrite of a Bash-based Bluesky content moderation system for the skywatch.blue independent labeler. The application monitors the Bluesky firehose in real-time and automatically applies labels to content that meets specific moderation criteria. 33 + 34 + ### Core Components 35 + 36 + - **`main.ts`** - Entry point that sets up Jetstream WebSocket connection to monitor Bluesky firehose events (posts, profiles, handles, starter packs) 37 + - **`agent.ts`** - Configures the AtpAgent for interacting with Ozone PDS for labeling operations 38 + - **`constants.ts`** - Contains all moderation check definitions (PROFILE_CHECKS, POST_CHECKS, HANDLE_CHECKS) 39 + - **`config.ts`** - Environment variable configuration and application settings 40 + - **Check modules** - Individual modules for different content types: 41 + - `checkPosts.ts` - Analyzes post content and URLs 42 + - `checkHandles.ts` - Validates user handles 43 + - `checkProfiles.ts` - Examines profile descriptions and display names 44 + - `checkStarterPack.ts` - Reviews starter pack content 45 + 46 + ### Moderation Check System 47 + 48 + The system uses a `Checks` interface to define moderation rules with the following properties: 49 + 50 + - `label` - The label to apply when content matches 51 + - `check` - RegExp pattern to match against content 52 + - `whitelist` - Optional RegExp to exempt certain content 53 + - `ignoredDIDs` - Array of DIDs to skip for this check 54 + - `reportAcct/commentAcct/toLabel` - Actions to take when content matches 55 + 56 + ### Environment Configuration 57 + 58 + The application requires several environment variables: 59 + 60 + - Bluesky credentials (`BSKY_HANDLE`, `BSKY_PASSWORD`) 61 + - Ozone server configuration (`OZONE_URL`, `OZONE_PDS`) 62 + - Optional: firehose URL, ports, rate limiting settings 63 + 64 + ### Data Flow 65 + 66 + 1. Jetstream receives events from Bluesky firehose 67 + 2. Events are categorized by type (post, profile, handle, starter pack) 68 + 3. Appropriate check functions validate content against defined patterns 69 + 4. Matching content triggers labeling actions via Ozone PDS 70 + 5. Cursor position is periodically saved for resumption after restart 71 + 72 + ### Development Notes 73 + 74 + - Uses Bun as the runtime and package manager 75 + - Built with modern TypeScript and ESNext modules 76 + - Implements rate limiting and error handling for API calls 77 + - Supports both labeling and reporting workflows 78 + - Includes metrics server on port 4101 for monitoring 79 + 80 + See `src/developing_checks.md` for detailed instructions on creating new moderation checks. 81 + 82 + ## TODO 83 + 84 + The code-reviewer has completed a comprehensive review of the codebase and identified several critical issues that need immediate attention: 85 + 86 + Immediate Blocking Issues 87 + 88 + - Missing constants.ts file (only example exists) 89 + - Inadequate error handling for async operations 90 + 91 + High Priority Security & Reliability Concerns 92 + 93 + - Hardcoded DIDs should be moved to environment variables 94 + - Missing structured error handling and logging 95 + - No environment variable validation at startup 96 + 97 + Medium Priority Code Quality Issues 98 + 99 + - Duplicate profile checking logic needs refactoring 100 + - ESLint configuration needs TypeScript updates 101 + - Missing comprehensive test suite 102 + 103 + The reviewer noted that while the modular architecture is well-designed, there are critical execution flaws that must be addressed before this 104 + can be safely deployed to production.