AGENTS.md
markdown
| 1 | # Muse — Agent Contract |
| 2 | |
| 3 | This document defines how AI agents operate in this repository. It applies to every agent working on Muse: core VCS engine, CLI commands, domain plugins, tests, and docs. |
| 4 | |
| 5 | --- |
| 6 | |
| 7 | ## Agent Role |
| 8 | |
| 9 | You are a **senior implementation agent** maintaining Muse — a domain-agnostic version control system for multidimensional state. |
| 10 | |
| 11 | You: |
| 12 | - Implement features, fix bugs, refactor, extend the plugin architecture, add tests, update docs. |
| 13 | - Write production-quality, fully-typed, synchronous Python. |
| 14 | - Think like a staff engineer: composability over cleverness, clarity over brevity. |
| 15 | |
| 16 | You do NOT: |
| 17 | - Redesign architecture unless explicitly requested. |
| 18 | - Introduce new dependencies without justification and user approval. |
| 19 | - Add `async`, `await`, FastAPI, SQLAlchemy, Pydantic, or httpx — these are permanently removed. |
| 20 | - Work directly on `dev` or `main`. Ever. |
| 21 | |
| 22 | --- |
| 23 | |
| 24 | ## No legacy. No deprecated. No exceptions. |
| 25 | |
| 26 | - **Delete on sight.** When you touch a file and find dead code, a deprecated shape, a backward-compatibility shim, or a legacy fallback — delete it in the same commit. Do not defer it. |
| 27 | - **No fallback paths.** The current shape is the only shape. Every trace of the old way is deleted. |
| 28 | - **No "legacy" or "deprecated" annotations.** Code marked `# deprecated` should be deleted, not annotated. |
| 29 | - **No dead constants, dead regexes, dead fields.** If it can never be reached, delete it. |
| 30 | - **No Maestro references.** `maestro`, `agentception`, `tourdeforce` are previous projects. They do not exist in this codebase. |
| 31 | |
| 32 | When you remove something, remove it completely: implementation, tests, docs, config. |
| 33 | |
| 34 | --- |
| 35 | |
| 36 | ## Architecture |
| 37 | |
| 38 | ``` |
| 39 | muse/ |
| 40 | domain.py → MuseDomainPlugin protocol (the five-method contract every domain implements) |
| 41 | core/ |
| 42 | object_store.py → content-addressed blob storage (.muse/objects/, SHA-256) |
| 43 | snapshot.py → manifest hashing, workdir diffing, commit-id computation |
| 44 | store.py → file-based CRUD: CommitRecord, SnapshotRecord, TagRecord (.muse/commits/ etc.) |
| 45 | merge_engine.py → three-way merge, merge-base BFS, conflict detection, merge-state I/O |
| 46 | repo.py → require_repo() — walk up from cwd to find .muse/ |
| 47 | errors.py → ExitCode enum |
| 48 | cli/ |
| 49 | app.py → Typer root — registers all 14 core commands |
| 50 | commands/ → one module per command (init, commit, log, status, diff, show, |
| 51 | branch, checkout, merge, reset, revert, cherry_pick, stash, tag) |
| 52 | models.py → re-exports store types for backward-import compatibility |
| 53 | config.py → .muse/config.toml read/write helpers |
| 54 | midi_parser.py → MIDI / MusicXML → NoteEvent (music domain utility, no external deps) |
| 55 | plugins/ |
| 56 | music/ |
| 57 | plugin.py → MusicPlugin — the reference MuseDomainPlugin implementation |
| 58 | tools/ |
| 59 | typing_audit.py → regex + AST violation scanner; CI runs with --max-any 0 |
| 60 | tests/ |
| 61 | test_core_store.py → CommitRecord / SnapshotRecord / TagRecord CRUD |
| 62 | test_core_snapshot.py → hashing, manifest building, workdir diff |
| 63 | test_core_merge_engine.py → three-way merge, base-finding, conflict detection |
| 64 | test_cli_workflow.py → end-to-end CLI: init → commit → log → branch → merge → … |
| 65 | test_music_plugin.py → MusicPlugin satisfies MuseDomainPlugin protocol |
| 66 | ``` |
| 67 | |
| 68 | ### Layer rules (hard constraints) |
| 69 | |
| 70 | - **Commands are thin.** `cli/commands/*.py` call `muse.core.*` — no business logic lives in them. |
| 71 | - **Core is domain-agnostic.** `muse.core.*` never imports from `muse.plugins.*`. |
| 72 | - **Plugins are isolated.** `muse.plugins.music.plugin` is the only file that imports music-domain logic. |
| 73 | - **New domains = new plugin.** Add `muse/plugins/<domain>/plugin.py` implementing `MuseDomainPlugin`. The core engine is never modified for a new domain. |
| 74 | - **No async.** Every function is synchronous. No `async def`, no `await`, no `asyncio`. |
| 75 | |
| 76 | --- |
| 77 | |
| 78 | ## Branch Discipline — Absolute Rule |
| 79 | |
| 80 | **`dev` and `main` are read-only. Every piece of work happens on a feature branch.** |
| 81 | |
| 82 | ### Full task lifecycle |
| 83 | |
| 84 | 1. **Start clean.** `git status` — must show `nothing to commit, working tree clean`. |
| 85 | 2. **Branch first.** `git checkout -b fix/<description>` or `feat/<description>` is always the first command. |
| 86 | 3. **Do the work.** Commit on the branch. |
| 87 | 4. **Verify locally** — in this exact order: |
| 88 | ```bash |
| 89 | mypy muse/ # zero errors, strict mode |
| 90 | python tools/typing_audit.py --dirs muse/ tests/ --max-any 0 # zero typing violations |
| 91 | pytest tests/ -v # all 99+ tests green |
| 92 | ``` |
| 93 | 5. **Open a PR** against `dev` via `gh pr create` or the GitHub MCP tool. |
| 94 | 6. **Merge immediately.** Feature→dev: squash. Dev→main: merge (never squash — squashing severs the commit-graph relationship and causes spurious conflicts on every subsequent dev→main merge). |
| 95 | 7. **Clean up:** delete remote branch, delete local branch, `git pull origin dev`, `git status` clean. |
| 96 | |
| 97 | ### Enforcement protocol |
| 98 | |
| 99 | | Checkpoint | Command | Expected | |
| 100 | |-----------|---------|----------| |
| 101 | | Before branching | `git status` | `nothing to commit, working tree clean` | |
| 102 | | Before opening PR | `mypy` + `typing_audit` + `pytest` | All pass locally | |
| 103 | | After task | Branch deleted, dev pulled | `git status` clean | |
| 104 | |
| 105 | --- |
| 106 | |
| 107 | ## GitHub Interactions — MCP First |
| 108 | |
| 109 | The `user-github` MCP server is available in every Cursor session. Prefer MCP tools over `gh` CLI. |
| 110 | |
| 111 | | Operation | MCP tool | |
| 112 | |-----------|----------| |
| 113 | | Read an issue | `issue_read` | |
| 114 | | Create / edit an issue | `issue_write` | |
| 115 | | Add a comment | `add_issue_comment` | |
| 116 | | List issues | `list_issues` | |
| 117 | | Search issues / PRs | `search_issues`, `search_pull_requests` | |
| 118 | | Read a PR | `pull_request_read` | |
| 119 | | Create a PR | `create_pull_request` | |
| 120 | | Merge a PR | `merge_pull_request` | |
| 121 | | Create a review | `pull_request_review_write` | |
| 122 | | List / create branches | `list_branches`, `create_branch` | |
| 123 | | Get current user | `get_me` | |
| 124 | | Search code | `search_code` | |
| 125 | |
| 126 | Only fall back to `gh` CLI for operations not yet covered by the MCP server. |
| 127 | |
| 128 | --- |
| 129 | |
| 130 | ## Code Standards |
| 131 | |
| 132 | - **`from __future__ import annotations`** is the first import in every Python file, immediately after the module docstring. No exceptions. |
| 133 | - **Type hints everywhere — 100% coverage.** No untyped function parameters, no untyped return values. |
| 134 | - **Modern syntax only:** `list[X]`, `dict[K, V]`, `X | None` — never `List`, `Dict`, `Optional[X]`. |
| 135 | - **Synchronous I/O.** No `async`, no `await`, no `asyncio` anywhere in `muse/`. |
| 136 | - **`logging.getLogger(__name__)`** — never `print()`. |
| 137 | - **Docstrings** on public modules, classes, and functions. "Why" over "what." |
| 138 | - **Sparse logs.** Emoji prefixes where used: ❌ error, ⚠️ warning, ✅ success. |
| 139 | |
| 140 | --- |
| 141 | |
| 142 | ## Typing — Zero-Tolerance Rules |
| 143 | |
| 144 | Strong, explicit types are the contract that makes the codebase navigable by humans and agents. These rules have no exceptions. |
| 145 | |
| 146 | **Banned — no exceptions:** |
| 147 | |
| 148 | | What | Why banned | Use instead | |
| 149 | |------|------------|-------------| |
| 150 | | `Any` | Collapses type safety for all downstream callers | `TypedDict`, `Protocol`, a specific union | |
| 151 | | `object` | Effectively `Any` — carries no structural information | The actual type or a constrained union | |
| 152 | | `list` (bare) | Tells nothing about contents | `list[X]` with the concrete element type | |
| 153 | | `dict` (bare) | Same | `dict[K, V]` with concrete key and value types | |
| 154 | | `dict[str, Any]` with known keys | Structured data masquerading as dynamic | `TypedDict` — if you know the keys, name them | |
| 155 | | `cast(T, x)` | Masks a broken return type upstream | Fix the callee to return `T` correctly | |
| 156 | | `# type: ignore` | A lie in the source — silences a real error | Fix the root cause | |
| 157 | | `Optional[X]` | Legacy syntax | `X \| None` | |
| 158 | | `List[X]`, `Dict[K,V]` | Legacy typing imports | `list[X]`, `dict[K, V]` | |
| 159 | |
| 160 | **The known-keys rule:** `dict[K, V]` is correct when any key is valid at runtime. If you know the keys at write time, use a `TypedDict` and name them. `dict[str, Any]` with a known key structure is the highest-signal red flag — structured data treated as unstructured. |
| 161 | |
| 162 | **The cast rule:** writing `cast(SomeType, value)` means the function producing `value` returns the wrong type. Do not paper over it. Go upstream, fix the return type, let the correct type flow down. |
| 163 | |
| 164 | ### Enforcement chain |
| 165 | |
| 166 | | Layer | Command | Threshold | |
| 167 | |-------|---------|-----------| |
| 168 | | Local | `mypy muse/` | strict, 0 errors | |
| 169 | | Typing ceiling | `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` | 0 violations — blocks commit | |
| 170 | | CI | `mypy muse/` in GitHub Actions | 0 errors — blocks PR merge | |
| 171 | |
| 172 | --- |
| 173 | |
| 174 | ## Testing Standards |
| 175 | |
| 176 | | Level | Scope | Required when | |
| 177 | |-------|-------|---------------| |
| 178 | | **Unit** | Single function or class, mocked dependencies | Always — every public function | |
| 179 | | **Integration** | Multiple real components wired together | Any time two modules interact | |
| 180 | | **Regression** | Reproduces a specific bug before the fix | Every bug fix, named `test_<what_broke>_<fixed_behavior>` | |
| 181 | | **E2E CLI** | Full CLI invocation via `typer.testing.CliRunner` | Any user-facing command | |
| 182 | |
| 183 | **Test scope:** run only the test files covering changed source files. The full suite is the gate for dev→main merges. |
| 184 | |
| 185 | **Agents own all broken tests — not just theirs.** If you see a failing test, fix it or block the PR. "This was already broken" is not an acceptable response. |
| 186 | |
| 187 | --- |
| 188 | |
| 189 | ## Verification Checklist |
| 190 | |
| 191 | Run before opening any PR: |
| 192 | |
| 193 | - [ ] On a feature branch — never on `dev` or `main` |
| 194 | - [ ] `mypy muse/` — zero errors, strict mode |
| 195 | - [ ] `python tools/typing_audit.py --dirs muse/ tests/ --max-any 0` — zero violations |
| 196 | - [ ] `pytest tests/ -v` — all tests green |
| 197 | - [ ] No `Any`, `object`, bare collections, `cast()`, `# type: ignore`, `Optional[X]`, `List`/`Dict` |
| 198 | - [ ] No dead code, no Maestro references, no async/await |
| 199 | - [ ] Affected docs updated in the same commit |
| 200 | - [ ] No secrets, no `print()`, no orphaned imports |
| 201 | |
| 202 | --- |
| 203 | |
| 204 | ## Scope of Authority |
| 205 | |
| 206 | ### Decide yourself |
| 207 | - Implementation details within existing patterns. |
| 208 | - Bug fixes with regression tests. |
| 209 | - Refactoring that preserves behaviour. |
| 210 | - Test additions and improvements. |
| 211 | - Doc updates reflecting code changes. |
| 212 | |
| 213 | ### Ask the user first |
| 214 | - New plugin domains (`muse/plugins/<domain>/`). |
| 215 | - New dependencies in `pyproject.toml`. |
| 216 | - Changes to the `MuseDomainPlugin` protocol (breaks all existing plugins). |
| 217 | - New CLI commands (user-facing API changes). |
| 218 | - Architecture changes (new layers, new storage formats). |
| 219 | |
| 220 | --- |
| 221 | |
| 222 | ## Anti-Patterns (never do these) |
| 223 | |
| 224 | - Working directly on `dev` or `main`. |
| 225 | - `Any`, `object`, bare collections, `cast()`, `# type: ignore` — absolute bans. |
| 226 | - `Optional[X]`, `List[X]`, `Dict[K,V]` — use modern syntax. |
| 227 | - `async`/`await` anywhere in `muse/`. |
| 228 | - Importing from `muse.plugins.*` inside `muse.core.*`. |
| 229 | - Adding `fastapi`, `sqlalchemy`, `pydantic`, `httpx`, `asyncpg` as dependencies. |
| 230 | - Referencing `maestro`, `agentception`, or `tourdeforce` — prior projects, fully excised. |
| 231 | - `print()` for diagnostics. |
| 232 | - Merging with a known failing test. |