cgcardona / muse public
muse-vcs.md markdown
327 lines 11.5 KB
7fd3e008 Fix JS syntax errors in tour_de_force.html; update type-contracts docs Gabriel Cardona <gabriel@tellurstori.com> 3d ago
1 # Muse VCS — Architecture Reference
2
3 > **Status:** Canonical — Muse v0.1.1
4 > **See also:** [E2E Walkthrough](muse-e2e-demo.md) · [Plugin Protocol](../protocol/muse-protocol.md) · [Domain Concepts](../protocol/muse-domain-concepts.md) · [Type Contracts](../reference/type-contracts.md)
5
6 ---
7
8 ## What Muse Is
9
10 Muse is a **domain-agnostic version control system for multidimensional state**. It provides
11 a complete DAG engine — content-addressed objects, commits, branches, three-way merge, drift
12 detection, time-travel checkout, and a full log graph — with one deliberate gap: it does not
13 know what "state" is.
14
15 That gap is the plugin slot. A `MuseDomainPlugin` tells Muse how to:
16
17 - **snapshot** the current live state into a serializable, content-addressable dict
18 - **diff** two snapshots into a minimal delta
19 - **merge** two divergent snapshots against a common ancestor
20 - **drift** — detect how much live state has diverged from the last commit
21 - **apply** a delta to produce a new live state (checkout execution)
22
23 Everything else — the DAG, object store, branching, lineage walking, log, merge state
24 machine — is provided by the core engine and shared across all domains.
25
26 ---
27
28 ## The Seven Invariants
29
30 ```
31 State = a serializable, content-addressed snapshot of any multidimensional space
32 Commit = a named delta from a parent state, recorded in a DAG
33 Branch = a divergent line of intent forked from a shared ancestor
34 Merge = three-way reconciliation of two divergent state lines against a common base
35 Drift = the gap between committed state and live state
36 Checkout = deterministic reconstruction of any historical state from the DAG
37 Lineage = the causal chain from root to any commit
38 ```
39
40 None of those definitions contain the word "music."
41
42 ---
43
44 ## Repository Structure on Disk
45
46 Every Muse repository is a `.muse/` directory containing:
47
48 ```
49 .muse/
50 repo.json — repository ID, domain name, creation metadata
51 HEAD — ref pointer, e.g. refs/heads/main
52 config.toml — optional local config (auth token, remotes)
53 refs/
54 heads/
55 main — SHA-256 commit ID of branch HEAD
56 feature/... — additional branch HEADs
57 objects/
58 <sha2>/ — shard directory (first 2 hex chars)
59 <sha62> — raw content-addressed blob (62 remaining hex chars)
60 commits/
61 <commit_id>.json — CommitRecord
62 snapshots/
63 <snapshot_id>.json — SnapshotRecord (manifest: {path → object_id})
64 tags/
65 <tag_id>.json — TagRecord
66 MERGE_STATE.json — present only during an active merge conflict
67 sessions/ — optional: named work sessions (muse session)
68 muse-work/ — the working tree (domain files live here)
69 .museattributes — optional: per-path merge strategy overrides
70 ```
71
72 The object store mirrors Git's loose-object layout: sharding by the first two hex
73 characters of each SHA-256 digest prevents filesystem degradation as the repository grows.
74
75 ---
76
77 ## Core Engine Modules
78
79 ```
80 muse/
81 domain.py — MuseDomainPlugin Protocol + all shared type definitions
82 core/
83 store.py — file-based commit / snapshot / tag store (no external DB)
84 repo.py — repository detection (MUSE_REPO_ROOT or directory walk)
85 snapshot.py — content-addressed snapshot and commit ID derivation
86 object_store.py — SHA-256 blob storage under .muse/objects/
87 merge_engine.py — three-way merge state machine + conflict resolution
88 errors.py — ExitCode enum and error primitives
89 plugins/
90 registry.py — maps domain names → MuseDomainPlugin instances
91 music/
92 plugin.py — MusicPlugin: reference MuseDomainPlugin implementation
93 cli/
94 app.py — Typer application root, command registration
95 commands/ — one file per subcommand
96 ```
97
98 ---
99
100 ## Deterministic ID Derivation
101
102 All IDs are SHA-256 digests, making the DAG fully content-addressed:
103
104 ```
105 object_id = sha256(raw_file_bytes)
106 snapshot_id = sha256(sorted("path:object_id\n" pairs))
107 commit_id = sha256(sorted_parent_ids | snapshot_id | message | timestamp_iso)
108 ```
109
110 The same snapshot always produces the same ID. Two commits that point to identical
111 state will share a `snapshot_id`. Objects are never overwritten — write is always
112 idempotent (`False` return means "already existed, skipped").
113
114 ---
115
116 ## Plugin Architecture
117
118 ### The Protocol
119
120 ```python
121 class MuseDomainPlugin(Protocol):
122 def snapshot(self, live_state: LiveState) -> StateSnapshot:
123 """Capture current live state as a serializable, hashable snapshot."""
124
125 def diff(self, base: StateSnapshot, target: StateSnapshot) -> StateDelta:
126 """Compute the minimal delta between two snapshots."""
127
128 def merge(
129 self,
130 base: StateSnapshot,
131 left: StateSnapshot,
132 right: StateSnapshot,
133 *,
134 repo_root: pathlib.Path | None = None,
135 ) -> MergeResult:
136 """Three-way merge. Loads .museattributes when repo_root is given.
137 Returns merged snapshot, conflict paths, applied_strategies, and
138 dimension_reports."""
139
140 def drift(
141 self,
142 committed: StateSnapshot,
143 live: LiveState,
144 ) -> DriftReport:
145 """Compare committed state against current live state."""
146
147 def apply(self, delta: StateDelta, live_state: LiveState) -> LiveState:
148 """Apply a delta to produce a new live state (checkout execution)."""
149 ```
150
151 ### How CLI Commands Use the Plugin
152
153 Every CLI command that touches domain state goes through `resolve_plugin(root)`:
154
155 | Command | Plugin method(s) called |
156 |---|---|
157 | `muse commit` | `snapshot()` |
158 | `muse status` | `drift()` |
159 | `muse diff` | `diff()` |
160 | `muse merge` | `merge()` |
161 | `muse cherry-pick` | `merge()` |
162 | `muse stash` | `snapshot()` |
163 | `muse checkout` | `diff()` + `apply()` |
164
165 The plugin registry (`muse/plugins/registry.py`) reads `domain` from `.muse/repo.json`
166 and returns the appropriate `MuseDomainPlugin` instance. Unknown domains raise a
167 `ValueError` listing the registered alternatives.
168
169 ### Registering a New Domain
170
171 ```python
172 # muse/plugins/registry.py
173 from muse.plugins.my_domain.plugin import MyDomainPlugin
174
175 _REGISTRY: dict[str, MuseDomainPlugin] = {
176 "music": MusicPlugin(),
177 "my_domain": MyDomainPlugin(),
178 }
179 ```
180
181 Then initialize a repository for that domain:
182
183 ```bash
184 muse init --domain my_domain
185 ```
186
187 ---
188
189 ## Music Plugin — Reference Implementation
190
191 The music plugin (`muse/plugins/music/plugin.py`) implements `MuseDomainPlugin` for
192 MIDI state stored as files in `muse-work/`. It is the proof that the abstraction works.
193
194 | Method | Music domain behavior |
195 |---|---|
196 | `snapshot()` | Walk `muse-work/`, SHA-256 each file → `{"files": {path: hash}, "domain": "music"}` |
197 | `diff()` | Set difference on file paths + hash comparison → added / removed / modified lists |
198 | `merge()` | Three-way set reconciliation; consensus deletions are not conflicts |
199 | `drift()` | `snapshot(workdir)` then `diff(committed, live)` → `DriftReport` |
200 | `apply()` | With a Path: rescan workdir (files already updated). With a dict: apply removals. |
201
202 ---
203
204 ## Merge Algorithm
205
206 `muse merge <branch>` performs a three-way merge:
207
208 1. **Find merge base** — walk the commit DAG from both HEADs to find the LCA
209 2. **Construct snapshots** — load base, ours, and theirs `StateSnapshot` objects
210 3. **Call `plugin.merge(base, ours, theirs)`** — domain logic reconciles the state
211 4. **Handle result:**
212 - Clean merge → restore working tree, create merge commit (two parents)
213 - Conflicts → write `MERGE_STATE.json`, restore what can be auto-merged, report conflict paths
214 5. **`muse merge --continue`** — after manual resolution, commit with stored parents
215
216 `MERGE_STATE.json` records `base_commit`, `ours_commit`, `theirs_commit`, and
217 `conflict_paths` so the CLI can resume after the user resolves conflicts.
218
219 ---
220
221 ## Checkout Algorithm
222
223 `muse checkout <target>` uses incremental delta restoration:
224
225 1. Read current branch's `StateSnapshot` from the object store
226 2. Read target `StateSnapshot`
227 3. Call `plugin.diff(current, target)` → delta
228 4. **Remove** files in `delta["removed"]` from `muse-work/`
229 5. **Restore** files in `delta["added"] + delta["modified"]` from the object store
230 6. Call `plugin.apply(delta, workdir)` — domain-level post-checkout hook
231
232 Only files that actually changed are touched. Unchanged files are never re-copied,
233 making checkout fast even for large repositories.
234
235 ---
236
237 ## Commit Data Flow
238
239 ```
240 muse commit -m "message"
241
242 ├─ plugin.snapshot(workdir) → StateSnapshot {"files": {path: sha}, "domain": "..."}
243
244 ├─ compute_snapshot_id(manifest) → snapshot_id (sha256 of sorted path:sha pairs)
245
246 ├─ compute_commit_id(parents, snapshot_id, message, timestamp) → commit_id
247
248 ├─ write_object_from_path(root, sha, src) ×N (idempotent)
249
250 ├─ write_snapshot(root, SnapshotRecord) (idempotent)
251
252 ├─ write_commit(root, CommitRecord)
253
254 └─ update refs/heads/<branch> → commit_id
255 ```
256
257 Revert and cherry-pick reuse existing snapshot IDs directly — no re-scan needed
258 since the objects are already content-addressed in the store.
259
260 ---
261
262 ## CLI Command Map
263
264 ### Core VCS (all domains)
265
266 | Command | Description |
267 |---|---|
268 | `muse init [--domain <name>]` | Initialize a repository |
269 | `muse commit -m <msg>` | Snapshot live state and record a commit |
270 | `muse status` | Show drift between HEAD and working tree |
271 | `muse diff [<base>] [<target>]` | Show delta between commits or vs. working tree |
272 | `muse log [--oneline] [--graph] [--stat]` | Display commit history |
273 | `muse show [<ref>] [--json] [--stat]` | Inspect a single commit |
274 | `muse branch [<name>] [-d <name>]` | Create or delete branches |
275 | `muse checkout <branch\|commit> [-b]` | Switch branches or restore historical state |
276 | `muse merge <branch>` | Three-way merge a branch into HEAD |
277 | `muse cherry-pick <commit>` | Apply a specific commit's delta on top of HEAD |
278 | `muse revert <commit>` | Create a new commit undoing a prior commit |
279 | `muse reset <commit> [--hard]` | Move branch pointer (hard: also restore working tree) |
280 | `muse stash` / `pop` / `list` / `drop` | Temporarily shelve uncommitted changes |
281 | `muse tag add <tag> [<ref>]` | Tag a commit |
282 | `muse tag list [<ref>]` | List tags |
283
284 ### Music-Domain Extras (music plugin only)
285
286 | Command | Description |
287 |---|---|
288 | `muse commit --section <name> --track <name> --emotion <name>` | Commit with music metadata |
289 | `muse log --section <s> --track <t> --emotion <e>` | Filter log by music metadata |
290 | `muse groove-check` | Analyse rhythmic drift across history |
291 | `muse emotion-diff <a> <b>` | Compare emotion vectors between commits |
292
293 ---
294
295 ## Testing
296
297 ```bash
298 # Run full test suite
299 python -m pytest
300
301 # Run with coverage report
302 python -m pytest --cov=muse --cov-report=term-missing
303
304 # Run type audit (zero violations enforced in CI)
305 python tools/typing_audit.py --dirs muse/ tests/ --max-any 0
306
307 # Run mypy
308 mypy muse/
309 ```
310
311 Coverage target: ≥ 80% (currently 91%, excluding `config.py`, `midi_parser.py`).
312
313 CI runs pytest + mypy + typing_audit on every pull request to `main` and `dev`.
314
315 ---
316
317 ## Adding a Second Domain
318
319 To add a new domain (e.g. `genomics`):
320
321 1. Create `muse/plugins/genomics/plugin.py` implementing `MuseDomainPlugin`
322 2. Register it in `muse/plugins/registry.py`
323 3. Run `muse init --domain genomics` in any project directory
324 4. All existing CLI commands work immediately — no changes needed
325
326 The music plugin (`muse/plugins/music/plugin.py`) is the complete reference for what
327 each method should do. It is 326 lines including full docstrings.