cgcardona / muse public
muse-vcs.md markdown
323 lines 11.3 KB
83fa3d6e docs: full sweep — domain-agnostic rewrite of all docs Gabriel Cardona <gabriel@tellurstori.com> 3d ago
1 # Muse VCS — Architecture Reference
2
3 > **Status:** Canonical — Muse v0.1.1
4 > **See also:** [E2E Walkthrough](muse-e2e-demo.md) · [Plugin Protocol](../protocol/muse-protocol.md) · [Domain Concepts](../protocol/muse-domain-concepts.md) · [Type Contracts](../reference/type-contracts.md)
5
6 ---
7
8 ## What Muse Is
9
10 Muse is a **domain-agnostic version control system for multidimensional state**. It provides
11 a complete DAG engine — content-addressed objects, commits, branches, three-way merge, drift
12 detection, time-travel checkout, and a full log graph — with one deliberate gap: it does not
13 know what "state" is.
14
15 That gap is the plugin slot. A `MuseDomainPlugin` tells Muse how to:
16
17 - **snapshot** the current live state into a serializable, content-addressable dict
18 - **diff** two snapshots into a minimal delta
19 - **merge** two divergent snapshots against a common ancestor
20 - **drift** — detect how much live state has diverged from the last commit
21 - **apply** a delta to produce a new live state (checkout execution)
22
23 Everything else — the DAG, object store, branching, lineage walking, log, merge state
24 machine — is provided by the core engine and shared across all domains.
25
26 ---
27
28 ## The Seven Invariants
29
30 ```
31 State = a serializable, content-addressed snapshot of any multidimensional space
32 Commit = a named delta from a parent state, recorded in a DAG
33 Branch = a divergent line of intent forked from a shared ancestor
34 Merge = three-way reconciliation of two divergent state lines against a common base
35 Drift = the gap between committed state and live state
36 Checkout = deterministic reconstruction of any historical state from the DAG
37 Lineage = the causal chain from root to any commit
38 ```
39
40 None of those definitions contain the word "music."
41
42 ---
43
44 ## Repository Structure on Disk
45
46 Every Muse repository is a `.muse/` directory containing:
47
48 ```
49 .muse/
50 repo.json — repository ID, domain name, creation metadata
51 HEAD — ref pointer, e.g. refs/heads/main
52 config.toml — optional local config (auth token, remotes)
53 refs/
54 heads/
55 main — SHA-256 commit ID of branch HEAD
56 feature/... — additional branch HEADs
57 objects/
58 <sha2>/ — shard directory (first 2 hex chars)
59 <sha62> — raw content-addressed blob (62 remaining hex chars)
60 commits/
61 <commit_id>.json — CommitRecord
62 snapshots/
63 <snapshot_id>.json — SnapshotRecord (manifest: {path → object_id})
64 tags/
65 <tag_id>.json — TagRecord
66 MERGE_STATE.json — present only during an active merge conflict
67 sessions/ — optional: named work sessions (muse session)
68 muse-work/ — the working tree (domain files live here)
69 .museattributes — optional: per-path merge strategy overrides
70 ```
71
72 The object store mirrors Git's loose-object layout: sharding by the first two hex
73 characters of each SHA-256 digest prevents filesystem degradation as the repository grows.
74
75 ---
76
77 ## Core Engine Modules
78
79 ```
80 muse/
81 domain.py — MuseDomainPlugin Protocol + all shared type definitions
82 core/
83 store.py — file-based commit / snapshot / tag store (no external DB)
84 repo.py — repository detection (MUSE_REPO_ROOT or directory walk)
85 snapshot.py — content-addressed snapshot and commit ID derivation
86 object_store.py — SHA-256 blob storage under .muse/objects/
87 merge_engine.py — three-way merge state machine + conflict resolution
88 errors.py — ExitCode enum and error primitives
89 plugins/
90 registry.py — maps domain names → MuseDomainPlugin instances
91 music/
92 plugin.py — MusicPlugin: reference MuseDomainPlugin implementation
93 cli/
94 app.py — Typer application root, command registration
95 commands/ — one file per subcommand
96 ```
97
98 ---
99
100 ## Deterministic ID Derivation
101
102 All IDs are SHA-256 digests, making the DAG fully content-addressed:
103
104 ```
105 object_id = sha256(raw_file_bytes)
106 snapshot_id = sha256(sorted("path:object_id\n" pairs))
107 commit_id = sha256(sorted_parent_ids | snapshot_id | message | timestamp_iso)
108 ```
109
110 The same snapshot always produces the same ID. Two commits that point to identical
111 state will share a `snapshot_id`. Objects are never overwritten — write is always
112 idempotent (`False` return means "already existed, skipped").
113
114 ---
115
116 ## Plugin Architecture
117
118 ### The Protocol
119
120 ```python
121 class MuseDomainPlugin(Protocol):
122 def snapshot(self, live_state: LiveState) -> StateSnapshot:
123 """Capture current live state as a serializable, hashable snapshot."""
124
125 def diff(self, base: StateSnapshot, target: StateSnapshot) -> StateDelta:
126 """Compute the minimal delta between two snapshots."""
127
128 def merge(
129 self,
130 base: StateSnapshot,
131 left: StateSnapshot,
132 right: StateSnapshot,
133 ) -> MergeResult:
134 """Three-way merge. Return merged snapshot + conflict paths."""
135
136 def drift(
137 self,
138 committed: StateSnapshot,
139 live: LiveState,
140 ) -> DriftReport:
141 """Compare committed state against current live state."""
142
143 def apply(self, delta: StateDelta, live_state: LiveState) -> LiveState:
144 """Apply a delta to produce a new live state (checkout execution)."""
145 ```
146
147 ### How CLI Commands Use the Plugin
148
149 Every CLI command that touches domain state goes through `resolve_plugin(root)`:
150
151 | Command | Plugin method(s) called |
152 |---|---|
153 | `muse commit` | `snapshot()` |
154 | `muse status` | `drift()` |
155 | `muse diff` | `diff()` |
156 | `muse merge` | `merge()` |
157 | `muse cherry-pick` | `merge()` |
158 | `muse stash` | `snapshot()` |
159 | `muse checkout` | `diff()` + `apply()` |
160
161 The plugin registry (`muse/plugins/registry.py`) reads `domain` from `.muse/repo.json`
162 and returns the appropriate `MuseDomainPlugin` instance. Unknown domains raise a
163 `ValueError` listing the registered alternatives.
164
165 ### Registering a New Domain
166
167 ```python
168 # muse/plugins/registry.py
169 from muse.plugins.my_domain.plugin import MyDomainPlugin
170
171 _REGISTRY: dict[str, MuseDomainPlugin] = {
172 "music": MusicPlugin(),
173 "my_domain": MyDomainPlugin(),
174 }
175 ```
176
177 Then initialize a repository for that domain:
178
179 ```bash
180 muse init --domain my_domain
181 ```
182
183 ---
184
185 ## Music Plugin — Reference Implementation
186
187 The music plugin (`muse/plugins/music/plugin.py`) implements `MuseDomainPlugin` for
188 MIDI state stored as files in `muse-work/`. It is the proof that the abstraction works.
189
190 | Method | Music domain behavior |
191 |---|---|
192 | `snapshot()` | Walk `muse-work/`, SHA-256 each file → `{"files": {path: hash}, "domain": "music"}` |
193 | `diff()` | Set difference on file paths + hash comparison → added / removed / modified lists |
194 | `merge()` | Three-way set reconciliation; consensus deletions are not conflicts |
195 | `drift()` | `snapshot(workdir)` then `diff(committed, live)` → `DriftReport` |
196 | `apply()` | With a Path: rescan workdir (files already updated). With a dict: apply removals. |
197
198 ---
199
200 ## Merge Algorithm
201
202 `muse merge <branch>` performs a three-way merge:
203
204 1. **Find merge base** — walk the commit DAG from both HEADs to find the LCA
205 2. **Construct snapshots** — load base, ours, and theirs `StateSnapshot` objects
206 3. **Call `plugin.merge(base, ours, theirs)`** — domain logic reconciles the state
207 4. **Handle result:**
208 - Clean merge → restore working tree, create merge commit (two parents)
209 - Conflicts → write `MERGE_STATE.json`, restore what can be auto-merged, report conflict paths
210 5. **`muse merge --continue`** — after manual resolution, commit with stored parents
211
212 `MERGE_STATE.json` records `base_commit`, `ours_commit`, `theirs_commit`, and
213 `conflict_paths` so the CLI can resume after the user resolves conflicts.
214
215 ---
216
217 ## Checkout Algorithm
218
219 `muse checkout <target>` uses incremental delta restoration:
220
221 1. Read current branch's `StateSnapshot` from the object store
222 2. Read target `StateSnapshot`
223 3. Call `plugin.diff(current, target)` → delta
224 4. **Remove** files in `delta["removed"]` from `muse-work/`
225 5. **Restore** files in `delta["added"] + delta["modified"]` from the object store
226 6. Call `plugin.apply(delta, workdir)` — domain-level post-checkout hook
227
228 Only files that actually changed are touched. Unchanged files are never re-copied,
229 making checkout fast even for large repositories.
230
231 ---
232
233 ## Commit Data Flow
234
235 ```
236 muse commit -m "message"
237
238 ├─ plugin.snapshot(workdir) → StateSnapshot {"files": {path: sha}, "domain": "..."}
239
240 ├─ compute_snapshot_id(manifest) → snapshot_id (sha256 of sorted path:sha pairs)
241
242 ├─ compute_commit_id(parents, snapshot_id, message, timestamp) → commit_id
243
244 ├─ write_object_from_path(root, sha, src) ×N (idempotent)
245
246 ├─ write_snapshot(root, SnapshotRecord) (idempotent)
247
248 ├─ write_commit(root, CommitRecord)
249
250 └─ update refs/heads/<branch> → commit_id
251 ```
252
253 Revert and cherry-pick reuse existing snapshot IDs directly — no re-scan needed
254 since the objects are already content-addressed in the store.
255
256 ---
257
258 ## CLI Command Map
259
260 ### Core VCS (all domains)
261
262 | Command | Description |
263 |---|---|
264 | `muse init [--domain <name>]` | Initialize a repository |
265 | `muse commit -m <msg>` | Snapshot live state and record a commit |
266 | `muse status` | Show drift between HEAD and working tree |
267 | `muse diff [<base>] [<target>]` | Show delta between commits or vs. working tree |
268 | `muse log [--oneline] [--graph] [--stat]` | Display commit history |
269 | `muse show [<ref>] [--json] [--stat]` | Inspect a single commit |
270 | `muse branch [<name>] [-d <name>]` | Create or delete branches |
271 | `muse checkout <branch\|commit> [-b]` | Switch branches or restore historical state |
272 | `muse merge <branch>` | Three-way merge a branch into HEAD |
273 | `muse cherry-pick <commit>` | Apply a specific commit's delta on top of HEAD |
274 | `muse revert <commit>` | Create a new commit undoing a prior commit |
275 | `muse reset <commit> [--hard]` | Move branch pointer (hard: also restore working tree) |
276 | `muse stash` / `pop` / `list` / `drop` | Temporarily shelve uncommitted changes |
277 | `muse tag add <tag> [<ref>]` | Tag a commit |
278 | `muse tag list [<ref>]` | List tags |
279
280 ### Music-Domain Extras (music plugin only)
281
282 | Command | Description |
283 |---|---|
284 | `muse commit --section <name> --track <name> --emotion <name>` | Commit with music metadata |
285 | `muse log --section <s> --track <t> --emotion <e>` | Filter log by music metadata |
286 | `muse groove-check` | Analyse rhythmic drift across history |
287 | `muse emotion-diff <a> <b>` | Compare emotion vectors between commits |
288
289 ---
290
291 ## Testing
292
293 ```bash
294 # Run full test suite
295 python -m pytest
296
297 # Run with coverage report
298 python -m pytest --cov=muse --cov-report=term-missing
299
300 # Run type audit (zero violations enforced in CI)
301 python tools/typing_audit.py --dirs muse/ tests/ --max-any 0
302
303 # Run mypy
304 mypy muse/
305 ```
306
307 Coverage target: ≥ 80% (currently 91%, excluding `config.py`, `midi_parser.py`).
308
309 CI runs pytest + mypy + typing_audit on every pull request to `main` and `dev`.
310
311 ---
312
313 ## Adding a Second Domain
314
315 To add a new domain (e.g. `genomics`):
316
317 1. Create `muse/plugins/genomics/plugin.py` implementing `MuseDomainPlugin`
318 2. Register it in `muse/plugins/registry.py`
319 3. Run `muse init --domain genomics` in any project directory
320 4. All existing CLI commands work immediately — no changes needed
321
322 The music plugin (`muse/plugins/music/plugin.py`) is the complete reference for what
323 each method should do. It is 326 lines including full docstrings.