muse-variation-spec.md
markdown
| 1 | # Muse / Variation Specification — End-to-End UX + Technical Contract (Stori) |
| 2 | |
| 3 | > **Status:** Implementation Specification (v1) |
| 4 | > **Date:** February 2026 |
| 5 | > **Target:** Stori DAW (Swift/SwiftUI) + Maestro/Intent Engine (Python) |
| 6 | > **Goal:** Ship a *demo-grade* implementation inside Stori that proves the "Cursor of DAWs" paradigm: **reviewable, audible, non-destructive AI changes**. |
| 7 | |
| 8 | > **Canonical Time Unit:** All Muse and Variation data structures use **beats** as the canonical time unit. Seconds are a derived, playback-only representation. Muse reasons musically, not in wall-clock time. |
| 9 | |
| 10 | > **Canonical Backend References:** |
| 11 | > For backend wire contract, state machine, and terminology, these docs are authoritative: |
| 12 | > - [variation_api.md](variation_api.md) — Wire contract, endpoints, SSE events, error codes |
| 13 | > - [terminology.md](terminology.md) — Canonical vocabulary (normative) |
| 14 | > - [muse_vcs.md](../architecture/muse_vcs.md) — Muse VCS architecture (persistent history, checkout, merge, log graph) |
| 15 | |
| 16 | --- |
| 17 | |
| 18 | ## What Is Muse? |
| 19 | |
| 20 | **Muse** is Stori's change-proposal system for music. |
| 21 | |
| 22 | Just as Git is a system for proposing, reviewing, and applying changes to source code, Muse is a system for proposing, reviewing, and applying changes to musical material. |
| 23 | |
| 24 | Muse does not edit music directly. |
| 25 | |
| 26 | Muse computes **Variations** — structured, reviewable descriptions of how one musical state differs from another — and presents them for human evaluation. |
| 27 | |
| 28 | --- |
| 29 | |
| 30 | ### Muse's Role in the System |
| 31 | |
| 32 | Muse sits between **intent** and **mutation**. |
| 33 | |
| 34 | |
| 35 | ## 0) Canonical Terms (Do Not Drift) |
| 36 | |
| 37 | This vocabulary is **normative**. Use these exact words in code, UI, docs, and agent prompts. |
| 38 | |
| 39 | | Software analogy | Stori term | Definition | |
| 40 | |---|---|---| |
| 41 | | Git | **Muse** | The creative intelligence / system that proposes musical ideas | |
| 42 | | Diff | **Variation** | A proposed musical interpretation expressed as a semantic, audible change set | |
| 43 | | Hunk | **Phrase** | An independently reviewable/applicable musical phrase (bars/region slice) | |
| 44 | | Commit | **Accept Variation** | Apply selected phrases to canonical state; creates a single undo boundary | |
| 45 | | Reject | **Discard Variation** | Close the proposal without mutating canonical state | |
| 46 | | Revert | **Undo Variation** | Uses DAW undo/redo; engine-aware and audio-safe | |
| 47 | | Branch (future) | Alternate Interpretation | Parallel musical directions | |
| 48 | | Merge (future) | Blend Variations | Combine harmony from A + rhythm from B + etc. | |
| 49 | |
| 50 | > **Key concept:** A diff is read. A Variation is **heard**. |
| 51 | > **Time unit:** Muse reasons in **beats**, not seconds. Time is a playback concern. |
| 52 | |
| 53 | --- |
| 54 | |
| 55 | ## 1) When Variations Appear (Execution Mode Policy) |
| 56 | |
| 57 | The backend enforces execution mode based on intent classification. The frontend does not choose the mode — it reacts to the `state` SSE event emitted at the start of every compose stream. |
| 58 | |
| 59 | ### 1.1 Core Rule — COMPOSING Always Produces a Variation |
| 60 | |
| 61 | | Intent state | `execution_mode` | Behavior | |
| 62 | |---|---|---| |
| 63 | | **COMPOSING** | `variation` (forced by backend) | All tool calls produce a Variation for human review | |
| 64 | | **EDITING** | `apply` (forced by backend) | Structural ops (add track, set tempo, mute, etc.) apply immediately | |
| 65 | | **REASONING** | n/a | Chat only, no tools | |
| 66 | |
| 67 | **Every COMPOSING request produces a Variation** — including purely additive ones (first-time MIDI generation, creating a new song from scratch). This mirrors the "Cursor of DAWs" paradigm: AI-generated musical content always requires human approval before becoming canonical state. |
| 68 | |
| 69 | **Examples (Variation Review UI — COMPOSING):** |
| 70 | - "Create a new song in the style of Phish" — additive, but COMPOSING -> Variation |
| 71 | - "Make a chill lo-fi beat at 85 BPM" — additive, COMPOSING -> Variation |
| 72 | - "Make that minor" (transforms pitches) — COMPOSING -> Variation |
| 73 | - "Simplify the melody" (removals/modifications) — COMPOSING -> Variation |
| 74 | - "Change the bassline to be more syncopated" (re-writes notes) — COMPOSING -> Variation |
| 75 | |
| 76 | **Examples (direct apply, no Variation — EDITING):** |
| 77 | - "Add a drum track" — structural, EDITING -> apply |
| 78 | - "Set the tempo to 120 BPM" — structural, EDITING -> apply |
| 79 | - "Mute the bass" — structural, EDITING -> apply |
| 80 | |
| 81 | ### 1.2 "Create a new song in the style of ..." (Multi-step Tool Flow) |
| 82 | |
| 83 | When the user asks to create a song from scratch, the backend classifies this as COMPOSING and the entire plan (tracks + regions + notes + FX) is proposed as a **single Variation** for review. |
| 84 | |
| 85 | **Behavior:** |
| 86 | 1. The planner generates a full plan (create tracks -> add regions -> generate MIDI -> add FX). |
| 87 | 2. The executor simulates the plan without mutation and computes a Variation with Phrases. |
| 88 | 3. The SSE stream emits `meta` -> `phrase*` -> `done` events. |
| 89 | 4. The frontend enters **Variation Review Mode** showing the proposed changes. |
| 90 | 5. The user reviews, auditions (A/B), and accepts or discards. |
| 91 | |
| 92 | This ensures the user always has agency over AI-generated content, even during initial creation. The UX is a single review step at the end of generation — not repeated pop-ups per tool call. |
| 93 | |
| 94 | ### 1.3 User Trust Overrides |
| 95 | Always show Variation UI when: |
| 96 | - The change is **destructive** (deletes/overwrites notes/regions) |
| 97 | - The target material is **user-edited** (has `userTouched=true`) or "pinned/locked" |
| 98 | - The change is **large-scope** (multi-track rewrite) |
| 99 | - The model's confidence is low OR the engine produced a best-effort fallback |
| 100 | |
| 101 | ### 1.4 Quick Setting (future) |
| 102 | Add a user preference (later): |
| 103 | - **Muse Review Mode:** `Always` | `Smart (default)` | `Never (power users)` |
| 104 | |
| 105 | When implemented, this preference will be stored server-side and consulted in `orchestrate()`. Even in `Never` mode, destructive changes should warn. |
| 106 | |
| 107 | --- |
| 108 | |
| 109 | ## 2) System Model |
| 110 | |
| 111 | ### 2.1 Canonical vs Proposed State |
| 112 | - **Canonical State**: the DAW's real project state (undoable, playable, saved). |
| 113 | - **Proposed State**: an ephemeral, derived state computed by backend to propose a Variation. |
| 114 | |
| 115 | **Important:** The backend does **not** mutate canonical state during proposal. |
| 116 | |
| 117 | ### 2.2 Variation Lifecycle |
| 118 | |
| 119 | 1. **Propose**: Muse generates a Variation from intent. |
| 120 | 2. **Stream**: Phrases (hunks) stream to the frontend as soon as they're computed. |
| 121 | 3. **Review**: FE enters Variation Review Mode (overlay + A/B audition). |
| 122 | 4. **Accept**: FE sends accepted phrase IDs; BE applies them transactionally. |
| 123 | 5. **Discard**: FE discards; no mutation. |
| 124 | |
| 125 | --- |
| 126 | |
| 127 | ## 3) API Contract (Backend <-> Frontend) |
| 128 | |
| 129 | This spec assumes HTTP + **SSE** (server-sent events) for streaming. WebSockets also acceptable; SSE is simpler for v1. |
| 130 | |
| 131 | ### 3.1 Identifiers & Concurrency |
| 132 | All Variation operations must carry: |
| 133 | - `project_id` |
| 134 | - `base_state_id` (monotonic project version, e.g., UUID or int) |
| 135 | - `variation_id` |
| 136 | - Optional `request_id` for idempotency |
| 137 | |
| 138 | Backend must reject commits if `base_state_id` mismatches (optimistic concurrency) unless FE explicitly requests rebase. |
| 139 | |
| 140 | ### 3.2 Endpoints |
| 141 | |
| 142 | #### (A) Propose Variation |
| 143 | `POST /variation/propose` |
| 144 | |
| 145 | **Request** |
| 146 | ```json |
| 147 | { |
| 148 | "project_id": "uuid", |
| 149 | "base_state_id": "uuid-or-int", |
| 150 | "intent": "make that minor", |
| 151 | "scope": { |
| 152 | "track_ids": ["uuid"], |
| 153 | "region_ids": ["uuid"], |
| 154 | "beat_range": [4.0, 8.0] |
| 155 | }, |
| 156 | "options": { |
| 157 | "phrase_grouping": "bars", |
| 158 | "bar_size": 4, |
| 159 | "stream": true |
| 160 | }, |
| 161 | "request_id": "uuid" |
| 162 | } |
| 163 | ``` |
| 164 | |
| 165 | **Immediate Response (fast)** |
| 166 | ```json |
| 167 | { |
| 168 | "variation_id": "uuid", |
| 169 | "project_id": "uuid", |
| 170 | "base_state_id": "uuid-or-int", |
| 171 | "intent": "make that minor", |
| 172 | "ai_explanation": null, |
| 173 | "stream_url": "/variation/stream?variation_id=uuid" |
| 174 | } |
| 175 | ``` |
| 176 | |
| 177 | #### (B) Stream Variation (phrases/hunks) |
| 178 | `GET /variation/stream?variation_id=...` (SSE) |
| 179 | |
| 180 | All events are wrapped in a transport-agnostic `EventEnvelope`: |
| 181 | ```json |
| 182 | { |
| 183 | "type": "meta|phrase|done|error|heartbeat", |
| 184 | "sequence": 1, |
| 185 | "variation_id": "uuid", |
| 186 | "project_id": "uuid", |
| 187 | "base_state_id": "uuid-or-int", |
| 188 | "timestamp_ms": 1700000000000, |
| 189 | "payload": { } |
| 190 | } |
| 191 | ``` |
| 192 | `sequence` is strictly increasing per variation (meta=1, then phrases, then done last). |
| 193 | The event-specific data lives in `payload`; outer fields provide routing and ordering context. |
| 194 | |
| 195 | **SSE Events** |
| 196 | - `meta` — overall summary + UX copy + counts |
| 197 | - `phrase` — one musical phrase at a time |
| 198 | - `done` — end of stream |
| 199 | - `error` — terminal |
| 200 | - `heartbeat` — keepalive (no payload significance) |
| 201 | |
| 202 | > `progress` events are not yet implemented. |
| 203 | |
| 204 | **Example: `meta`** (this is the `payload` field inside the `EventEnvelope`; `variation_id`, `project_id`, `base_state_id`, and `sequence` are in the outer envelope) |
| 205 | ```json |
| 206 | { |
| 207 | "intent": "make that minor", |
| 208 | "ai_explanation": "Lowered scale degrees 3 and 7", |
| 209 | "affected_tracks": ["uuid"], |
| 210 | "affected_regions": ["uuid"], |
| 211 | "note_counts": { "added": 12, "removed": 4, "modified": 8 } |
| 212 | } |
| 213 | ``` |
| 214 | |
| 215 | **Example: `phrase`** |
| 216 | ```json |
| 217 | { |
| 218 | "phrase_id": "uuid", |
| 219 | "track_id": "uuid", |
| 220 | "region_id": "uuid", |
| 221 | "start_beat": 16.0, |
| 222 | "end_beat": 32.0, |
| 223 | "label": "Bars 5-8", |
| 224 | "tags": ["harmonyChange","scaleChange"], |
| 225 | "explanation": "Converted major 3rds to minor 3rds", |
| 226 | "note_changes": [ |
| 227 | { |
| 228 | "note_id": "uuid", |
| 229 | "change_type": "modified", |
| 230 | "before": { "pitch": 64, "start_beat": 0.0, "duration_beats": 0.5, "velocity": 90 }, |
| 231 | "after": { "pitch": 63, "start_beat": 0.0, "duration_beats": 0.5, "velocity": 90 } |
| 232 | } |
| 233 | ], |
| 234 | "controller_changes": [ |
| 235 | { "kind": "cc", "cc": 64, "beat": 0.0, "value": 127 }, |
| 236 | { "kind": "pitch_bend", "beat": 1.5, "value": 4096 }, |
| 237 | { "kind": "aftertouch", "beat": 2.0, "value": 80 } |
| 238 | ] |
| 239 | } |
| 240 | ``` |
| 241 | |
| 242 | > **Beat semantics:** `phrase.start_beat` / `phrase.end_beat` are **absolute project positions**. Note `start_beat` values inside `note_changes` are **region-relative** (offset from the region's start beat). This matches how DAWs universally store MIDI data within regions. |
| 243 | |
| 244 | **Example: `done`** |
| 245 | |
| 246 | The `variation_id` is carried in the outer `EventEnvelope` wrapper (not repeated in the payload). |
| 247 | ```json |
| 248 | { "status": "ready", "phrase_count": 3 } |
| 249 | ``` |
| 250 | |
| 251 | #### (C) Commit (Accept Variation) |
| 252 | `POST /variation/commit` |
| 253 | |
| 254 | **Request** |
| 255 | ```json |
| 256 | { |
| 257 | "project_id": "uuid", |
| 258 | "base_state_id": "uuid-or-int", |
| 259 | "variation_id": "uuid", |
| 260 | "accepted_phrase_ids": ["uuid","uuid"], |
| 261 | "request_id": "uuid" |
| 262 | } |
| 263 | ``` |
| 264 | |
| 265 | **Response** |
| 266 | ```json |
| 267 | { |
| 268 | "project_id": "uuid", |
| 269 | "new_state_id": "uuid-or-int", |
| 270 | "applied_phrase_ids": ["uuid","uuid"], |
| 271 | "undo_label": "Accept Variation: make that minor", |
| 272 | "updated_regions": [ |
| 273 | { |
| 274 | "region_id": "uuid", |
| 275 | "track_id": "uuid", |
| 276 | "notes": [ |
| 277 | { "pitch": 60, "start_beat": 0.0, "duration_beats": 1.0, "velocity": 100, "channel": 0 } |
| 278 | ], |
| 279 | "cc_events": [ |
| 280 | { "cc": 64, "beat": 0.0, "value": 127 } |
| 281 | ], |
| 282 | "pitch_bends": [], |
| 283 | "aftertouch": [] |
| 284 | } |
| 285 | ] |
| 286 | } |
| 287 | ``` |
| 288 | |
| 289 | #### (D) Poll Variation Status |
| 290 | `GET /variation/{variation_id}` |
| 291 | |
| 292 | Returns the current status and accumulated phrases for a variation. Useful for |
| 293 | reconnect flows and clients that can't maintain a long-lived SSE connection. |
| 294 | |
| 295 | **Response** |
| 296 | ```json |
| 297 | { |
| 298 | "variation_id": "uuid", |
| 299 | "status": "ready", |
| 300 | "intent": "make that minor", |
| 301 | "phrases": [] |
| 302 | } |
| 303 | ``` |
| 304 | |
| 305 | #### (E) Discard Variation |
| 306 | `POST /variation/discard` |
| 307 | |
| 308 | ```json |
| 309 | { |
| 310 | "project_id": "uuid", |
| 311 | "variation_id": "uuid", |
| 312 | "request_id": "uuid" |
| 313 | } |
| 314 | ``` |
| 315 | |
| 316 | Returns `{ "ok": true }`. |
| 317 | |
| 318 | --- |
| 319 | |
| 320 | ## 4) Variation Data Shapes (Canonical JSON) |
| 321 | |
| 322 | ### 4.1 Variation (meta) |
| 323 | ```json |
| 324 | { |
| 325 | "variation_id": "uuid", |
| 326 | "intent": "string", |
| 327 | "ai_explanation": "string|null", |
| 328 | "affected_tracks": ["uuid"], |
| 329 | "affected_regions": ["uuid"], |
| 330 | "beat_range": [0.0, 16.0], |
| 331 | "note_counts": { "added": 0, "removed": 0, "modified": 0 } |
| 332 | } |
| 333 | ``` |
| 334 | |
| 335 | ### 4.2 Phrase |
| 336 | ```json |
| 337 | { |
| 338 | "phrase_id": "uuid", |
| 339 | "track_id": "uuid", |
| 340 | "region_id": "uuid", |
| 341 | "start_beat": 0.0, |
| 342 | "end_beat": 4.0, |
| 343 | "label": "Bars 1-4", |
| 344 | "tags": [], |
| 345 | "explanation": "string|null", |
| 346 | "note_changes": [], |
| 347 | "controller_changes": [] |
| 348 | } |
| 349 | ``` |
| 350 | |
| 351 | ### 4.3 NoteChange |
| 352 | ```json |
| 353 | { |
| 354 | "note_id": "uuid", |
| 355 | "change_type": "added|removed|modified", |
| 356 | "before": { "pitch": 60, "start_beat": 0.0, "duration_beats": 1.0, "velocity": 90 }, |
| 357 | "after": { "pitch": 60, "start_beat": 0.0, "duration_beats": 1.0, "velocity": 90 } |
| 358 | } |
| 359 | ``` |
| 360 | |
| 361 | Rules: |
| 362 | - `added` -> `before` must be null (enforced by backend) |
| 363 | - `removed` -> `after` must be null (enforced by backend) |
| 364 | - `modified` -> both `before` and `after` must be present |
| 365 | - All positions in **beats** (not seconds) |
| 366 | - `start_beat` within `before`/`after` is **region-relative** (offset from the region's start) |
| 367 | |
| 368 | ### 4.4 Controller Changes (Expressive MIDI) |
| 369 | |
| 370 | Phrases carry `controller_changes` — expressive MIDI data beyond notes. The |
| 371 | pipeline supports the **complete** set of musically relevant MIDI messages: |
| 372 | |
| 373 | | `kind` | Fields | MIDI byte | Coverage | |
| 374 | |--------|--------|-----------|----------| |
| 375 | | `cc` | `cc`, `beat`, `value` | Control Change (0xBn) | All 128 CC numbers: sustain (64), expression (11), modulation (1), volume (7), pan (10), filter cutoff (74), resonance (71), reverb send (91), chorus send (93), attack (73), release (72), soft pedal (67), sostenuto (66), legato (68), breath (2), etc. | |
| 376 | | `pitch_bend` | `beat`, `value` | Pitch Bend (0xEn) | 14-bit signed (−8192 to 8191) | |
| 377 | | `aftertouch` | `beat`, `value` | Channel Pressure (0xDn) | No `pitch` field → channel-wide pressure | |
| 378 | | `aftertouch` | `beat`, `value`, `pitch` | Poly Key Pressure (0xAn) | `pitch` present → per-note pressure | |
| 379 | |
| 380 | Program Change is handled at track level (`stori_set_midi_program`). |
| 381 | Track-level automation curves (volume, pan, FX params) are handled by |
| 382 | `stori_add_automation`. |
| 383 | |
| 384 | After commit, the full expressive state is materialized in `updated_regions` |
| 385 | as three separate arrays: `cc_events`, `pitch_bends`, `aftertouch`. |
| 386 | |
| 387 | --- |
| 388 | |
| 389 | ## 5) Backend Implementation Guidance |
| 390 | |
| 391 | ### 5.1 Execution Mode Policy (Backend-Owned) |
| 392 | |
| 393 | The backend determines `execution_mode` based on intent classification. The frontend's `execution_mode` field is deprecated and ignored. |
| 394 | |
| 395 | - **COMPOSING** -> `execution_mode="variation"` -> Variation proposal (no mutation) |
| 396 | - **EDITING** -> `execution_mode="apply"` -> Immediate tool call execution |
| 397 | - **REASONING** -> no tools |
| 398 | |
| 399 | This is enforced in `orchestrate()` (`app/core/maestro_handlers.py`). The frontend knows which mode is active from the `state` SSE event (`"composing"` / `"editing"` / `"reasoning"`) emitted at the start of every stream. |
| 400 | |
| 401 | ### 5.2 Proposed State Construction |
| 402 | Avoid copying whole projects: |
| 403 | - Identify affected regions/tracks |
| 404 | - Clone only those regions (notes + essential metadata) |
| 405 | - Apply existing transform functions onto the clones |
| 406 | |
| 407 | ### 5.3 Diffing / Matching Notes |
| 408 | Start simple: |
| 409 | - Match by `(pitch, start)` proximity with a tolerance (e.g., 1/16 note) |
| 410 | - If ambiguous, prefer same pitch then closest start-time |
| 411 | - Emit `modified` rather than `remove+add` when a single note clearly moved |
| 412 | |
| 413 | ### 5.4 Phrase Grouping (MVP) |
| 414 | - Group changes by **bar windows** (e.g., 4 bars per phrase) |
| 415 | - Or by region boundaries if the region already stores bar markers |
| 416 | |
| 417 | ### 5.5 Streaming |
| 418 | Compute hunks incrementally and stream as soon as available: |
| 419 | - `meta` ASAP |
| 420 | - then `phrase` events |
| 421 | - progress optional |
| 422 | |
| 423 | Streaming is what makes the UI feel alive and Cursor-like. |
| 424 | |
| 425 | --- |
| 426 | |
| 427 | ## 6) Frontend UX Spec (Variation Review Mode) |
| 428 | |
| 429 | ### 6.1 Entry |
| 430 | Variation Review Mode enters when the compose stream emits a `state` event with `state: "composing"`, followed by `meta` and `phrase` events. The frontend must: |
| 431 | 1. Detect `state: "composing"` -> prepare for Variation Review Mode |
| 432 | 2. Receive `meta` event -> show banner with intent, explanation, counts |
| 433 | 3. Receive `phrase` events -> accumulate phrases for review |
| 434 | 4. Receive `done` event -> enable Accept/Discard controls |
| 435 | |
| 436 | For `state: "editing"`, the frontend applies `toolCall` events directly. The backend also emits `plan` and `planStepUpdate` events to render a step-by-step checklist. See [api.md](../reference/api.md) for the full event reference. |
| 437 | |
| 438 | ### 6.2 Chrome (always visible while reviewing) |
| 439 | Banner containing: |
| 440 | - Intent text |
| 441 | - AI explanation (optional) |
| 442 | - Counts: +added / -removed / ~modified |
| 443 | - Controls: **A/B**, **Delta Solo**, **Accept**, **Discard**, **Review Phrases** |
| 444 | |
| 445 | ### 6.3 Visual Language (Piano Roll + Score) |
| 446 | - Added: green |
| 447 | - Removed: red ghost |
| 448 | - Modified: connector + highlighted proposed note |
| 449 | - Unchanged: normal |
| 450 | |
| 451 | ### 6.4 Audition |
| 452 | Required: |
| 453 | - Play Original (A) |
| 454 | - Play Variation (B) |
| 455 | - Delta Solo (changes only) |
| 456 | - Loop selected phrase |
| 457 | |
| 458 | MVP audio strategy: |
| 459 | - Rebuild MIDI regions in-memory for audition modes and switch at beat boundary. |
| 460 | - If switching causes glitches, pause -> swap -> resume at same transport time (acceptable for MVP). |
| 461 | |
| 462 | ### 6.5 Partial Acceptance |
| 463 | In the "Review Phrases" sheet/list: |
| 464 | - Each phrase row shows summary `+ / - / ~` |
| 465 | - Accept / reject per phrase |
| 466 | - "Apply Selected" commits accepted phrase IDs |
| 467 | |
| 468 | ### 6.6 Exit |
| 469 | - Accept -> applies to project, pushes one undo group, exits review mode |
| 470 | - Discard -> exits review mode without changes |
| 471 | |
| 472 | --- |
| 473 | |
| 474 | ## 7) Failure Modes & UX Rules |
| 475 | |
| 476 | ### 7.1 If streaming fails mid-way |
| 477 | - Keep received hunks |
| 478 | - Show a "Retry stream" button |
| 479 | - Allow Discard |
| 480 | |
| 481 | ### 7.2 If commit fails due to `base_state_id` mismatch |
| 482 | - Offer: "Rebase Variation" (future) |
| 483 | - MVP: show message: "Project changed while reviewing; regenerate variation." |
| 484 | |
| 485 | ### 7.3 If the user edits while reviewing |
| 486 | MVP rule: |
| 487 | - Block destructive edits to affected regions, or |
| 488 | - Allow edits but invalidate Variation (recommended: invalidate with clear toast) |
| 489 | |
| 490 | --- |
| 491 | |
| 492 | ## 8) MVP Cut (What to Ship First) |
| 493 | |
| 494 | 1. **Variation propose + stream hunks (SSE)** |
| 495 | 2. **Piano roll overlay rendering** |
| 496 | 3. **A/B audition (pause/swap/resume acceptable)** |
| 497 | 4. **Accept all / Discard** |
| 498 | 5. **Per-phrase accept (optional but high value)** |
| 499 | |
| 500 | Score view diff + controller diffs can come after the demo. |
| 501 | |
| 502 | --- |
| 503 | |
| 504 | ## 9) Demo Script (Suggested) |
| 505 | |
| 506 | 1. Generate a major piano riff. |
| 507 | 2. Ask: "Make that minor and more mysterious." |
| 508 | 3. Variation Review appears: |
| 509 | - green/red note overlay |
| 510 | - A/B toggle + Delta Solo |
| 511 | 4. Accept only bars 5-8, discard rest. |
| 512 | 5. Undo to prove it's safe. |
| 513 | |
| 514 | --- |
| 515 | |
| 516 | ## 10) Appendix: Implementation Checklist |
| 517 | |
| 518 | ### Backend |
| 519 | |
| 520 | **Core (Implemented & Tested):** |
| 521 | - [x] `POST /variation/propose` returns `variation_id` + `stream_url` |
| 522 | - [x] `POST /variation/commit` accepts `accepted_phrase_ids` |
| 523 | - [x] `POST /variation/discard` returns `{"ok": true}` |
| 524 | - [x] SSE stream emits `meta`, `phrase*`, `done` (via `/maestro/stream`) |
| 525 | - [x] Phrase grouping by bars (4 bars per phrase default) |
| 526 | - [x] Commit applies accepted phrases only, returns `new_state_id` |
| 527 | - [x] No mutation in variation mode |
| 528 | - [x] All data uses beats as canonical unit (not seconds/milliseconds) |
| 529 | - [x] Optimistic concurrency via `base_state_id` checks |
| 530 | - [x] Zero Git terminology — pristine musical language |
| 531 | - [x] `VariationService` computes variations (not "diffs") |
| 532 | - [x] `Phrase` model for independently reviewable changes |
| 533 | - [x] `NoteChange` model for note transformations |
| 534 | - [x] Beat-based fields: `start_beat`, `duration_beats`, `beat_range` |
| 535 | |
| 536 | **v1 Infrastructure (State Machine + Envelope + Store):** |
| 537 | - [x] `VariationStatus` enum: CREATED -> STREAMING -> READY -> COMMITTED/DISCARDED/FAILED/EXPIRED |
| 538 | - [x] `assert_transition()` enforces valid state machine transitions |
| 539 | - [x] `EventEnvelope` with type, sequence, variation_id, project_id, base_state_id, payload |
| 540 | - [x] `SequenceCounter` for per-variation monotonic sequence numbers |
| 541 | - [x] `VariationStore` (in-memory) for variation records + phrase storage |
| 542 | - [x] `SSEBroadcaster` with publish, subscribe, replay, late-join support |
| 543 | - [x] Builder helpers: `build_meta_envelope`, `build_phrase_envelope`, `build_done_envelope`, `build_error_envelope` |
| 544 | |
| 545 | **v1 Supercharge (Complete):** |
| 546 | - [x] Wired infrastructure into endpoints (propose/commit/discard) |
| 547 | - [x] `GET /variation/stream` — real SSE with envelopes, replay, heartbeat |
| 548 | - [x] `GET /variation/{variation_id}` — status polling + reconnect |
| 549 | - [x] Note removals implemented in commit engine |
| 550 | - [x] Background generation task (async propose via `asyncio.create_task`) |
| 551 | - [x] Discard cancels in-flight generation |
| 552 | - [x] `stream_router.py` — single publish entry point (WS-ready) |
| 553 | - [x] Commit loads variation from store |
| 554 | |
| 555 | **Execution Mode Policy (New):** |
| 556 | - [x] Backend forces `execution_mode="variation"` for all COMPOSING intents |
| 557 | - [x] Backend forces `execution_mode="apply"` for all EDITING intents |
| 558 | - [x] Frontend reacts to `state` SSE event; backend determines mode from intent |
| 559 | |
| 560 | ### Frontend (Not Yet Started) |
| 561 | - [ ] Detect `state: "composing"` SSE event and enter Variation Review Mode |
| 562 | - [ ] Detect `state: "editing"` SSE event and apply tool calls directly (existing behavior) |
| 563 | - [ ] Parse and accumulate `meta`, `phrase`, `done` events during COMPOSING |
| 564 | - [ ] Variation Review Mode overlay chrome (banner, counts, intent) |
| 565 | - [ ] Render note states (added/removed/modified) in piano roll |
| 566 | - [ ] Phrase list UI with accept/reject per phrase |
| 567 | - [ ] A/B + Delta Solo audition |
| 568 | - [ ] Commit/discard flows with state-id checks |
| 569 | - [ ] Convert beats to audio time for playback only |
| 570 | |
| 571 | --- |
| 572 | |
| 573 | ## North-Star Reminder |
| 574 | |
| 575 | > **Muse proposes Variations organized as Phrases.** |
| 576 | > **Humans choose the music.** |
| 577 | > **Everything is measured in beats.** |
| 578 | |
| 579 | If this sticks, it becomes a new creative primitive for the entire industry. |