Skip to main content
Advanced

Advanced Techniques

Parallel workscopes, deep review, in-flight failures, IDE pairing, and prompt processing for power users.

Parallel Workscope Sessions

Since AI assistants are not instant, there is idle time between prompts, and this is time you can put to work. Running multiple workscope sessions simultaneously is a straightforward way to increase throughput without sacrificing quality.

The mechanics rely on a single coordination primitive: the [*] checkbox state. When the Task-Master assigns tasks to a workscope, it marks them [*] in the checkboxlist. Any subsequent Task-Master invocation (whether in the same terminal or a different one) sees those marks and skips past them. Each session gets its own non-overlapping slice of work, and no manual coordination is required.

The key practice is to start sessions with a slight offset. Launch the first session, let the Task-Master mark its tasks, and then launch the second. The offset can be as short as a few seconds, just long enough for the first session’s checkboxlist updates to reach disk. If two Task-Masters read the checkboxlist at the exact same moment before either has written its [*] marks, they could select the same tasks. The offset prevents this.

Parallelization works best when the sessions address unrelated concerns. Two sessions implementing different features, or one implementing code while another writes documentation, will rarely conflict. Two sessions modifying the same module risk producing changes that clash at the code level — the QA gauntlet will catch this, but resolving merge-level conflicts manually is overhead that erodes the productivity gain.

Beyond parallel workscopes on a single project, there is multi-project interleaving. If you maintain several WSD-managed projects, you can run sessions across them simultaneously. Each project has its own Action Plan, its own checkboxlists, and its own agent definitions. There is no cross-project coordination needed — just separate terminal windows and enough screen space (or desktop spaces) to monitor progress.

The practical ceiling depends on your ability to handle the post-workscope activities — reviewing action items, committing work, confirming results — before the next session finishes. Two or three parallel sessions is a comfortable range. Beyond that, the supervisory overhead can exceed the throughput gain.

Deep Review

Specifications are the foundation that implementation builds on. A flaw in a specification discovered during Phase 1 costs minutes to fix. The same flaw discovered during Phase 4, after multiple workscopes have built on the flawed assumption, can require reworking days of completed work.12

The /deep-review command is a karpathy script that you can run to catch those flaws before implementation begins.

Deep review is a multi-agent process designed for large, intersectional specifications, documents that touch multiple systems, reference several external files, and contain enough complexity that a single reviewer would miss things. The architecture works in stages:

  1. A Review-Planner reads the target document, identifies thematic concerns that warrant focused review (naming consistency, constraint completeness, cross-reference accuracy), and produces a review plan assigning each concern to a separate reviewer along with the specific files they need to check against.

  2. Multiple Deep-Reviewers execute in parallel, each focused on a single concern. One reviewer might check whether all error handling paths are specified. Another might verify that the document’s terminology is consistent with the system documentation it references. Each reviewer writes its findings to disk.

  3. A Review-Synthesizer reads all findings, categorizes each issue as either an auto-fix (unambiguous correction) or a judgment call (design decision requiring your input), and produces a unified report with specific edit recommendations.

  4. You review the report. Auto-fixes are applied directly. Judgment calls are presented as numbered items with options and recommendations, and you decide how to resolve each one. After resolution, the specification is tighter, more consistent, and ready for implementation.

The multi-pass convergence pattern is how deep review delivers its full value. Run it once, address the findings, then run it again. The second pass catches issues exposed by the first round of fixes, like a renamed parameter that was not updated in one cross-reference, or a constraint that became inconsistent after another was tightened. Run it until the findings become trivial. Two or three passes typically suffice.

Deep review has real cost in time and API usage. Reserve it for specifications that justify the investment: feature overviews with complex checkboxlists, documents that intersect with five or more external files, or any specification that has been through multiple review cycles without converging. Routine tickets and straightforward feature additions do not need it; a careful read and a single review pass are sufficient for simpler documents.

In-Flight Failures (IFFs)

Multi-phase implementation creates a natural phenomenon: work completed in one phase can intentionally break things that a later phase will fix. Phase 1 changes a function signature. Phase 3 updates the callers. Between those phases, the test suite has failures that are expected, documented, and scheduled for resolution, not bugs to chase.

These are In-Flight Failures. They are a planned consequence of phased work, not an error in the current workscope. The distinction matters because it determines what the agent should do when tests fail. A failure introduced by the current workscope must be fixed before the workscope can close. An IFF must not be fixed — it belongs to a later phase, and “fixing” it prematurely could conflict with the planned approach.

To manage IFFs, Work Plan Documents include a dedicated section (## In-Flight Failures (IFF)) where expected failures are documented with the phase that introduced them and the phase that will resolve them. When the Test-Guardian runs the test suite during the QA gauntlet, it reports all failures. The User Agent then categorizes each one:

  • Introduced — caused by the current workscope’s changes. These must be fixed.
  • IFF (Documented) — listed in the ticket’s IFF section. Leave them alone.
  • IFF (New) — caused by earlier phases of the same ticket but not yet documented. Report these to the developer for addition to the IFF section.
  • Pre-existing — existed before the current ticket began, unrelated to the current work. Rare, but worth escalating.

This classification system prevents two failure modes. Without it, agents either waste time fixing failures that belong to later phases, or they dismiss genuine regressions as someone else’s problem. The IFF framework gives each failure a category and a disposition, keeping the QA process accurate even in the middle of complex, multi-phase work.

Suspected IFFs will be presented after the /wsd:execute phase and listed among the User Action Items. You can verify them and request that the User Agent add these to the WPD’s IFF list. It’s customary to leave the IFF list after the WPD is completed as a record of what temporary breakage was endured during the implementation.

Health check violations are similar to but distinct from IFFs. There are cases in which a particular health check failure cannot be helped, such as when a security issue is pending an update in a third-party dependency. The docs/read-only/Health-Check-Exceptions.md is a permanent, write-restricted file where the User can list such expected health check violations in order to prevent them from triggering escalations in a similar way to how the IFF list prevents confusion around known test failures.

IDE Pairing

WSD is a terminal-based workflow, but it is designed to work alongside a code editor. Several pieces of WSD infrastructure exist specifically to support this pairing, and understanding them makes the moment-to-moment experience noticeably smoother.

The most important mechanism is the symlink system. WSD maintains automatically updated symlinks like dev/journal/Current-Journal.md and dev/prompts/Current-Prompts.md that always point to the active file for their respective concern. If you pin these in dedicated editor panes, they stay current without manual tab-switching. When a new workscope starts and the journal rotates, the symlink updates and your pane follows. This turns your editor into a passive monitoring surface — you can watch the Work Journal update in real time as the agent works, verify workscope selections as they happen, and keep the current frontier of the Action Plan visible without hunting for the right file.

The prompt scratch pad is where this pairing becomes essential. Rather than composing prompts directly in the terminal, the practice is to write them in Current-Prompts.md inside your editor, then copy the finished prompt into the AI harness. This is not a minor convenience — it is where the majority of your active time goes. The editor gives you revision, restructuring, and the ability to reference surrounding panes (the Action Plan, the journal, the rules) while composing. Prompts written this way tend to be more thorough and more precise than anything typed directly into a chat interface. The Previous-Prompts.md symlink keeps the last prompt page accessible for quick lookup or reuse of recent material. Both symlinks are cycled appropriately when you run the wsd prompt command.

For multi-project work, post-tool hooks close the awareness gap. When you have sessions running across separate desktop spaces or monitors, a hook that fires a system notification or plays a sound on workscope completion means you do not need to poll each project manually. You stay focused on whichever project needs active attention and get pulled to the others only when they are ready for input. The combination of desktop-level separation (distinct spaces, color-coded terminals for visual distinction) and hook-driven notifications makes interleaving across projects practical rather than frantic.

None of this prescribes a specific layout. IDE setups are personal, and developers tend to lock into arrangements that suit their habits. The point is that WSD’s symlinks, prompt files, and hooks are provided as deliberate infrastructure for editor integration. However you arrange your workspace, these are the attachment points worth knowing about.

Prompt Processing

Your prompt archive is more than a historical record. It contains every decision, discovery, and instruction that shaped your project — a complete, chronological account of Design Mode thinking. The /process-prompts command extracts insights from this archive into persistent, customizable reports.

The concept is straightforward: you define report files in dev/prompts/reports/, each with extraction criteria in its YAML frontmatter that describe what kind of information to look for. When you run /process-prompts, the system reads your prompt archives and integrates relevant findings into each report. The default reports build a project timeline and design evolution record, but you can create additional reports for anything you want to track — feature ideas, research questions, AI interaction best practices, experiment logs.

What makes this valuable is that prompts capture information that no other artifact preserves. Code captures what was built. Specifications capture what was planned. But prompts capture the reasoning, the false starts, the decisions that were considered and rejected, the instructions that shaped agent behavior over dozens of sessions. Processing this archive surfaces patterns and insights that would otherwise be lost to the ephemeral nature of individual sessions.

Creating a custom report is as simple as adding a new file to the reports directory with frontmatter that describes what to extract. The next time you run /process-prompts, the system picks it up automatically. Past examples include reports tracking experiment studies, future feature brainstorms, and lessons learned about effective prompting techniques. These reports can feed downstream processes like release notes generation or retrospective reviews.

As reports grow through repeated processing passes, they can become unwieldy. The /decompose-report command is a karpathy script that splits a monolithic report file into a directory of section files, each independently maintainable. This keeps individual sections focused and prevents any single report from growing past the point where it is useful as context.

Footnotes

  1. B. W. Boehm and V. R. Basili, “Software Defect Reduction Top 10 List,” IEEE Computer, vol. 34, no. 1, pp. 135-137, Jan. 2001.

  2. S. McConnell, Code Complete, 2nd ed. Redmond, WA: Microsoft Press, 2004, ch. 3.