overnight-multi-issue-implementation

wan-huiyan-overnight-workflows

OtherClaude Codeby wan-huiyan

Summary

Overnight autonomous workflow that takes a cluster of related GitHub issues (typically a P1 review-panel finding set) and ships them to merged stacked PRs by morning. Builds on subagent-driven-development with overnight-specific discipline: stacked PRs (so PR2 doesn't wait on a human PR1-merge mid-night), pre-flight tracker-id audit (concurrent sessions on main steal IDs), final PR-level code review before proposing merge, review findings preserved as PR comments before squash. Sister to overnight-review-client-delivery and overnight-insight-discovery.

Install to Claude Code

/plugin install overnight-multi-issue-implementation@wan-huiyan-overnight-workflows

Run in Claude Code. Add the marketplace first with /plugin marketplace add wan-huiyan/overnight-workflows if you haven't already.

README.md

Overnight Workflows

Sister Claude Code plugins for running autonomous overnight work sessions that land a polished deliverable, an insight brief, or a stack of reviewed PRs on your desk by morning, with multi-agent review panels baked in to catch factual errors before they reach the client.

![license](LICENSE) ![last commit](https://github.com/wan-huiyan/overnight-workflows/commits) ![Claude Code](https://claude.com/claude-code)

Workflow plugins

| Plugin | When to use | |---|---| | overnight-review-client-delivery | You already have a client deliverable (slide deck, report, HTML, memo) that needs polishing + quality-gating before a morning hand-off. Runs Phase A (content work) + Phase B (8-agent review panel in parallel) + Phase C (morning synthesis). | | overnight-insight-discovery | You want to generate a client-facing insight brief from scratch — surfacing funnel leaks and surprise patterns from data. Runs two parallel tracks (B = LLM-autonomous creative exploration + C = hybrid deterministic-with-narration), consolidates, and reviews. | | overnight-multi-issue-implementation | You have a cluster of 6–15 related GitHub issues (typically a P1 review-panel finding set) and want them implemented + reviewed + opened as stacked PRs by morning. Runs Phase A (PR1 tasks via subagent-driven-development) + Phase B (PR2 tasks stacked on PR1) + Phase C (PR-level code review + morning hand-off). Also covers the plan-driven variant (independent PRs from a written plan rather than stacked PRs from issues). |

Companion safety patterns

Cross-cutting skills that strengthen any overnight workflow. Installable independently or as part of the bundle.

| Plugin | When to use | |---|---| | large-redesign-parallel-branch-collision-audit | Pre-flight audit BEFORE starting a multi-PR redesign that rewrites shared files. Catches the failure mode where a long-running parallel feature branch (client-variant, staging, whitelabel) has unmerged commits touching the same files the redesign is about to rewrite — so they end up stranded with head-on conflicts that can't be cleanly cherry-picked. Adjacent to but distinct from the tracker-id audit in overnight-multi-issue-implementation. | | subagent-review-tier-calibration-for-overnight-pr-chains | Calibrate review intensity per-PR (Tier 1 two-stage / Tier 2 combined single-agent / Tier 3 bash-only verification) in long overnight chains (10+ PRs). Specializes superpowers:subagent-driven-development's review step with a decision rubric + concrete bash-verification recipe for low-risk visual-restyle PRs. | | overnight-review-panel-blocked-reviewer-reads-as-clean | Harden the review panel against a silent tool gap: code-review subagents often have no Bash, so when told to gh pr diff/checkout a PR they return a BLOCKED report (or review main, which predates the PR) — and in an unattended run a BLOCKED reviewer reads as a CLEAN one. Fix: pre-generate per-base diffs + materialize worktrees + hand explicit paths, and treat BLOCKED as not-clean. (Overnight specialization of the general code-reviewer-subagent-no-bash-blocked-on-pr-diff.) | | schedule-poll-orchestrator-pattern | Fire-ASAP orchestration for multi-track overnight runs dispatched via scheduled triggers (RemoteTrigger / CronCreate). Replaces a fixed t+Nh consolidation timer with a self-rescheduling poll loop that consolidates the moment all parallel tracks report phase: complete, and lets a scheduled successor survive a 12–20h session end. Distinct from in-session successor-handoff. |

The workflow plugins share the same phase structure, locked-file escape hatch, branch hygiene, and file-first discipline — use them as a set, in pairs, or individually. The companion safety patterns layer on top of any of the workflow plugins (or on standalone subagent-driven-development runs). For the orchestrator-takeover boundary when a track's subagent is blocked waiting on external state (CI / Cloud Build / a gcloud poll), see subagent-external-wait-orchestrator-takeover in agent-traffic-control.

Why use these

Overnight autonomous runs are seductive but brittle. The typical failure modes:

  • Hallucinated conclusions. The model "finds" patterns that are restatements of known features, or narrates trivial tautologies as surprising.
  • Factual errors ship to the client. A single reviewer (you, sleep-deprived in the morning) misses a wrong BSTS CI, a decomposition table with inverted signs, a mislabelled cohort.
  • Stale content dressed as fresh. Author adds an "archive banner" at the top + updates the headline, leaves the body with old numbers — readers can't tell which parts are current.
  • Context-window blowup. A 6-hour autonomous run fills the model's window; the session compresses lossy, then drifts.
  • Parallel session commit-dropping. Two agents on the same branch silently rebase each other's commits into oblivion.

These plugins encode the hard-won patterns that fix each of these — extracted from real overnight runs that caught real P0 errors before they reached real clients.

Core patterns (shared across both plugins)

1. Multi-agent review panel

Neither plugin trusts the author (or the track) to self-review. A panel of 4–8 specialized reviewers runs on the deliverable — data-scientist, data-analyst, scientific-critical-thinker, client-trust-evaluator, compliance-auditor, qa-expert. A Supreme Judge arbitrates. Dependency: agent-review-panel.

But verify every reviewer actually saw what it reviewed. Many review/search subagents (feature-dev:code-reviewer, voltagent-*, Explore) ship without a Bash tool, so a reviewer told to gh pr diff/checkout a PR returns a BLOCKED report — or silently reviews the current checkout (often main, which predates the work) instead. In an unattended overnight run, a BLOCKED reviewer reads as a CLEAN one, and the bug it never looked at ships by morning. Pre-generate per-base diffs to files + materialize PR branches as worktrees + hand each reviewer explicit paths, and in the morning synthesis treat BLOCKED as not-clean (re-dispatch before counting the vote). See overnight-review-panel-blocked-reviewer-reads-as-clean.

> Standing convention — review every non-trivial PR with the panel before merge. Distinct from the deliverable panel above (Phase B audits a doc/deck): before squash-merging any non-trivial code PR, run the roundtable:agent-review-panel skill with all panel agents set to model: opus (the skill's enforced default) instead of (or in addition to) a single code-reviewer agent — multiple independent opus reviewers catch what one reviewer misses, gating client-facing / substantive changes. Fold/triage findings, fix, re-run if needed, THEN squash-merge. Trivial / docs-only PRs may skip the full panel (same non-trivial threshold). This roundtable:-invoked panel is the same agent-review-panel dependency named in §1 + Dependencies. (Origin: barryU propensity project, 2026-06-02.)

2. Locked-file escape hatch

Client-facing files are LOCKED by default. Modifying one requires four conditions: explicit prompt authorization, independent verification (BQ query OR second reviewer confirming), surgical-only edit, and prominent documentation in the morning summary. Without all four, flag as "REQUIRES USER DECISION."

3. File-first successor handoff

The parent orchestrator never loads working data — reads only small status files. Each track writes state to state/status.json, state/planning_board.md, state/findings/*.md. When context pressure rises, parent dispatches a fresh successor subagent that reads state files and continues. Max 3 hops per track.

4. Archive-and-regenerate (not banner-and-partial-update)

When refreshing stale content, never add an archive banner + update headlines in place. Archive the prior version (name_context.html) as a snapshot, then regenerate the active version from the current source of truth. Keeps readers oriented; passes the "would you stake your reputation on this" test.

5. Aggressive cost cap

£0 Cloud Run + £0 Cloud Build + 5 TB BQ read is the recommended envelope. Validated: entire overnight runs complete within this for most client-delivery and insight-discovery sessions. A bq_budget.py wrapper (shipped with overnight-insight-discovery) dry-runs every query, logs to JSONL, aborts on soft-cap hit.

6. Unique branch names per parallel agent

Critical gotcha: parallel Claude sessions on the same repo can silently drop each other's commits via rebase. Use feature/session-NN-claude-A vs feature/session-NN-claude-B. Push immediately after every commit. Treat git reflog as the safety net.

Installation

# Add the marketplace
/plugin marketplace add wan-huiyan/overnight-workflows

# Install one or more plugins
/plugin install overnight-review-client-delivery@wan-huiyan-overnight-workflows
/plugin install overnight-insight-discovery@wan-huiyan-overnight-workflows
/plugin install overnight-multi-issue-implementation@wan-huiyan-overnight-workflows

Or clone directly:

git clone https://github.com/wan-huiyan/overnight-workflows.git
cp -R overnight-workflows/plugins/overnight-insight-discovery ~/.claude/skills/
cp -R overnight-workflows/plugins/overnight-review-client-delivery ~/.claude/skills/
cp -R overnight-workflows/plugins/overnight-multi-issue-implementation ~/.claude/skills/

Dependencies

Both plugins integrate tightly with:

  • agent-review-panel — REQUIRED. 16-phase review protocol with Supreme Judge + HTML dashboard.
  • plan-review-integrator — Applies review findings to plans/briefs with rollback on coherence break.
  • planning-with-files — File-first discipline that makes successor handoff possible.
  • claudeception — Post-run knowledge capture into updated skill versions.

Quick start

Polish a client deliverable overnight (existing doc/deck):

> "Run overnight-review-client-delivery on the Q4 marketing campaign report. The deliverable lives at deliverables/campaign_impact.html. Canonical numbers are in scoping/expected_metrics.md. Locked files: the three HTMLs going to the client tomorrow. Cap: £0 cloud spend, use BQ queries only."

Discover ah-ha insights overnight (generate from data):

> "Run overnight-insight-discovery on Q4 e-commerce data. Target: 2 funnel leaks + 2 surprise patterns for the exec brief on Monday. Fall campaign scope. Cap: 5 TB BQ, 8 hr wall-clock. Client = retail ops team."

Implement an issue cluster overnight (issues → stacked PRs):

> "Run overnight-multi-issue-implementation on issues #437–#442 in barryu_application_propensity. Two-PR shape: hardening (#438–#441) + knowledge-gap (#437, #442). Brainstorm + plan first, then subagent-driven execution, code-review subagent before merge. I'm asleep — wake up to merged PRs and follow-up issues filed."

All three plugins will ask for scoping details (target date, known-knowns table, canonical numbers, panel personas, issue cluster, PR shape) before kicking off. Morning output: a PR with the deliverable, a morning summary flagging anything that needs your attention first, and a review-panel HTML dashboard.

What you get by morning

  • The finished deliverable (Markdown + HTML, client-ready)
  • A morning_summary.md that flags the ONE thing to look at first (P0 fixes to locked files, capped loops, unresolved P1s)
  • A review-panel HTML dashboard with per-round scores and persona-by-persona verdicts
  • A PR to main with a DO NOT MERGE banner (you eyeball first)
  • A workflow_learnings.md capturing concrete recommendations for the next run
  • Full traceability: every commit per phase, every BQ query, every finding that DIDN'T make the brief (and why)

Limitations

  • These plugins are structured for 8 hour overnight windows. Sub-hour sessions are overkill; multi-day projects need further decomposition.
  • They assume a BigQuery-style data warehouse with read access + at least one scratch dataset. Snowflake / Redshift / Postgres will work but the budget wrapper and SHAP compute scripts need adaptation.
  • Neither plugin can execute trades, move money, push to production, or run migrations — all side-effecting actions require explicit human authorization.
  • overnight-insight-discovery requires a pre-populated cohort known-knowns table (~30 cells × top-20 features each). Without it, the novelty gate has nothing to enforce.

How they compose

overnight-insight-discovery        →  generates the brief from scratch
                                   ↓
overnight-review-client-delivery   →  polishes a known-good brief into client-shape

overnight-multi-issue-implementation  →  ships a cluster of issues to stacked PRs
                                       (independent track — engineering, not deliverables)

For new insight work, start with overnight-insight-discovery. For existing deliverables that just need polish + QA, go straight to overnight-review-client-delivery. For an engineering issue cluster (typically a P1 review-panel finding set), use overnight-multi-issue-implementation. The first two chain naturally; the third runs as an independent track.

Related repos

For synchronous end-to-end audit of a live data dashboard (one ~30–45 min round of parallel cluster agents — different shape from the autonomous overnight runs in this repo), see wan-huiyan/dashboard-audit-toolkit. It bundles four sister skills: a parallel-cluster-agents methodology spine, the most common fix-shape produced by the audit, single-metric depth audit, and the GitHub squash-merge gotcha when shipping the fixes.

For a per-instance ML explainability fix-shape — SHAP waterfall charts in production dashboards where a post-hoc calibrator (isotonic regression, Platt scaling) sits between the raw model and the displayed score — see wan-huiyan/shap-waterfall-calibrator-skill. The "rescale to fit the chip" anti-pattern, the negative-scale-guard workaround that trades one bug for another, and the correct fix (apply the calibrator point-by-point to the cumulative-probability path so every bar lives in calibrated space).

Origin

All three plugins encode patterns from real overnight runs. overnight-review-client-delivery was validated on a causal-impact project; overnight-insight-discovery was extracted from a university-admissions propensity project; overnight-multi-issue-implementation was extracted from a 12-task chatbox-hardening + knowledge-gap session on the same admissions propensity project (2026-05-08, issues #437–#442 → 2 stacked PRs merged by morning + 5 follow-ups filed). The patterns are generalized for any project that needs autonomous overnight work with quality gates.

Version history

  • 2026-06-02 — Standing convention added: review every non-trivial PR with the roundtable:agent-review-panel skill (all agents model: opus) before squash-merge; trivial/docs-only PRs may skip the full panel. Reconciled with the per-PR tier rubric (the panel is the heavyweight tier; single-reviewer tiers remain for low-risk PRs). overnight-multi-issue-implementation SKILL → v1.1.1, subagent-review-tier-calibration-for-overnight-pr-chains SKILL → v1.0.1.
  • v1.1.0 (2026-05-08) — Adds overnight-multi-issue-implementation for the engineering-side overnight pattern (issues → stacked PRs). README updated to reflect three plugins; install + compose sections expanded.
  • v1.0.0 (2026-04-17) — Initial release bundling two plugins. overnight-review-client-delivery was previously a standalone skill; this bundle adds the insight-discovery sibling and unifies the shared patterns (locked-file escape hatch, branch hygiene, file-first successor handoff, archive-and-regenerate).

License

MIT — see LICENSE.

Contributing

Patches welcome. The shape of both plugins is still settling — if you run them in production and learn something the skills didn't catch, please open an issue or PR against the relevant SKILL.md or reference doc.

Common contribution targets:

  • Additional panel personas for specific domains (medical, financial, legal)
  • New yield classes for the adaptive tuning loop (insight-discovery only)
  • Platform-specific adaptations of bq_budget.py (Snowflake, Redshift, Databricks)
  • Alternative HTML renderers (beyond markdown2)

---

🤖 Patterns co-developed with Claude Code. All examples in the skills use synthetic data; no client-specific numbers in this repo.

Related plugins

Browse all →

large-redesign-parallel-branch-collision-audit

wan-huiyan-overnight-workflows

Pre-flight audit before starting a large multi-PR redesign that rewrites shared files. Catches the failure mode where a long-running parallel feature branch (e.g. client-variant, staging, whitelabel) has unmerged commits touching the same files the redesign is about to rewrite — so they end up stranded with head-on conflicts that can't be cleanly cherry-picked. Sister to pre-merge-client-variant-regression-audit (audits a variant branch BEING merged into main; this audits BEFORE main diverges from a variant) and parallel-pr-scope-overlap-tiebreaker-delta-check (two simultaneous PRs vs. one redesign on main + WIP elsewhere). Companion to overnight-multi-issue-implementation for the file-collision dimension that the existing tracker-id audit doesn't cover.

Open plugin →

overnight-insight-discovery

wan-huiyan-overnight-workflows

Overnight autonomous B-vs-C parallel insight discovery that surfaces genuinely ah-ha findings from data, with cohort-conditional novelty gate, adaptive tuning, and agent-review-panel loop.

Open plugin →

overnight-review-client-delivery

wan-huiyan-overnight-workflows

Overnight autonomous work session that produces a polished client delivery package by morning, with an 8-agent review panel to catch factual errors before they reach the client.

Open plugin →

overnight-review-panel-blocked-reviewer-reads-as-clean

wan-huiyan-overnight-workflows

Companion safety pattern: in an unattended overnight review panel, a reviewer that couldn't see the code reads as a CLEAN one — usually because the code-review subagent lacks a Bash tool, so when told to `gh pr diff`/checkout a PR it returns a BLOCKED report (or reviews main, which predates the PR). Pre-generate per-base diffs + materialize worktrees + hand explicit paths, and treat BLOCKED as not-clean. Overnight specialization of the general skill code-reviewer-subagent-no-bash-blocked-on-pr-diff.

Open plugin →

schedule-poll-orchestrator-pattern

wan-huiyan-overnight-workflows

Fire-ASAP orchestrator pattern for multi-track autonomous overnight workflows on scheduled triggers — a self-rescheduling poll loop that consolidates the moment all tracks report complete, and survives a 12-20h session end. Companion safety pattern.

Open plugin →

subagent-review-tier-calibration-for-overnight-pr-chains

wan-huiyan-overnight-workflows

Calibrate review intensity per-PR (Tier 1 two-stage / Tier 2 combined single-agent / Tier 3 bash-only verification) when orchestrating a long chain (10+ PRs) of independent tasks overnight from a single implementation plan. Specializes subagent-driven-development's review step for throughput-tuned chains: prescribes three tiers, a decision rubric, and a concrete bash-verification recipe for low-risk visual-restyle PRs. Sister to subagent-driven-development (master pattern; this specializes the review step) and overnight-multi-issue-implementation (different shape: stacked PRs from issues vs. independent PRs from plan). Companion to large-redesign-parallel-branch-collision-audit (pre-flight before STARTING a chain).

Open plugin →