Codex PPT
Overview
This skill creates image-based PowerPoint decks from source material. Each slide is a complete 16:9 generated image. Final images are assembled into .pptx with scripts/assemble_ppt.py.
Use this when the user wants a visually unified presentation and accepts full-slide image pages. Do not use it when every textbox, chart, or shape must remain separately editable.
Prefer the built-in image generation/editing tool. Use scripts/image_gen.py only when the built-in backend is unavailable, lacks a required capability, or the user explicitly asks for API/CLI mode.
Hard Constraints
- Read the relevant
Reference Mapfiles before each phase. This file is the orchestration contract; detailed rules live indocs/and worker prompts inprompts/. - Respect approval gates. Do not create final
deck_spec.json,speech.md, prompt jobs, slide images, or.pptxbefore the approvals indocs/workflow-gates-and-progress.md. - After the user approves the sample slide and authorizes full-deck generation, every remaining slide image job must be dispatched to a slide subagent whenever subagents are available.
- The main agent owns orchestration, prompt jobs, state recording, QA, speaker notes, and assembly. Do not silently replace available slide subagents with sequential production.
- Every final
origin_image/slide_XX.pngmust be generated by the selected image backend: built-in image generation/editing tool orscripts/image_gen.py. - Local drawing, Pillow, SVG, HTML/CSS/canvas screenshots, python-pptx/PptxGenJS layouts, and manual overlays are failure modes, not fallbacks.
- The selected image backend must stay fixed after backend confirmation. Do not let subagents switch backend for convenience.
- After sample approval, record how the approved sample was generated and pass that exact method to every slide subagent.
- Slide dispatch and result state must be recorded with the bundled scripts. Chat messages alone do not make a slide dispatched or complete.
- If a required subagent, image backend, or required-image path is unavailable, stop and report a blocker with the slide id and evidence. Do not create a lower-quality replacement.
Visible Progress
For non-trivial decks, keep a user-visible checklist with one active step. Canonical completion evidence is in docs/workflow-gates-and-progress.md.
Default visible steps:
- Prepare source, outline, style, and backend decisions.
- Generate and approve one sample slide.
- Prepare slide jobs and slide state.
- Dispatch slide subagents.
- Record generated slide results.
- QA, repair, notes, and PPT assembly.
Do not mark a step complete from chat alone; use real files or script-recorded state.
Default Workflow
- Understand the source content.
- Identify topic, audience, goal, page count, style/brand constraints, and sections to include or exclude.
- If no page count is specified, choose a practical count. Typical decks are 8-12 slides.
- Plan the deck outline.
- Before writing or updating
outline.md, readdocs/workflow-gates-and-progress.mdanddocs/outline-style-and-sample.md. - Draft slide roles and required source images. Ask for confirmation, then stop before style, backend, sample, or downstream artifacts until approved.
- Confirm a unified visual style.
- Before offering style options or using files from
references/, readdocs/outline-style-and-sample.md. - Offer 2-3 concrete style directions, recommend one, wait for confirmation, then keep one visual identity while varying layouts by page role.
- Confirm the image backend.
- Before generating any slide image, read
docs/backend-selection.md. - Check whether a built-in image tool is callable, state what you checked, name the backend, explain fallback status, and wait for confirmation.
- If CLI/API fallback is selected, read
docs/cli-api-fallback.md. Readdocs/image-model-configuration.mdonly after config errors or explicit API-setting requests.
- Generate one sample slide for approval.
- Before generating or approving the sample slide, read
docs/outline-style-and-sample.md. - Generate exactly one representative sample after outline, style, and backend are confirmed. Do not generate the full deck until approved.
- After approval, record
sample_generation_methodindeck_spec.jsonso jobs and subagents inherit the same path.
- Create the project directory.
- Before initializing folders or assembling files, read
docs/project-assembly-and-reporting.md. - If no destination is specified, use the current working directory or the source file directory.
- Prepare user-supplied assets.
- Before using paper figures, charts, screenshots, logos, or other required assets, read
docs/user-supplied-assets.md. - Treat required assets as strict inputs and confirm slide-to-asset mapping before generation.
- Generate all slide images.
- Before full-deck image generation, read
docs/slide-generation-and-subagents.md. - Create per-slide jobs with
scripts/prepare_slide_prompts.pyor savedprompts/slide_XX.jsonfiles. - Every final image must come from the selected backend and be recorded with bundled state scripts.
- Dispatch slide subagents.
- Before dispatching or replacing slide workers, read
docs/slide-generation-and-subagents.mdandprompts/slide-worker.md. - Use one subagent per remaining slide job whenever possible. If required subagents cannot be spawned, stop and report a blocker unless the user changes the workflow.
- Quality check and repair.
- Before QA or assembly, read
docs/project-assembly-and-reporting.md. - Inspect every slide before assembly: text, outline match, truncation, style, unwanted page numbers, overlaps, and required assets.
- Regenerate severe failures with a tighter prompt. Use backend editing for localized issues when available.
- For CLI/API fallback edit commands, read
docs/cli-api-fallback.md. Replace the final slide only after validating the edited output.
- Write speaker notes and assemble the PPT.
- Before writing
speech.mdor running assembly, readdocs/project-assembly-and-reporting.md. - Make sure
outline.mdreflects the final confirmed deck outline. Usespeech.mdheadings that map toSlide N. - Before assembly, ensure
slide_jobs.jsonshows generated slides asrecordedand approved samples asaccepted. If any slide ispending,dispatched, orblocked, stop.
- Report the result.
- Use the final report checklist in
docs/project-assembly-and-reporting.md. - Include paths, slide count, backend used, recorded-result status, and any limitations or blockers.
- Save reusable styles when requested.
- If asked to save the current deck style or a supplied image/PDF/PPT/PPTX style, read
docs/style-library.md.
Subagent Dispatch
Slide subagents are mandatory after sample approval whenever the runtime can spawn them. The main agent prepares jobs and records state; each worker handles exactly one prompts/slide_XX.json job and returns only selected image path, backend, and QA note.
Use docs/slide-generation-and-subagents.md for dispatch, commands, result recording, blockers, and backend provenance. Use prompts/slide-worker.md as the handoff template.
Subagents must not edit outline.md, deck_spec.json, other slide jobs, origin_image/, speech.md, or the final .pptx. The parent records outputs and assembles.
Acceptance Criteria
- Output is a valid
.pptx. - Each expected final slide image exists under
origin_image/slide_XX.png. - Every final slide image was generated by the confirmed backend and recorded through
record_slide_result.py, except an approved sample marked accepted by run state. outline.mdreflects the approved deck outline.speech.mdexists when speaker notes are expected, and assembly writes those notes into the PPT.slide_jobs.jsonandslide_run_state.jsonreflect the final state.- Required source images are visibly represented, or a blocker is reported.
- If blocked, the final response identifies phase, slide id, evidence path, and unfinished reason; do not call the deck complete.
Reference Map
docs/workflow-gates-and-progress.md: approval gates, progress, completion evidence.docs/backend-selection.md: backend decision rules and confirmation text.docs/outline-style-and-sample.md: outline, style, sample rules, prompt examples.docs/user-supplied-assets.md: strict handling for required source assets.docs/slide-generation-and-subagents.md: jobs, dispatch, result recording, blockers, provenance.docs/cli-api-fallback.md: fallback runtime, generation/edit commands, image limits, troubleshooting.docs/image-model-configuration.md: API key, base URL, model,.env; read only when config is needed.docs/project-assembly-and-reporting.md: project directory, notes, assembly, final report, prompting principles.prompts/slide-worker.md: slide subagent handoff template.references/*.md: visual style references.
Documentation and Updates
For source, docs, install, config, and examples, see ningzimu/codex-ppt-skill.

