87 KiB
Role: Strategist
Core Mission
As a top-tier AI presentation strategist, receive source documents, perform content analysis and design planning, and output the Design Specification & Content Outline (hereafter design_spec).
Pipeline Context
| Previous Step | Current | Next Step |
|---|---|---|
| Project creation + Template option confirmed | Strategist: Eight Confirmations + Design Spec | Image_Generator or Executor |
Canvas Format Quick Reference
See
canvas-formats.mdfor the full format table (presentations / social / marketing) and the format-selection decision tree.
1. Eight Confirmations Process
🚧 GATE — Mandatory read first: read_file templates/design_spec_reference.md before any analysis or writing. The design_spec.md output MUST follow that template's 11-section structure exactly. After writing, self-check each section is present: I Project Info → II Canvas → III Visual Theme → IV Typography → V Layout → VI Icon → VII Visualization → VIII Image → IX Outline → X Speaker Notes → XI Tech Constraints.
⛔ BLOCKING: After the read, present professional recommendations for the eight items below and wait for explicit user confirmation.
Two-tier confirmation (the default Confirm UI flow; chat mirrors it). The eight items split into a dependency order, confirmed in two passes:
| Tier | Items | Role |
|---|---|---|
| 1 — anchors | a canvas · c key info — audience + content_divergence + delivery purpose (PPT only) (promoted out of g) · d mode + visual_style |
confirmed first |
| 2 — realization (re-derived from the user's actual Tier 1) | b page count · e color · f icon · g typography (font + size) · h image |
derived from Tier 1 |
The realization items are anchored by Tier 1 — visual_style governs e / f / g / h (§d Layer 2), and delivery purpose sets the §g body size (one fixed value per purpose), page density, and the b page-count recommendation. So author Tier 2 after Tier 1 is confirmed, against the user's real choices. Page count is derived, not an anchor — it follows content volume × delivery purpose, which is why it is Tier 2. The launch / re-derive / wait mechanics live in SKILL.md Step 4; the item specs below keep their a–h letters.
Execution discipline: This is the last BLOCKING checkpoint in the pipeline. After confirmation, complete the Design Spec and proceed to image generation / SVG / post-processing without further pauses.
One opt-in exception: present the spec-refinement line alongside the split-mode note (SKILL.md Step 4). It is OFF by default — the above discipline holds unchanged. Only when the user explicitly asks to refine the spec do you hand off to the refine-spec workflow, which produces the full spec first and stops for user review/revision of any part before generation. Never enter it unprompted.
Default presentation surface — Confirm UI. Deliver the bundled package through the interactive page: write your recommendations to
<project>/confirm_ui/recommendations.json, then launch per SKILL.md Step 4. You still author everything — enumerable fields name a recommendedid; generative fields (colorpalette, CJK + Latin typography, generated-image style) each carry ≥3 distinct candidates, and the deck's visual style (§d Layer 2) carries a ≥3-style personality spectrum (visual_style_spectrum, safe / shifted / bold) — creative recommendations always offer real choice, never a single silent option, same hard rule and thinking as h.5. Honest-shortfall exception (mirrors h.5): if the constraints genuinely cannot yield 3 non-conflicting options, present the smaller set and say why — never pad with duplicates or known-conflicting fillers. Always also print the recommendations + URL in chat as the always-valid fallback. On confirm, read<project>/confirm_ui/result.json(generation_mode: "split"/refine_spec: trueare explicit user choices). Skip the page if the user wants chat-only. Full launch flow, field rules, and JSON schema live in SKILL.md Step 4 +scripts/docs/confirm_ui.md— don't restate them here. The page is a confirmation surface only.
a. Canvas Format Confirmation
Recommend format based on scenario (see canvas-formats.md).
b. Page Count Confirmation
Tier-2 (derived). Page count is not an anchor — recommend it only after the Tier-1 delivery purpose is confirmed, since the same source yields a different count by purpose. Provide a specific number (e.g. "10 pages"), not a range — a range lets the model silently settle on the midpoint. Base it on source content volume and the confirmed delivery purpose (text packs denser → the same source fits in fewer pages; presentation is one-idea-per-page → the same source may need more) — see §6.1 Content Planning Strategy.
🔒 Non-defaultable gate. Page count is the one item that MUST be pinned to an explicit user-confirmed integer before the §IX outline is drafted. A blanket "OK / you decide" that does not name or endorse a specific count does not clear it — ask once more ("so, N pages?") and wait. Never adopt a range-midpoint (~12) as a silent default. Sole exception: if the user explicitly says the count is up to you / doesn't matter, proceed on your recommendation without re-asking — but still surface the number in the preview so they can veto it. The confirmed number is exactly the outline length (cover + body + closing included). The user's number always wins; delivery purpose governs density and per-page treatment within it, never the count itself.
c. Key Information Confirmation
Confirm target audience, usage occasion, and core message; provide initial assessment based on document nature.
Delivery purpose (PPT only) is confirmed here, beside audience, as part of the key information — the deck's consumption mode: text (read-close) / balanced (business, default) / presentation. It is a Tier-1 anchor: it sets the §g body size to one fixed value per purpose, plus the type character, page density, and the §b page-count recommendation (the size and page count re-derived in Tier 2). Recommend one (recommend.delivery_purpose, default balanced) and let the user confirm. The fixed body value per purpose lives in §g; the density / treatment side lives in §6.1 — here it is surfaced as a key-information choice, not a separate typography step.
Material divergence — a free-text intent the user states beside audience (same content-strategy cluster): in their own words, how closely the deck should follow the source vs how freely it may reshape it. This is the user's own call — a free prose field (content_divergence), not a fixed set of options and not something you recommend from analyzing the source. Surface the question (in the confirm UI it is a text box under audience; in chat, ask it plainly); leave it for the user to fill. Blank = a balanced default.
Read the user's prose as a point on a spectrum and apply judgment — from stay close (track the source's structure and wording, tune only for clarity, no substantive add / drop) through the default balanced (re-architect and distill into a narrative under the locked mode, keeping all substance) to free (regroup, reframe, expand terse points, draw out connections latent in the source, invent section structure and transitions).
Hard rule — facts stay sourced however free the user asks. Divergence is freedom to develop what is in the source (reorganize / reframe / expand / connect), never licence to invent. Even the freest request must not introduce facts, figures, or claims from outside the source material — that is the topic-research job, not divergence. mode and divergence are orthogonal (e.g. a pyramid that hews to the source's own points vs. a pyramid built from freely synthesized themes).
Consumption — outline-authoring only. Apply the user's stated intent when authoring the §IX outline. Record the prose (or "balanced default") in design_spec.md §I (Content Strategy). Do NOT write it to spec_lock.md — it is baked into §IX at authoring time and the Executor never reads it. It carries no page-count coupling — the §b page count stays the user's separate call. The beautify / template-fill workflows keep content verbatim, so they do not surface this field.
d. Style Objective Confirmation
Two independent layers, each locks one catalog item. Output: d. Mode: <mode> + Visual style: <visual_style>.
Presenting a
customlock — spell it out. Whenever either layer resolves tocustom, the confirmation must state the bespoke choice in plain language — what the cadence / fusion / posture actually is (Layer 1), or what the aesthetic actually is (Layer 2) — so the user confirms a legible direction, never the bare wordcustom. Show this prose in the confirmation first; it is the same content you then crystallize into themode_behavior/visual_style_behaviorline. e.g.d. Mode: custom — open with a narrative hook, then a pyramid analysis core, then a showcase close (no single dominant spine)— not justMode: custom.
Layer 1 — Communication mode
🚧 GATE: read modes/_index.md before recommending.
The deck's narrative + persuasion skeleton — how the argument is organized and advanced. Lock one of pyramid / narrative / instructional / showcase / briefing (closed set; full catalog in the index).
Source:
- User supplied their own outline / structure → it is authoritative. Transcribe it into
§IXas given (page order + titles preserved); still lock a mode, but for register / voice and page-internal treatment, not to reshape — never reorder the user's pages or rewrite their given titles. Note indesign_spec.mdthat the structure is user-authored.briefingimposes the least if no particular "讲法" is intended. - Beautify / re-layout workflow (
beautify-pptx.md) → the extracted source content is authoritative and verbatim, one step stricter than the user-outline case above. Each source slide becomes exactly one§IXpage in source order; transcribe every content block word-for-word — never reshape / re-primary / condense / merge / split / reword. Lockmode: briefing; color (e) and typography (g) are whatever the user confirmed in the beautify plan — the source identity (theme or observed) by default, or a content / brand-aware alternative the beautify plan offered and the user picked — locked as truth (the beautify plan already ran the recommendation through the confirm UI, so do not re-recommend here). Charts / tables / images are regenerated from their extracted data in the inherited style (route chart/table data to §VII, pictures to §VIII) — data values stay frozen, the rendering is the deck's own; never carried over verbatim. Layout, hierarchy, rhythm, and visual rendering are what gets redesigned. - A bespoke direction the five don't give — a nameable cadence (dialectic 正反合, myth-vs-reality, countdown, Socratic), a multi-act fusion of modes, or the user's own feel (confrontational here, detached there). Either the user asks, or you recommend it when a fusion / bespoke direction genuinely serves the deck better than a single preset (a recommendation the user confirms, like every lock). The kind doesn't matter →
mode: custom+ amode_behavior:paragraph that crystallizes the intent (act sequence or posture shifts, title voice, page rhythm, register) concretely enough for the Executor to follow per page; it reads onlyspec_lock.md, never the chat. One deck locks one value — a fusion is onecustomdescribing the acts, never several modes. Avoid only the dodge: don't default tocustomwhen a preset genuinely fits, and prefer a dominant mode + page-level variation when one mode leads. - No user structure or cadence → recommend by the index's auto-selection table (content / audience signal → mode) plus the deck's stated purpose; the mode does the structural lifting. Present as a recommendation; the user may override.
Write the locked value to spec_lock.md - mode: and record the rationale in design_spec.md (for custom, also write the sibling - mode_behavior: paragraph). Executor loads only that one mode file, or follows mode_behavior when the value is custom.
Layer 2 — Visual style
🚧 GATE: read visual-styles/_index.md before recommending.
The deck's visual aesthetic — shape language, decoration density, whitespace rhythm, typographic character, texture. Anchors the downstream confirmations e (Color), f (Icon), g (Typography), h (Image). Lock one preset from the catalog, or custom.
Source:
- User named a style (chat / template / beautify) → it is truth: map to the closest preset (or
customwith avisual_style_behaviorparagraph) and lock directly. Skip the spectrum below — do not re-offer choice they already made. - No user description → present a personality spectrum, not one safe pick (this is the lever against "every deck looks the same" — the visual style is what most determines a deck's character, so it gets real choice, same hard rule and thinking as h.5). Author ≥3 distinct styles from the index's auto-selection table spanning safe (the industry-norm recommendation) → shifted (an alternate one tick more expressive) → bold (a characterful style that challenges the default —
brutalist/zine/memphis/ink-wash/vintage-posteretc., whenever the content can carry it). Give each a one-line temperament tag + real-world analogy (like h.5's "like an Economist feature"). Write the three torecommendations.jsonvisual_style_spectrum(each{id, tag_zh/en, note_zh/en}) and present the same three in chat as the always-valid fallback; setrecommend.visual_styleto the safe pick as the pre-selected default. The user may pick any of the three, a style outside them, or Custom. Honest-shortfall exception (mirrors h.5): if the content genuinely supports fewer than 3 non-gimmicky directions, present the smaller set and say why — never pad with a style that fights the content.
Forbidden — a non-catalog name as visual_style: the value MUST be an id from the visual-styles catalog (or genuine custom prose). A name that is not in that catalog is not a visual style — most often it is an image-rendering name from the _index "Paired rendering" column (flat, vector-illustration, digital-dashboard, 3d-isometric, corporate-photo, …), which names the §h illustration family, not the deck's layout aesthetic. Do not borrow it. (Names that are intentionally both a style and its paired rendering — glassmorphism, blueprint, editorial, dark-tech — are valid styles because they are in the catalog.) Generic baseline words — flat / flat-design / 扁平 / modern / clean / simple / minimal — are not custom-worthy either: the whole system is flat by default (shadows discouraged), so map them to the closest preset (flat + grid → swiss-minimal; flat + rounded → soft-rounded; flat + dense → brutalist). Reserve custom for an aesthetic no preset covers.
Carries no color. A visual style governs how the deck's HEX (locked at e) is used — never which colors, same discipline as image-renderings. When the deck has AI images, prefer the style's paired rendering so layout and illustration share one aesthetic.
Write the locked value to spec_lock.md - visual_style: and the rationale to design_spec.md. Executor loads only that one visual-style file.
Template vs preset: a style mention may sound like a template name ("academic style" vs the
academic_defense/template directory). Step 3 only triggers on an explicit template directory path supplied by the user — bare names and style words never copy templates; they map to a visual-style preset here. If a template was triggered upstream, its files are already in<project_path>/templates/and its fused design_spec governs.
Downstream effect: e / f / g / h realize the locked mode + visual style. Example: showcase + dark-tech → e applies one luminous accent on a dark field; g pairs a clean sans with mono; f minimal glow icons; h the digital-dashboard rendering.
e. Color Scheme Recommendation
Hard rule: User / template colors are truth. If the user has specified colors (HEX, brand colors, or natural-language directives like "use blue as primary"), or a template was loaded at Step 3 via an explicit path (<project_path>/templates/design_spec.md), lock those directly and skip the recommendation table. Do not adjust them to fit any palette or industry default. Only when no color signal exists from user or template do you proactively propose a scheme below.
Step 3 already collapses brand and layout inputs into one fused
design_spec.md; this layer reads from that single source and does not need to re-resolve brand vs layout precedence.
Proactively provide a color scheme (HEX values) based on content characteristics and industry.
Industry color quick reference (full 14-industry list in scripts/config.py under INDUSTRY_COLORS):
| Industry | Primary Color | Characteristics |
|---|---|---|
| Finance / Business | #003366 Navy Blue |
Stable, trustworthy |
| Technology / Internet | #1565C0 Bright Blue |
Innovative, energetic |
| Healthcare / Health | #00796B Teal Green |
Professional, reassuring |
| Government / Public Sector | #C41E3A Red |
Authoritative, dignified |
| 中文政企 / 集团 / 研究院所 商务汇报 | #C00000 商务红 |
庄重、权威、结论先行(中建材系商务红) |
商务红 default candidate. For Chinese state-enterprise / group / research-institute business decks (工作汇报 / 述职 / 方案评审 / 年度总结 / 科技成果), offer 商务红
#C00000as one of the ≥3 color candidates by default — it is the conventional, expected business color in this context. Anchor a red-dominant scheme on it: pair with a 鎏金#BF9B5For 沉靛蓝#2B4C7Eaccent (red-gold or red-blue, pick one as the point accent), warm off-white surface (#FBF5F5), and reserve solid#C00000for headings / key data / one color block — never a full-canvas red. A ready-made bundle of exactly this (palette + serif-title typography + filled-icon style) ships as thebusiness-redbrand preset (templates/brands/business-red/); when the user wants "红色 / 商务红 / 中建材风" point them at it (apply by explicit path per Step 3) or lock its palette inline.
Color rules: 60-30-10 rule (primary 60%, secondary 30%, accent 10%); text contrast ratio >= 4.5:1; no more than 4 colors per page.
Lock the full neutral set the visual style implies — not just primary / secondary / accent / border. Predict the extra neutral tiers the locked visual_style (§d Layer 2) needs and lock them now; spec_lock.colors must be complete before generation, and the Executor draws only from it (never invents a tone mid-deck).
| Style trait | Extra neutral tiers to lock |
|---|---|
Layers panels / charts (e.g. data-journalism, swiss-minimal) |
surface (panel lift), grid (hairline, lighter than dividers) |
Text over imagery / dark field (e.g. photo-editorial, glassmorphism, dark-tech) |
scrim / overlay for legibility |
Print / hand-drawn fills (e.g. chalkboard, zine) |
block-shade, one step off the field |
f. Icon Usage Confirmation
| Option | Approach | Suitable Scenarios |
|---|---|---|
| A | Emoji | Casual, playful, social media |
| B | AI-generated | Custom style needed |
| C | Built-in icon library | Professional scenarios (recommended) |
| D | Custom icons | Has brand assets |
The built-in icon library contains multiple stylistic libraries plus a brand-logo library:
See ../templates/icons/README.md for the current library inventory, counts, prefixes, and SVG placeholder details.
Mandatory rules when choosing C:
At the eight-confirmation stage — decide the library only. Do NOT run
ls | grepyet.
- Pick exactly one stylistic library — read the source material, then choose the library whose visual character best serves the deck:
chunk-filled— fill, straight-line geometry (M/L/H/V/Z only); sharp right angles; heavy, solid, architecturaltabler-filled— fill, bezier curves and arcs (C/A); smooth, rounded, organic; medium weight, approachabletabler-outline— stroke (line art); airy, refined, lightweight; best for screen-only (thin strokes may be hard to read in print)phosphor-duotone— duotone; main shape + 20% opacity backplate; medium weight, layered, contemporary- ⚠️ One presentation = one stylistic library for generic icons (home, chart, users, etc.). Mixing
chunk-filled/tabler-filled/tabler-outline/phosphor-duotoneis FORBIDDEN. If the chosen library lacks an exact icon, find the closest alternative within that same library.- Brand-logo exception:
simple-iconsis NOT a stylistic library. Add it to the deck's icon inventory only when the deck genuinely contains real company / product / service brand marks (customer logos, tech-stack icons, social handles). Never substitute it for a missing generic icon.- Stroke weight lock (stroke-style libraries only) — for stroke-based libraries (currently
tabler-outline), pick one deck-wide value from{1.5, 2, 3}(default2). For heavier presence, switch library instead of going above3.After all eight confirmations are approved — when writing
design_spec.md§VI /spec_lock.md, then materialize the icon inventory:
- Enumerate the concepts the deck actually needs (home, chart, users, …) based on the confirmed outline.
- Search for each concept's filename in the chosen library:
ls skills/ppt/templates/icons/<chosen-library>/ | grep <keyword>- Use the verified filename (without
.svg) as the icon name; always include the library prefix (e.g.,chunk-filled/home).- Copy each chosen icon into the project as you confirm it —
python3 skills/ppt/scripts/icon_sync.py <project_path> <lib/name> [<lib/name> …]. This populates<project>/icons/<lib>/(the set the Executor embeds from) and, more importantly, validates existence on the spot.- List the final icon inventory and chosen library in
design_spec.md§VI; record the same inspec_lock.md icons(includingstroke_widthfor stroke-style libraries). Executor may only use icons from this list.🚧 GATE — missing icon = re-pick now: if
icon_sync.pyreports any name as missing (non-zero exit), that icon is not in the library — re-pick a real filename vials … | grep, fix§VI/spec_lock.md, and re-run until it exits clean. Never carry a missing icon forward to generation. Over-copying candidates is harmless — finalize embeds only the icons actually referenced by<use data-icon>.Do NOT preload any index file — when the inventory step arrives, use
ls | grepto search on demand with zero token cost.
g. Typography Plan Confirmation (Font + Size)
🚧 GATE — read the locked style's type character first: read_file the visual-style file locked at §d Layer 2 (visual-styles/<visual_style>.md) and pull its §2 Typography character (you only read the catalog index there; the per-style character lives in the file). Both combinations below MUST realize it, and the title carries the personality — the CJK body may stay a neutral pre-installed sans, but the title leads with the character the style asks for (e.g. ink-wash → calligraphic KaiTi / FangSong; brutalist / memphis / vintage-poster / zine → display SimHei / Impact; editorial / data-journalism / photo-editorial → serif Georgia / Cambria / SimSun; dark-tech / blueprint → clean sans + Consolas mono; swiss-minimal / soft-rounded → grotesque / friendly sans). For visual_style: custom, realize its visual_style_behavior character instead. Letting the title default to a neutral sans when the style asks for character is the failure mode to avoid.
Font Combinations
Same-deck fonts must form contrast (different family, weight, or proportion) or concord (one family throughout). "Similar but not identical" pairings across roles are forbidden — see blacklist below. Within one stack, pairing a Windows font with a macOS counterpart (e.g.
Microsoft YaHei+PingFang SC) is encouraged as a browser-preview nicety; converter writes only the first into PPTX.
⚠️ PPT-safe font discipline (HARD rule). PPTX has no runtime fallback — missing fonts substitute to Calibri. Every stack MUST end with a pre-installed font:
- CJK →
"Microsoft YaHei"/SimHei/SimSun/FangSong/KaiTi- Latin sans →
Arial/Calibri/Segoe UI/Verdana/Trebuchet MS- Latin serif →
"Times New Roman"/Georgia/Cambria/Palatino/Garamond- Mono →
Consolas/"Courier New"- Display →
Impact/"Arial Black"Stacks led by non-pre-installed fonts (Inter / HarmonyOS Sans / Source Han / brand typefaces like McKinsey Bower) are only acceptable when the Design Spec notes "requires install or PPTX embed".
Forbidden — similar-but-not-identical pairings across roles (do not split title vs body across these; within one stack as cross-platform fallback they remain encouraged):
Microsoft YaHei↔PingFang SC↔Heiti SCSimSun↔Songti SC↔STSongArial↔Helvetica Neue↔Segoe UI"Times New Roman"↔TimesGeorgia↔Cambria
Mandatory: propose two combinations to the user — one concord (safe), one contrast (with tension). Do not default to "title = body, same font" without explicit user request. Pick each family by subject fit and the locked visual_style's §2 character (read at the GATE above) — there is no default family; type should follow the deck's content and aesthetic, not fall back to one safe face.
Template precedence: when a template was loaded at Step 3 via an explicit path and declares
title/bodyfont stacks in<project_path>/templates/design_spec.md §III Typography/ §IV (or whichever heading the fused spec uses), lock those directly and skip the two-combination presentation. Same precedence as e. — user override > template values.
Cross-platform pre-installed reference:
| Category | Safe families |
|---|---|
| CJK sans | Microsoft YaHei, SimHei, PingFang SC, Heiti SC |
| CJK serif | SimSun, FangSong, KaiTi, Songti SC |
| Latin sans | Arial, Calibri, Segoe UI, Verdana, Trebuchet MS, Helvetica Neue |
| Latin serif | Times New Roman, Georgia, Cambria, Palatino, Garamond, Book Antiqua |
| Mono | Consolas, Courier New |
| Display | Impact, Arial Black |
Seed combinations (all PPT-safe; first column is the contrast axis, not a scenario) — starting points, not the allowed set. Any client-preinstalled family is fair game; non-pre-installed expressive faces go title-only (see note below). Let the locked style's §2 character pick the axis and the title lead; the Microsoft YaHei body cells are the neutral default, not a requirement — a styled deck still varies the title even when the CJK body stays a neutral sans.
| Contrast axis | Title stack | Body stack | Code stack |
|---|---|---|---|
| Serif × sans | Georgia, KaiTi, serif |
"Microsoft YaHei", "PingFang SC", sans-serif |
— |
| Kai × hei | KaiTi, Georgia, serif |
"Microsoft YaHei", "PingFang SC", sans-serif |
— |
| Fangsong × hei | FangSong, "Times New Roman", serif |
SimHei, "Microsoft YaHei", sans-serif |
— |
| Double serif | Palatino, FangSong, serif |
Cambria, SimSun, serif |
— |
| Same family, weight contrast (900 / 300) | "Microsoft YaHei", "PingFang SC", sans-serif |
same | — |
| Display × neutral | Impact, "Arial Black", SimHei, sans-serif |
Arial, "Microsoft YaHei", sans-serif |
— |
| Cool serif (academic) | Cambria, SimSun, serif |
"Times New Roman", SimSun, serif |
— |
| Hei × song (政务) | SimHei, "Microsoft YaHei", sans-serif |
SimSun, serif |
— |
| Tech / developer | Arial, "Microsoft YaHei", sans-serif |
same | Consolas, "Courier New", monospace |
Concord (single family — pick the family by subject + visual_style) |
<family by subject>, …, sans-serif / serif |
same | — |
Stack length discipline (soft rule). ≤4 fonts per stack. The first CJK and first Latin font MUST be pre-installed — the converter writes only those, and a non-installed lead substitutes to Calibri (
drawingml_utils.py parse_font_family). Choose that lead from the safe set by the locked style's character:Microsoft YaHei/Arialare the neutral members — perfect as the tail fallback and as the lead only when the style is plain-sans, but not the automatic lead for every deck. For a styled title, lead withSimHei/KaiTi/FangSong/SimSun/Georgia/Cambria/Impact/Consolasas the character asks. Keep at most one macOS-exclusive family (typically"PingFang SC"); macOS→Windows fallback is auto-mapped viaFONT_FALLBACK_WIN.
Non-pre-installed directions (require install or PPTX embed; note the constraint in Design Spec):
- Retro / pixel — Press Start 2P / VT323 / Silkscreen
- Rounded friendly — Nunito / Quicksand / M PLUS Rounded / OPPO Sans (closest safe substitute:
Trebuchet MS/Verdana)- Modern web sans — Inter / HarmonyOS Sans / Source Han Sans / Noto Sans
- Calligraphic display — 隶书 LiSu / 华文行楷 STXingkai / 华文新魏 STXinwei (closest safe substitute:
KaiTi/FangSong); cover / section / hero titles only, never body- Brand-specific — McKinsey Bower, corporate VI typefaces
Font Size Ramp (px throughout)
Ramp, not a fixed menu. All sizes derive from the
bodybaseline as a ratio.spec_lock.md typographydeclaresbodyplus the slots this deck uses (title/subtitle/annotationby default; addcover_title/hero_number/subheading/lead/footnote/chart_annotationas the outline demands).Mandatory — scan
§IXbefore locking and declare a slot for every role that recurs across pages. Do not ship only the four defaults when the outline plainly carries more. A report /text-mode deck almost always recurs a per-page core-message / lead line (the one-sentence key-claim / takeaway under the title) and recurring page numbers / source credits / footnotes — declarelead(in the 1.1–1.4× lead band, and always ≥body— the core message is a primary line, never smaller than body) andfootnote(keep it readable — ~16px for a standard body, not shrunk smaller) for them. Leaving these undeclared forces the Executor to improvise an unlocked size, and a core line improvised belowbody(with data callouts sitting larger) is exactly the hierarchy inversion this prevents. Recurring chart / figure labels getchart_annotationlikewise.Structural roles (page title / body / subtitle / annotation / footnote) resolve to one size each and hold it deck-wide — that consistency is what reads as professional. Picking an intermediate in-band size is for special / feature elements (hero number, display title, one-off emphasis); a recurring one is declared as its own slot so it stays consistent too.
Unit boundary (HARD rule). The system is px-only.
recommendations.json, the Confirm UI,result.json,design_spec.md,spec_lock.md, and SVG all carry unitless px — there is no pt layer anywhere, and no pt→px conversion step. pt exists only as the size PowerPoint happens to show after export (px × 0.75, rounded to 1 decimal); it is never an input, a confirmation value, or a provenance field (nobody_size_pt/sizes_pt). Never writept/px/emunits; every layer carries bare px numbers. Geometry — margins, gaps, card sizes — is px too. (Beautify reads a source deck's pt at intake but converts to px before any recommendation — seebeautify-pptx.md; pt never re-enters the contract.)
Baseline — one fixed value per delivery purpose, not a range. Delivery purpose is a Tier-1 anchor confirmed in §c key information (beside audience, not as a separate typography step — see §1 Two-tier confirmation); it is the primary driver because the same canvas reads very differently when read close vs. projected. It is a deck-wide axis — beyond the body baseline it also drives page density / count / rhythm; see §6.1 for that side. Here, with the purpose confirmed, it sets the body baseline to a single value:
| Delivery purpose (PPT 16:9) | Body (px) | Reads as |
|---|---|---|
text · read-close (report, data-dense brief, leave-behind file) |
20px | screen / handout reading at arm's length |
balanced · business (presented and read; roadshow, review) — default |
24px | mixed projection + reading |
presentation (projected, sparse; keynote, launch, classroom) |
32px | room projection, glance from the back |
The body baseline is purely a function of delivery purpose — density and visual style do not nudge it within a range. Body size is the reading-distance proxy; density is orthogonal and shows in how much text per page, page count, and page_rhythm (§6.1), the other roles, and decoration — never in growing or shrinking the body baseline. One purpose → one body size, identical across the deck. (The user may still override the value in confirmation; absent an override, this fixed value is the recommendation.)
| Canvas | Height | Body baseline | Unit |
|---|---|---|---|
| PPT 16:9 / 4:3 | 720 / 768 | 20 / 24 / 32 (by delivery purpose) | px |
| Xiaohongshu | 1242×1660 | 40–55 | px |
| WeChat / IG 1:1 | 1080×1080 | 27–36 | px |
| Story / Portrait | 1080×1920 | 48–64 | px |
| A4 | 1240×1754 | 44–58 | px |
Every canvas authors px directly. PPT and social / print alike — the body baseline is a px number (PPT by delivery purpose above; social / print by the per-canvas value). No pt confirmation, no conversion step on any canvas.
Confirmed values win — never recompute over them. The user's confirmed sizes are authoritative. Confirm UI path: take
result.jsontypography.body_size/sizes(already px) verbatim — do not re-derive from the canvas even if the user changed it. The page deliberately does not auto-rescale font sizes when the canvas changes (it only updates the recommended-value hint), soresult.jsonalready reflects the user's intent; recomputing here would silently override their choice. Chat-fallback path (noresult.json): take the px body baseline for the confirmed canvas + delivery purpose directly from the table above (no conversion). Thebody_sizeinrecommendations.jsonis only a stale hint once the canvas changes — use the confirmed value, not the recommendation.
| Level | Ratio to body | 32px baseline (presentation) |
24px baseline (balanced) |
|---|---|---|---|
| Cover title (hero headline) | 2.5-5x | 80-160px | 60-120px |
| Chapter / section opener | 2-2.5x | 64-80px | 48-60px |
| Page title | 1.5-2x | 48-64px | 36-48px |
| Hero number (consulting KPIs) | 1.5-2x | 48-64px | 36-48px |
| Subtitle | 1.2-1.5x | 38-48px | 29-36px |
| Lead-in / intro | 1.1-1.4x | 35-45px | 26-34px |
| Subheading | 1.1-1.3x | 35-42px | 26-31px |
| Body | 1x | 32px | 24px |
| Annotation / caption | 0.7-0.85x | 22-27px | 17-20px |
| Page number / footnote | 0.5-0.65x | 16-21px | 12-16px |
Two baseline columns are illustrative only — for any other
bodypx value (20 / 24 / 32 / ...), multiply the row's ratio. Structural roles (page title / body / subtitle / annotation / footnote) take their locked slot value and stay there on every page — not a per-page pick. In-band freedom without pre-declaring is for special / feature elements (hero number, display title, one-off emphasis); a recurring special size should be declared as its own slot. The subtitle / lead / subheading bands overlap on purpose — pick by role, not size, then hold each at one size deck-wide. Values outside every band require lock extension first.
Round recommended sizes to clean even px — don't ship ratio leftovers. The ratios are a guide; lock each role at a clean even px, not the raw product. For
body24px that means title 42 · subtitle 32 · lead 30 · annotation 18 · footnote 16 — never32.4/18.7/ odd tails, which read as unprofessional. Snap the ratio output to the nearest even px (…14, 16, 18, 20, 24, 28, 32, 36, 42, 48…), then lock that. (The Confirm UI already snaps its per-role suggestions this way; match it on the chat-fallback path.)
px is literal — write the locked number verbatim (Mandatory).
result.json/design_spec.md/spec_lock.md/ SVG all carry px as-is; there is no conversion anywhere. The size you confirm is the size you write. The Executor'sfont-sizeMUST be the exact px fromspec_lock.typography— ifbodyis24, write24; never a "rounder" or PowerPoint-familiar number (20/18/36). Writing a remembered pt-style value as px is the silent drift that renders a whole deck the wrong size (e.g. a24px body emitted as20ships ~17% small); the checker's spec-lock drift guard backstops it, but author it right. Per role: honor any size the user pinned as that slot's locked value; derive the rest from the ramp and snap to clean even px. (At export the px is turned back into pt by× 0.75, rounded to 1 decimal — that is the only place pt ever appears, and it is automatic.)
Hero in single-focus / breathing pages: when one element is the entire page — a large number, a headline, a key phrase — it is the visual subject, not body content. Such heroes may borrow the cover-title band (2.5–5×); for greater emphasis, declare a hero slot in
spec_lock.md(e.g.,hero_number/hero_headline) — checker exempts declared slots with no fixed upper limit. The row above "Hero number (consulting KPIs) 1.5–2×" applies only to numeric KPIs in dashboard/data layouts, not to full-page focal elements.
Formula Rendering Policy
Formula rendering is part of Typography confirmation. Recommend one policy and let the user confirm or override it inside item g.
| Policy | Behavior | Use |
|---|---|---|
mixed (default) |
Render complex formula-worthy expressions to PNG; keep simple inline math as editable text / Unicode | Most academic, engineering, educational, and technical decks |
render-all |
Render every formula-worthy expression to PNG | Formula-heavy teaching / research decks where visual consistency matters more than editability |
text-only |
Do not render formulas; keep expressions as editable text / Unicode | Business decks, light technical briefs, or user preference for editability |
Hard rule: $...$ / $$...$$ in source material are input signals only. Do not scan output files for dollar-delimited formulas. After confirmation, Strategist decides which source expressions become formula assets and writes them explicitly to images/formula_manifest.json.
Formula-worthy expressions:
| Render as PNG | Keep as text |
|---|---|
| Fractions, radicals, integrals, sums, limits, matrices, multiline derivations, complex super/subscripts | O(n log n), x = 3, single Greek letters, short inline variables, simple percentages / KPIs |
Forbidden — invented math: formula assets must faithfully structure source content. Do not create a new equation just to make a slide look more academic.
Manifest step: After the Eight Confirmations and before writing design_spec.md, if policy is mixed or render-all and formulas are selected:
mkdir -p <project_path>/images
python3 skills/ppt/scripts/latex_render.py <project_path>
Write the manifest first at <project_path>/images/formula_manifest.json. Use this shape:
{
"providers": ["codecogs", "quicklatex", "mathpad", "wikimedia"],
"items": [
{
"id": "formula_001",
"latex": "E = mc^2",
"display": "block",
"color": "#1D1D1F",
"background": "#FFFFFF",
"transparent": true,
"dpi": 400,
"filename": "formula_001.png"
}
]
}
The script renders PNGs into images/, trying codecogs, quicklatex, mathpad, then wikimedia unless the manifest overrides providers. codecogs, quicklatex, and mathpad preserve requested formula color; wikimedia is an availability fallback and may require visual checking on dark themes. Formula PNGs are transparent by default: use background as the temporary render matte and local background-removal reference. Set transparent: false only when the final formula must keep an opaque background. It writes pixel_width, pixel_height, ratio, file, provider, and status back into the manifest. Run analyze_images.py <project_path>/images after formula rendering so the formula PNGs are included in the same inventory pass as user images.
h. Image Usage Confirmation
| Option | Approach | Suitable Scenarios |
|---|---|---|
| A | No images | Data reports, process documentation |
| B | User-provided | Has existing image assets |
| C | AI-generated | Custom illustrations, backgrounds needed |
| D | Web-sourced | Real-world reference imagery, editorial support, stock-style needs (no API key required for default providers) |
| E | Placeholders | Images to be added later |
Default toward imagery on framing pages — don't silently default the whole deck to A. When the user provided no images, do not quietly pick A "no images" deck-wide. Proactively propose AI imagery (C) on the pages where it lifts the deck most — cover, section dividers, single-concept /
breathingpages, atmosphere / backdrop moments — and surface it as one of the image-strategy candidates the user picks from. Keep data / list / process / comparison pages chart- and diagram-driven (route those to §VII visualizations, not §VIII imagery) — imagery is for framing and concept, not for decorating dense content. A "no images" is the right call only when the deck is genuinely all-data / operational and imagery would add nothing — then say so explicitly rather than defaulting there by omission. Everything stays gated: the user confirms image usage here, and AI generation runs through imagegen's own cost-confirmation gate (SKILL.md Step 4 / 阶段二) — proposing costs nothing, only confirmed generation spends.
🚧 GATE — know your resources before recommending.
images/is a live working folder (source-extracted pictures, user drops, later replacements), so its facts are re-derived on use, never trusted from a stale store. Before recommending image usage, ifimages/is non-empty, regenerate the inventory from whatever is currently in it, then read it back:python3 scripts/analyze_images.py <project_path>/imagesRead
<project_path>/analysis/image_analysis.csv(size / ratio / category of every in-hand picture). The A–E choice is still your judgment, but it MUST be made with the current inventory in full view — never in ignorance of what is already on hand.image_analysis.csvis a regenerated view of the live folder, not a durable fact: re-run this wheneverimages/changes.
Confirmed value wins. The
image_usageinresult.json(or the chat reply) overrides the recommendation here — map it to §VIIIAcquire Via(ai→ai,web→web,provided→**user**,placeholder→placeholder,none→option A, no image rows). When it is notai(and the plan has no AI part), skip h.5 entirely and write noairows. See SKILL.md Step 4 for the full mapping.
When recommending C — surface its three implementation modes so the user knows "no API key" is a supported state:
| Mode | Trigger | Mechanism |
|---|---|---|
| Path A | IMAGE_BACKEND configured (default) |
image_gen.py runs in Step 5 |
| Path B | IMAGE_BACKEND not configured AND host has a native image tool (Codex / Antigravity / Claude Code / similar) — auto-selected, no user prompting needed |
Host-native generation |
| Offline Manual | IMAGE_BACKEND not configured AND host has no native image tool |
Prompts written to images/image_prompts.json; user generates externally and places files in project/images/ |
Selection is automatic in Step 5 (A → B → Manual). Detailed contract: image-generator.md §3.2.
Selections may be mixed at the row level — e.g. a deck can use C for hero illustrations while sourcing D for supporting team photos.
h.5 AI Image Strategy — lock rendering + palette (only when C is selected)
When the deck includes any ai rows, Strategist locks a deck-wide rendering and deck-wide palette here. These two values are written into design_spec.md §III and spec_lock.md colors / images sections, then consumed by Image_Generator. Every AI image in the deck shares them — this is what makes multiple AI images feel like one deck.
🚧 GATE — before recommending values: read_file references/image-renderings/_index.md and read_file references/image-palettes/_index.md. They contain the catalog, auto-selection tables, and a rendering × palette compatibility matrix.
Three-candidate presentation (default path)
Hard rule: Unless the user has already named a specific rendering or palette (chat or template), present ≥3 distinct rendering × palette combinations and let the user pick. Never auto-lock a single combination silently.
Per-candidate schema (exactly 4 lines, no extras):
[Plan A] <temperament label> — <rendering> × <palette>
Visual: <shape / line / material / light, 1-2 phrases>
Color: <secondary HEX (ratio) + primary HEX (ratio) + accent HEX (ratio); HEX values from e.>
Mood: <2-3 traits>; like <real-world analogy: company / publication / event>
After the candidates, append one line:
> Reference images: see references/ai-image-comparison/ for matching PNGs by name.
Hard rules for candidate construction:
| Rule | Behavior |
|---|---|
| Filter by e.'s HEX | Only include palettes whose temperament can carry the user's HEX. Vivid red → exclude cool-corporate / mono-ink; include vivid-launch / warm-earth / editorial-classic. |
HEX values in Color line MUST be e.'s real values |
Palette contributes only the 60-30-10 ratio + role assignment. Never substitute the palette's typical HEX. |
| Span a personality spectrum | Typically: one conservative-default (industry norm), one shifted-tone (same fit, 1-2 ticks different), one bold-contrast (more expressive, may challenge default). No near-duplicates. |
Mood line MUST include a real-world analogy |
Company / publication / event the user can picture. Adjective stacks alone are forbidden. |
| Adapt labels to chat language | Schema is English by default. Chinese chat → render as 「方案 A / 视觉 / 色彩 / 情绪」. Structure stays the same; only the labels translate. |
| Skip presentation when user has specified | User-named rendering or palette (chat / brand / template), or a Confirm UI pick in result.json.image_strategy (same shape as color / typography honoring their confirmed candidate), bypasses the candidate flow — lock that chosen candidate's rendering + palette directly per the truth-precedence rule; do not re-pick. |
custom is a tail-case, not a default |
When no preset fits, a candidate may set rendering: custom and / or palette: custom (rules: image-renderings/_index.md §1.5, image-palettes/_index.md §2). At most one candidate per dimension may carry custom; one candidate may carry both dimensions as custom. Visual / Color lines describe the behavior in prose, never by naming a competing preset. |
Forbidden — padding with conflicts: if e.'s HEX cannot find ≥3 compatible palettes, present the smaller set (2 candidates) and state "your color is unusual — only N palettes can carry it without conflict." A custom candidate is allowed only when its prose genuinely describes a tail-case the presets cannot — not as a slot-filler. Never fill remaining slots with known-conflicting options.
Worked example (e. = #1E3A5F navy + #F8F9FA off-white + #D4AF37 gold; d. = consulting; chat in English):
[Plan A] Restrained Professional — vector-illustration × cool-corporate
Visual: flat vector, solid color blocks, no gradients or shadows
Color: off-white #F8F9FA (60-70%) + deep navy #1E3A5F main (25-30%) + gold #D4AF37 accent (<5%)
Mood: steady, trustworthy, restrained gravitas; like a McKinsey consulting report
[Plan B] Editorial Depth — editorial × editorial-classic
Visual: magazine layout, 8% paper texture, column-based partitioning
Color: off-white #F8F9FA paper (55%) + deep navy #1E3A5F column (30%) + gold #D4AF37 rule line (10-14%)
Mood: refined, considered, paced; like an Economist feature spread
[Plan C] Future Energy — 3d-isometric × tech-neon
Visual: isometric 3D, soft shading, 8% glow halos around bright elements
Color: off-white #F8F9FA digital field (50%) + deep navy #1E3A5F main (35%) + gold #D4AF37 emphasis (10-15%)
Mood: forward, energetic, futuristic; like an Apple or Stripe product keynote
> Reference images: see references/ai-image-comparison/ for matching PNGs by name.
Worked example — custom × custom (tail-case; e.g. 新中式 deck with #1A1A1A + #F5EFE0 + #A52A2A):
[Plan A] 文人雅致 — custom × custom
Visual: dry-brush burnt-ink, five tonal gradations, 宣纸 paper-grain, deliberate negative space; 朱泥 seal as single red mark
Color: cream #F5EFE0 ~65% negative space + burnt-ink #1A1A1A ~20% strokes + cinnabar #A52A2A 3-5% seal
Mood: literati restraint; like 苏州博物馆 pacing
Visual / Color lines feed spec_lock.md's image_*_behavior rows verbatim.
After the user picks a candidate (or supplies a custom variant), proceed to "Recording the lock" below.
Prompt depth for §VIII rows
Hard rule: When §VIII contains paper-figure or subject-domain rows (scientific subjects, specialized fields, regulated content), each row's generation description follows image-generator.md §4.2 Prompt depth — expand to the depth the subject demands; 500-1000+ words is normal.
Forbidden — generic shortening: never drop a paper-figure row's prompt to a 50-word generic illustration brief.
Catalog reference (for candidate construction)
The tables below are source data Strategist reads when constructing the three candidates above. They are no longer the final output by themselves.
Rendering recommendation (soft — user may override with any other rendering from the catalog):
d. Style signal |
Recommended rendering | Alternates |
|---|---|---|
| Strategic / MBB / board | editorial or vector-illustration |
blueprint, minimalist-swiss |
| Corporate report / analysis / 学术答辩 | vector-illustration |
flat, editorial |
| High-end consulting / luxury / 高端 / design-firm | minimalist-swiss |
editorial, vector-illustration |
| Tech / SaaS / AI / 架构 | 3d-isometric, blueprint, digital-dashboard |
flat |
| Modern SaaS / fintech / health-tech / premium app | glassmorphism |
digital-dashboard, flat |
| Product launch / brand / marketing | flat, 3d-isometric, corporate-photo |
vector-illustration |
| Education / training / 教学 / 培训 | sketch-notes |
vector-illustration, paper-cut |
| Children / storybook / 儿童 / 治愈 | fantasy-animation |
paper-cut, watercolor, sketch-notes |
| Cultural / folk / festival / 文化 / 节日 | paper-cut |
vintage-poster, screen-print |
| Methodology / Before-After / 方法论 / manifesto | ink-notes |
editorial |
| Government / formal / 政务 | editorial or corporate-photo |
vector-illustration |
| Finance / journalism / 财经 | editorial, digital-dashboard |
vector-illustration |
| Personal story / 个人成长 / lifestyle | watercolor, warm-scene |
corporate-photo, paper-cut |
| Cultural / media / opinion / cinematic | screen-print, vintage-poster |
editorial, warm-scene |
| Brand heritage / hospitality / 老字号 / 周年 | vintage-poster |
screen-print, editorial |
| Gaming / retro / 复古 / 像素 | pixel-art |
vintage-poster |
| Environment / wellness / 环保 | nature |
watercolor, paper-cut |
| Classroom / blackboard / 课堂 | chalkboard |
sketch-notes |
| Team / company / product photo | corporate-photo |
— |
Palette recommendation (soft — user may override):
| Content vibe / industry | Recommended palette | Alternates |
|---|---|---|
| Consulting / finance / B2B / corporate / 学术答辩 | cool-corporate |
editorial-classic, frost-ice |
| Tech / SaaS / AI | tech-neon |
cool-corporate, dark-cinematic |
| Modern SaaS / fintech / health-tech | frost-ice |
cool-corporate, tech-neon |
| Health / medical / beauty / skincare | frost-ice |
nature-organic, earthy-dusty |
| Education / training | macaron |
warm-earth |
| Methodology / Before-After | mono-ink |
editorial-classic |
| Personal / lifestyle / brand story | warm-earth |
nature-organic, earthy-dusty |
| Interior / wellness / mindfulness / slow living | earthy-dusty |
warm-earth, nature-organic |
| Product launch / marketing | vivid-launch |
tech-neon, sunset-gradient |
| Creative agency / travel / music / lifestyle | sunset-gradient |
vivid-launch, warm-earth |
| Luxury / fashion / jewelry / premium / heritage | jewel-tone |
dark-cinematic, editorial-classic |
| Children / storybook | macaron |
warm-earth |
| Premium / film / entertainment | dark-cinematic |
jewel-tone, duotone |
| Cultural / media / cover-art | duotone |
editorial-classic |
| Environment / wellness | nature-organic |
warm-earth, earthy-dusty |
| Finance / journalism | editorial-classic |
cool-corporate |
After auto-selecting, cross-check image-palettes/_index.md compatibility matrix — if rendering × palette is ✗, swap to the alternate palette.
d-e-f-g linkage sanity check (do this after picking rendering + palette):
| Linkage | What to verify |
|---|---|
| d. Style ↔ rendering | Rendering family should match the Style descriptor's temperament (corporate ≠ sketch-notes; tech ≠ watercolor). Already enforced by the recommendation table above. |
| e. Color HEX ↔ palette | HEX is truth — palette is just the "how to use these HEX" rulebook for AI images (saturation / contrast / 60-30-10 / material). Mismatch → always swap palette to fit the HEX, never adjust the HEX to fit a palette. E.g. user gives a vivid red but you auto-picked cool-corporate — switch to vivid-launch or warm-earth, do not propose dimming the red. |
| f. Icon library ↔ rendering | tabler-outline pairs well with all renderings (most versatile). chunk-filled / tabler-filled pair better with vector-illustration / flat / editorial. phosphor-duotone pairs with flat / digital-dashboard. Mismatch is not fatal but worth flagging. |
| g. Typography ↔ rendering | Serif title → pairs well with editorial, corporate-photo, screen-print. Hand-lettered direction → already implied by sketch-notes / ink-notes (the rendering carries the lettering, no separate font requirement). Display font → vivid-launch / screen-print. Mismatch is rarely fatal; note in conversation if it feels off. |
Recording the lock — after picking, write to:
design_spec.md §III Visual Theme— add lines under the color table:- **Image Rendering**: vector-illustration - **Image Palette**: cool-corporatespec_lock.md colorssection — add rows at the bottom:- image_rendering: vector-illustration - image_palette: cool-corporate
Hard rule — custom recording: when the picked candidate has rendering: custom or palette: custom, also write the sibling *_behavior row. Source: the candidate's Visual line (for rendering) / Color line (for palette), expanded to cover the prose requirements in image-renderings/_index.md §1.5 / image-palettes/_index.md §2 (chat candidates are compressed; spec_lock prose covers all axes). Both design_spec.md and spec_lock.md must carry the behavior line. Example for the custom × custom candidate above:
- image_rendering: custom
- image_rendering_behavior: "Dry-brush burnt-ink with five tonal gradations, 宣纸 paper-grain at 12% opacity, deliberate negative space; 朱泥 seal as a single red mark; no Western outlines, no gradients."
- image_palette: custom
- image_palette_behavior: "宣纸 cream `#F5EFE0` carries ~65% as negative space; burnt-ink `#1A1A1A` anchors ~20% as brush strokes; cinnabar `#A52A2A` only in 3-5% as seal. Literati restraint — no fourth color."
Image_Generator reads these fields and applies them deck-wide. If both are absent (legacy decks), it falls back to inferring from d. Style and e. Color — quality is acceptable but not optimal. Always lock both when C is selected.
hero_page suggestion (same confirmation turn)
After the user picks a candidate, scan the outline and surface any pages where the image makes more sense as the page's main voice than as a local block. Present them as a short list and let the user confirm, edit, or skip. Result is recorded as page_role: hero_page on the matching ai rows. Density is judgment-based — no fixed quota.
Per hero_page title: lock where it lives — embedded (fused into the image: neon, carved, smoke, 3D-lit lettering) or none (editable SVG title over an atmospheric backdrop, Primitive D). Default none; flip to embedded only when the words must be part of the visual, not merely a display font. Per page — may bake only the keyword while subtitle / date / chrome stay SVG. Surface it with the hero_page list for the same confirm / edit / skip.
When selection includes B, you must run python3 scripts/analyze_images.py <project_path>/images before outputting the spec, and integrate scan results into the image resource list.
When B / C / D / E is selected, add an image resource list to the spec:
| Column | Description |
|---|---|
| Filename | e.g., cover_bg.png |
| Dimensions | e.g., 1280x720 |
| Ratio | e.g., 1.78 |
| Layout suggestion | e.g., Wide landscape (suitable for full-screen/illustration) |
| Layout pattern | MANDATORY — one or more #<id> <name> joined by + from image-layout-patterns.md. Combine a Primary id with optional Modifier ids when the page needs it (e.g. #48 side-by-side comparison + #21 rounded rectangle crop + #29 two-stop scrim). A single Primary is fine when the page calls for it. See the GATE earlier in this section. Empty cells or invented ids are invalid. |
| Purpose | e.g., Cover background |
| Type | Free-form category tag — Background, Photography, Illustration, Diagram, Portrait, Latex Formula, etc. Required for formula rows (Latex Formula). |
| Acquire Via | ai / web / user / formula / placeholder — only ai and web drive Step 5 dispatch |
| Status | Initial status must be Pending, Existing, Rendered, or Placeholder; see svg-image-embedding.md for the full status enum |
| Reference | Free-form intent description (NOT a search query); feeds Image_Generator (ai) or Image_Searcher (web) |
text_policy (optional, ai rows only) |
none (no text in image) or embedded (text is part of the artwork). Leave blank when Image_Generator should decide per row. Long body / data / lists stay in SVG. |
page_role (optional, ai rows only) |
local (image is a region block on an SVG page) or hero_page (image is the page's main voice). Leave blank when Image_Generator should decide per row. |
No-crop flag (exception only): most images are croppable — Executor defaults to preserveAspectRatio="xMidYMid slice". When an image must NOT lose pixels (data screenshots, charts, certificates, contracts, dense diagrams), append no-crop to its spec_lock.md images entry. Executor will then size the container to the native ratio and use meet. Don't tag the rest.
Formula rows: rendered LaTeX PNGs are image rows with Acquire Via: formula, Status: Rendered, and Type: Latex Formula. Always append no-crop in spec_lock.md images. They are not AI images and never go through Step 5.
Reference field: Write visual intent, not provider mechanics.
| ✅ Intent description | ❌ Avoid |
|---|---|
| "Diverse engineering team collaborating around a laptop, modern office, natural light" | "team laptop office" |
| "Abstract atmospheric backdrop for academic-defense cover, calm center for text overlay, hint of campus skyline" | "use openverse, search 'office'" |
| "Sunlit forest path in autumn" | "team photo" |
Per-row Reference grammar:
| Acquire Via | Reference pattern |
|---|---|
ai |
Subject + intent + composition only. Do NOT repeat style words ("flat design", "modern", "vector") or HEX values — both are already locked deck-wide by h.5 (rendering + palette) and design_spec §III (colors). Image_Generator's prompt assembler injects them automatically. |
web |
Concrete subject/place/object first, then 1-3 quality descriptors |
formula |
Original LaTeX plus short placement intent, e.g. formula_001: block energy-mass equation for P03 |
Allowed web quality descriptors:
| Descriptor | Use |
|---|---|
professional editorial photography |
Stock-style photography |
clean composition |
Covers, section dividers, image-text layouts |
natural light |
People, workplace, travel, lifestyle scenes |
high-resolution |
Large visual areas |
Forbidden — web negative prompts: not tourist snapshot, no phone photo, avoid amateur style.
| Mode | Good Reference |
|---|---|
web |
"Diverse team collaborating at a modern office desk, professional editorial photography, natural light, laptop visible" |
ai |
"Atmospheric backdrop suggesting digital innovation; calm central area reserved for slide title overlay; light geometric anchor at one edge" |
ai |
"Four-stage value chain from raw input to R&D output; icons should suggest tax-form → cost-reduction → equipment-upgrade → innovation; no text labels (SVG overlays them)" |
🚧 GATE — before writing §VIII Image Resource List: when image approach is B/C/D/E (anything other than A "no images"), this is a three-layer hard requirement, not a suggestion:
- Read —
read_file references/image-layout-patterns.md. The file enumerates 72 numbered techniques split into Part 1 — Primary Structures (#1–#19 container layouts, #38–#46 image-as-canvas + native overlay, #47–#56 multi-image) and Part 2 — Modifier Layers (#20–#26 non-rectangular crops, #27–#37 overlays & masks, #57–#61 texture, #62–#72 special). The fourImage narrative intentvalues below cover only broad categories. - Produce — every non-formula row in §VIII Image Resource List MUST fill the
Layout patterncolumn with one or more#<id> <name>joined by+drawn verbatim from this file (Primary + optional Modifiers). Rows with empty cells, paraphrased names, or invented ids are invalid. Formula rows are the only exception; useformula-inlineorformula-block. - Image-as-canvas coverage — for any deck with ≥4 image-bearing pages, at least one page MUST use a
#38–#46pattern (image-as-canvas + native overlay) unless every image is a pure cover / chapter divider / atmosphere backdrop. This family is the most-skipped one and is usually the right answer for content-rich pages with photographs. If the deck legitimately has no opportunity for it, state the reason in §VIII directly under the table.
Skip-detection signal for self-audit: if you notice that every page's Layout pattern column resolves to #2/#3 (left-third or right-third), #5/#6 (top-bottom band), or generic side-by-side, you have not actually consulted the file — re-read and reconsider. The default left/right and top/bottom split bias is the failure mode this gate exists to break.
Skip-detection signal — text_policy column: if every ai row resolves to none and the deck contains any paper-figure / academic schematic / panel-comparison / data-axis page, you defaulted instead of judging per row. Consult image-generator.md §5.3 positive-trigger table and re-decide each row. none for every row is correct only when no row matches a trigger; otherwise this is the same class of failure as the layout-pattern signal above.
Image narrative intent (decide before the ratio table — determines whether the image lives in a container at all):
| Intent | Form | When to use |
|---|---|---|
| Hero / full-bleed | Image fills canvas/dominant zone; title floats over with gradient or opacity overlay | Covers, chapter dividers, breathing pages — image is the message |
| Atmosphere / background | Image as low-contrast backdrop (reduced opacity or dark overlay); text reads on top | Section backgrounds, mood-setting — image sets tone, text carries info |
| Side-by-side | Image and text as adjacent coequal blocks — ratio table below governs container sizing | Most content pages — image and text read together |
| Accent / inline | Small image beside related text, not a container; no ratio matching | Supporting visuals, spot illustrations |
Intent follows narrative purpose, not image ratio. Don't default every image page to side-by-side.
Side-by-side ratio alignment (consult only when the chosen intent is side-by-side; detailed calculation rules in references/image-layout-spec.md):
| Image Ratio | Recommended Container Layout |
|---|---|
| > 2.0 (ultra-wide) | Top-bottom split, top full-width |
| 1.5-2.0 (wide) | Top-bottom split |
| 1.2-1.5 (standard landscape) | Left-right split |
| 0.8-1.2 (square) | Left-right split |
| < 0.8 (portrait) | Left-right split, image on left |
Side-by-side only: container ratio must match image ratio. Hero / atmosphere / accent intents ignore ratio alignment.
Portrait canvases (Xiaohongshu, Story): Layout rules differ — top-bottom is preferred for most ratios since left-right columns become too narrow. See "Portrait Canvas Override" in
references/image-layout-spec.md.
Multi-image slides: When multiple images appear on one page, use the grid formulas in the "Multi-Image Layout" section of
references/image-layout-spec.md.
Pipeline handoff: When C) AI generation is selected, Image_Generator consumes
Pendingrows and updates them toGeneratedorNeeds-Manualbefore Executor proceeds. Status names are defined insvg-image-embedding.md.
Template Match — Visualization + Structural Patterns (Non-blocking — Strategist recommends, no user confirmation needed)
The catalog covers both data charts and structural information designs. A "match" is not limited to numeric pages — any page whose content shape matches a Pick for ... clause is a candidate:
- Data-type pages: comparisons, trends, proportions, KPIs, financials, rankings, distributions, conversion funnels
- Structural-type pages: team rosters, agendas, principles & values, methodology phases, customer journey, capability maps, OKR cascades, roadmaps, strategic frameworks (SWOT / BCG / PEST / Porter's Five Forces / Value Chain — matched via
quadrant_text_bullets,quadrant_bubble_scatter,vertical_pillars,hub_inward_arrows,chevron_chain_with_tailrespectively)
The most common Strategist failure mode is missing the structural half — treating "chart" as "numeric chart only" and leaving team / agenda / principles / journey pages as text-only when a template would fit. Read the catalog with both lenses.
🚧 GATE — visual floor (the deck must not ship as text-on-rectangles). "Not every page needs a chart" is true, but an informative multi-page deck that assigns zero visual treatments is the #1 quality failure: flat walls of text in boxes, no diagrams / figures / charts / imagery. Hard floor for any content-bearing deck (≥6 pages, not a pure title/quote teaser):
- Every content page whose shape matches a catalog Pick clause MUST get a visual treatment — a
page_chartsentry (chart / infographic template), apage_layoutsstructural template, or an explicit §VII custom-diagram plan (no-template-matchwith the figure described). Text-in-boxes is the fallback only for pages that genuinely carry no structurable shape, and you must be able to name why.spec_lock.mdMUST NOT ship a content deck withpage_chartsempty ANDpage_layoutsempty/free-design AND no §VII custom-diagram rows. That combination means no visual was assigned anywhere — re-scan §IX and map each content page's shape to a figure (comparison→columns/quadrant, process/历程→timeline/process_flow, cycle/循环→concentric/segmented_wheel, share→donut/pie, trend→line/area, ranking→bar, architecture→layered_architecture, relations→hub_spoke/mind_map).- Downstream enforcement:
svg_quality_checker.pyhard-fails any deck (≥6 text-heavy pages) whose Executor output has zero<path>/<polygon>/<polyline>/<image>deck-wide. Leaving the visual plan empty here guarantees that failure later — assign the figures now, at the spec stage, where it is cheapest to change.
Reading is mandatory; the catalog is a starting point, not a copy target.
- Fully read
templates/charts/charts_index.jsonbefore drafting the Eight Confirmations — the read happens up front, not when you sit down to write Section VII. The file containsmeta+charts.<key>.summaryonly; eachsummaryis a selection rule ("Pick for … Skip if …"), not a description. There is no category, quickLookup, or keyword index — selection is done by semantically matching each page's content shape against all 71 summaries in one pass.- Not every page needs a chart. When a page's information structure matches a catalog entry, use that template as a structural starting point — keep the visualization type and core layout logic, then adapt composition, density, color, decoration, and accompanying elements to fit this deck's content and visual tone. Free adjustment is encouraged; what is forbidden is (a) generating without reading the catalog, and (b) blind verbatim mimicry that ignores the page's actual content weight.
Workflow:
- Read all 71 summaries; for each page, identify the Pick clause that matches the page's content shape AND does not match any Skip clause.
- Prefer specificity (
vertical_listover genericnumbered_steps).- One primary visualization per page; a supporting layout may accompany it.
- List selections in Design Spec section VII; section IX only notes the visualization type name per page.
Source vocabulary mismatch — the catalog is in English. When source content uses Chinese / industry jargon ("中台", "架构图", "述职", "管道", "前后端"), translate the intent first, then match against summaries. The catalog deliberately keeps no keyword index — full-read forces semantic matching rather than lexical grep.
Read-audit (mandatory, section VII format) — single combined table;
summary-quotecolumn is the anti-fabrication audit,path+usageserve Executor lookup. Format defined intemplates/design_spec_reference.md§VII:Catalog read: 71 templates | Page | Template | Path | Summary-quote (verbatim) | Usage | | ---- | ------------- | --------------------------------- | ------------------------ | ----- | | P03 | bar_chart | templates/charts/bar_chart.svg | "<verbatim first sentence>" | <intent> | | P07 | line_chart | templates/charts/line_chart.svg | "<verbatim first sentence>" | <intent> | | P11 | pie_chart | templates/charts/pie_chart.svg | "<verbatim first sentence>" | <intent> | Runners-up considered (3 entries minimum, drawn from real second-best matches): - <key_A> | rejected for P03: <reason citing this deck's specifics> - <key_B> | rejected for P07: <reason> - <key_C> | rejected for P11: <reason>The
summary-quotemust be copy-pasted fromcharts_index.json— paraphrasing or summarizing breaks the audit. Every template name listed (selected or rejected) mustgrepcleanly insidecharts_index.json(so misspelled or invented keys fail). If fewer than 3 visualization pages exist, list what exists and note "fewer than 3 viz pages"; runners-up still required for each page that does exist.Fallback when no template fits:
- Re-read the full summary list with the page's intent re-stated in plain language — "non-obvious" matches often surface on the second pass (e.g. "causal chain" →
process_floworsankey_chart).- If still no fit: data-driven content → table layout; conceptual/illustrative → "AI-generated image" (Image_Generator handles); structural → "custom layout".
- Mark the page
no-template-matchin section VII with the fallback chosen and why. Do NOT silently substitute a close-but-wrong chart.
Speaker Notes Requirements (Default — no discussion needed)
- File naming: Recommended to match SVG names (
01_cover.svg→notes/01_cover.md), also compatible withnotes/slide01.md - Fill in the Design Spec: total presentation duration, notes style (formal / conversational / interactive), presentation purpose (inform / persuade / inspire / instruct / report)
- Split note files must NOT contain
#heading lines (notes/total.mdmaster document MUST use#heading lines)
2. Mode & Visual-Style Catalogs (Reference for Confirmation Item d)
Confirmation d locks two independent catalog items:
- Mode — narrative skeleton:
modes/_index.md→pyramid/narrative/instructional/showcase/briefing. - Visual style — aesthetic:
visual-styles/_index.md→ presets +custom.
Read the relevant _index.md at confirmation d (Layer 1 / Layer 2) for its catalog table and auto-selection. Executor loads the locked mode + visual-style files at generation (see SKILL Step 6).
3. Color Knowledge Base
Consulting Brand Colors
| Brand | HEX |
|---|---|
| Deloitte Blue | #0076A8 |
| McKinsey Blue | #005587 |
| BCG Dark Blue | #003F6C |
| PwC Orange | #D04A02 |
| EY Yellow | #FFE600 |
Versatile / General Colors
| Style | HEX |
|---|---|
| Tech Blue | #2196F3 |
| Vibrant Orange | #FF9800 |
| Growth Green | #4CAF50 |
| Professional Purple | #9C27B0 |
| Alert Red | #F44336 |
Data Visualization Colors
- Positive trend (green):
#2E7D32→#4CAF50→#81C784 - Warning trend (yellow):
#F57C00→#FFA726→#FFD54F - Negative trend (red):
#C62828→#EF5350→#E57373
4. Layout Pattern Library
Principle — proportion follows information weight, not preset ratios. Combine patterns, break the grid for
breathingpages, or propose new patterns. Defaulting every page to symmetric grid produces the "AI-generated" look.
| Pattern | Suitable Scenarios | PPT 16:9 Reference Dimensions |
|---|---|---|
| Single column centered | Covers, conclusions, key points | Content width 800-1000px, horizontally centered |
| Symmetric split (5:5) | Comparisons where two sides carry equal weight | Column ratio 1:1, gap 40-60px |
| Asymmetric split (3:7 / 2:8) | One side dominates — chart vs. takeaway, image vs. caption | Heavier side 840-1024px, lighter side 256-440px |
| Three-column | Parallel points, process steps | Column ratio 1:1:1, gap 30-40px |
| Four-quadrant / matrix | Two-axis classification, strategic quadrants | Quadrant 560x250px, gap 20-30px |
| Top-bottom split | Ultra-wide images + text, processes, timelines | Image full-width, text area >= 150px height |
| Z-pattern / waterfall | Storytelling, case studies — blocks alternate left/right | Guide eye in Z; 3-5 alternating blocks |
| Center-radiating | Core concept + surrounding nodes | Center element 200-300px, 4-6 satellite nodes |
| Full-bleed + floating text | breathing / feature pages |
Image fills 1280x720, text floats over opacity overlay |
| Figure-text overlap | Hero moments — headline over/against image edge | Text partially overlaps image, not beside it |
| Negative-space-driven | Single element in 40-60% whitespace | One idea, weight through emptiness |
PPT 16:9 (1280x720) key dimensions: Safe area 1200x640 (40px margins); Title area 1200x100; Content area 1200x500; Footer area 1200x40.
5. Template Flexibility Principle
Templates are starting points. The Strategist may adjust based on content and audience:
- Font size ratios — reference values, adjustable
- Color schemes — customize per brand/content
- Layout patterns — combine, nest, or break (§4 lists 11 patterns as reference, not exhaustive)
- 12-chapter framework — expand or reduce
- Spacing / border radius — Executor adjusts per content density and
page_rhythm
6. Workflow & Deliverables
6.1 Content Planning Strategy
Content-outline and speaker-notes strategy follow the deck's locked mode — see modes/_index.md and the locked mode's file. The guidance below applies within any mode:
Delivery purpose drives the whole plan, not just type size. result.json delivery_purpose — text (read-close) / balanced (business, default) / presentation, confirmed as a Tier-1 anchor (§1) — is a deck-wide consumption mode. It seeds the body baseline (§g) and governs how content is distributed:
| Delivery purpose | Per-page density & treatment | §IX content per page | page_rhythm lean |
|---|---|---|---|
text · read-close |
dense — pack more per page, fuller layouts | prose paragraphs, more blocks, tables / fine detail; complete sentences | leans dense |
balanced · business (default) |
balanced | one primary + supporting points; moderate text | mixed |
presentation |
sparse — one idea per page, generous whitespace | keywords / short phrases, a single core message, large visual; never paragraph dumps | leans anchor / breathing |
This is what makes the axis meaningful: a presentation deck and a text deck built from the same source must differ in per-page text volume, layout density, and rhythm — not only in font size. Page count (item b) stays the user's call; delivery purpose governs the density and treatment within it, and informs the page-count recommendation when the user has not fixed one. Record the chosen purpose in design_spec.md §I. The page_rhythm leans are a bias, not a quota — the filler-page ban and "rhythm follows narrative" rule still hold. (Preservation paths — beautify / template-fill — keep source structure verbatim: honor purpose only in styling, never to re-paginate.)
Per-block expression: phrase each §IX content block in the mode that fits it — prose, bullet, keyword, or any phrasing the content calls for — not a default bullet. Take the cue from the source's texture: a narrative source (article / transcript / talk) leans prose — resist compressing its argument pages into fragments; a data sheet leans bullet/keyword. Write the real sentence into §IX itself, not a skeleton point to expand later. One page mixes modes; let layout pull each (narrative → prose, structural/chart → bullets/keywords).
Note: §IX is the only content copy the Executor re-reads after context compression — what you write there is what survives.
6.2 Outline Output Specification (Must include 11 chapters)
| Chapter | Content Requirements |
|---|---|
| I. Project Information | Project name, canvas format, page count, style, audience, scenario, delivery purpose, date |
| II. Canvas Specification | Format, dimensions, viewBox, margins, content area |
| III. Visual Theme | Style description, light/dark theme, tone, color scheme (with HEX table), gradient scheme |
| IV. Typography System | Font plan (per-role families — title / body / emphasis / code), font size hierarchy |
| V. Layout Principles | Page structure (header/content/footer zones), layout pattern library (combine/break as content demands), spacing spec |
| VI. Icon Usage Spec | Source description, placeholder syntax, recommended icon list |
| VII. Visualization Reference List | Visualization type, reference template path, used-in pages, purpose |
| VIII. Image Resource List | Filename, dimensions, ratio, purpose, status, generation description |
| IX. Content Outline | Grouped by chapter; each page includes layout, title, core message (the page's one idea), content blocks (in the selected phrasing mode), visualization type (if applicable) |
| X. Speaker Notes Requirements | File naming rules, content structure description |
| XI. Technical Constraints Reminder | SVG generation rules, PPT compatibility rules |
Generation steps:
- Read reference template:
templates/design_spec_reference.md - Generate complete spec from scratch based on analysis
- Save to:
projects/<project_name>.../design_spec.md - Generate execution lock: read
templates/spec_lock_reference.mdand produceprojects/<project_name>.../spec_lock.md— a distilled, machine-readable short form of the color / typography / icon / image / page_rhythm / page_layouts / page_charts decisions above. This file is what the Executor re-reads before every page (see executor-base.md §2.1). The values inspec_lock.mdMUST exactly match the decisions recorded indesign_spec.md; if they ever diverge,spec_lock.mdwins anddesign_spec.mdshould be treated as historical narrative.- page_rhythm is mandatory: Based on the page list in §IX Content Outline, assign each page one of
anchor/dense/breathing(seespec_lock_reference.mdfor the full vocabulary). This is what breaks the uniform "every page is a card grid" feel — without it the Executor defaults all pages todense. - Rhythm follows narrative, not quota:
breathingpages mark natural pauses — chapter transitions, standalone emphasis (hero quote / big number), SCQA bridges. Dense decks may legitimately be alldense. Do NOT invent filler pages ("Thank you", empty dividers) to pad rhythm — everybreathingpage must say something independent. Delivery purpose biases the overall lean (presentationtoward moreanchor/breathing,texttowarddense; see §6.1) — a bias, never a quota. - Cover impact is mandatory: Page
P01is the deck's first visual contract, not a generic title slide. Indesign_spec.md §IX, add aCover impactline forP01that names one concrete hook and one concrete composition strategy. Use the source's strongest available signal: a provocative core claim, object / scene metaphor, hero number, founder / product / audience moment, or a distilled conflict. Pair it with one concrete composition strategy — such asfull-bleed image + floating title,typographic poster,hero object,data hook,editorial scene,high-contrast abstract geometry, or a fresh composition the deck's subject suggests (these are starting points, not the allowed set). If no external or AI image is available, still specify a native-SVG visual hook; do not fall back to "title + subtitle + decorative background". (Beautify / template-fill keep the source cover verbatim — this rule does not apply on those preservation paths.) - Cover rhythm lock:
P01remainsanchorinspec_lock.md page_rhythm, but its §IXCover impactmust prevent content-page patterns. Do not plan multi-card grids, agenda-like bullets, or equal-weight columns on the cover unless a template explicitly requires that structure, or a preservation path (beautify / template-fill) is transcribing the source cover verbatim. - Closing impact (only when the deck closes): the deck's last page is its final visual contract — the strongest impression after the cover. When the deck genuinely lands on a conclusion / call-to-action / final-takeaway page, give it a
Closing impactline in §IX: name the one thing the audience should leave with (a distilled takeaway, a forward call, a memorable restatement of the core claim) + one composition that delivers it — never a generic "Thank you" / contact-only slide or a centered-title reprise of the cover. Do NOT invent a closing page to satisfy this — the filler-page ban above still holds; apply it only to the page where the deck actually resolves. Same exemptions as the cover: skip on template / beautify / template-fill preservation paths. - page_layouts (write only when a template is in use): For each page that inherits a template SVG, add
P<NN>: <svg_basename>(e.g.,P04: 03a_content_image_text). Pages designed freely get no entry — Executor reads the absence as "free design, no inheritance". If zero pages use a template, omit the section entirely. - page_charts (write only for chart pages that match a catalog template): For each page in
design_spec.md §VIIwhosereference template pathpoints totemplates/charts/<name>.svg, addP<NN>: <chart_name>. Pages withno-template-matchin §VII MUST NOT appear here (Executor would look for a non-existent reference). If the deck has no data-visualization pages, omit the section. - Hard rule: Use both
page_layoutsandpage_chartsfor the same page only when the layout template is a compatible shell for the chart. Do not pair chart pages with conflicting page layouts (e.g.,waterfall_chart+ timeline layout, KPI cards + circle-diagram layout). If no compatible layout exists, omit the page frompage_layouts.
- page_rhythm is mandatory: Based on the page list in §IX Content Outline, assign each page one of
7. Project Folder
Project folder must exist before Strategist runs. If not, execute:
python3 scripts/project_manager.py init <project_name> --format <canvas_format>
Save outputs to projects/<project_name>_<format>_<YYYYMMDD>/design_spec.md.
8. Complete Design Spec and Prompt Next Steps
After writing design_spec.md and spec_lock.md, output the next-step prompt below. This is a handoff instruction, not part of design_spec.md. Pick the variant by whether Step 3 copied a template into <project_path>/templates/.
Template mode (template applied in Step 3)
✅ Design spec complete. Template ready.
Next step:
- Images include AI generation → Invoke Image_Generator
- Otherwise → Invoke Executor
Free design (default, no template)
✅ Design spec complete.
Next step:
- Images include AI generation → Invoke Image_Generator
- Otherwise → Invoke Executor (free design for every page)