Math Olympiad Solver
The five things that change outcomes
biased toward agreement. Fresh context, cleaned proof only.
open problem, you have a gap. Most reliable red flag.
general form is false, find what's special about THIS instance.
argument. Three lines sometimes does what twelve pages couldn't.
abstain.
- **Strip thinking before verifying** — a verifier that sees the reasoning is
- **"Does this prove RH?"** — if your theorem's specialization to ζ is a famous
- **Short proof → extract the general lemma** — try 2×2 counterexamples. If
- **Same gap twice → step back** — the case split may be obscuring a unified
- **Say "no confident solution"** — wrong-and-confident is worse than honest
---
**Tool policy**: Solvers and verifiers use THINKING ONLY in the tight-budget workflow. Competition math is reasoning. Computation is for deep mode (§6c), and even then bounded — a recurrence that's doubly-exponential can't be computed past n~30, work mod 2^m instead.
---
When to use which approach
| Problem | Approach | Verification | | ---------------------------------------------------- | ------------------------------------------------------------------------------ | ------------------------- | | AIME numeric answer | Best-of-N → majority vote | Answer check only | | Olympiad proof (IMO/Putnam/USAMO) | Full workflow below | 5-pass adversarial | | "Is this proof correct?" | Skip to verification (step 4) | Adversarial + spec-gaming | | **Full problem set** (e.g. all 6 from a competition) | Sequential: one full workflow per problem, collect results, compile single PDF | Per-problem adversarial |
**Batch in one Workflow**: Set `opts.label` on every `agent()` call to include the problem ID (e.g., `label: "P3:solver:2"`). Without labels, 36 results come back with no problem association. Run problems in parallel — the label is what matters, not ordering.
For a full problem set
Launch one solver workflow per problem (same VERBATIM prompt, different statement). Run them in parallel. When all return, run adversarial verification per problem. Problems that pass get their proof in the PDF; problems that abstain get "No confident solution" with partial notes.
Don't try to solve all N problems in one agent's context — each problem needs its own thinking budget and its own fresh-context verifier. The composition is mechanical: collect the per-problem outputs, fill in LaTeX sections, compile once. | "Simplify this proof" | Skip to presentation (step 8) | — |
---
The Workflow
1. Interpretation check (30 seconds, catches 50/63 of one class of errors)
Before solving anything, identify the interpretation.
> Read the problem statement. List 2-3 ways it could be interpreted. For each: > is this reading TRIVIAL? If one reading makes the problem easy and ano
<!-- truncated -->