Swarm
Process many independent items in parallel. create builds a table handle; run fans work out across rows and merges results back. One row = one unit of work — swarm handles batching automatically.
Flow
- Create. Build a table from a source — files, a glob pattern, or
pre-parsed records. One row per item. Returns a handle.
- Run. Dispatch an
instructiontemplate across rows. Results are merged
back into the table. Returns { completed, failed, skipped, failures }.
- Aggregate. Use
rows()and plain JS to count, filter, or summarize.
Do not spawn additional subagents for aggregation.
- Retry. Re-run with
filter: { column: "<col>", exists: false }to
reprocess only failed rows.
Choosing a source
glob / filePaths — one file = one row. Use when each file is an independent unit of work. Each row gets { id, file }; the subagent reads the file itself via the {file} placeholder.
tasks — pass pre-built records directly. Use when the data lives inside a file (JSONL, CSV, JSON array). Read and parse the file first inside eval, then pass the records. One record = one row — do not group multiple items into a single row.
For small files (under ~500 lines), parse and create in one block:
const { create } = await import("@/skills/swarm");
const raw = await tools.readFile({ file_path: "/data.jsonl" });
const records = raw.trim().split("\n").map(l => JSON.parse(l));
const table = await create({ tasks: records });
console.log(table);
For large files, read in chunks of 500 lines to avoid truncation:
const { create } = await import("@/skills/swarm");
let records = [];
let offset = 0;
while (true) {
const chunk = await tools.readFile({ file_path: "/data.txt", offset, limit: 500 });
const lines = chunk.split("\n").filter(l => l.trim());
for (const l of lines) { records.push({ id: `r${records.length}`, text: l }); }
if (lines.length < 500) break;
offset += 500;
}
const table = await create({ tasks: records });
console.log(table);
When the file is too large to parse and dispatch in one eval call, split across two blocks. Only the block that calls swarm functions needs the import:
// eval 1: parse only — no swarm import needed
const raw = await tools.readFile({ file_path: "/data.jsonl" });
globalThis.records = raw.trim().split("\n").map(l => JSON.parse(l));
console.log(`Parsed ${globalThis.records.length} records`);
// eval 2: create and dispatch
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: globalThis.records });
const result = await run(table.id, {
instruction: "Classify {text}",
responseSchema: {
type: "object",
properties: { label: { type: "string" } },
required: ["label"],
},
});
console.log(result);
Passing filePaths: ["/data.jsonl"] would produce a table with one row pointing at the file — not one row per record inside it.
When to use subagentType
Omit subagentType for classification, extraction, labeling, and any task where a single model call with structured output is sufficient. This is the default and is significantly cheaper and faster — each dispatch is a direct model call, no tools, no iteration.
Set subagentType when the task requires tools, file access, or multi-step reasoning. Each dispatch runs a full agentic loop with the named subagent.
// Direct model call — classification, no tools needed
await run(table.id, {
instruction: "Classify {text}",
responseSchema: { type: "object", properties: { label: { type: "string" } }, required: ["label"] },
});
// Subagent — needs to read files and reason over multiple steps
await run(table.id, {
subagentType: "reviewer",
instruction: "Review {file} for security issues.",
responseSchema: { type: "object", properties: { finding: { type: "string" } }, required: ["finding"] },
});
Instruction + context
instruction is a per-item template with {column} placeholders. Placeholders are resolved by the framework — your column names appear in prompts as references to the values listed alongside, never as raw template syntax. Subagents do the work — do not process items yourself in JS and write the results into rows.
context is free-form prose prepended to every subagent prompt. Use it for shared background: domain terms, classification rules, examples, etc.
const { create, run } = await import("@/skills/swarm");
const table = await create({ glob: "src/**/*.ts" });
const r = await run(table.id, {
subagentType: "reviewer",
instruction: "Review {file} for security issues. List findings or write 'no issues'.",
context: "TypeScript Express backend using Prisma ORM. Focus on injection, auth bypass, path traversal.",
responseSchema: {
type: "object",
properties: { review: { type: "string" } },
required: ["review"],
},
});
console.log(r);
// → { completed: 45, failed: 2, skipped: 0, failures: [...] }
Structured output
responseSchema is required. Schema properties become top-level columns on each row and constrain what subagents can return.
const { run } = await import("@/skills/swarm");
await run(table.id, {
instruction: "Classify: {text}",
responseSchema: {
type: "object",
properties: {
sentiment: { type: "string", enum: ["positive", "negative", "neutral"] },
},
required: ["sentiment"],
},
});
// Row after: { id: "r1", text: "...", sentiment: "positive" }
Batching
By default, swarm auto-batches to keep total dispatches under 10. For small tables (≤10 rows) each row gets its own subagent call. For larger tables, rows are grouped automatically.
Set batchSize to control grouping:
- Number — uniform batch size for all rows.
batchSize: 1forces per-row
dispatch; batchSize: 20 groups in twenties.
- Function —
(row, rowCount) => number. Returns the desired batch size
for each row. Rows with the same batch size are grouped together, then chunked. Allows mixed dispatch where some rows go solo and others batch.
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: items });
// Complex items get individual attention; simple ones batch together
await run(table.id, {
instruction: "Analyze {text}",
responseSchema: {
type: "object",
properties: { analysis: { type: "string" } },
required: ["analysis"],
},
batchSize: (row) => (row.token_count > 1000 ? 1 : 10),
});
Batch sizes are clamped to [1, 50] after evaluation.
Aggregation
After run(), use rows() and plain JS — no additional subagents needed.
const { rows } = await import("@/skills/swarm");
const data = await rows(table.id, { columns: ["sentiment"] });
const counts = {};
data.forEach(r => { counts[r.sentiment] = (counts[r.sentiment] || 0) + 1 });
console.log(counts);
// → { positive: 120, negative: 45, neutral: 35 }
Chaining passes
run updates the table in place — chain calls to accumulate columns.
const { create, run } = await import("@/skills/swarm");
const table = await create({ tasks: interviews });
await run(table.id, {
instruction: "Classify sentiment of {text}",
responseSchema: {
type: "object",
properties: { sentiment: { type: "string", enum: ["positive", "negative", "neutral"] } },
required: ["sentiment"],
},
});
await run(table.id, {
filter: { column: "sentiment", equals: "negative" },
instruction: "Summarize why {text} had negative sentiment.",
responseSchema: {
type: "object",
properties: { summary: { type: "string" } },
required: ["summary"],
},
});
Action-only tasks
When subagents perform actions (write a file, apply a fix) rather than return data, use a simple schema with a status or marker field. The exists: false filter still works for retries.
const { create, run } = await import("@/skills/swarm");
const fixedSchema = {
type: "object",
properties: { fixed: { type: "string" } },
required: ["fixed"],
};
const table = await create({ glob: "src/**/*.ts" });
await run(table.id, {
subagentType: "fixer",
instruction: "Add missing JSDoc to all exported functions in {file}.",
responseSchema: fixedSchema,
});
// retry any that failed
await run(table.id, {
subagentType: "fixer",
instruction: "Add missing JSDoc to all exported functions in {file}.",
responseSchema: fixedSchema,
filter: { column: "fixed", exists: false },
});
Filtering
{ column: "status", equals: "done" }
{ column: "status", notEquals: "done" }
{ column: "category", in: ["A", "B"] }
{ column: "result", exists: false } // not yet processed
{ and: [filter1, filter2] }
{ or: [filter1, filter2] }
Technical notes
- Only import
@/skills/swarmin blocks where you call swarm functions.
Data preparation (reading files, parsing, storing in globalThis) does not need the import. Destructure only what you use: { create }, { run }, { create, run }, etc.
- Console output is capped at ~5 KB. Never log raw file contents —
log only counts and short samples.
- **
readFileinsideevalreturns raw content — no line-number
prefixes.** Request at most 500 lines per call. For files with more than 500 lines, loop with incrementing offset.
- When building a table from a file, read it inside
eval. Data read
inside the sandbox stays there; it never enters the agent's context window.
- Never write to
.swarm/directly. Always usecreate(). - Everything the subagent needs must be in
instruction+context.
Subagents can't see the agent's context.
- Row ids must be unique.
create()rejects sources that produce
duplicate ids. For tasks, that's a caller-side responsibility; for glob / filePaths, ids are auto-disambiguated by parent directory.
- Unknown columns fail fast. If
instructionreferences{foo}and
no matched row provides foo, run() throws before any subagent is dispatched.
API Reference
create(source)
Create a table. Returns a handle { id, count, columns }.
| Source | Description |
|---|---|
{ glob: "src/*/.ts" } or { glob: ["src/*/.ts", "lib/*/.ts"] } | Match files by one or more patterns. Columns: id, file |
{ filePaths: ["a.ts", "b.ts"] } | Explicit file list. Columns: id, file |
{ tasks: [{ id: "t1", text: "..." }] } | Custom rows. Each must have id |
run(tableId, options)
Dispatch work across rows. Returns { completed, failed, skipped, failures }.
| Option | Default | Description |
|---|---|---|
instruction | (required) | Template with {column} placeholders |
responseSchema | (required) | JSON Schema (type: "object") — properties become row columns |
context | — | Prose prepended to every subagent prompt |
filter | — | Only dispatch matching rows |
subagentType | — | Name of subagent to dispatch to. When set, runs a full agentic loop. When omitted, runs a direct model call |
batchSize | auto | Number or (row, rowCount) => number. Auto caps dispatches at 10; 1 = per-row; function = per-row sizing |
concurrency | 10 | Max concurrent subagent dispatches (clamped to 1–10) |
rows(tableId, options?)
Retrieve rows. Use for inspection and JS-based aggregation.
| Option | Description |
|---|---|
filter | Only return matching rows |
columns | Project to specific columns |
limit | Max rows returned |

