Claude Code · Community plugin
Autoresearch Agent
Autonomous experiment loop that optimizes any file by a measurable metric. 5 slash commands, 8 evaluators, configurable loop intervals (10min to monthly).
What this plugin covers
This page keeps a stable Remote OpenClaw URL for the upstream pluginwhile preserving the original source content below. The shell stays consistent, and the body can vary as much as the upstream SKILL.md or README varies.
Source files and registry paths
Source path
engineering/autoresearch-agent
Entry file
Not available
Manifest file
engineering/autoresearch-agent/.claude-plugin/plugin.json
Repository
alirezarezvani/claude-skills
Format
json-plugin
Original source content
Raw file# Autoresearch Agent — Claude Code Instructions This plugin runs autonomous experiment loops that optimize any file by a measurable metric. ## Commands Use the `/ar:` namespace for all commands: - `/ar:setup` — Set up a new experiment interactively - `/ar:run` — Run a single experiment iteration - `/ar:loop` — Start an autonomous loop with user-selected interval - `/ar:status` — Show dashboard and results - `/ar:resume` — Resume a paused experiment ## How it works You (the AI agent) are the experiment loop. The scripts handle evaluation and git rollback. 1. You edit the target file with ONE change 2. You commit it 3. You call `run_experiment.py --single` — it evaluates and prints KEEP/DISCARD/CRASH 4. You repeat Results persist in `results.tsv` and git log. Sessions can be resumed. ## When to use each command ### Starting fresh ``` /ar:setup ``` Creates the experiment directory, config, program.md, results.tsv, and git branch. ### Running one iteration at a time ``` /ar:run engineering/api-speed ``` Read history, make one change, evaluate, report result. ### Autonomous background loop ``` /ar:loop engineering/api-speed ``` Prompts for interval (10min, 1h, daily, weekly, monthly), then creates a recurring job. ### Checking progress ``` /ar:status ``` Shows the dashboard across all experiments with metrics and trends. ### Resuming after context limit or break ``` /ar:resume engineering/api-speed ``` Reads results history, checks out the branch, and continues where you left off. ## Agents - **experiment-runner**: Spawned for each loop iteration. Reads config, results history, decides what to try, edits target, commits, evaluates. ## Key principle **One change per experiment. Measure everything. Compound improvements.** The agent never modifies the evaluator. The evaluator is ground truth.
Related Claude Code plugins
claude-skills
Agenthub
Multi-agent collaboration plugin for Claude Code. Spawn N parallel subagents that compete on code optimization, content drafts, research approaches, or any problem that benefits from diverse solutions. Evaluate by metric or LLM judge, merge the winner. 7 slash commands, agent templates, git DAG orchestration, message board coordination.
claude-skills
Behuman
Self-Mirror consciousness loop for human-like AI responses. Adds inner dialogue (Self → Mirror → Conscious Response) to make AI output feel authentic, not robotic. Zero dependencies — pure prompt technique.
claude-skills
Code Tour
Create CodeTour .tour files — persona-targeted, step-by-step walkthroughs that link to real files and line numbers. Supports 10 developer personas (vibecoder, new joiner, architect, security reviewer, etc.), all CodeTour step types, and SMIG description formula.
claude-skills
Data Quality Auditor
Audit datasets for completeness, consistency, accuracy, and validity. 3 stdlib-only Python tools: data profiler with DQS scoring, missing value analyzer with MCAR/MAR/MNAR classification, and multi-method outlier detector.
claude-skills
Demo Video
Create polished demo videos from screenshots and scene descriptions. Orchestrates playwright, ffmpeg, and edge-tts to produce product walkthroughs, feature showcases, and marketing teasers with story structure, scene design system, and narration guidance.
claude-skills
Docker Development
Docker and container development agent skill and plugin for Dockerfile optimization, docker-compose orchestration, multi-stage builds, and container security hardening. Covers build performance, layer caching, and production-ready container patterns.