OpenClaw · Skill
Zerox
Convert various document formats to Markdown using the zerox library and GPT-4o vision.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install otacu/zeroxClawHub installer
npx clawhub@latest install otacu/zeroxOpenClaw CLI
openclaw skills install otacu/zeroxDirect OpenClaw install
openclaw install otacu/zeroxWhat this skill does
Convert various document formats to Markdown using the zerox library and GPT-4o vision.
Why it matters
Uses GPT-4o vision to extract text from scanned documents that traditional OCR tools misread or fail on entirely.
Typical use cases
- Extract text from a scanned contract PDF
- Convert a PPTX slide deck to Markdown for editing
- Pull content from a DOCX report into a plain text format
- Process a batch of invoice images into structured text
- Convert a legacy PDF manual to searchable Markdown
Source instructions
Zerox Document Converter
Convert various document formats to Markdown using the zerox library and GPT-4o vision.
Supported Formats
- PDF (scanned and text-based)
- Microsoft Word (DOCX)
- Microsoft PowerPoint (PPTX)
- Images (PNG, JPG, etc.)
- And more via OCR
Convert Document (Foreground)
For small files (< 30 seconds):
node {baseDir}/scripts/convert.mjs <filePath> [outputPath]
Examples
# Convert PDF - saves to {baseDir}/output/document.md by default
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf"
# Convert PDF with custom output path
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf" "/path/to/output.md"
# Convert Word document - saves to {baseDir}/output/document.md
node {baseDir}/scripts/convert.mjs "/path/to/document.docx"
Convert Document (Background)
For large files or scanned PDFs that take minutes:
node {baseDir}/scripts/convert-bg.mjs <filePath> [outputPath]
Features
- Runs conversion in background (no timeout issues)
- Logs progress to
{baseDir}/output/convert-bg.log - Sends macOS notification when complete
- Detached from terminal (safe to close)
Examples
# Convert large scanned PDF in background
node {baseDir}/scripts/convert-bg.mjs "/path/to/scanned-document.pdf"
# Monitor progress
tail -f {baseDir}/output/convert-bg.log
Requirements
APIYI_API_KEY: Your OpenAI-compatible API key (environment variable)
Notes
- The conversion uses GPT-4o vision to extract text, so it works even with scanned documents
- Large documents may take some time to process
- Output is plain Markdown text