OpenClaw · Skill

Zerox

Convert various document formats to Markdown using the zerox library and GPT-4o vision.

Image & Video Generation
v0.1.0
VirusTotal: Suspicious

Install

Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.

Primary command

clawhub install otacu/zerox

ClawHub installer

npx clawhub@latest install otacu/zerox

OpenClaw CLI

openclaw skills install otacu/zerox

Direct OpenClaw install

openclaw install otacu/zerox

What this skill does

Convert various document formats to Markdown using the zerox library and GPT-4o vision.

Why it matters

Uses GPT-4o vision to extract text from scanned documents that traditional OCR tools misread or fail on entirely.

Typical use cases

  • Extract text from a scanned contract PDF
  • Convert a PPTX slide deck to Markdown for editing
  • Pull content from a DOCX report into a plain text format
  • Process a batch of invoice images into structured text
  • Convert a legacy PDF manual to searchable Markdown

Source instructions

Zerox Document Converter

Convert various document formats to Markdown using the zerox library and GPT-4o vision.

Supported Formats

  • PDF (scanned and text-based)
  • Microsoft Word (DOCX)
  • Microsoft PowerPoint (PPTX)
  • Images (PNG, JPG, etc.)
  • And more via OCR

Convert Document (Foreground)

For small files (< 30 seconds):

node {baseDir}/scripts/convert.mjs <filePath> [outputPath]

Examples

# Convert PDF - saves to {baseDir}/output/document.md by default
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf"

# Convert PDF with custom output path
node {baseDir}/scripts/convert.mjs "/path/to/document.pdf" "/path/to/output.md"

# Convert Word document - saves to {baseDir}/output/document.md
node {baseDir}/scripts/convert.mjs "/path/to/document.docx"

Convert Document (Background)

For large files or scanned PDFs that take minutes:

node {baseDir}/scripts/convert-bg.mjs <filePath> [outputPath]

Features

  • Runs conversion in background (no timeout issues)
  • Logs progress to {baseDir}/output/convert-bg.log
  • Sends macOS notification when complete
  • Detached from terminal (safe to close)

Examples

# Convert large scanned PDF in background
node {baseDir}/scripts/convert-bg.mjs "/path/to/scanned-document.pdf"

# Monitor progress
tail -f {baseDir}/output/convert-bg.log

Requirements

  • APIYI_API_KEY: Your OpenAI-compatible API key (environment variable)

Notes

  • The conversion uses GPT-4o vision to extract text, so it works even with scanned documents
  • Large documents may take some time to process
  • Output is plain Markdown text

Related OpenClaw skills

Browse all →
Featured slot

Your product here

Reserve this slot to reach operators and coding-agent buyers.

Shown where builders are actively comparing tools and deployment options.

Advertise