OpenClaw · Skill

CLI

CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).

Coding Agents & IDEs
v0.1.1
VirusTotal: Benign

Install

Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.

Primary command

clawhub install atacan/speechall-cli

ClawHub installer

npx clawhub@latest install atacan/speechall-cli

OpenClaw CLI

openclaw skills install atacan/speechall-cli

Direct OpenClaw install

openclaw install atacan/speechall-cli

What this skill does

CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).

Why it matters

Provides access to speech-to-text models from OpenAI, Deepgram, AssemblyAI, Google, and others through a single CLI without separate SDKs or account integrations for each provider.

Typical use cases

  • Transcribing a recorded interview to a text file
  • Generating SRT subtitles for a recorded presentation
  • Identifying speakers in a multi-person meeting recording
  • Processing domain-specific audio with custom terminology
  • Listing available STT models to compare provider options

Source instructions

speechall-cli

CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).

Installation

Homebrew (macOS and Linux)

brew install Speechall/tap/speechall

Without Homebrew: Download the binary for your platform from https://github.com/Speechall/speechall-cli/releases and place it on your PATH.

Verify

speechall --version

Authentication

An API key is required. Provide it via environment variable (preferred) or flag:

export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav

The user can create an API key on https://speechall.com/console/api-keys

Commands

transcribe (default)

Transcribe an audio or video file. This is the default subcommand — speechall audio.wav is equivalent to speechall transcribe audio.wav.

speechall <file> [options]

Options:

FlagDescriptionDefault
--model <provider.model>STT model identifieropenai.gpt-4o-mini-transcribe
--language <code>Language code (e.g. en, tr, de)API default (auto-detect)
--output-format <format>Output format (text, json, verbose_json, srt, vtt)API default
--diarizationEnable speaker diarizationoff
--speakers-expected <n>Expected number of speakers (use with --diarization)
--no-punctuationDisable automatic punctuation
--temperature <0.0-1.0>Model temperature
--initial-prompt <text>Text prompt to guide model style
--custom-vocabulary <term>Terms to boost recognition (repeatable)
--ruleset-id <uuid>Replacement ruleset UUID
--api-key <key>API key (overrides SPEECHALL_API_KEY env var)

Examples:

# Basic transcription
speechall interview.mp3

# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en

# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt

# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"

# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4

models

List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.

speechall models [options]

Filter flags:

FlagDescription
--provider <name>Filter by provider (e.g. openai, deepgram)
--language <code>Filter by supported language (tr matches tr, tr-TR, tr-CY)
--diarizationOnly models supporting speaker diarization
--srtOnly models supporting SRT output
--vttOnly models supporting VTT output
--punctuationOnly models supporting automatic punctuation
--streamableOnly models supporting real-time streaming
--vocabularyOnly models supporting custom vocabulary

Examples:

# List all available models
speechall models

# Models from a specific provider
speechall models --provider deepgram

# Models that support Turkish and diarization
speechall models --language tr --diarization

# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'

Tips

  • On macOS, video files (.mp4, .mov, etc.) are automatically converted to audio before upload.
  • On Linux, pass audio files directly (.wav, .mp3, .m4a, .flac, etc.).
  • Output goes to stdout. Redirect to save: speechall audio.wav > transcript.txt
  • Errors go to stderr, so piping stdout is safe.
  • Run speechall --help, speechall transcribe --help, or speechall models --help to see all valid enum values for model identifiers, language codes, and output formats.

Related OpenClaw skills

Browse all →
Featured slot

Your product here

Reserve this slot to reach operators and coding-agent buyers.

Shown where builders are actively comparing tools and deployment options.

Advertise