OpenClaw · Skill

Vibevoice

Local text-to-speech using Microsoft's VibeVoice model. Generates natural Spanish voice audio, perfect for WhatsApp voice messages.

Web & Frontend Development
v1.0.0
VirusTotal: Suspicious

Install

Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.

Primary command

clawhub install javier887/vibevoice

ClawHub installer

npx clawhub@latest install javier887/vibevoice

OpenClaw CLI

openclaw skills install javier887/vibevoice

Direct OpenClaw install

openclaw install javier887/vibevoice

What this skill does

Local text-to-speech using Microsoft's VibeVoice model. Generates natural Spanish voice audio, perfect for WhatsApp voice messages.

Why it matters

Runs locally with no API key or internet connection, so there are no per-request costs and audio stays on your machine.

Typical use cases

  • Send WhatsApp voice messages without recording your voice
  • Convert long Spanish text to audio for review
  • Generate voice audio files from text scripts
  • Test different Spanish or English voice styles locally
  • Create audio output from text files in batch

Source instructions

VibeVoice TTS

Local text-to-speech using Microsoft's VibeVoice model. Generates natural Spanish voice audio, perfect for WhatsApp voice messages.

Quick Start

# Basic usage
{baseDir}/scripts/vv.sh "Hola, esto es una prueba" -o /tmp/audio.ogg

# From file
{baseDir}/scripts/vv.sh -f texto.txt -o /tmp/audio.ogg

# Different voice
{baseDir}/scripts/vv.sh "Texto" -v en-Wayne -o /tmp/audio.ogg

# Adjust speed (0.5-2.0)
{baseDir}/scripts/vv.sh "Texto" -s 1.2 -o /tmp/audio.ogg

Configuration

SettingDefaultDescription
Voicesp-Spk1_manSpanish male voice (slight Mexican accent)
Speed1.1515% faster than normal
Format.oggOpus codec for WhatsApp

Available Voices

Spanish:

  • sp-Spk1_man - Male, slight Mexican accent (default)

English:

  • en-Wayne - Male
  • en-Denise - Female
  • Other voices in ~/VibeVoice/demo/voices/streaming_model/

Output Formats

  • .ogg - Opus codec (WhatsApp compatible, recommended)
  • .mp3 - MP3 format
  • .wav - Uncompressed WAV

For WhatsApp

Always use .ogg format with asVoice=true in the message tool:

# Generate
{baseDir}/scripts/vv.sh "Tu mensaje aquí" -o /tmp/mensaje.ogg

# Send via message tool
message action=send channel=whatsapp to="+34XXXXXXXXX" filePath=/tmp/mensaje.ogg asVoice=true

Requirements

  • GPU: NVIDIA with ~2GB VRAM
  • VibeVoice: Installed at ~/VibeVoice
  • ffmpeg: For audio conversion
  • Python 3.10+: With torch, torchaudio

Performance

  • RTF: ~0.24x (generates faster than realtime)
  • 1 minute of audio ≈ 15 seconds to generate

Notes

  • First run loads model (~10s), subsequent runs are faster
  • Audio rule: Only send voice if user requests it or speaks via audio
  • Keep text under 1500 chars for best quality

Related OpenClaw skills

Browse all →
Featured slot

Your product here

Reserve this slot to reach operators and coding-agent buyers.

Shown where builders are actively comparing tools and deployment options.

Advertise