OpenClaw · Skill
SSML
Speech generation via Zvukogram API with SSML markup support.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install erview/zvukogramClawHub installer
npx clawhub@latest install erview/zvukogramOpenClaw CLI
openclaw skills install erview/zvukogramDirect OpenClaw install
openclaw install erview/zvukogramWhat this skill does
Speech generation via Zvukogram API with SSML markup support.
Why it matters
SSML support with per-word stress control and English transcription aliases gives more accurate Russian-language output than basic TTS APIs.
Typical use cases
- Converting written articles to audio files
- Adding voice notifications to automated pipelines
- Recording podcast narration with multiple voices
- Voicing news scripts with proper pronunciation control
- Generating audio from long-form text documents
Source instructions
Zvukogram TTS
Speech generation via Zvukogram API with SSML markup support.
Requirements
To use this skill, you need:
- Zvukogram API token — get it at https://zvukogram.com/
- Zvukogram account email
Setup
Create file ~/.config/zvukogram/config.json:
mkdir -p ~/.config/zvukogram
{
"token": "your_api_token_here",
"email": "your_email@example.com"
}
Or use environment variables:
export ZVUKOGRAM_TOKEN=your_api_token_here
export ZVUKOGRAM_EMAIL=your_email@example.com
Quick Start
# Simple TTS
python3 scripts/tts.py --text "Hello, world!" --voice Алена --output hello.mp3
# With +20% speed
python3 scripts/tts.py --text "Fast text" --voice Алена --speed 1.2 --output fast.mp3
# Check balance
python3 scripts/balance.py
Features
- TTS generation — text to speech
- SSML support — stress marks, pauses, speed
- Audio merging — combine fragments via ffmpeg
- Transcription — proper pronunciation of English words
SSML Markup
Stress Marks
Use + before stressed vowel:
З+амок — stress on "a"
зам+ок — stress on "o"
Aliases (Transcription)
<sub alias="Оупен Эй Ай">OpenAI</sub>
<sub alias="Самсунг">Samsung</sub>
<sub alias="Ал+ьтман">Альтман</sub>
Speed
<prosody rate="1.2">20% faster</prosody>
<prosody rate="fast">Fast text</prosody>
Pauses
<break time="500ms"/>
Available Voices
- Алена — female, neutral (recommended)
- Андрей — male, neutral (recommended)
- Александра — female, soft
- Антон — male, business
Full list: see references/VOICES.md
Examples
See references/EXAMPLES.md for:
- Dialogs and podcasts
- News voiceover
- Voice notifications
- Long texts
Transcription
See references/TRANSCRIPTION.md for proper pronunciation:
- OpenAI → Оупен Эй Ай
- GPT → Джи Пи Ти
- Samsung → Самсунг
- Altman → Ал+ьтман
SSML Reference
- Full, agent-readable reference (recommended): references/SSML.md
say-asmodes with extra patterns: references/say-as.md- Pronunciation & transcription patterns (
+,<sub>,<phoneme>): references/pronunciation-patterns.md - Podcast-oriented SSML patterns: references/podcast-examples.md
- Quick lookup: references/SSML_CHEATSHEET.md
- Official Zvukogram SSML docs: https://zvukogram.com/node/ssml/
Troubleshooting
See references/TROUBLESHOOTING.md for:
- API errors
- Audio issues
- Diagnostics
API Reference
- API contract (endpoints, parameters, responses): references/API.md
- Choosing
/textvs/longtextvs chunking: references/chunking-and-method-choice.md
API Limits / gotchas
/text: max 1000 characters per request/longtextand/subs: up to 1,000,000 characters- Multi-voice in API: generate and merge fragments (one request per voice). Do not rely on
<voice>wrappers.
Links
- API docs: https://zvukogram.com/node/api/
- Voice rating: https://zvukogram.com/rating/
- Support: https://t.me/zvukogram