OpenClaw · Skill
Azure AI Transcription Py
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
Install
Start with the primary install command. Alternate entrypoints are included below for ClawHub and OpenClaw CLI users.
Primary command
clawhub install thegovind/azure-ai-transcription-pyClawHub installer
npx clawhub@latest install thegovind/azure-ai-transcription-pyOpenClaw CLI
openclaw skills install thegovind/azure-ai-transcription-pyDirect OpenClaw install
openclaw install thegovind/azure-ai-transcription-pyWhat this skill does
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
Why it matters
Combines batch and real-time transcription with built-in diarization in a single client, removing the need to stitch together separate Azure services or third-party speaker separation tools.
Typical use cases
- Transcribing recorded meeting audio with speaker labels
- Generating subtitle files from video recordings
- Real-time captioning of live audio streams
- Processing large call center recordings stored in blob storage
- Converting interview audio to searchable text
Source instructions
Azure AI Transcription SDK for Python
Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.
Installation
pip install azure-ai-transcription
Environment Variables
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>
Authentication
Use subscription key authentication (DefaultAzureCredential is not supported for this client):
import os
from azure.ai.transcription import TranscriptionClient
client = TranscriptionClient(
endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
credential=os.environ["TRANSCRIPTION_KEY"]
)
Transcription (Batch)
job = client.begin_transcription(
name="meeting-transcription",
locale="en-US",
content_urls=["https://<storage>/audio.wav"],
diarization_enabled=True
)
result = job.result()
print(result.status)
Transcription (Real-time)
stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
print(event.text)
Best Practices
- Enable diarization when multiple speakers are present
- Use batch transcription for long files stored in blob storage
- Capture timestamps for subtitle generation
- Specify language to improve recognition accuracy
- Handle streaming backpressure for real-time transcription
- Close transcription sessions when complete