Skip to main content

WatsonX Audio Transcription

Overview​

PropertyDetails
DescriptionWatsonX audio transcription using Whisper models for speech-to-text
Provider Route on LiteLLMwatsonx/
Supported Operations/v1/audio/transcriptions
Link to Provider DocIBM WatsonX.ai ↗

Quick Start​

LiteLLM SDK​

transcription.py
import litellm

response = litellm.transcription(
model="watsonx/whisper-large-v3-turbo",
file=open("audio.mp3", "rb"),
api_base="https://us-south.ml.cloud.ibm.com",
api_key="your-api-key",
project_id="your-project-id"
)
print(response.text)

LiteLLM Proxy​

config.yaml
model_list:
- model_name: whisper-large-v3-turbo
litellm_params:
model: watsonx/whisper-large-v3-turbo
api_key: os.environ/WATSONX_APIKEY
api_base: os.environ/WATSONX_URL
project_id: os.environ/WATSONX_PROJECT_ID
Request
curl http://localhost:4000/v1/audio/transcriptions \
-H "Authorization: Bearer sk-1234" \
-F file="@audio.mp3" \
-F model="whisper-large-v3-turbo"

Supported Parameters​

ParameterTypeDescription
modelstringModel ID (e.g., watsonx/whisper-large-v3-turbo)
filefileAudio file to transcribe
languagestringLanguage code (e.g., en)
promptstringOptional prompt to guide transcription
temperaturefloatSampling temperature (0-1)
response_formatstringjson, text, srt, verbose_json, vtt