Environmental Variables
Configuring the Engine
¶
export ASR_ENGINE=openai_whisper
export ASR_ENGINE=faster_whisper
Configuring the Model
¶
export ASR_MODEL=base
Available ASR_MODELs are tiny
, base
, small
, medium
, large
, large-v1
, large-v2
, large-v3
, turbo
and large-v3-turbo
.
For English-only applications, the .en
models tend to perform better, especially for the tiny.en
and base.en
models. We observed that the difference becomes less significant for the small.en
and medium.en
models.
Configuring the Model Path
¶
export ASR_MODEL_PATH=/data/whisper
Configuring the Model Unloading Timeout
¶
export MODEL_IDLE_TIMEOUT=300
Defaults to 0
. After no activity for this period (in seconds), unload the model until it is requested again. Setting
0
disables the timeout, keeping the model loaded indefinitely.
Configuring the SAMPLE_RATE
¶
export SAMPLE_RATE=16000
Defaults to 16000
. Default sample rate for audio input. 16 kHz
is commonly used in speech-to-text
tasks.