python script to STT output in a text file
To output the Speech-to-Text (STT) transcription into a text file using Vosk and Python, you can modify the AGI script or any standalone Python script to write the transcription to a file after processing the audio.SPEECH RECOGNITION WITH GOOGLE CLOUD API ON ASTERISK 15 python script to STT output in a text file
Here is an example Python script that takes an audio file, transcribes it using Vosk, and then writes the transcription to a text file.
Python Script for STT and Output to a Text File Install Python 3.8 on CentOS 7 / CentOS 8 Download and install Python https://www.python.org/downloads/
#!/usr/bin/env python3
import wave
import json
from vosk import Model, KaldiRecognizer
# Path to Vosk model (Hindi model in this case)
model = Model("path_to_vosk_model_hindi") # Change this path to your model
# Path to the audio file
audio_file = "/tmp/asterisk-call.wav" # Change this to your audio file location
# Path to the output text file where transcription will be stored
output_file = "/tmp/transcription_output.txt"
def transcribe_audio():
# Open the WAV audio file
wf = wave.open(audio_file, "rb")
# Ensure the audio format is compatible (16kHz, mono, 16-bit PCM)
if wf.getnchannels() != 1 or wf.getsampwidth() != 2 or wf.getframerate() != 16000:
print("ERROR: Unsupported audio format")
return
# Initialize the recognizer with the Vosk model and audio frame rate
rec = KaldiRecognizer(model, wf.getframerate())
# Initialize an empty transcription string
transcription = ""
# Read the audio data and process it in chunks
while True:
data = wf.readframes(4000)
if len(data) == 0:
break
if rec.AcceptWaveform(data):
result = rec.Result()
result_json = json.loads(result)
transcription += result_json.get('text', '')
# Save the transcription to a text file
with open(output_file, 'w') as f:
f.write(transcription)
print(f"Transcription saved to {output_file}")
if __name__ == "__main__":
transcribe_audio()
Key Components of the Script of Python Script used for STT and Output to a Text File :
- Vosk Model:
- The script loads the Hindi model (
vosk-model-small-hi-0.22
) or any other Vosk model you’re using. - Ensure you specify the correct path to the model on your system.
- The script loads the Hindi model (
- Audio File:
- The script takes an input WAV file (
/tmp/asterisk-call.wav
). - This file should be in the 16kHz, mono, 16-bit PCM format.
- The script takes an input WAV file (
- Output File:
- The transcribed text will be saved to
/tmp/transcription_output.txt
(you can change this path if needed).
- The transcribed text will be saved to
- Error Checking:
- The script checks if the audio file format is correct for Vosk to process (1 channel, 16kHz, 16-bit PCM).
- Writing Transcription to a File:
- After transcribing the audio, it writes the resulting text to a specified text file (
output_file
).
- After transcribing the audio, it writes the resulting text to a specified text file (
How to Run:
- Make sure the required Vosk model is downloaded and the
vosk
Python package is installed:
pip install vosk
2. Run the script with appropriate permissions:
python3 vosk_transcription_to_file.py
3. The transcription will be saved to /tmp/transcription_output.txt
, and you can view the output using any text editor.