python script to STT output in a text file

To output the Speech-to-Text (STT) transcription into a text file using Vosk and Python, you can modify the AGI script or any standalone Python script to write the transcription to a file after processing the audio.SPEECH RECOGNITION WITH GOOGLE CLOUD API ON ASTERISK 15 python script to STT output in a text file

Here is an example Python script that takes an audio file, transcribes it using Vosk, and then writes the transcription to a text file.

Python Script for STT and Output to a Text File Install Python 3.8 on CentOS 7 / CentOS 8 Download and install Python https://www.python.org/downloads/

#!/usr/bin/env python3
import wave
import json
from vosk import Model, KaldiRecognizer

# Path to Vosk model (Hindi model in this case)
model = Model("path_to_vosk_model_hindi")  # Change this path to your model

# Path to the audio file
audio_file = "/tmp/asterisk-call.wav"  # Change this to your audio file location

# Path to the output text file where transcription will be stored
output_file = "/tmp/transcription_output.txt"

def transcribe_audio():
    # Open the WAV audio file
    wf = wave.open(audio_file, "rb")

    # Ensure the audio format is compatible (16kHz, mono, 16-bit PCM)
    if wf.getnchannels() != 1 or wf.getsampwidth() != 2 or wf.getframerate() != 16000:
        print("ERROR: Unsupported audio format")
        return

    # Initialize the recognizer with the Vosk model and audio frame rate
    rec = KaldiRecognizer(model, wf.getframerate())

    # Initialize an empty transcription string
    transcription = ""

    # Read the audio data and process it in chunks
    while True:
        data = wf.readframes(4000)
        if len(data) == 0:
            break
        if rec.AcceptWaveform(data):
            result = rec.Result()
            result_json = json.loads(result)
            transcription += result_json.get('text', '')

    # Save the transcription to a text file
    with open(output_file, 'w') as f:
        f.write(transcription)

    print(f"Transcription saved to {output_file}")

if __name__ == "__main__":
    transcribe_audio()

Key Components of the Script of Python Script used for STT and Output to a Text File :

  1. Vosk Model:
    • The script loads the Hindi model (vosk-model-small-hi-0.22) or any other Vosk model you’re using.
    • Ensure you specify the correct path to the model on your system.
  2. Audio File:
    • The script takes an input WAV file (/tmp/asterisk-call.wav).
    • This file should be in the 16kHz, mono, 16-bit PCM format.
  3. Output File:
    • The transcribed text will be saved to /tmp/transcription_output.txt (you can change this path if needed).
  4. Error Checking:
    • The script checks if the audio file format is correct for Vosk to process (1 channel, 16kHz, 16-bit PCM).
  5. Writing Transcription to a File:
    • After transcribing the audio, it writes the resulting text to a specified text file (output_file).

How to Run:

  1. Make sure the required Vosk model is downloaded and the vosk Python package is installed:
pip install vosk

2. Run the script with appropriate permissions:

python3 vosk_transcription_to_file.py

3. The transcription will be saved to /tmp/transcription_output.txt, and you can view the output using any text editor.

python script to STT output in a text file
python script to STT output in a text file

Leave a Reply