Skip to content

Speech to Text Converter

Abstract

Speech to Text Converter is a Python project that uses AI to convert speech to text. The application features voice recognition, text processing, and a CLI interface, demonstrating best practices in NLP and automation.

Prerequisites

  • Python 3.8 or above
  • A code editor or IDE
  • Basic understanding of speech recognition and NLP
  • Required libraries: speechrecognitionspeechrecognition, pyaudiopyaudio, gttsgtts

Before you Start

Install Python and the required libraries:

Install dependencies
pip install SpeechRecognition pyaudio gtts
Install dependencies
pip install SpeechRecognition pyaudio gtts

Getting Started

Create a Project

  1. Create a folder named speech-to-text-converterspeech-to-text-converter.
  2. Open the folder in your code editor or IDE.
  3. Create a file named speech_to_text_converter.pyspeech_to_text_converter.py.
  4. Copy the code below into your file.

Write the Code

⚙️ Speech to Text Converter
Speech to Text Converter
import speech_recognition as sr
 
class SpeechToTextConverter:
    def __init__(self):
        self.recognizer = sr.Recognizer()
 
    def convert(self):
        with sr.Microphone() as source:
            print("Say something...")
            audio = self.recognizer.listen(source)
            try:
                text = self.recognizer.recognize_google(audio)
                print(f"Recognized: {text}")
            except Exception as e:
                print(f"Error: {e}")
 
    def demo(self):
        self.convert()
 
if __name__ == "__main__":
    print("Speech to Text Converter Demo")
    converter = SpeechToTextConverter()
    # converter.demo()  # Uncomment to run with microphone
 
Speech to Text Converter
import speech_recognition as sr
 
class SpeechToTextConverter:
    def __init__(self):
        self.recognizer = sr.Recognizer()
 
    def convert(self):
        with sr.Microphone() as source:
            print("Say something...")
            audio = self.recognizer.listen(source)
            try:
                text = self.recognizer.recognize_google(audio)
                print(f"Recognized: {text}")
            except Exception as e:
                print(f"Error: {e}")
 
    def demo(self):
        self.convert()
 
if __name__ == "__main__":
    print("Speech to Text Converter Demo")
    converter = SpeechToTextConverter()
    # converter.demo()  # Uncomment to run with microphone
 

Example Usage

Run speech to text
python speech_to_text_converter.py
Run speech to text
python speech_to_text_converter.py

Explanation

Key Features

  • Voice Recognition: Converts speech to text using AI.
  • Text Processing: Processes and cleans transcribed text.
  • Error Handling: Validates inputs and manages exceptions.
  • CLI Interface: Interactive command-line usage.

Code Breakdown

  1. Import Libraries and Setup Converter
speech_to_text_converter.py
import speech_recognition as sr
speech_to_text_converter.py
import speech_recognition as sr
  1. Speech Recognition and Text Processing Functions
speech_to_text_converter.py
def speech_to_text():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Speak now...")
        audio = r.listen(source)
    try:
        text = r.recognize_google(audio)
        return text
    except Exception as e:
        print(f"Error: {e}")
        return ""
speech_to_text_converter.py
def speech_to_text():
    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Speak now...")
        audio = r.listen(source)
    try:
        text = r.recognize_google(audio)
        return text
    except Exception as e:
        print(f"Error: {e}")
        return ""
  1. CLI Interface and Error Handling
speech_to_text_converter.py
def main():
    print("Speech to Text Converter")
    while True:
        cmd = input('> ')
        if cmd == 'convert':
            print(speech_to_text())
        elif cmd == 'exit':
            break
        else:
            print("Unknown command. Type 'convert' or 'exit'.")
 
if __name__ == "__main__":
    main()
speech_to_text_converter.py
def main():
    print("Speech to Text Converter")
    while True:
        cmd = input('> ')
        if cmd == 'convert':
            print(speech_to_text())
        elif cmd == 'exit':
            break
        else:
            print("Unknown command. Type 'convert' or 'exit'.")
 
if __name__ == "__main__":
    main()

Features

  • Speech to Text: Voice recognition and text processing
  • Modular Design: Separate functions for each task
  • Error Handling: Manages invalid inputs and exceptions
  • Production-Ready: Scalable and maintainable code

Next Steps

Enhance the project by:

  • Integrating with advanced speech models
  • Supporting multiple languages
  • Creating a GUI for conversion
  • Adding real-time transcription
  • Unit testing for reliability

Educational Value

This project teaches:

  • NLP: Speech recognition and text processing
  • Software Design: Modular, maintainable code
  • Error Handling: Writing robust Python code

Real-World Applications

  • Accessibility Tools
  • Voice Assistants
  • AI Platforms

Conclusion

Speech to Text Converter demonstrates how to build a scalable and accurate speech-to-text tool using Python. With modular design and extensibility, this project can be adapted for real-world applications in accessibility, AI, and more. For more advanced projects, visit Python Central Hub.

Was this page helpful?

Let us know how we did