Python get Speech to text voice audio data

Question

I need your help because at the moment I use "engine.say()" from "pyttsx3" so my program "speaks" with me. That's already working but now I want an audio visualizer for this voice how can I do that?

Example

import pyttsx3

engine = pyttsx3.init()

engine.say("Hello World")
engine.runAndWait()

And I want something like this.

What I have

import pyaudio
import struct
import matplotlib.pyplot as plt
import numpy as np

mic = pyaudio.PyAudio()
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 5000
CHUNK = 3000#int(RATE/20)
stream = mic.open(format=FORMAT, channels=CHANNELS, rate=RATE, input=True, output=True, frames_per_buffer=CHUNK)

fig, ax = plt.subplots(figsize=(14,6))
x = np.arange(0, 2 * CHUNK, 2)
ax.set_ylim(-500, 500)
ax.set_xlim(0, CHUNK)
line, = ax.plot(x, np.random.rand(CHUNK))


while True:
    data = stream.read(CHUNK)
    data = np.frombuffer(data, np.int16)
    line.set_ydata(data)
    fig.canvas.draw()
    fig.canvas.flush_events()
    plt.pause(0.01)

This already visualizes my microphone audio but how can I make the voice the source? Hope you can help me Thank you very much!

AzyCrw4282 · Accepted Answer

I think what you need is - Realtime_PyAudio_FFT

One of the benefits of this is that

Starts a stream_reader that pulls live audio data from any source using PyAudio (soundcard, microphone, ...)

Since you'll be playing the audio using pyttsx3 it can extract the audio from the soundcard and would show a live visualization. This is a far better option for your case instead of extracting it via the microphone.

Also, you may need to enable threading or multiprocessing to visualize the audio if it's being executed by the same thread. Here's a good guide on that - threading or multiprocessing

Python get Speech to text voice audio data

Example

What I have

Answers (1)

Related Questions