user7519553
user7519553

Reputation:

Python Speech Recognition response very slow

I have a script that uses the IBM Speech to Text API built in the speech_recognition in Python. It is very slow. Response time is about 5 seconds to respond back to me. I am suspecting the while True loop, causing the CPU to burn up, but I do not know.

Here is the code:

import speech_recognition as sr
import pyttsx
import time
import requests

engine = pyttsx.init()

time_check = time.strftime("%H")
time_check = int(time_check)

# current weather

url = "http://api.openweathermap.org/data/2.5/weather?lat=59.13&lon=10.22&APPID=8f605c186309e3d8f60bb7b2f31ba75c&units=metric" \
          ""

r = requests.get(url)
response_dict = r.json()

#daily forecast

url_daily = "http://api.openweathermap.org/data/2.5/forecast/daily?q=Sandefjord&mode=JSON&units=metric&cnt=7&appid=8f605c186309e3d8f60bb7b2f31ba75c" \
        ""

r_daily = requests.get(url_daily)
response_dict_daily = r_daily.json()
days = response_dict_daily["list"]
tomorrow = days[1]





#weather variables

main = response_dict["main"]

wind = response_dict["wind"]

description = response_dict["weather"]



url_1 = "https://newsapi.org/v1/articles?source=techcrunch&apiKey=d35bd4b8699b444fbcd3661e39c0bf49"

f = requests.get(url_1)
response = f.json()
articles = response["articles"]
article_1 = articles[0]
article_2 = articles[1]
article_3 = articles[2]
article_4 = articles[3]

title_1 = article_1["title"]
title_2 = article_2["title"]
title_3 = article_3["title"]
title_4 = article_4["title"]





while True:

    r = sr.Recognizer()
    with sr.Microphone() as source:
        print("Say something!")
        audio = r.listen(source)

    try:


        if "hello" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        if time_check < 9:
            engine.say("Good morning sir, did you sleep well?")
            engine.runAndWait()
        elif time_check < 16:
            engine.say("Good afternoon sir")
            engine.runAndWait()

        elif time_check > 16:
            engine.say("Good evening sir")
            engine.runAndWait()

    if "update" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        print("Initializing news listing...")
        engine.say(title_1)
        engine.say(title_2)
        engine.say(title_3)
        engine.say(title_4)
        engine.runAndWait()

    if "time" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        time_say = time.strftime("%H:%M")
        engine.say(time_say)
        engine.runAndWait()

    if "date" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        time_saydate = time.strftime("%B %d")
        engine.say(time_saydate)
        engine.runAndWait()

    if "weather" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        temp = main["temp"]
        windspeed = wind["speed"]
        umbrella = description[1]
        description = description["description"]
        engine.say("Here is a quick overview of current weather. The temperature is at " + temp + "degrees. The wind speed is at " + windspeed + "meters per second. Overall it is " + description)

        if umbrella == "Rain":
            engine.say("I recommend you bring an umbrella sir, it is raining")


    if "tomorrow" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD):
        temp = tomorrow["temp"]
        temp_1 = temp["day"]
        temp_1 = int(temp_1)
        desc = tomorrow["weather"]
        desc_1 = desc[0]
        desc_2 = desc_1["description"]

        engine.say("The weather for tomorrow is " + str(temp_1) +  " degrees. The description is " + desc_2)
        engine.runAndWait()

        if desc_2 == "rain":
            engine.say("It is going to rain, I would recommend bringing an umbrella")
            engine.runAndWait()












    print("Watson thinks you said " + r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD))
except sr.UnknownValueError:
    print("IBM Speech to Text could not understand audio")
    engine.say("Couldnt recognize, please try again sir")
    engine.runAndWait()
except sr.RequestError as e:
    print("Could not request results from IBM Speech to Text service; {0}".format(e))

Upvotes: 2

Views: 6067

Answers (1)

Nikolay Shmyrev
Nikolay Shmyrev

Reputation: 25210

There is a fundamentally wrong thing that you record audio first and then send it to the server. It takes time to record and then it takes time to send the data and get the response back.

If you want a good interaction you need to stream audio to the server with Websockets, not send it with HTTP requests. By the time the recording ends the server will send you back a decoding result.

speech_recognition library is not well designed here, you should use IBM interfaces directly together with python websocket module. Python example is here:

https://github.com/watson-developer-cloud/speech-to-text-websockets-python

Upvotes: 2

Related Questions