Skip to main content
 首页 » 编程设计

Python语音识别响应非常慢

2024年10月01日2haluo1

我有一个脚本,它使用在 Python 的 speech_recognition 中构建的 IBM Speech to Text API。它非常慢。响应时间约为 5 秒即可回复我。我怀疑 while True 循环,导致 CPU 烧毁,但我不知道。

代码如下:

import speech_recognition as sr 
import pyttsx 
import time 
import requests 
 
engine = pyttsx.init() 
 
time_check = time.strftime("%H") 
time_check = int(time_check) 
 
# current weather 
 
url = "http://api.openweathermap.org/data/2.5/weather?lat=59.13&lon=10.22&APPID=8f605c186309e3d8f60bb7b2f31ba75c&units=metric" \ 
          "" 
 
r = requests.get(url) 
response_dict = r.json() 
 
#daily forecast 
 
url_daily = "http://api.openweathermap.org/data/2.5/forecast/daily?q=Sandefjord&mode=JSON&units=metric&cnt=7&appid=8f605c186309e3d8f60bb7b2f31ba75c" \ 
        "" 
 
r_daily = requests.get(url_daily) 
response_dict_daily = r_daily.json() 
days = response_dict_daily["list"] 
tomorrow = days[1] 
 
 
 
 
 
#weather variables 
 
main = response_dict["main"] 
 
wind = response_dict["wind"] 
 
description = response_dict["weather"] 
 
 
 
url_1 = "https://newsapi.org/v1/articles?source=techcrunch&apiKey=d35bd4b8699b444fbcd3661e39c0bf49" 
 
f = requests.get(url_1) 
response = f.json() 
articles = response["articles"] 
article_1 = articles[0] 
article_2 = articles[1] 
article_3 = articles[2] 
article_4 = articles[3] 
 
title_1 = article_1["title"] 
title_2 = article_2["title"] 
title_3 = article_3["title"] 
title_4 = article_4["title"] 
 
 
 
 
 
while True: 
 
    r = sr.Recognizer() 
    with sr.Microphone() as source: 
        print("Say something!") 
        audio = r.listen(source) 
 
    try: 
 
 
        if "hello" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        if time_check < 9: 
            engine.say("Good morning sir, did you sleep well?") 
            engine.runAndWait() 
        elif time_check < 16: 
            engine.say("Good afternoon sir") 
            engine.runAndWait() 
 
        elif time_check > 16: 
            engine.say("Good evening sir") 
            engine.runAndWait() 
 
    if "update" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        print("Initializing news listing...") 
        engine.say(title_1) 
        engine.say(title_2) 
        engine.say(title_3) 
        engine.say(title_4) 
        engine.runAndWait() 
 
    if "time" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        time_say = time.strftime("%H:%M") 
        engine.say(time_say) 
        engine.runAndWait() 
 
    if "date" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        time_saydate = time.strftime("%B %d") 
        engine.say(time_saydate) 
        engine.runAndWait() 
 
    if "weather" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        temp = main["temp"] 
        windspeed = wind["speed"] 
        umbrella = description[1] 
        description = description["description"] 
        engine.say("Here is a quick overview of current weather. The temperature is at " + temp + "degrees. The wind speed is at " + windspeed + "meters per second. Overall it is " + description) 
 
        if umbrella == "Rain": 
            engine.say("I recommend you bring an umbrella sir, it is raining") 
 
 
    if "tomorrow" in r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD): 
        temp = tomorrow["temp"] 
        temp_1 = temp["day"] 
        temp_1 = int(temp_1) 
        desc = tomorrow["weather"] 
        desc_1 = desc[0] 
        desc_2 = desc_1["description"] 
 
        engine.say("The weather for tomorrow is " + str(temp_1) +  " degrees. The description is " + desc_2) 
        engine.runAndWait() 
 
        if desc_2 == "rain": 
            engine.say("It is going to rain, I would recommend bringing an umbrella") 
            engine.runAndWait() 
 
 
 
 
 
 
 
 
 
 
 
 
    print("Watson thinks you said " + r.recognize_ibm(audio, username=IBM_USERNAME, password=IBM_PASSWORD)) 
except sr.UnknownValueError: 
    print("IBM Speech to Text could not understand audio") 
    engine.say("Couldnt recognize, please try again sir") 
    engine.runAndWait() 
except sr.RequestError as e: 
    print("Could not request results from IBM Speech to Text service; {0}".format(e)) 

请您参考如下方法:

首先录制音频然后将其发送到服务器是根本错误的。记录需要时间,然后发送数据和返回响应也需要时间。

如果您想要良好的交互,您需要使用 Websockets 将音频流式传输到服务器,而不是使用 HTTP 请求发送它。录制结束服务器会发回解码结果。

speech_recognition 库在这里设计得不好,你应该直接使用 IBM 接口(interface)和 python websocket 模块。 Python 示例在这里:

https://github.com/watson-developer-cloud/speech-to-text-websockets-python