Reputation: 61
I'm trying to do a speech-to-text recognition for a wav. file I have, with Google, Google_Cloud, and Houndify.
I've noticed that with the latter two, they show no problem with profanities and but the Google speech recognizer filters the word, for example, f***, s***.
And this creates a problem for me as I want to do a sentimental analysis with LIWC and the program gives no profanity weights for words filtered like f***.
I've tried all of the above.
(1) Turning profanity filter off
recognizer_instance.recognize_google(audio_data: AudioData, key: Union[str, None] = None, language: str = "en-US", , pfilter: Union[0, 1], show_all: bool = False) -> Union[str, Dict[str, Any]]
https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst
(2) Remove profanity censor from Google Speech Recognition
But non of them solved the problem
r.recognize_google(example_audio)
---> what the f*** is wrong with you
But then,
r.recognize_google(example_audio, pfilter=0)
Gives
TypeError Traceback (most recent call last)
<ipython-input-21-b158a03c879c> in <module>
----> 1 r.recognize_google(example_audio, pfilter=0)
TypeError: recognize_google() got an unexpected keyword argument 'pfilter'
How should I solve this problem?
I know that many solutions written on Stackoverflow are referring to recognizer for Google Cloud API. I do have Google_Cloud (r.recognize_google_cloud
) working, so I want a solution for recognize_google not Google Cloud. I want to compare the results.
Upvotes: 6
Views: 1281
Reputation: 1
You need to open __init__.py
(speech_recognition),
find
def recognize_google(self, audio_data, key=None, language="en-US", show_all=False):
and edit to
def recognize_google(self, audio_data, key=None, language="en-US", show_all=False, pfilter=1):
The next step
find
url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
"client": "chromium",
"lang": language,
"key": key,
}))
and edit to
url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
"client": "chromium",
"lang": language,
"key": key,
"pFilter": pfilter,
}))
and
r.recognize_google(example_audio, pfilter=0)
will start working
Upvotes: 0
Reputation: 1
I am hitting the same thing. Looking at code in github here https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/init.py I can see that the pfilter parameter is supported, as the documentation suggests, but the version I've got from pip install, which also claims to be 3.8.1 just has pfilter deleted.
However, looking at the implementation, it just affects whether "pfilter": 0 | 1 is added to the dictionary for the request, so just edit your copy locally to add this to the dictionary is one route forward.
Very frustrating to have this sort of inconsistency :(
Upvotes: 0