KKKM
KKKM

Reputation: 61

Could anyone help me turn the profanity filter off for Google speech recognizer?

I'm trying to do a speech-to-text recognition for a wav. file I have, with Google, Google_Cloud, and Houndify.

I've noticed that with the latter two, they show no problem with profanities and but the Google speech recognizer filters the word, for example, f***, s***.

And this creates a problem for me as I want to do a sentimental analysis with LIWC and the program gives no profanity weights for words filtered like f***.

I've tried all of the above.

(1) Turning profanity filter off

recognizer_instance.recognize_google(audio_data: AudioData, key: Union[str, None] = None, language: str = "en-US", , pfilter: Union[0, 1], show_all: bool = False) -> Union[str, Dict[str, Any]]

https://github.com/Uberi/speech_recognition/blob/master/reference/library-reference.rst

(2) Remove profanity censor from Google Speech Recognition

But non of them solved the problem

r.recognize_google(example_audio)

---> what the f*** is wrong with you

But then,

r.recognize_google(example_audio, pfilter=0)

Gives

TypeError                                 Traceback (most recent call last)
<ipython-input-21-b158a03c879c> in <module>
----> 1 r.recognize_google(example_audio, pfilter=0)

TypeError: recognize_google() got an unexpected keyword argument 'pfilter'

How should I solve this problem?

I know that many solutions written on Stackoverflow are referring to recognizer for Google Cloud API. I do have Google_Cloud (r.recognize_google_cloud) working, so I want a solution for recognize_google not Google Cloud. I want to compare the results.

Upvotes: 6

Views: 1281

Answers (2)

KIRUYXAN
KIRUYXAN

Reputation: 1

You need to open __init__.py (speech_recognition),

find

def recognize_google(self, audio_data, key=None, language="en-US", show_all=False):

and edit to

def recognize_google(self, audio_data, key=None, language="en-US", show_all=False, pfilter=1):

The next step

find

url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
            "client": "chromium",
            "lang": language,
            "key": key,
        }))

and edit to

url = "http://www.google.com/speech-api/v2/recognize?{}".format(urlencode({
            "client": "chromium",
            "lang": language,
            "key": key,
            "pFilter": pfilter,
        }))

and

r.recognize_google(example_audio, pfilter=0)

will start working

Upvotes: 0

FoodyBorris
FoodyBorris

Reputation: 1

I am hitting the same thing. Looking at code in github here https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/init.py I can see that the pfilter parameter is supported, as the documentation suggests, but the version I've got from pip install, which also claims to be 3.8.1 just has pfilter deleted.

However, looking at the implementation, it just affects whether "pfilter": 0 | 1 is added to the dictionary for the request, so just edit your copy locally to add this to the dictionary is one route forward.

Very frustrating to have this sort of inconsistency :(

Upvotes: 0

Related Questions