Reputation: 1979

Using discord.py to filter out bad words, no other commands will work

I'm trying to create a filter for bad words on my discord bot using discord.py. Here's my code so far:

with open('badwords.txt') as file:
    file = file.read().split()
@bot.event
async def on_message(message):

    channel = bot.get_channel(my_log_channel)
    mybot = bot.get_user(my_bot_id)
    #.. some custom embed ..

    if message.author is mybot:
        return

    for badword in file:
        if badword in message.content.lower():
            await message.delete()
            await channel.send(embed=embed)

There are two main problems so far. The bigger one is that no other commands will execute.

I tried adding

    else:
        await bot.process_commands(message)

to the end of the second if statement, but then every command is getting executed twice.

The second problem i'm having is that if I wanted to blacklist for example the word "ass" it also automatically deletes words like "pass" and so on. I would like to avoid this.

I would really appreciate some help on this, I'm kinda new to discord bots and I'm stuck here. Thanks in advance!

Upvotes: 1

Answers (3)

Turkio

Reputation: 1

I didn't use a text file, i made a list of filtered words, this is my code :

filtered_words = ["Example1","Example 2","ex 3"]

@bot.event
async def on_message(msg):
    for word in filtered_words:
        if word in msg.content:
         await msg.delete()

    await bot.process_commands(msg)

Upvotes: 0

Jacob Lee

Reputation: 4700

While using a simple for loop to find the bad words works, people get creative and may use spacing to still say them, for example, v e r y b a d w o r d. You bot wouldn’t detect this and wouldn’t be able to delete this.

Code

import re
import string

separators = string.punctuation+string.digits+string.whitespace
excluded = string.ascii_letters

word = "badword"
formatted_word = f"[{separators}]*".join(list(word))
regex_true = re.compile(fr"{formatted_word}", re.IGNORECASE)
regex_false = re.compile(fr"([{excluded}]+{word})|({word}[{excluded}]+)", re.IGNORECASE)

profane = False
if regex_true.search(message.content) is not None\
    and regex_false.search(message.content) is None:
    profane = True

Analysis

separators contains characters which can be inserted between the letters of a banned word while still being marked as profane.

excluded contains characters which can be inserted between the letters of a banned word and gets marked as not profane.

separators and excluded can be modified based on which characters should or shouldn't be allowed to divide profane words.

word is a sample bad word which will be tested to determine the validity of the regular expression.

formatted_word is an inclusive character set which matches zero or more of any of the characters in separators

regex_true returns a match object if a banned word is detected with any of the characters in separators are detected between the letters of the word. Note that the number of characters separating each letters is independent of the others, so 'w!o@#r$%^d' would return a match if 'word' is a banned word.

regex_false returns a match object when the banned word itself is detected with one of the characters in excluded either precedes or trails the word. This implies that the word is actually part of another word, e.g. 'ass' in 'pass'.

If regex_true returns a match object and regex_false does not, then the detected match should be considered to contain a banned word, given the above criteria.

Examples

Given word = "word" and the above code is used, below are the results of various test messages. Remember that if there is a matching set for regex_true and not matching set for regex_false, then profane = True.

>>> regex_true.search("word")
<re.Match object; span=(0, 4), match='word'>
>>> regex_false.search("word")
>>> #profane=True

>>> regex_true.search("aworda")
<re.Match object; span=(1, 5), match='word'>
>>> regex_false.search("aworda")
<re.Match object; span=(0, 5), match='aworda'>
>>> #profane=False

>>> regex_true.search("w1o~r d")
<re.Match object; span=(0, 7), match='w1o~r d'>
>>> regex_false.search("w1o~r d")
>>> #profane=True

Upvotes: 3

Sujit

Reputation: 1782

If there is nothing present in your file, then it won't even go inside the for loop. Also, if that specific bad word is not present in the message, it will call the remaining commands and go to the next word, that is why your commands are being called repeatedly.

So, try this:

with open('badwords.txt') as file:
    file = file.read().split()

@bot.event
async def on_message(message):
    channel = bot.get_channel(my_log_channel)
    mybot = bot.get_user(my_bot_id)
    #.. some custom embed ..

    if message.author is mybot:
        return

    flag = False
    for badword in file:
        if badword in message.content.lower():
            await message.delete()
            flag = True

    if not flag:
        await bot.process_commands(message)