Reputation: 53
I'm trying to create a function that censors words in a string. It's kinda working, with a few quirks.
This is my code:
def censor(sentence):
badwords = 'apple orange banana'.split()
sentence = sentence.split()
for i in badwords:
for words in sentence:
if i in words:
pos = sentence.index(words)
sentence.remove(words)
sentence.insert(pos, '*' * len(i))
print " ".join(sentence)
sentence = "you are an appletini and apple. new sentence: an orange is a banana. orange test."
censor(sentence)
And the output:
you are an ***** and ***** new sentence: an ****** is a ****** ****** test.
Some punctuation is gone and the word "appletini"
is replaced wrongly.
How can this be fixed?
Also, is there any simpler way of doing this kind of thing?
Upvotes: 1
Views: 4245
Reputation: 122091
The specific problems are that:
'*'
s.I would switch the loop order around, so you only process the sentence once, and use enumerate
rather than remove
and insert
:
def censor(sentence):
badwords = ("test", "word") # consider making this an argument too
sentence = sentence.split()
for index, word in enumerate(sentence):
if any(badword in word for badword in badwords):
sentence[index] = "".join(['*' if c.isalpha() else c for c in word])
return " ".join(sentence) # return rather than print
Testing str.isalpha
will replace only upper- and lower-case letters with asterisks. Demo:
>>> censor("Censor these testing words, will you? Here's a test-case!")
"Censor these ******* *****, will you? Here's a ****-****!"
# ^ note length ^ note punctuation
Upvotes: 2
Reputation: 2224
Try:
for i in bad_word_list:
sentence = sentence.replace(i, '*' * len(i))
Upvotes: 0