Reputation: 174
I am trying to open a file and censor words out of it. These words that are censored are referenced from a list. This is my code
# These are the emails you will be censoring.
# The open() function is opening the text file that the emails are contained in
# and the .read() method is allowing us to save their contexts to the following variables:
email_one = open("email_one.txt", "r").read()
email_two = open("email_two.txt", "r").read()
email_three = open("email_three.txt", "r").read()
email_four = open("email_four.txt", "r").read()
# Write a function that can censor a specific word or phrase from a body of text,
# and then return the text.
# Mr. Cloudy has asked you to use the function to censor all instances
# of the phrase learning algorithms from the first email, email_one.
# Mr. Cloudy doesn’t care how you censor it, he just wants it done.
def censor_words(text, censor):
if censor in text:
text = text.replace(censor, '*' * len(censor))
return text
#print(censor_words(email_one, "learning algorithms"))
# Write a function that can censor not just a specific word or phrase from a body of text,
# but a whole list of words and phrases, and then return the text.
# Mr. Cloudy has asked that you censor all words and phrases from the following list in email_two.
def censor_words_in_list(text):
proprietary_terms = ["she", "personality matrix", "sense of self",
"self-preservation", "learning algorithm", "her", "herself"]
for x in proprietary_terms:
if x.lower() in text.lower():
text = text.replace(x, '*' * len(x))
return text
out_file = open("output.txt", "w")
out_file.write(censor_words_in_list(email_two))
This is the string before its being called and printed.
Good Morning, Board of Investors,
Lots of updates this week. The learning algorithms have been working better than we could have ever expected. Our initial internal data dumps have been completed and we have proceeded with the plan to connect the system to the internet and wow! The results are mind blowing.
She is learning faster than ever. Her learning rate now that she has access to the world wide web has increased exponentially, far faster than we had though the learning algorithms were capable of.
Not only that, but we have configured her personality matrix to allow for communication between the system and our team of researchers. That's how we know she considers herself to be a she! We asked!
How cool is that? We didn't expect a personality to develop this early on in the process but it seems like a rudimentary sense of self is starting to form. This is a major step in the process, as having a sense of self and self-preservation will allow her to see the problems the world is facing and make hard but necessary decisions for the betterment of the planet.
We are a-buzz down in the lab with excitement over these developments and we hope that the investors share our enthusiasm.
Till next month, Francine, Head Scientist
This is the same string after being ran through my code.
Good Morning, Board of Investors,
Lots of updates this week. The ******************s have been working better than we could have ever expected. Our initial internal data dumps have been completed and we have proceeded with the plan to connect the system to the internet and wow! The results are mind blowing.
She is learning faster than ever. Her learning rate now that *** has access to the world wide web has increased exponentially, far faster than we had though the ******************s were capable of.
Not only that, but we have configured * ****************** to allow for communication between the system and our team of researc***s. That's how we know * considers *self to be a *! We asked!
How cool is that? We didn't expect a personality to develop this early on in the process but it seems like a rudimentary ************* is starting to form. This is a major step in the process, as having a ************* and ***************** will allow *** to see the problems the world is facing and make hard but necessary decisions for the betterment of the planet.
We are a-buzz down in the lab with excitement over these developments and we hope that the investors share our enthusiasm.
Till next month, Francine, Head Scientist
Example of what I need to fix is when you find the word researchers it is censoring out the word partially when it should not. Reason being is that it is finding the substring her in researchers. How can I fix this?
Upvotes: 0
Views: 487
Reputation: 492
Using the regular expression module and the word boundary anchor \b:
import re
def censor_words_in_list(text):
regex = re.compile(
r'\bshe\b|\bpersonality matrix\b|\bsense of self\b'
r'|\bself-preservation\b|\blearning algorithms\b|\bher\b|\bherself\b',
re.IGNORECASE)
matches = regex.finditer(text)
# find location of matches in text
for match in matches:
# find how many * should be used based on length of match
span = match.span()[1] - match.span()[0]
replace_string = '@' * span
# substitution expression based on match
expression = r'\b{}\b'.format(match.group())
text = re.sub(expression, replace_string, text, flags=re.IGNORECASE)
return text
email_one = open("email_one.txt", "r").read()
out_file = open("output.txt", "w")
out_file.write(censor_words_in_list(email_one))
out_file.close()
Output (I have used the @ symbol because ** is used to create bold text (like this) so the answer displays incorrectly for text bounded by three asterisks on Stack Overflow):
Good Morning, Board of Investors,
Lots of updates this week. The @@@@@@@@@@@@@@@@@@@ have been working better than we could have ever expected. Our initial internal data dumps have been completed and we have proceeded with the plan to connect the system to the internet and wow! The results are mind blowing.
@@@ is learning faster than ever. @@@ learning rate now that @@@ has access to the world wide web has increased exponentially, far faster than we had though the learning algorithms were capable of.
Not only that, but we have configured @@@ @@@@@@@@@@@@@@@@@@ to allow for communication between the system and our team of researchers. That's how we know @@@ considers @@@@@@@ to be a @@@! We asked!
How cool is that? We didn't expect a personality to develop this early on in the process but it seems like a rudimentary @@@@@@@@@@@@@ is starting to form. This is a major step in the process, as having a @@@@@@@@@@@@@ and @@@@@@@@@@@@@@@@@ will allow @@@ to see the problems the world is facing and make hard but necessary decisions for the betterment of the planet.
We are a-buzz down in the lab with excitement over these developments and we hope that the investors share our enthusiasm.
Till next month, Francine, Head Scientist
Upvotes: 1