Reputation: 2401
This is what I came up with, before getting stuck (NB source of the text : The Economist) :
import random
import re
text = 'One calculation by a film consultant implies that half of Hollywood productions with budgets over one hundred million dollars lose money.'
nbofwords = len(text.split())
words = text.split()
randomword = random.choice(words)
randomwordstr = str(randomword)
Step 1 works : Delete the random word from the original text
replaced1 = re.sub(randomwordstr, '', text)
replaced2 = re.sub(' ', ' ', replaced1)
Step 2 works : Select a defined number of random words
nbofsamples = 3
randomitems = random.choices(population=words, k=nbofsamples)
gives, e.g. ['over', 'consultant', 'One']
Step 3 works : Delete from the original text one element of that list of random words thanks to its index
replaced3 = re.sub(randomitems[1], '', text)
replaced4 = re.sub(' ', ' ', replaced3)
deletes the word 'consultant'
Step 4 fails : Delete from the original text all the elements of that list of random words thanks to their index The best I can figure out is :
replaced5 = re.sub(randomitems[0],'',text)
replaced6 = re.sub(randomitems[1],'',replaced5)
replaced7 = re.sub(randomitems[2],'',replaced6)
replaced8 = re.sub(' ', ' ', replaced7)
print(replaced8)
It works (all 3 words have been deleteg), but it is clumsy and inefficient (I would have to rewrite it if I changed the nbofsamples variable).
How can I iterate from my list of random words (step 2) to delete those words in the original text ?
Thanks in advance
Upvotes: 1
Views: 2095
Reputation: 10890
Note that as long as you do not use any regular expressions but replace just simple strings by others (or nothing), you don't need re
:
for r in randomitems:
text = text.replace(r, '')
print(text)
For replacing only the first occurence you can simple set desired number of occurences in the replace function:
text = text.replace(r, '', 1)
Upvotes: 1
Reputation: 112
to delete words in a list from a string just use a for-loop. This will iterate through each item in the list, assigning the value of the item in the list to whatever variable you want (In this case i used "i", but i can be pretty much anything a normal variable could be) and executes the code in the loop until there are no more items in the list given. Here's the bare bones version of a for-loop:
list = []
for i in list:
print(i)
in your case you wanted to remove the words specified in the list from a string, so just plug the variable "i" into the same method you've been using to remove the words. After that you need a constantly changing variable, otherwise the loop would have only removed the last word in the list from the string. after that you can print the output. This code will work a list of and length.
r=replaced3
for i in randomitems:
replaced4 = re.sub(i, '', r)
r=replaced4
print(replaced4)
Upvotes: 2