Reputation: 409
I have a list of words. It's pretty large (len(list) ~ 70,000). I'm currently using this code:
replacement = "bla"
for word in data:
if (word in unique_words):
word = replacement
This code take a while to perform the operation. Is there a quicker way to do this?
Upvotes: 0
Views: 1059
Reputation: 133574
Use a set
for unique_words
. Sets are considerably faster than lists for determining if an item is in them (see Python Sets vs Lists ).
Also, it's only a stylistic issue but I think you should drop the brackets in the if
. It looks cleaner.
Upvotes: 6
Reputation: 174622
The code you have posted doesn't actually do any replacement. Here is a snippet that does:
for key,word in enumerate(data):
if word in unique_words:
data[key] = replacement
Here's a more compact way:
new_list = [replacement if word in unique_words else word for word in big_list]
I think unique_words
is an odd name for the variable considering its use, perhaps it should be search_list
?
Edit:
After your comment, perhaps this is better:
from collections import Counter
c = Counter(data)
only_once = [k for k,v in c.iteritems() if v == 1]
# Now replace all occurances of these words with something else
for k, v in enumerate(data):
if v in only_once:
data[k] = replacement
Upvotes: 4