Reputation:
I have downloaded the corpus Reuters from the NLTK library and want to store 10 random documents with more than 50 elements in a new variable.
I have already downloaded the corpus and have written the following code, but it runs contiously without stopping:
import nltk
nltk.download('reuters')
nltk.download('punkt')
from nltk.corpus import reuters
sample_data = []
for i in range(len(reuters.sents())):
sent = random.choice(reuters.sents())
if len(sent) <= 50: # Skips the sentence if it contains less than 50 elements
pass
else:
sample_data.append(sent)
while len(sample_data) == 10:
break
Is there a more efficient way of writing this so that the program completes my commands?
Upvotes: 0
Views: 175