Reputation: 85
I have a word list of words like so:
wordlist = ['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>', 'i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>']
I want to make a list of sentences: this is the code I'm using
sentence = []
start = []
end = []
wordlist = [word.replace('.','<s>') for word in wordlist]
for word in wordlist:
end = word['<s>']
for word in wordlist:
sentence = word[0][end]
sentence.append([])
I'm trying to get a list like this:
sentence=[['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'], ['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>], ...etc]
my idea is marking the end of a sentence with '' and telling my sentence list to create a new list after ''. anything will help, thank you.
Upvotes: 0
Views: 2085
Reputation: 273
You don't have to replace '<s>'
strings with '.'
to keep track of when sentences end. If you want to end sentences at '<s>'
, you can just check for these each time you add a word to your current sentence, like so:
sentences = []
current_sentence = []
for word in wordlist:
current_sentence.append(word)
if word == '<s>':
sentences.append(current_sentence)
current_sentence = []
print(sentences)
Here, I replaced your sentence
list with sentences
. This will keep track of all of the sentences that you make from your word list. current_sentence
will keep track of all of the words in your current sentence. When you reach a '<s>'
, this code adds your current sentence list to sentences
, then resets current_sentence
to an empty list.
Upvotes: 1
Reputation: 5464
Append your results to a list and reset it after you have found your end, in this case is <s>
wordlist = ['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>', 'i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>']
results = []
result = []
for word in wordlist:
if word == '<s>':
result.append(word)
results.append(result)
result = []
else:
result.append(word)
Final output in results
:
[['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'],
['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant', '<s>']]
Upvotes: 0
Reputation: 4606
You can create an iter
from wordlist
and then use a while
loop with try/except
to iterate and create your sublists that will be appended to your final list.
a = iter(wordlist)
res = []
temp = []
while True:
try:
b = next(a)
if b != '<s>':
temp.append(b)
else:
temp.append(b)
res.append(temp)
temp = []
except StopIteration:
break
print(res)
# [['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'], ['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant', '<s>']]
Upvotes: 0