ChicJaab
ChicJaab

Reputation: 85

Making a list in a list: making lists of words from words in a word list

I have a word list of words like so:

wordlist = ['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>', 'i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>']

I want to make a list of sentences: this is the code I'm using

sentence = []
start = []
end = []

wordlist = [word.replace('.','<s>') for word in wordlist]

for word in wordlist:
    end = word['<s>']


for word in wordlist:
    sentence = word[0][end]
    sentence.append([])

I'm trying to get a list like this: sentence=[['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'], ['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>], ...etc]

my idea is marking the end of a sentence with '' and telling my sentence list to create a new list after ''. anything will help, thank you.

Upvotes: 0

Views: 2085

Answers (3)

Samantha
Samantha

Reputation: 273

You don't have to replace '<s>' strings with '.' to keep track of when sentences end. If you want to end sentences at '<s>', you can just check for these each time you add a word to your current sentence, like so:

sentences = []
current_sentence = []

for word in wordlist:
    current_sentence.append(word)
    if word == '<s>':
        sentences.append(current_sentence)
        current_sentence = []

print(sentences)

Here, I replaced your sentence list with sentences. This will keep track of all of the sentences that you make from your word list. current_sentence will keep track of all of the words in your current sentence. When you reach a '<s>', this code adds your current sentence list to sentences, then resets current_sentence to an empty list.

Upvotes: 1

BernardL
BernardL

Reputation: 5464

Append your results to a list and reset it after you have found your end, in this case is <s>

wordlist = ['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>', 'i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant','<s>']
results = []
result = []

for word in wordlist:
    if word == '<s>':
        result.append(word)
        results.append(result)
        result = []
    else:
        result.append(word)

Final output in results:

[['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'],
 ['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant', '<s>']]

Upvotes: 0

vash_the_stampede
vash_the_stampede

Reputation: 4606

You can create an iter from wordlist and then use a while loop with try/except to iterate and create your sublists that will be appended to your final list.

a = iter(wordlist)
res = []
temp = []

while True:
    try:
        b = next(a)
        if b != '<s>':
            temp.append(b)
        else:
            temp.append(b)
            res.append(temp)
            temp = []

    except StopIteration:
        break

print(res)
# [['i', 'would', 'like', 'to', 'go', 'to', 'the', 'store', '<s>'], ['i', "'d", 'like', 'to', 'go', 'to', 'a', 'fancy', 'restaurant', '<s>']]

Upvotes: 0

Related Questions