AKM
AKM

Reputation: 3

Convert 4 sentences from text file and append all the words into a new list without repeating the words

I have been working on program to read 4 sentences from .txt file and append all the words into a new empty list.

My code is as follow:

fname = raw_input("Enter file name: ")
fh = open(fname)
lst = list()
for line in fh:
    line = line.rstrip()
    words = line.split()
    words.sort()
    if words not in lst:
      lst.append(words)
      print lst

And i got the following results:

[['But', 'breaks', 'light', 'soft', 'through', 'what', 'window', 'yonder']] [['But', 'breaks', 'light', 'soft', 'through', 'what', 'window', 'yonder'], ['It', 'Juliet', 'and', 'east', 'is', 'is', 'sun', 'the', 'the']] [['But', 'breaks', 'light', 'soft', 'through', 'what', 'window', 'yonder'], ['It', 'Juliet', 'and', 'east', 'is', 'is', 'sun', 'the', 'the'], ['Arise', 'and', 'envious', 'fair', 'kill', 'moon', 'sun', 'the']] [['But', 'breaks', 'light', 'soft', 'through', 'what', 'window', 'yonder'], ['It', 'Juliet', 'and', 'east', 'is', 'is', 'sun', 'the', 'the'], ['Arise', 'and', 'envious', 'fair', 'kill', 'moon', 'sun', 'the'], ['Who', 'already', 'and', 'grief', 'is', 'pale', 'sick', 'with']]

What could i do to obtain the following:

['Arise', 'But', 'It', 'Juliet', 'Who', 'already', 'and', 'breaks', 'east', 'envious', 'fair', 'grief', 'is', 'kill', 'light', 'moon', 'pale', 'sick', 'soft', 'sun', 'the', 'through', 'what', 'window', 'with', 'yonder']

The sentences are: But soft what light through yonder window breaks It is the east and Juliet is the sun Arise fair sun and kill the envious moon Who is already sick and pale with grief

Upvotes: 0

Views: 1037

Answers (5)

cicada_3301
cicada_3301

Reputation: 1

I was doing the same assignment. The code I have used is as follows:

fname = input("Enter file name: ")
fh = open(fname)
lst = list()
for line in fh:
    line = line.rstrip()
    words = line.split()
    for word in words:
        if word not in lst:
            lst.append(word)
lst.sort()
print(lst)

Upvotes: 0

ekhumoro
ekhumoro

Reputation: 120588

A set can be used to remove duplicates, and the split method will split on any kind of whitespace - including line-endings. So this task can be reduced to a quite simple one-liner:

lst = sorted(set(open(fname).read().split()))

Upvotes: 0

robbawebba
robbawebba

Reputation: 349

You are splitting each line into a list of words correctly using line.split(), but you are not iterating through the new list named words that you just created. You are instead comparing the list words as an object to the contents of lst, and then appending words as an object to lst. This causes lst to be a list of lists, as you've shown in the results you've been receiving.

In order to achieve the array of words you're looking for, you'll have to iteratw through words and add each word individually as long as it is not in lst:

for word in words:
    if word not in lst:
      lst.append(word)

edit: Found another question/answer regarding the same problem - Probably for the same class assignment.

Upvotes: 0

ted
ted

Reputation: 14714

You want to use a set that will uniquely list elements :

my_string = "But soft what light through yonder window breaks It is the east and Juliet is the sun Arise fair sun and kill the envious moon Who is already sick and pale with grief"    
lst = set(my_string.split(' '))

This will give you what you want. You can use seton strings, lists etc. sets in python 3.5

Upvotes: 1

Batman
Batman

Reputation: 8917

The easiest way is to use a set, and to append each word.

file_name = raw_input("Enter file name: ")
with open(file_name, 'r') as fh: 
    all_words = set()
    for line in fh:
        line = line.rstrip()
        words = line.split()
        for word in words:     
            all_words.add(word)
print(all_words)

Upvotes: 0

Related Questions