Padawan
Padawan

Reputation: 728

Python filtering and sorting through a list

I just can't seem to get this to work. I need to to open file ranger.txt. Read each line, then split each line into a list of words. Check to see if each word is already in the list. If the word is not in the list, then add it to the list. At end of program, sort and print the resulting words in alphabetical order.

The result should be: ["a", "and" "buckle", "C130", "countrollin", "door", "down", "four", "Gonna", "Jump", "little", "out", "ranger", "Recon", "right", "Shuffle", "Standstrip", "the", "to", "take", "trip", "up"]

I can get the individual lists to print, and even one word from each list, but that's it.

rangerHandle = open("ranger.txt")
count = 0
rangerList = list()

for line in rangerHandle:
    line = line.rstrip()
    #print line works at this point
    words = line.split() # split breaks string makes another list
    #print words works at this point
    if words[count] not in words: 
        rangerList.append(words[count])        
        count += 1
    print rangerList

ranger.txt file is:

C130 rollin down the strip
Recon ranger
Gonna take a little trip
Stand up, buckle up,
Shuffle to the door
Jump right out and count to four

And if you're going to neg vote, please at least give an explanation.

Upvotes: 1

Views: 1734

Answers (2)

Julien Spronck
Julien Spronck

Reputation: 15433

First, it is better to use the with ... syntax when dealing with files (https://docs.python.org/2/tutorial/inputoutput.html).

Second, if I were you, I would use sets (https://docs.python.org/2/library/sets.html) instead of lists. They have the advantage that you cannot add the same element twice, so you do not need to check if the word is already in the set or not. For each line, I will create a new set with the words on that line and merge it with the other words using the union method.

words = set([])
with open("ranger.txt") as f:
     for line in f:
         newset = set(line.rstrip().split())
         words = words.union(newset)
words = sorted(words) ## this line transforms the set into a sorted list

Upvotes: 2

JuniorCompressor
JuniorCompressor

Reputation: 20025

We can create the list without finding duplicates. We are going to remove them later by converting the list to a set. Then we sort the set by doing a case insensitive sort:

with open("ranger.txt") as f:
    l = [w for line in f for w in line.strip().split()]
print(sorted(set(l), key=lambda s: s.lower()))

Result:

[
    'a', 'and', 'buckle', 'C130', 'count', 'door', 'down', 'four', 
    'Gonna', 'Jump', 'little', 'out', 'ranger', 'Recon', 'right', 
    'rollin', 'Shuffle', 'Stand', 'strip', 'take', 'the', 'to', 'trip',
    'up,'
]

Upvotes: 3

Related Questions