user3350669
user3350669

Reputation: 47

How to sort specific info in a file

I have a pre-made text file that has peoples names and scores in it. They each have three scores, each separated by a tab.

John    12    13    21
Zack    14    19    12
Tim     18    22    8
Jill    13    3     22

Now, my goal is to sort the names alphabetically with only the highest score displayed. To look like this:

Jill   22
John   21
Tim    18
Zack   19

Once the file has been sorted, I want to print it on the python shell. I have defined the code because I am going to implement it into my other code that I have created.

from operator import itemgetter

def highscore():
    file1 = open("file.txt","r")
    file1.readlines()
    score1 = file1(key=itemgetter(1))
    score2 = file1(key=itemgetter(2))
    score3 = file1(key=itemgetter(3))


def class1alphabetical():
    with open('file.txt') as file1in:
        lines = [line.split('/t') for line in file1in]
        lines.sort()
    with open('file.txt', 'w') as file1out:
        for el in lines:
            file1out.write('{0}\n'.format(' '.join(el)))
    with open('file.txt','r') as fileqsort:
        for line in file1sort:
            print(line[:-1])
        file1sort.close

classfilealphabetical()

I have used info from other questions such as: Sorting information from a file in python and Python : Sort file by arbitrary column, where column contains time values

However, I am still stuck on what to do now.

Upvotes: 3

Views: 152

Answers (4)

jfs
jfs

Reputation: 414079

There are distinctly two tasks:

  1. keep only the top score
  2. sort lines by name alphabetically

Here's a standalone script that removes all scores from each line except the highest one:

#!/usr/bin/env python3
import sys
import fileinput

try:
    sys.argv.remove('--inplace') # don't modify file(s) unless asked
except ValueError:
    inplace = False
else:
    inplace = True # modify the files given on the command line

if len(sys.argv) < 2:
    sys.exit('Usage: keep-top-score [--inplace] <file>')

for line in fileinput.input(inplace=inplace):
    name, *scores = line.split() # split on whitespace (not only tab)
    if scores:
        # keep only the top score
        top_score = max(scores, key=int)
        print(name, top_score, sep='\t')
    else:
        print(line, end='') # print as is

Example:

$ python3 keep_top_score.py class6Afile.txt

To print the lines sorted by name:

$ sort -k1 class6Afile.txt

The result of the sort command depends on your current locale e.g., you could use LC_ALL=C to sort by byte values.

Or if you want Python solution:

#!/usr/bin/env python
import sys
from io import open

filename = sys.argv[1] 
with open(filename) as file:
    lines = file.readlines() # read lines

# sort by name
lines.sort(key=lambda line: line.partition('\t')[0])

with open(filename, 'w') as file:
    file.writelines(lines) # write the sorted lines

The names are sorted as Unicode text here. You could provide the explicit character encoding used in the file otherwise the default (based on your locale) encoding is used.

Example:

$ python sort_inplace_by_name.py class6Afile.txt

Result

Jill    22
John    21
Tim 22
Zack    19

Upvotes: 0

JL Peyret
JL Peyret

Reputation: 12144

whoa, you seem to be doing things a bit too complicated.

This is a rough idea.

#this will get your folks in alpha
lines = f.readlines()
lines.sort()

#now, on each line, you want to split (that attrgetter is too complicated and
#blows up if <> 3 grades.

# use the special feature of split() with no parameter to remove all spaces and \t characters
fields = line.split()
name, grades = fields[0], fields[1:]

#cast your grades to integers  
grades = [int(grade) for grade in grades]

#sort and pick the last one
grades.sort()
highest = grades[-1]

#or... use max as suggested
highest = max(grades)

#write to output file....

another piece of advice, use open with context managers for your files, they can be nested. closing resources is a major component of well-behaved pgms.

with open("/temp/myinput.txt","r") as fi:
    ....

Upvotes: 2

Totem
Totem

Reputation: 7349

Once you have your lines in a sorted list try this:

output = ["{} {}".format(i[0], max(i[1:], key=int)) for i in lines]

for i in output:
    print i

Jill 22
John 21
Tim 22
Zack 19

output is a list created using a list comprehension.

The curly brackets('{}') are replaced by the arguments passed to str.format(). The str in this case being "{} {}"

The max function takes a keyword argument 'key', as seen above, which lets you specify a function to apply to each item in the iterable given to max(The iterable in this case being i[1:]). I used int because all the the items in the list were strings(containing numbers), and had to be converted to ints.

Upvotes: 0

James Mills
James Mills

Reputation: 19030

This is quite easy to do with some builtin functions and an interation:

Code:

#!/usr/bin/env python


from operator import itemgetter


scores = """\
John\t12\t13\t21\n
Zack\t14\t19\t12\n
Tim\t18\t22\t8\n
Jill\t13\t3\t22"""


datum = [x.split("\t") for x in filter(None, scores.split("\n"))]
for data in sorted(datum, key=itemgetter(0)):
    name, scores = data[0], map(int, data[1:])
    max_score = max(scores)
    print "{0:s} {1:d}".format(name, max_score)

Output:

$ python -i scores.py 
Jill 22
John 21
Tim 22
Zack 19
>>> 

Upvotes: 0

Related Questions