030
030

Reputation: 11679

Unable to sort multiple lines on last string of line

Introduction

Follow up to this Q&A.

Aim: sort multiple lines on last string

Method

A sample file has been created to test the sorting of multiple lines on last String.

Sample

aa - http://gggggggggg
bb - http://bbbbbbbbbb
cc - http://aaaaaaaaaa
aa - http://cccccccccc
bb - http://bbbbbbbbbb
cc - http://iiiiiiiiii
bb - http://bbbbbbbbbb
aa - http://ffffffffff
bb - http://bbbbbbbbbb

Code

fp = "C:\\sample.txt"
fp2 = "C:\\sample2.txt"

with open(fp, "r+") as f:
    lines = f.readlines()
    lines.sort()
    print(lines)

with open(fp2, "r+") as f2:
    f2.write("\n".join(lines))

Results

The sorting is based on the first string, rather than on the latter

Current

aa - http://cccccccccc

aa - http://ffffffffff

aa - http://gggggggggg

bb - http://bbbbbbbbbb
bb - http://bbbbbbbbbb

bb - http://bbbbbbbbbb

bb - http://bbbbbbbbbb

cc - http://aaaaaaaaaa

cc - http://iiiiiiiiii

Expected

cc - http://aaaaaaaaaa

bb - http://bbbbbbbbbb
bb - http://bbbbbbbbbb

bb - http://bbbbbbbbbb

bb - http://bbbbbbbbbb


aa - http://cccccccccc

aa - http://ffffffffff

aa - http://gggggggggg

cc - http://iiiiiiiiii 

Upvotes: 0

Views: 264

Answers (3)

afeldspar
afeldspar

Reputation: 1353

The answer is an idiom called "decorate, sort, undecorate."

Loop through your lines, and for each one make a tuple of "last string", "complete line".

Sort that list of tuples.

Return only the "complete line" portion of each tuple.

Example:

fp = "C:\\sample.txt"
fp2 = "C:\\sample2.txt"

with open(fp, "r+") as f:
    lines = f.readlines()
    line_tuples = [(i.split()[-1], i) for i in lines]
    line_tuples.sort()
    lines = [i[-1] for i in line_tuples]
    print(lines)

with open(fp2, "r+") as f2:
    f2.write("\n".join(lines))

The two list comprehensions, on either side of line_tuples.sort(), may be hard to read because they do so much in such short space. The first one basically says "for each string in lines, create a matching tuple in line_tuples which starts with just the last word of the string." That last word of the string becomes the sorting key in the next line. Then the second list comprehension says "Go through all those tuples, extract just the original lines, and put them back in lists."

Note that this code should work, but I don't have access to a machine with Python at the moment, so I can't guarantee it.

Upvotes: 0

unutbu
unutbu

Reputation: 879691

Use the key parameter to specify the proxy value on which to sort. In this case you could split on the '-', and reverse the order of the substrings:

fp = "C:\\sample.txt"
fp2 = "C:\\sample2.txt"

with open(fp, "r+") as f, open(fp2, "r+") as f2:
    lines = sorted(f, key=lambda text: text.split('-', 1)[::-1])
    print(lines)
    f2.write("\n".join(lines))

Upvotes: 1

Padraic Cunningham
Padraic Cunningham

Reputation: 180441

lines.sort(key= lambda x:x.split()[-1]) # sort on last item of each string

For example:

In [11]: s ="cc - http://iiiiiiiiii "

In [12]: s.split()
Out[12]: ['cc', '-', 'http://iiiiiiiiii']  # this is what x:x.split()[-1] is doing each time

I would also use with open(fp, "r+") as f and with open(fp2, "r+")as f1 to open your files, it closes them automatically for you.

Upvotes: 4

Related Questions