Wiki
Wiki

Reputation: 3

How do i sort some of the numbers from a text file and order it in higher to lower. Python

So, I have file which looks like this

[email protected]:h1annah!! - Number of visits: 132 - True - False - True
[email protected]:joh22nny!! - Number of visits: 14814 - True - False - False
[email protected]:gin55er.! - Number of visits: 15 - True - False - False

My objective is to order it like this

[email protected]:joh22nny!! - Number of visits: 14814 - True - False - False
[email protected]:h1annah!! - Number of visits: 132 - True - False - True
[email protected]:gin55er.! - Number of visits: 15 - True - False - False

So it would order the number of visits from higher to lower.

I've found a solution, which looks like this.

with open('file.txt') as f, open('file2.txt', 'w') as f2:
    f2.writelines(sorted(f.readlines(),
                         key=lambda s: int(s.rsplit(' ')[-1].strip()),
                         reverse=True))

Though, this would only work if there's a integer on the last character.

So it won't work with the files I need it too.

My problem is on getting the numerical values from the number of visits and ordering them into ascending order, without removing anything from the file.

Sorry if this is wordy, I dont speak english.

Upvotes: 0

Views: 71

Answers (2)

nigh_anxiety
nigh_anxiety

Reputation: 2326

This solution uses the re module.

The regex pattern I used of r"Number of visits: (\d*) -" is actually larger than it needs to be and could be reduced to r": (\d*) -", but I wanted it to be clear and explicit which digits it should be capturing.
If you aren't familiar with re/Regular Expressions, the parentheses indicate that whatever matches the pattern inside of them should be captured separately from the matching string. \d* means to capture any number of consecutive digits.

Each line and the extracted value are then put into a tuple and stored in the list data. I chose to convert the value to an int() at this time but it could also be done as part of the sort lambda function instead.

import re

infile = "test.txt"
outfile = "output.txt"
data = []

# Read from the input file and use regex.
with open(infile, 'r') as fp:
    while True:
        line = fp.readline()
        if not line:
            break
        # Use re.search to capture the digits we want.
        match = re.search(r"Number of visits: (\d*) -", line)
        # Save data array of tuples with (line, integer)
        data.append((line, int(match.group(1))))

# Sort the list of tuples by the integers.
data.sort(key=lambda e: e[1], reverse=True)

# Write just the lines to the output file.
with open(outfile, 'w') as fp:
    for line in data:
        fp.write(line[0])

Upvotes: 0

Ryan Zhang
Ryan Zhang

Reputation: 1920

Define the key to be the element at the integer, so it sorts based off of that integer. Here, it seems to be the 5th element. Simply change -1 to 5, like this:

with open('file.txt') as f, open('file2.txt', 'w') as f2:
    f2.writelines(sorted(f.readlines(),
                         key=lambda s: int(s.rsplit(' ')[5].strip()),
                         reverse=True))

Upvotes: 1

Related Questions