Reputation: 3
So, I have file which looks like this
[email protected]:h1annah!! - Number of visits: 132 - True - False - True
[email protected]:joh22nny!! - Number of visits: 14814 - True - False - False
[email protected]:gin55er.! - Number of visits: 15 - True - False - False
My objective is to order it like this
[email protected]:joh22nny!! - Number of visits: 14814 - True - False - False
[email protected]:h1annah!! - Number of visits: 132 - True - False - True
[email protected]:gin55er.! - Number of visits: 15 - True - False - False
So it would order the number of visits from higher to lower.
I've found a solution, which looks like this.
with open('file.txt') as f, open('file2.txt', 'w') as f2:
f2.writelines(sorted(f.readlines(),
key=lambda s: int(s.rsplit(' ')[-1].strip()),
reverse=True))
Though, this would only work if there's a integer on the last character.
So it won't work with the files I need it too.
My problem is on getting the numerical values from the number of visits and ordering them into ascending order, without removing anything from the file.
Sorry if this is wordy, I dont speak english.
Upvotes: 0
Views: 71
Reputation: 2326
This solution uses the re
module.
The regex pattern I used of r"Number of visits: (\d*) -"
is actually larger than it needs to be and could be reduced to r": (\d*) -"
, but I wanted it to be clear and explicit which digits it should be capturing.
If you aren't familiar with re
/Regular Expressions, the parentheses indicate that whatever matches the pattern inside of them should be captured separately from the matching string. \d*
means to capture any number of consecutive digits.
Each line and the extracted value are then put into a tuple and stored in the list data
. I chose to convert the value to an int() at this time but it could also be done as part of the sort lambda function instead.
import re
infile = "test.txt"
outfile = "output.txt"
data = []
# Read from the input file and use regex.
with open(infile, 'r') as fp:
while True:
line = fp.readline()
if not line:
break
# Use re.search to capture the digits we want.
match = re.search(r"Number of visits: (\d*) -", line)
# Save data array of tuples with (line, integer)
data.append((line, int(match.group(1))))
# Sort the list of tuples by the integers.
data.sort(key=lambda e: e[1], reverse=True)
# Write just the lines to the output file.
with open(outfile, 'w') as fp:
for line in data:
fp.write(line[0])
Upvotes: 0
Reputation: 1920
Define the key
to be the element at the integer, so it sorts based off of that integer. Here, it seems to be the 5th element. Simply change -1
to 5
, like this:
with open('file.txt') as f, open('file2.txt', 'w') as f2:
f2.writelines(sorted(f.readlines(),
key=lambda s: int(s.rsplit(' ')[5].strip()),
reverse=True))
Upvotes: 1