Reputation: 391
I have 2 text files (new.txt
and master.txt
). Each has different data stored as such:
Cory 12 12:40:12.016221
Suzy 64 12:40:33.404614
Trent 145 12:40:56.640052
(catagorised by the first set of numbers appearing on each line)
I have to scan each line of new.txt
for the name (e.g. Suzy), check if there is a duplicate in master.txt
and if there isn't, then I add that line to master.txt
catagorized by that line's number (e.g. 64 in Suzy 64 12:40:33.404614
).
I have written the following script, but it falls into a loop of checking the 1st line of new.txt
(I know why, I just don't know how to work around not closing fileinput.input(new.txt)
so that I can then open fileinput.input(master.txt)
further down the loop). I feel like I've highly over complicated things for myself and any help is appreciated.
import fileinput
import re
end_of_file = False
while end_of_file == False:
for line in fileinput.input('new.txt', inplace=1):
end_of_file = fileinput.isstdin() #ends while loop if on last line of new.txt
user_f_line_list = line.split()
master_f = open('master.txt', 'r')
master_f_read = master_f.read()
master_f.close()
fileinput.close()
if not re.findall(user_f_line_list[0], master_f_read):
for line in fileinput.input('master.txt', inplace=1):
master_line_list = line.split()
if int(user_f_line_list[1]) <= int(master_line_list[1]):
written = False
while written == False:
written = True
print(' '.join(user_f_line_list))
print(line, end='')
fileinput.close()
And for reference, master.txt
starts with startline 0
and ends with endline 1000000000000000
so that it is impossible for the categorizing to be out of range.
Upvotes: 0
Views: 44
Reputation: 489
Some suggestions:
master.txt
into a list with readlines()
.OrderedDict
from the collections
module - it is the same as a regular dict but preserves the order. Make each key the unique element - a tuple in this case (e.g. ("Cory", 12)
). Make the value whatever comes after. if key in my_dict:
.sort
function to the list with a custom function to specify how to sort.I won't say it's necessarily shorter than your solution, but it is a lot cleaner.
Upvotes: 1