Reputation: 1
I have the following problem: In file A and B I have a list of names and dates of birth the format is Joe Bloggs 01/01/1901.
File A has all the correct dates of birth. I have been trying to write code in Python that can run through file B and compare the name and date of birth and if it finds a duplicate to delete it, leaving behind all the incorrect entries.
This code creates a file with all the names and DOB that are incorrect which I can work with but I want to do exactly the same but with the correct DOB. Any ideas?
def build_set(filename):
found = set()
with open(filename) as f:
for line in f:
found.add(tuple(sorted(line.split()[:3])))
return found
set_more = build_set('Incorrect.txt')
set_del = build_set('Correct.txt')
with open('results.txt', 'w') as out_file:
for res in (set_more - set_del):
out_file.write(" ".join(res) + "\n")
Upvotes: 0
Views: 58
Reputation: 9126
If you want the new fie to only contain correct DOB why not take set difference once more?
with open('results.txt', 'w') as out_file:
for res in (set_del - (set_more - set_del)):
out_file.write(" ".join(res) + "\n")
Or is it something else that you are looking for?
Upvotes: 1