user3536059
user3536059

Reputation: 1

Comparing and extracting data from two similar files

I have the following problem: In file A and B I have a list of names and dates of birth the format is Joe Bloggs 01/01/1901.

File A has all the correct dates of birth. I have been trying to write code in Python that can run through file B and compare the name and date of birth and if it finds a duplicate to delete it, leaving behind all the incorrect entries.

This code creates a file with all the names and DOB that are incorrect which I can work with but I want to do exactly the same but with the correct DOB. Any ideas?

def build_set(filename):
    found = set()    

    with open(filename) as f:
        for line in f:
            found.add(tuple(sorted(line.split()[:3])))

    return found

set_more = build_set('Incorrect.txt')
set_del = build_set('Correct.txt')

with open('results.txt', 'w') as out_file:
    for res in (set_more - set_del):  
        out_file.write(" ".join(res) + "\n")

Upvotes: 0

Views: 58

Answers (1)

abhink
abhink

Reputation: 9126

If you want the new fie to only contain correct DOB why not take set difference once more?

with open('results.txt', 'w') as out_file:
    for res in (set_del - (set_more - set_del)):  
        out_file.write(" ".join(res) + "\n")

Or is it something else that you are looking for?

Upvotes: 1

Related Questions