Reputation: 133
If you have two mainly identical files with 1000s of records, how will you write code to find differences between them. Assume that unix/linux commands are not allowed to be used.
My idea:
Because most of entries are the same, we can sort the entries of the two files, then compare each entry one by one, e.g. entry i in file1 compare to entry i in file2. It is O(n lg n), n is size of file.
Are there O(n) solution ?
Upvotes: 8
Views: 829
Reputation:
Hash tables are your friends.
1
.Now you know which record exists in both files (value 3), which exists in the first file only (value 1), and which exists in the second file only (value 2). And in linear time.
Note: If you're implementing your own hash table, you have to handle growing the size of your table as needed, as well as collisions. I'm sure if you could do that then you wouldn't have a hard time with this question, so use a library.
Upvotes: 3