Reputation: 2619
imagine you have 2 texfiles (let's say 500kB - 3 MB large): the first is original, the second is the update of this original. How can I find out, what was changed (inserted, deleted) and where the changes took place (in the update file in comparison to original)?
Thanx for your ideas...
Upvotes: 5
Views: 574
Reputation: 455302
Is there any tool or library somewhere?
There are many. Try using diff
, it's a command line based file comparison utility that works fine for small diffs. But if the two file differs a lot, it'll be hard to understand the output of diff. In that case you can use visual file diff tools like diffmerge, Kompare or vimdiff.
Resides this function in any well known text editors?
Many modern editors like vim, Eclipse have this visual diffing feature..
Does anybody know an algorithm? Or what are the common methods to solve it on the large scale?
It is based on the Longest common subsequence algorithm
, popularly known as LCS.
LCS of old text and new text gives the part that has remain unchanged. So the parts of old text that is not part of LCS is the one that got changed.
What would you do if you face this kind of problem?
I'd use one of the visual diff tools mentioned to see what and where the changes were made.
Upvotes: 0
Reputation: 47038
The unix diff tool does line-by-line differences; there is a GNU tool called wdiff which will do word-by-word differences, and should be available as a package for most Linux distributions or Cygwin.
Classic papers on the algorithm are:
Upvotes: 0
Reputation: 55009
What you're describing sounds exactly like a diff-style tool. This sort of functionality is available in many of the more advanced text editors.
Upvotes: 2
Reputation: 9061
There is an extensive list of file comparison tools on wikipedia.
If you want to do it programatically I've used SED and AWK on Unix systems before now - and there are windows versions. Basically these types of file processing languages allow you to read and compare text files on a line-by-line basis and then allow you to do something with the differences (for example save them to a third file).
Upvotes: 1