lyborko
lyborko

Reputation: 2619

Comparison of 2 text files: what and where changes were made?

imagine you have 2 texfiles (let's say 500kB - 3 MB large): the first is original, the second is the update of this original. How can I find out, what was changed (inserted, deleted) and where the changes took place (in the update file in comparison to original)?

  1. Is there any tool or library somewhere?
  2. Resides this function in any well known text editors?
  3. Does anybody know an algorithm? Or what are the common methods to solve it on the large scale?
  4. What would you do if you face this kind of problem?

Thanx for your ideas...

Upvotes: 5

Views: 574

Answers (6)

codaddict
codaddict

Reputation: 455302

Is there any tool or library somewhere?

There are many. Try using diff, it's a command line based file comparison utility that works fine for small diffs. But if the two file differs a lot, it'll be hard to understand the output of diff. In that case you can use visual file diff tools like diffmerge, Kompare or vimdiff.

Resides this function in any well known text editors?

Many modern editors like vim, Eclipse have this visual diffing feature..

Does anybody know an algorithm? Or what are the common methods to solve it on the large scale?

It is based on the Longest common subsequence algorithm, popularly known as LCS.

LCS of old text and new text gives the part that has remain unchanged. So the parts of old text that is not part of LCS is the one that got changed.

What would you do if you face this kind of problem?

I'd use one of the visual diff tools mentioned to see what and where the changes were made.

Upvotes: 0

Pleomax
Pleomax

Reputation: 1

GNU Diffutils http://www.gnu.org/software/diffutils/

Upvotes: 0

Matthew Slattery
Matthew Slattery

Reputation: 47038

The unix diff tool does line-by-line differences; there is a GNU tool called wdiff which will do word-by-word differences, and should be available as a package for most Linux distributions or Cygwin.

Classic papers on the algorithm are:

Upvotes: 0

Michael Madsen
Michael Madsen

Reputation: 55009

What you're describing sounds exactly like a diff-style tool. This sort of functionality is available in many of the more advanced text editors.

Upvotes: 2

amelvin
amelvin

Reputation: 9061

There is an extensive list of file comparison tools on wikipedia.

If you want to do it programatically I've used SED and AWK on Unix systems before now - and there are windows versions. Basically these types of file processing languages allow you to read and compare text files on a line-by-line basis and then allow you to do something with the differences (for example save them to a third file).

Upvotes: 1

Itay Karo
Itay Karo

Reputation: 18296

You can try Notepad++ it is an open source text editor that has a compare files plug in.

Upvotes: 1

Related Questions