Mahesh
Mahesh

Reputation: 233

comparing two files

I have two file

file1 contents are as below

===================================================

OUTPUT1:---------
    orange
    india
    US

xx

OUTPUT2:---------

    orange-1
    india-1
    US-1
xx

===================================================

file2 contents are as below

OUTPUT1:---------
    orange
    india
    US

xx

OUTPUT2:---------
    orange-1
    india-1
    US-2
xx

===================================================

I want difference of two as below

-----------------------
OUTPUT1: No evolution
----------------------
OUTPUT2: Evolution found
Before:US-1
After:US-2
----------------------

Is it possible to write script in perl with above requirement

any help will be much appreciated

Upvotes: 1

Views: 542

Answers (3)

Dallaylaen
Dallaylaen

Reputation: 5308

Algorithm::Diff should do the job. It works on arrays (i.e. you can parse whatever input format you like) and generates diff-like output.

However, it might turn out the LCS algorithm is a bit of overkill for the task, and you should just go with hash tables instead.

Upvotes: 1

Blender
Blender

Reputation: 298176

No perl, but something more awesome: diff!

It compares files:

[blender@arch Desktop]$ diff file1.txt file2.txt 
11c11
<     US-1
---
>     US-2

11c11 says that the changed text starts on line 11, character 11.

Upvotes: 5

djna
djna

Reputation: 55907

Possible in Perl, for sure, it's a pretty powerful language.

The degree of difficulty will be affected by the assumptions we can make about the data. Is it sorted? How big are the files.

If the data is unsorted and the files are too large to be held entirely in memory then you may need to adopt a pipeline approach, first sorting and then "diffing", and in which case if you have access to Unix heritage tools such as diff and sort you may not even need Perl.

Assuming you want to use Perl, I'd suggest looking at the problem in stages:

  1. Identify "records", which span multiple lines. Write code to consume a single file and build a representation of each record.
  2. Solve the sort problem, if need be build an intermediate file containing the sorted records.
  3. Do the diff across the two sorted files, if you can build a hash of one entire file in memory this is easy, otherwise you need to fetch records from one file or the other depending upon which one has the "next" record.
  4. Having indentified a change print out the details in the desired format

Upvotes: 0

Related Questions