user175084
user175084

Reputation: 4630

Script to compare 2 files line by line

I have two text files:

File1.txt

dadads 434 43 {"4fsdf":"66db1" fdf1:"5834"}
gsgss 45 0 {"gsdg":"8853" sgdfg:"4631"}
fdf 767 4643 {"klhf":"3455" kgs:"4566"}
.  
.

File2.txt

8853
6437437567
36265
4566
.
.

Output could be two files

Match.txt

gsgss 45 0 {"gsdg":"8853" sgdfg:"4631"}
fdf 767 4643 {"klhf":"3455" kgs:"4566"}

Non_Match.txt

dadads 434 43 {"4fsdf":"66db1" fdf1:"5834"}

Can someone help me write bash script for this?

I think i have the logic here if it helps:

 for (rows in File1.txt) {
   bool found = false;
    for (id in File2.txt) {
      if (row contains id) {
      found = true;
      echo row >> Match.txt
      break;
     }
    }
   if (!found) {
      echo row >> Non_Match.txt
   }
  }

Edit Part:

I also have a bash script but its not helping as it is not putting the row which matches but instead only the ID that matches..

#!/bin/bash

set -e

file1="File2.txt"
file2="File1.txt"

for id in $(tail -n+1 "${file1}"); do
   if ! grep "${id}" "${file2}"; then
      echo "${id}" >>non_matches.txt
   else
       echo "${id}" >>matches.txt
   fi
done

Upvotes: 0

Views: 1732

Answers (2)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476574

This sounds a bit like diff or wdiff if you want to do this on word level.

If you run diff on your two files, you will generate the following output:

< dadads 434 43 {"4fsdf":"66db1" fdf1:"5834"}
< gsgss 45 0 {"gsdg":"8853" sgdfg:"4631"}
< fdf 767 4643 {"klhf":"3455" kgs:"4566"}
---
> 8853
> 6437437567
> 36265
> 4566

It means that the "minimal" way (per line) to modify the first file into the second is removing all lines and add all new lines.

If however the second file would have been:

8853
6437437567
gsgss 45 0 {"gsdg":"8853" sgdfg:"4631"}
36265
4566

The diff output is:

1c1,2
< dadads 434 43 {"4fsdf":"66db1" fdf1:"5834"}
---
> 8853
> 6437437567
3c4,5
< fdf 767 4643 {"klhf":"3455" kgs:"4566"}
---
> 36265
> 4566

So diff no longer asks to remove the second line.

wdiff does approximately the same, but on word level:

[-dadads 434 43 {"4fsdf":"66db1" fdf1:"5834"}-]{+8853
6437437567+}
gsgss 45 0 {"gsdg":"8853" sgdfg:"4631"}
[-fdf 767 4643 {"klhf":"3455" kgs:"4566"}-]
{+36265
4566+}

Upvotes: 1

John Kugelman
John Kugelman

Reputation: 361585

You could use grep -f to look for search patterns that are listed in a separate file. It'd probably be good to use the -F (fixed strings) and -w (match whole words) flags as well.

grep -Fw  -f File2.txt File1.txt > Match.txt
grep -Fwv -f File2.txt File1.txt > Non_Match.txt

Upvotes: 5

Related Questions