Shivika
Shivika

Reputation: 229

Compare two files and update one with other in shell script

I have two files which are having same key column and based on that I have to update one file with other. Could you please help me to write logic for this in shell script?

Example:

src file:

+-------+-------+
|call_id|call_nm|
+-------+-------+
|    100|     QC|
|    105|     XY|
|    110|     NM|
|    115|     AB|
+-------+-------+

lkp file:

+-------+-------+
|call_id|call_nm|
+-------+-------+
|    100|     QZ|
|    105|     XY|
|    106|     XZ|
|    115|     JQ|
+-------+-------+

In above example I have to match both the files and update src_File based on the value in lkp_file. We have key_column - call_id.

First check key_column and then update the value in src_file. The final output should be like this-

src file:

+-------+-------+
|call_id|call_nm|
+-------+-------+
|    100|     QZ|
|    105|     XY|
|    110|     NM|
|    115|     JQ|
+-------+-------+

You can see value of 100 & 115 have changed based on lkp_file values.

Upvotes: 0

Views: 466

Answers (3)

ufopilot
ufopilot

Reputation: 3975

Using awk:

awk -F\| '{if(NR==FNR){a[$2]=$0}else{if($2 in a)$0=a[$2];print}}' lkp_file src_file
+-------+-------+
|call_id|call_nm|
+-------+-------+
|    100|     QZ|
|    105|     XY|
|    110|     NM|
|    115|     JQ|
+-------+-------+

Upvotes: 1

Dudi Boy
Dudi Boy

Reputation: 4865

Suggesting awk script:

script.awk

BEGIN     {FS = OFS = "|"} # set awk field seperator and output field seperator to |
NR == 1   {footer = $0} # read footer
NR < 4    {header = header ORS $0; next} # read and ignore header
FNR == NR {arr[$2] = $3; next} # read array from 1st file
$2 in arr {arr[$2] = $3} # update arra if found id in file2
END {  # output the formated array
  print header;
  asorti(arr,arrSorted); # sort array indexes in arrSorted
  for (i in arrSorted) { # for each element in sorted array
    if (arrSorted[i] == "") continue; # ignore empty element
    print OFS arrSorted[i] OFS arr[arrSorted[i]] OFS; # print current element
  }
  print footer;
}

running script.awk

awk script.awk src_file.txt lkp_file.txt

output:

+-------+-------+
|call_id|call_nm|
+-------+-------+
|    100|     QZ|
|    105|     XY|
|    110|     NM|
|    115|     JQ|
+-------+-------+

Notice the output is correct and sorted in id order.

Upvotes: 0

Victor Lee
Victor Lee

Reputation: 2658

Try this(back up your original file before exec this command):

grep -Eo '[0-9]{3,3}|[A-Z]{2,2}' lkp | xargs -n2 | xargs -I {} bash -c 'o=$(echo {}); n=(${o}); id=${n[0]}; nm=${n[1]}; sed -i "s/\($id|\ *\).*|/\1$nm|/" src'
  • '[0-9]{3,3}|[A-Z]{2,2}': extract id and nm
  • xargs -n2: create the pairs like 100 QZ
  • -I {} bash -c: handle the pairs to id and nm variable for sed command
  • "s/\($id|\ *\).*|/\1$nm|/": only suitable for your file format like | 100| QC|, just replace the nm before |, or you could use the pattern what you want such as [A-Z]+

Upvotes: 1

Related Questions