VIPIN KUMAR
VIPIN KUMAR

Reputation: 3147

How to find the value of column on the basis of matching column from two files?

file 1 : emp.txt

7839|KING|PRESIDENT||17-Nov-81|5000||10
7698|BLAKE|MANAGER|7839|01-May-81|2850||30
7782|CLARK|MANAGER|7839|09-Jun-81|2450||10
7566|JONES|MANAGER|7839|02-Apr-81|2975||20
7788|SCOTT|ANALYST|7566|19-Apr-87|3000||20
7902|FORD|ANALYST|7566|03-Dec-81|3000||20
7369|SMITH|CLERK|7902|17-Dec-80|800||20
7499|ALLEN|SALESMAN|7698|20-Feb-81|1600|300|30
7521|WARD|SALESMAN|7698|22-Feb-81|1250|500|30
7654|MARTIN|SALESMAN|7698|28-Sep-81|1250|1400|30

file 2 : dept.txt

  10|ACCOUNTING|NEW YORK
  20|RESEARCH|DALLAS
  30|SALES|CHICAGO
  40|OPERATIONS|BOSTON

I want to print below output :

7839|KING|PRESIDENT||17-Nov-81|5000||10|NEW YORK
7698|BLAKE|MANAGER|7839|01-May-81|2850||30|CHICAGO
7782|CLARK|MANAGER|7839|09-Jun-81|2450||10|NEW YORK
7566|JONES|MANAGER|7839|02-Apr-81|2975||20|DALLAS
7788|SCOTT|ANALYST|7566|19-Apr-87|3000||20|DALLAS
7902|FORD|ANALYST|7566|03-Dec-81|3000||20|DALLAS
7369|SMITH|CLERK|7902|17-Dec-80|800||20|DALLAS
7499|ALLEN|SALESMAN|7698|20-Feb-81|1600|300|30|CHICAGO
7521|WARD|SALESMAN|7698|22-Feb-81|1250|500|30|CHICAGO
7654|MARTIN|SALESMAN|7698|28-Sep-81|1250|1400|30|CHICAGO

I tried below awk statement, but it is not printing anything -

awk -F'|' 'NR==FNR {val[$1]=$3; next} $8 in val {print $1,$2,$3,$4,$5,$6,$7,$8,val[$1]}' OFS="|"   dept.txt emp.txt

Any Suggestion ??

Upvotes: 1

Views: 111

Answers (3)

jaypal singh
jaypal singh

Reputation: 77105

Use $NF, which is the value of the last field:

➜ awk '
    BEGIN { FS = OFS = "|" }
    NR==FNR { location[$1] = $NF; next }
    { print (location[$NF] ? $0 OFS location[$NF] : $0) }
' dept.txt emp.txt
7839|KING|PRESIDENT||17-Nov-81|5000||10|NEW YORK
7698|BLAKE|MANAGER|7839|01-May-81|2850||30|CHICAGO
7782|CLARK|MANAGER|7839|09-Jun-81|2450||10|NEW YORK
7566|JONES|MANAGER|7839|02-Apr-81|2975||20|DALLAS
7788|SCOTT|ANALYST|7566|19-Apr-87|3000||20|DALLAS
7902|FORD|ANALYST|7566|03-Dec-81|3000||20|DALLAS
7369|SMITH|CLERK|7902|17-Dec-80|800||20|DALLAS
7499|ALLEN|SALESMAN|7698|20-Feb-81|1600|300|30|CHICAGO
7521|WARD|SALESMAN|7698|22-Feb-81|1250|500|30|CHICAGO
7654|MARTIN|SALESMAN|7698|28-Sep-81|1250|1400|30|CHICAGO

This assumes you still want the entire line regardless if the dept city index exists. If not then please update your question to reflect common use cases and expected output.

Upvotes: 2

rakinhaider
rakinhaider

Reputation: 124

The problem is there are two spaces in front of the matching column. Since you are using '|' as your field separator then each row of the second file is divided as follows.(Using the first row as example.)

  10|ACCOUNTING|NEW YORK

$1="  10"
$2="ACCOUNTING"
$3="NEW YORK"

So you are mapping Accounting with " 10" rather than "10". Thats why you don't get any match in the second file. (Assuming you wanted to use val[$8] rather than val[$1] in the second print command).

Do the following. This will fix your problem.

awk -F'|' 'NR==FNR {sub("  ","",$1);val[$1]=$3; next;} $8 in val {print $1,$2
,$3,$4,$5,$6,$7,$8,val[$8]}' OFS="|"   dept.txt emp.txt

Output:

7839|KING|PRESIDENT||17-Nov-81|5000||10|NEW YORK
7698|BLAKE|MANAGER|7839|01-May-81|2850||30|CHICAGO
7782|CLARK|MANAGER|7839|09-Jun-81|2450||10|NEW YORK
7566|JONES|MANAGER|7839|02-Apr-81|2975||20|DALLAS
7788|SCOTT|ANALYST|7566|19-Apr-87|3000||20|DALLAS
7902|FORD|ANALYST|7566|03-Dec-81|3000||20|DALLAS
7369|SMITH|CLERK|7902|17-Dec-80|800||20|DALLAS
7499|ALLEN|SALESMAN|7698|20-Feb-81|1600|300|30|CHICAGO
7521|WARD|SALESMAN|7698|22-Feb-81|1250|500|30|CHICAGO
7654|MARTIN|SALESMAN|7698|28-Sep-81|1250|1400|30|CHICAGO

Upvotes: 1

Dez
Dez

Reputation: 5838

In your code line, you should call the hash by the column that has the id where you hashed each value, in your case, column 8 is the one that stores the common id for the file you want to print the info out.

awk -F\| 'NR==FNR {val[$1]=$3; next} {print $1, $2, $3, $4, $5, $6, $7, $8, val[$8]};' OFS="|"  dept.txt emp.txt

Upvotes: 1

Related Questions