Match 2 columns in 2 files and get another value from the first file

Question

I have 2 csv files which have the following structure:

File 1:
date,keyword,location,page
2019-04-11,ABC,mumbai,http://www.insurers.com
and so on.

File 2:
date,site,market,location,url 
2019-05-12,denmark,de ,Frankfurt,http://lufthansa.com
2019-04-11,Netherlands,nl,amsterdam,http://www.insurers.com

The problem is I need to match the dates in both the files as well as the the url. Example:

2019-04-11 and http://www.insurers.com (File 1)
with 
2019-04-11 and http://www.insurers.com (File 2)

Edit: If this condition is satisfied the keyword (ABC) in File 1 should be inserted into the File 2 as the third column(new column).

Expected Output:

date,site,keyword,market,location,url
2019-04-11,Netherlands,ABC,nl,amsterdam,http://www.insurers.com

I have tried putting the dates and urls in a map in java, but there are too many URLs duplicated. So I am seeking a bash, awk, grep or sed solution. Thanks.

Ed Morton · Accepted Answer

$ awk '
    BEGIN { FS=OFS="," }
    NR==FNR { m[$1,(NR>1?$4:"url")]=$2; next }
    ($1,$5) in m { $2=$2 OFS m[$1,$5]; print }
' file1 file2
date,site,keyword,market,location,url
2019-04-11,Netherlands,ABC,nl,amsterdam,http://www.insurers.com

Match 2 columns in 2 files and get another value from the first file

Answers (2)

Related Questions