Yamuna_dhungana
Yamuna_dhungana

Reputation: 663

How to match a column and extract values between two files in Shell

I have two csv files with these values

FileA.csv

CACHE_1005,VB14314690
CACHE_1071,VB69700608
CACHE_1211,VB85967916

FileB.csv

CACHE_1005,unk,HS2000-785_105^1
CACHE_1005,unk,HS2000-785_105^2
CACHE_1005,unk,HS2000-899_107^2
CACHE_1005,unk,HS2000-901_127^1
CACHE_1071,unk,HS2000-152_566^1
CACHE_1071,unk,HS2000-940_103^7
CACHE_1071,unk,HS2000-940_103^8
CACHE_1211,unk,HS2000-152_566^2
CACHE_1211,unk,HS2000-152_566^3
CACHE_1211,unk,HS2000-152_566^4

I want to match first columns in two csv files and extract values of second column in FileA and append to FileB. This is the result I want:

Result.csv

CACHE_1005,unk,HS2000-785_105^1,VB14314690
CACHE_1005,unk,HS2000-785_105^2,VB14314690
CACHE_1005,unk,HS2000-899_107^2,VB14314690
CACHE_1005,unk,HS2000-901_127^1,VB14314690
CACHE_1071,unk,HS2000-152_566^1,VB69700608
CACHE_1071,unk,HS2000-940_103^7,VB69700608
CACHE_1071,unk,HS2000-940_103^8,VB69700608
CACHE_1211,unk,HS2000-152_566^2,VB85967916
CACHE_1211,unk,HS2000-152_566^3,VB85967916
CACHE_1211,unk,HS2000-152_566^4,VB85967916

Upvotes: 1

Views: 377

Answers (2)

stackoverflower
stackoverflower

Reputation: 122

You can read fileA (by while read line or for loop) each line and use sed command to append column 2 to fileB:

#!/bin/bash
for lineA in $(cat FileA.csv); do
        pattern=${lineA%,*}
        extract=${lineA#*,}
        sed -i "/$pattern/s/$/\,$extract/g" FileB.csv
done

Upvotes: 0

thanasisp
thanasisp

Reputation: 5975

This is exactly what the join command is doing.

> join -t, fileB.csv fileA.csv 
CACHE_1005,unk,HS2000-785_105^1,VB14314690
CACHE_1005,unk,HS2000-785_105^2,VB14314690
CACHE_1005,unk,HS2000-899_107^2,VB14314690
CACHE_1005,unk,HS2000-901_127^1,VB14314690
CACHE_1071,unk,HS2000-152_566^1,VB69700608
CACHE_1071,unk,HS2000-940_103^7,VB69700608
CACHE_1071,unk,HS2000-940_103^8,VB69700608
CACHE_1211,unk,HS2000-152_566^2,VB85967916
CACHE_1211,unk,HS2000-152_566^3,VB85967916
CACHE_1211,unk,HS2000-152_566^4,VB85967916

-t sets the separator, also files have to be sorted like in your example, if not, you can join <(sort file2) <(sort file1) without preserving the order, or give a try with --nocheck-order.

Note: assuming you have a simple csv file, like in your example, without nested separators inside quoted fields.

Upvotes: 7

Related Questions