Reputation: 663
I have a text file like this:
test.list
##a
##b
##C
#CHROM 0_62000_1 0_62000_5 0_62070_19 0_62000
I have OLD_SM.list
0_62000_1
0_62000
0_62070_19
and NEW_SM.list
APPLE
BANANA
KIWI
I want to replace the word in test.list that match in OLD_SM.list with NEW_SM.list.
I would prefer sed command, so I tried something like this which doesn't work.
paste OLD_SM.list NEW_SM.list | while read OLD_SM NEW_SM; do sed -i "/^#CHROM/s/[[:space:]]${OLD_SM}$/\t${NEW_SM}/g" test.list; done
Result I want
##a
##b
##C
#CHROM APPLE 0_62000_5 KIWI BANANA
Upvotes: 1
Views: 1327
Reputation: 246837
A slightly different take: build up the sed program as a bash array:
sed_opts=()
while read -r old <&3; read -r new <&4; do
sed_opts+=( -e "s/\\<$old\\>/$new/g" )
done 3< OLD_SM.list 4< NEW_SM.list
sed "${sed_opts[@]}" test.list
Upvotes: 1
Reputation: 141135
With GNU sed you can match beginning and ending of a word with \<
\>
. You may first generate a sed script from your input then pass it to sed. There have to be no special characters in input.
script=$(
paste OLD_SM.list NEW_SM.list |
sed 's/\(.*\)\t\(.*\)/s~\\<\1\\>~\2~g/'
)
sed -i "/^#CHROM/{ $script }" file.
The s/[[:space:]]${OLD_SM}$
- the $
matches end of line, so it's never going to work. You may do s/\(^\|[[:space:]]\)$OLD_SM\([[:space:]]\|$\)/\1$NEW_SM\2/
- match beginning of a line or space, then the word, then space or ending of line, and then substitute for backreference. Topics to research: regex and backreferences in sed.
Upvotes: 3
Reputation: 785246
You may use this paste + awk
solution:
awk -v OFS='\t' 'NR == FNR { map[$1]=$2; next} $1 == "#CHROM" {for (i=2; i<=NF; ++i) $i in map && $i=map[$i]} 1' <(paste OLD_SM.list NEW_SM.list) test.list
##a
##b
##C
#CHROM APPLE 0_62000_5 KIWI BANANA
Expanded form:
awk -v OFS='\t' '
NR == FNR {
map[$1] = $2
next
}
$1 == "#CHROM" {
for (i=2; i<=NF; ++i)
$i in map && $i = map[$i]
}
1' <(paste OLD_SM.list NEW_SM.list) test.list
Upvotes: 3