Reputation: 101
I have a tab separated text file. In column 1 and 2 there are family and individual ids that start with a character followed by number as follow:
HG1005 HG1005
HG1006 HG1006
HG1007 HG1007
NA1008 NA1008
NA1009 NA1009
I would like to replace NA with HG in both the columns. I am very new to linux and tried the following code and some others:
awk '{sub("NA","HG",$2)';print}' input file > output file
Any help is highly appreciated.
Upvotes: 1
Views: 1114
Reputation: 626738
The $2
in your call to sub
only replaces the first occurrence of NA
in the second field.
Note that while sed
is more typical for such scenarios:
sed 's/NA/HG/g' inputfile > outputfile
you can still use awk
:
awk '{gsub("NA","HG")}1' inputfile > outputfile
See the online demo.
Since there is no input variable in gsub
(that performs multiple search and replaces) the default $0
is used, i.e. the whole record, the current line, and the code above is equal to awk '{gsub("NA","HG",$0)}1' inputfile > outputfile
.
The 1
at the end triggers printing the current record, it is a shorter variant of print
.
Upvotes: 1
Reputation: 1126
Notice /^NA/
position at the beginning of field:
awk '{for(i=1;i<=NF;i++)if($i ~ /^NA/) sub(/^NA/,"HG",$(i))} 1' file
HG1005 HG1005
HG1006 HG1006
HG1007 HG1007
HG1008 HG1008
HG1009 HG1009
and save it:
awk '{for(i=1;i<=NF;i++)if($i ~ /^NA/) sub(/^NA/,"HG",$(i))} 1' file > outputfile
If you have a tab as separator:
awk 'BEGIN{FS=OFS="\t"} {for(i=1;i<=NF;i++)if($i ~ /^NA/) sub(/^NA/,"HG",$(i))} 1' file > outputfile
Upvotes: 1
Reputation: 133438
Converting my comment to answer now, use gsub
in spite of sub
here. Because it will globally substitute NA to HG here.
awk 'BEGIN{FS=OFS="\t"} {gsub("NA","HG");print}' inputfile > outputfile
OR use following in case you have several fields and you want to perform substitution only in 1st and 2nd fields.
awk 'BEGIN{FS=OFS="\t"} {sub("NA","HG",$1);sub("NA","HG",$2);print}' inputfile > outputfile
Change sub
to gsub
in 2nd code in case multiple occurrences of NA needs to be changed within field itself.
Upvotes: 2