apply dictionary mapping to the column of a file with awk

Question

I have a text file file.txt with several columns (tab separated), and the first column can contain indexes such as 1, 2, and 3. I want to update the first column so that 1 becomes "one", 2 becomes "two", and 3 becomes "three". I created a bash file a.sh containing:

declare -A DICO=( [1]="one" [2]="two" [3]="three" )
awk '{ $1 = ${DICO[$1]}; print }'

But now when I run cat file.txt | ./a.sh I get:

awk: cmd. line:1: { $1 = ${DICO[$1]}; print }
awk: cmd. line:1:         ^ syntax error

I'm not able to fix the syntax. Any ideas? Also there is maybe a better way to do this with bash, but I could not think of another simple approach.

For instance, if the input is a file containing:

2       xxx
2       yyy
1       zzz
3       000
4       bla

The expected output would be:

two     xxx
two     yyy
one     zzz
three   000
UNKNOWN bla

RavinderSingh13 · Accepted Answer

EDIT: Since OP had now added samples so changed solution as per that now.

awk 'BEGIN{split("one,two,three",array,",")} {$1=$1 in array?array[$1]:"UNKONW"} 1' OFS="	" Input_file

Explanation: Adding explanation for above code too now.

awk '
BEGIN{                              ##Starting BEGIN block of awk code here.
  split("one,two,three",array,",")  ##Creating an array named array whose values are string one two three with delimiter as comma.
}
{
  $1=$1 in array?array[$1]:"UNKOWN" ##Re-creating first column which will be if $1 comes in array then its value will be aray[$1] else it will be UNKOWN string.
}
1                                   ##Mentioning 1 here. awk works on method of condition then action, so making condition is TRUE here and not mentioning any action so by default print of current line will happen.
' Input_file                        ##mentioning Input_file name here.

Since you haven't shown samples so couldn't tested completely, could you please try following and let me know if this helps.

awk 'function check(value){gsub(value,array[value],$1)} BEGIN{split("one,two,three",array,",")} check(1) check(2) check(3); 1' Input_file

Adding a non-one liner form of solution too here.

awk '
function check(value){
  gsub(value,array[value],$1)
}
BEGIN{
  split("one,two,three",array,",")
}
check(1)
check(2)
check(3);
1'  OFS="	" Input_file

Tested code as follows too:

Let's say we have following Input_file:

cat Input_file
1213121312111122243434onetwothree wguwvrwvrwvbvrwvrvr
vkewjvrkmvr13232424

Then after running the code following will be the output:

onetwoonethreeonetwoonethreeonetwooneoneoneonetwotwotwo4three4three4onetwothree wguwvrwvrwvbvrwvrvr
vkewjvrkmvronethreetwothreetwo4two4

apply dictionary mapping to the column of a file with awk

Answers (2)

Related Questions