KGee
KGee

Reputation: 373

How to add an extra column based on pattern of other 2 columns

I have a big file like that:

$1   $2  $3 $4  $5  
567 NA  0   0   NA
568 NA  0   0   NA
569 NA  0   0   NA
570 NA  0   0   NA
571 +   1   1   1
572 +   1   2   1
573 +   1   3   1
966 +   1   396 1
967 NA  0   396 NA
968 NA  0   396 NA
969 NA  0   396 NA
8793    +   1   -3599   2
8794    +   1   -3598   2
3277    -   -1  -146    3
3278    -   -1  -147    3
3279    -   -1  -148    3
8795    +   1   -3597   4
8796    +   1   -3596   4
3280    -   -1  -149    5
3281    -   -1  -150    5
3282    -   -1  -151    5
3283    -   -1  -152    6
3284    -   -1  -153    6
3285    -   -1  -154    6
5692    NA  0    0  NA
3286    -   -1  -155    7

I want to create a counter in an extra column preferably at the end (lets say $6) in which For $5 if $2=+/-/NA add +1/-1 and 0 respectively, and keep the same value until the $5 changed value like that. To make it more clear, I want to make a table read concerning values of $5. Based on $2, my counter has to change its values by +1 for +, -1 for - and 0 for N/A. At the end I need the new counter keeps printing the counter value and until the value of $5 is change:

$2  $5    $6
NA  NA  0
NA  NA  0
NA  NA  0
NA  NA  0
+   1   1
+   1   1
+   1   1
+   1   1
NA  NA  1
NA  NA  1
NA  NA  1
+   2   2
+   2   2
-   3   1
-   3   1
-   3   1
+   4   2
+   4   2
-   5   1
-   5   1
-   5   1
-   6   0
-   6   0
-   6   0
NA  6   0
-   7   -1

I typed this:

awk 'BEGIN {v=0; p=0} {if ($2=="-") {v=v-1 ;p=v} if ($2=="NA") {p=n; v=$5;} else {v=$5+1;p=n}; $6=v;$7=p; print}' MyFIle 

But it almost give the same values as $5.

Upvotes: 0

Views: 107

Answers (1)

karakfa
karakfa

Reputation: 67467

$ awk 'BEGIN {a["+"]=1;a["-"]=-1;a["NA"]=0} 
       p!=$5 {p=$5; c+=a[$2]} 
             {print $2,$5,c}' file | column -t

NA  NA  0
NA  NA  0
NA  NA  0
NA  NA  0
+   1   1
+   1   1
+   1   1
+   1   1
NA  NA  1
NA  NA  1
NA  NA  1
+   2   2
+   2   2
-   3   1
-   3   1
-   3   1
+   4   2
+   4   2
-   5   1
-   5   1
-   5   1
-   6   0
-   6   0
-   6   0
NA  NA  0
-   7   -1

or, golfed version

$ awk 'p!=$5{c+=$2!="NA"?$2"1":0} {print $2,p=$5,c}' file | column -t

Upvotes: 1

Related Questions