shams
shams

Reputation: 162

Replace dot in entire file with NA without altering data structure

I am trying to replace dot(.) with NA but my current code is replacing even . form a decimal point eg form 0.01. I am using

cat input.tsv |  sed -r 's/\./NA/g' > replaced.tsv

I have copied hear one of the input lines.

1   69091   A   C   M   L   .   1   69091   1   58954   OR4F5   +   ATG 1   0   a   ./. ./. ENSG00000186092 ENST00000335137 ENSP00000334393 1   0.13    0.26702 T   Q8NH21  OR4F5_HUMAN 1   0.0 0.02634 B   0.0 0.01257 B   0.589091    0.05577 N   1.339740    .   .   .   .   .   OR4F5_HUMAN M1L .   .   .   6.76    0.00529 T   -0.38   0.13435 N   NM_001005484.1  M1L 0.12    0.13350 -0.9577 0.39629 T   0.0009  0.00318 T   8   0.00708247797993    0.18931 T   0.109   0.31349 0.823   0.93536 Q8NH21  M1L Loss of sheet (P = 0.0817); Loss of disorder (P = 0.091); Loss of catalytic residue at V2 (P = 0.3992); Loss of solvent accessibility (P = 0.5485); Gain of helix (P = 0.5668)  -1.436194   0.01840 0.003   0.44378301154325944 0.03370 0.02063 0.06083 N   AEFI    c   -1.39413139690747   0.1192561   -1.53570515685522   0.09493324  0.02038 2.2163971633957E-5  0.03550 0.487112    0.13308 0   0.573888    0.26071 0   0.573888    0.22998 0   0.564101    0.26208 0   2.31    -4.63   0.03101 -0.055000   0.11668 -1.983000   0.00506 0.000000    0.06329 0.000000    0.01567 0.2547:0.0:0.5282:0.2171    3.5592  0.07372 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .

Desired output

1    69091   A   C   M   L   NA   1   69091   1   58954   OR4F5   +   ATG 1   0   a ./. ./. ENSG00000186092 ENST00000335137 ENSP00000334393 1   0.13    0.26702 T   Q8NH21  OR4F5_HUMAN 1   0.0 0.02634 B   0.0 0.01257 B   0.589091    0.05577 N   1.339740    NA  NA  NA  NA  NA  OR4F5_HUMAN M1L NA  NA  NA  6.76    0.00529 T   -0.38   0.13435 N   NM_001005484.1  M1L 0.12    0.13350 -0.9577 0.39629 T   0.0009  0.00318 T   8   0.00708247797993    0.18931 T   0.109   0.31349 0.823   0.93536 Q8NH21  M1L Loss of sheet (P = 0.0817); Loss of disorder (P = 0.091); Loss of catalytic residue at V2 (P = 0.3992); Loss of solvent accessibility (P = 0.5485); Gain of helix (P = 0.5668)  -1.436194   0.01840 0.003   0.44378301154325944 0.03370 0.02063 0.06083 N   AEFI    c   -1.39413139690747   0.1192561   -1.53570515685522   0.09493324  0.02038 2.2163971633957E-5  0.03550 0.487112    0.13308 0   0.573888    0.26071 0   0.573888    0.22998 0   0.564101    0.26208 0   2.31    -4.63   0.03101 -0.055000   0.11668 -1.983000   0.00506 0.000000    0.06329 0.000000    0.01567 0.2547:0.0:0.5282:0.2171    3.5592  0.07372 NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA

Upvotes: 1

Views: 48

Answers (1)

karakfa
karakfa

Reputation: 67537

something like this?

sed -E 's/(^| )\.( |$)/ NA/g' file

looks for a dot with surrounding spaces, also checking for line start or line end. You need to eat up one of the spaces though. I chose right one.

Upvotes: 3

Related Questions