M. A.
M. A.

Reputation: 1

How to use awk and grep combination

I have a file with 10 columns and lots of lines. I want to add a fix correction to the 10th column where its line contain 'G01' pattern.

For example, in the file below

AS G17  2014  3 31  0  2  0.000000  1   -0.809159910000E-04                     
AS G12  2014  3 31  0  2  0.000000  1    0.195515363000E-03                     
AS G15  2014  3 31  0  2  0.000000  1   -0.171167837000E-03                     
AS G29  2014  3 31  0  2  0.000000  1    0.521982134000E-03                     
AS G07  2014  3 31  0  2  0.000000  1    0.329889640000E-03                     
AS G05  2014  3 31  0  2  0.000000  1   -0.381588767000E-03                     
AS G25  2014  3 31  0  2  0.000000  1    0.203352860000E-04                     
AS G01  2014  3 31  0  2  0.000000  1    0.650180300000E-05                     
AS G24  2014  3 31  0  2  0.000000  1   -0.258444780000E-04                     
AS G27  2014  3 31  0  2  0.000000  1   -0.203691700000E-04   

the 10th column of the line with G01 should be corrected. I've used 'awk' with 'while' loop to do that, but it takes a very long time for massive files. It will be appreciated if anybody can help for a more effective way.

Upvotes: 0

Views: 1085

Answers (2)

Bertrand Martel
Bertrand Martel

Reputation: 45513

You can use the following :

awk '$2 == "G01" {$10="value"}1' file.txt

To preserve whitespaces you can use the solution from this post :

awk '$2 == "G01" {
    data=1
    n=split($0,a," ",b)
    a[10]="value"
    line=b[0]
    for (i=1;i<=n; i++){
        line=(line a[i] b[i])
    }
    print line
}{
    if (data!=1){
        print;
    }
    else {
        data=0;
    }
}' file.txt

Upvotes: 2

Akshay Hegde
Akshay Hegde

Reputation: 16997

the 10th column of the line with G01 should be corrected

Syntax is as follows, which will search for regex given inside /../ in current record/line/row regardless of which field the regex was found

Either

$ awk '/regex/{ $10 = "somevalue"; print }' infile

OR

1 at the end does default operation print $0, that is print current record/line/row

$ awk '/regex/{ $10 = "somevalue" }1' infile

OR

$0 means current record/line/row

$ awk '$0 ~ /regex/{ $10 = "somevalue"}1' infile

So in current context, it will be any of the following

$ awk '/G01/{$10 = "somevalue" ; print }' infile

$ awk '/G01/{$10 = "somevalue" }1' infile

$ awk '$0 ~ /G01/{$10 = "somevalue"; print }' infile

$ awk '$0 ~ /G01/{$10 = "somevalue" }1' infile

If you would like to strict your search to specific field/column in record/line/row then

$10 means 10th field/column

$ awk '$2 == "G01" {$10 = "somevalue"; print }' infile

$ awk '$2 == "G01" {$10 = "somevalue" }1' infile

In case if you would like to pass say some word from shell variable to awk or just a word then

$ awk -v search="G01" -v replace="foo" '$2 == search {$10 = replace }1' infile

and then same from shell

$ search_value="G01"
$ new_value="foo"
$ awk -v search="$search_value" -v replace="$new_value" '$2 == search {$10 = replace }1' infile

From man

-v var=val

   --assign var=val

    Assign  the  value  val to the variable var, before execution of
    the program begins.  Such variable values are available  to  the
    BEGIN block of an AWK program.

For additional syntax instructions:

"sed & awk" by Dale Dougherty and Arnold Robbins
(O'Reilly)

"UNIX Text Processing," by Dale Dougherty and Tim O'Reilly (Hayden Books)

"GAWK: Effective awk Programming," by Arnold D. Robbins
(O'Reilly)

http://www.gnu.org/software/gawk/manual/

Upvotes: 1

Related Questions