Numpty
Numpty

Reputation: 1481

awk - if column = null

Using awk, how would you assign 'null' a value to read?

I'm sure there's a set character for this, I just can't find it.

For example,

I've got a string of awk like this:

awk '
$3==24{print "stuff"}
$3==23{print "stuff"}
'

I need to know how to account for blank colums using the same format so that if $3 = blank {print "stuff"}

Thanks!

Upvotes: 11

Views: 49594

Answers (2)

Keith Thompson
Keith Thompson

Reputation: 263267

In default awk processing, there's no such thing as a "blank" column.

Fields are delimited by whitespace, i.e., by one or more whitespace characters (tabs and spaces, basically). So given this input:

this that the_other
foo       bar

for the first line $1, $2, and $3 are this, that, and the_other, respectively, but for the second line bar is $2, regardless of how many blanks there are between the first and second fields.

You can have empty fields if you specify a different field separator:

$ ( echo 'this:that:the_other' ; echo 'foo::bar' ) | awk -F: '{print $3}'
the_other
bar

Or, if you prefer to set the field separator in the script itself:

$ ( echo 'this:that:the_other' ; echo 'foo::bar' ) | \
    awk 'BEGIN { FS = ":" } {print $3}'
the_other
bar

But you can use a regular expression as the field separator:

$ ( echo 'this that the_other' ; echo 'foo  bar' ) | \
  awk 'BEGIN { FS = "[ ]" } {print $3}'
the_other
bar

(Some very old Awk implementations might not support it regular expressions here.)

The regular expression "[ ]" doesn't get the same special treatment that the space character does.

References to the GNU Awk manual:

Default field splitting:

Fields are normally separated by whitespace sequences (spaces, TABs, and newlines), not by single spaces. Two spaces in a row do not delimit an empty field. The default value of the field separator FS is a string containing a single space, " ". If awk interpreted this value in the usual way, each space character would separate fields, so two spaces in a row would make an empty field between them. The reason this does not happen is that a single space as the value of FS is a special case -- it is taken to specify the default manner of delimiting fields.

If FS is any other single character, such as ",", then each occurrence of that character separates two fields. Two consecutive occurrences delimit an empty field. If the character occurs at the beginning or the end of the line, that too delimits an empty field. The space character is the only single character that does not follow these rules.

and Using Regular Expressions to Separate Fields.

But be careful with this; either you'll have to modify the file to use a different separator, or your parsing will be sensitive to the number of blanks between fields (foo bar (with one blank) will be distinct from foo bar (with two blanks)).

Depending on your application, you might consider parsing lines by column number rather than by awk-recognized fields.

Upvotes: 10

Gilles Quénot
Gilles Quénot

Reputation: 185161

Try doing this :

awk '
    $3==24{print "stuff"}
    $3==23{print "stuff"}
    !$3{print "null"}
' file.txt

If you need to process $3 if it's zero (false for awk), try doing this :

!$3 && $3 != 0{print "null"}

Upvotes: 11

Related Questions