Reputation: 43
While going through a piece of code I saw the below command:
grep "r" temp | awk '{FS=","; $0=$0} { print $1,$3}'
temp file contain the pattern like:
1. r,1,5
2. r,4,5
3. ...
I could not understand what does the statement $0=$0
mean in awk command.
Can anyone explain what does it mean?
Upvotes: 4
Views: 5467
Reputation: 203712
When you do $1=$1
(or any other assignment to a field) it causes record recompilation where $0 is rebuilt with every FS replaced with OFS but it does not change NF (unless there was no $1
previously and then NF would change from 0 to 1) or reevaluate the record in any other way.
When you do $0=$0
it causes field splitting where NF, $1, $2, etc. are repopulated based on the current value of FS but it does not change the FSs to OFSs or modify $0 in any other way.
Look:
$ echo 'a-b-c' |
awk -F'-+' -v OFS='-' '
function p() { printf "%d) %d: $0=%s, $2=%s\n", ++c,NF,$0,$2 }
{ p(); $2=""; p(); $1=$1; p(); $0=$0; p(); $1=$1; p() }
'
1) 3: $0=a-b-c, $2=b
2) 3: $0=a--c, $2=
3) 3: $0=a--c, $2=
4) 2: $0=a--c, $2=c
5) 2: $0=a-c, $2=c
Note in the above that even though setting $2 to null resulted in 2 consecutive -
s and the FS of -+
means that 2 -
s are a single separator, they are not treated as such until $0=$0
causes the record to be re-split into fields as shown in output step 4.
The code you have:
awk '{FS=","; $0=$0}'
is using $0=$0
as a cludge to work around the fact that it's not setting FS until AFTER the first record has been read and split into fields:
$ printf 'a,b\nc,d\n' | awk '{print NF, $1}'
1 a,b
1 c,d
$ printf 'a,b\nc,d\n' | awk '{FS=","; print NF, $1}'
1 a,b
2 c
$ printf 'a,b\nc,d\n' | awk '{FS=","; $0=$0; print NF, $1}'
2 a
2 c
The correct solution, of course, is instead to simply set FS BEFORE The first record is read:
$ printf 'a,b\nc,d\n' | awk -F, '{print NF, $1}'
2 a
2 c
To be clear - assigning any value to $0 causes field splitting, it does not cause record recompilation while assigning any value to any field ($1, etc.) causes record recompilation but not field splitting:
$ echo 'a-b-c' | awk -F'-+' -v OFS='#' '{$2=$2}1'
a#b#c
$ echo 'a-b-c' | awk -F'-+' -v OFS='#' '{$0=$0}1'
a-b-c
Upvotes: 7
Reputation: 2845
>>> echo 'a-b-c' | awk -F'-+' -v OFS='#' '{$2=$2}1'
>>> a#b#c
This can be slightly simplified to
mawk 'BEGIN { FS="[-]+"; OFS = "#"; } ($2=$2)'
Rationale being the boolean test that comes afterwards will evaluate to true upon the assignment, so that itself is sufficient to re-gen the fields in OFS and print it.
Upvotes: 0
Reputation: 10039
$0 = $0
is used most often to rebuild the field separation evaluation of a modified entry. Ex: adding a field will change $NF after $0 = $0 where it stay as original (at entry of the line).
in this case, it change every line the field separator by (see @EdMorton comment below for strike) reparse the line with current FS info where a ,
andawk -F ',' { print $1 "," $3 }'
is a lot better coding for the same idea, taking the field separator at begining for all lines (in this case, could be different if separator is modified during process depernding by example of previous line content)
ex:
echo "foo;bar" | awk '{print NF}{FS=";"; print NF}{$0=$0;print NF}'
1
1
2
based on @EdMorton comment and related post (What is the meaning of $0 = $0 in Awk)
echo "a-b-c" |\
awk ' BEGIN{ FS="-+"; OFS="-"}
function p(Ref) { printf "%12s) NF=%d $0=%s, $2=%s\n", Ref,NF,$0,$2 }
{
p("Org")
$2="-"; p( "S2=-")
$1=$1 ; p( "$1=$1")
$2=$2 ; p( "$2=$2")
$0=$0 ; p( "$0=$0")
$2=$2 ; p( "$2=$2")
$3=$3 ; p( "$3=$3")
$1=$1 ; p( "$1=$1")
} '
Org) NF=3 $0=a-b-c, $2=b
S2=-) NF=3 $0=a---c, $2=-
$1=$1) NF=3 $0=a---c, $2=-
$2=$2) NF=3 $0=a---c, $2=-
$0=$0) NF=2 $0=a---c, $2=c
$2=$2) NF=2 $0=a-c, $2=c
$3=$3) NF=3 $0=a-c-, $2=c
$1=$1) NF=3 $0=a-c-, $2=c
Upvotes: 3
Reputation: 16997
$0=$0
is for re-evaluate the fields
For example
akshay@db-3325:~$ cat <<EOF | awk '/:/{FS=":"}/\|/{FS="|"}{print $2}'
1:2
2|3
EOF
# Same with $0=$0, it will force awk to have the $0 reevaluated
akshay@db-3325:~$ cat <<EOF | awk '/:/{FS=":"}/\|/{FS="|"}{$0=$0;print $2}'
1:2
2|3
EOF
2
3
# NF - gives you the total number of fields in a record
akshay@db-3325:~$ cat <<EOF | awk '/:/{FS=":"}/\|/{FS="|"}{print NF}'
1:2
2|3
EOF
1
1
# When we Force to re-evaluate the fields, we get correct 2 fields
akshay@db-3325:~$ cat <<EOF | awk '/:/{FS=":"}/\|/{FS="|"}{$0=$0; print NF}'
1:2
2|3
EOF
2
2
Upvotes: 1