oguz ismail
oguz ismail

Reputation: 50750

caret (^) in FS (gawk)

Update

This was a bug and a fix is available in git repo now.


I can't understand how a circumflex in FS is interpreted. For example, here is my file:

$ cat file
foo bar
baz quz

I wrote this awk script:

BEGIN{FS="^.";OFS="|"}{$1=$1}1

and was expecting this output:

|oo bar
|uz baz

but with gawk I got this:

$ gawk 'BEGIN{FS="^.";OFS="|"}{$1=$1}1' file
||o bar
||z quz

And it gets stranger with more dots:

$ gawk 'BEGIN{FS="^..";OFS="|"}{$1=$1}1' file
||bar
||quz
$ gawk 'BEGIN{FS="^...";OFS="|"}{$1=$1}1' file
||r
||z
$ gawk 'BEGIN{FS="^....";OFS="|"}{$1=$1}1' file
|bar
|quz

I couldn't find an explanation in neither POSIX awk specification nor gawk manual. Can you guys please help me understand what's going on? What am I missing here?

Upvotes: 6

Views: 405

Answers (1)

kvantour
kvantour

Reputation: 26471

It is clearly a bug and probably a memory leak. When you ask to print NF before, the behaviour is as expected:

$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print}'
||oo
$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print NF; print}'
2
|oo

Upvotes: 3

Related Questions