Reputation: 17336
Is it always the case, after modifying a specific field in awk
, that information on the output field separator is lost? What happens if there are multiple field separators and I want them to be recovered?
For example, suppose I have a simple file example
that contains:
a:e:i:o:u
If I just run an awk
script, which takes account of the input field separator, that prints each line in my file, such as running
awk -F: '{print $0}' example
I will see the original line. If however I modify one of the fields directly, e.g. with
awk -F: '{$2=$2"!"; print $0}' example
I do not get back a modified version of the original line, rather I see the fields separated by the default whitespace separator, i.e:
a e! i o u
I can get back a modified version of the original by specifying OFS, e.g.:
awk -F: 'BEGIN {OFS=":"} {$2=$2"!"; print $0}' example
In the case, however, where there are multiple potential field separators but in the case of multiple separators is there a simple way of restoring the original separators?
For example, if example
had both :
and ;
as separators, I could use -F":|;"
to process the file but OFS would no be sufficient to restore the original separators in their relative positions.
More explicitly, if we switched to example2
containing
a:e;i:o;u
we could use
awk -F":|;" 'BEGIN {OFS=":"} {$2=$2"!"; print $0}' example2
(or -F"[:;]"
) to get
a:e!:i:o:u
but we've lost the distinction between : and ; which would have been maintained if we could recover
a:e!;i:o;u
Upvotes: 4
Views: 1007
Reputation: 203189
You need to use GNU awk for the 4th arg to split() which saves the separators, like RT does for RS:
$ awk -F'[:;]' '{split($0,f,FS,s); $2=$2"!"; r=s[0]; for (i=1;i<=NF;i++) r=r $i s[i]; $0=r} 1' file
a:e!;i:o;u
There is no automatically populated array of FS matching strings because of how expensive it'd be in time and memory to store the string that matches FS every time you split a record into fields. Instead the GNU awk folks provided a 4th arg to split() so you can do it yourself if/when you want it. That is the result of a long conversation a few years ago in the comp.lang.awk newsgroup between experienced awk users and gawk providers before all agreeing that this was the best approach.
See split()
at https://www.gnu.org/software/gawk/manual/gawk.html#String-Functions.
Upvotes: 5