Reputation: 6921
I am dealing with a file where fields are separated by a single space.
awk interprets the FS " "
as "one or more whitespace", which misreads my file when one of the fields is empty.
I tried using "a space not followed by a space"( " (?! )"
) as FS but awk does not support negative lookahead. Simple google queries like "single space field separator awk" only sent me to the manual page explaining the special treatment of FS=" "
. I must have missed the relevant manual page...
How can I use a single space as field separator with awk?
Upvotes: 19
Views: 78122
Reputation: 3649
To give a couple of helpful manpage references for this behaviour:
Default Field Splitting explains that " "
is the default value, but carries a special meaning:
The default value of the field separator
FS
is a string containing a single space," "
.If awk interpreted this value in the usual way, each space character would separate fields, so two spaces in a row would make an empty field between them.
The reason this does not happen is that a single space as the value of
FS
is a special case—it is taken to specify the default manner of delimiting fields.
Regexp Field Splitting explains how delimit a single space:
For a less trivial example of a regular expression, try using single spaces to separate fields the way single commas are used.
FS
can be set to"[ ]"
(left bracket, space, right bracket).This regular expression matches a single space and nothing else (see Regular Expressions).
(Added the emphasis and paragraphing.)
Upvotes: 0
Reputation: 67467
this should work
$ echo 'a b' | awk -F'[ ]' '{print NF}'
5
where as, this treats all contiguous white space as one.
$ echo 'a b' | awk -F' ' '{print NF}'
2
based on the comment, it need special consideration, empty string or white space as field value are very different things probably not a good match for a white space separated content.
I would suggest preprocessing with cut
and changing the delimiters, for example
$ echo 'a b' | cut -d' ' -f1,3,5 --output-delimiter=,
a,,b
Upvotes: 33