Reputation: 21
I'm processing a manpage in nroff format with awk to extract the options to each command... I figured out that the options start with \fB
, followed by the actual option, and maybe \fP
and option arguments and so on...
Example:
\fB\-\-author\fR
I started writing an awk-script, specifing FS = "\fB"
... well, it didn't work... I tried to escape the \
, switching to FS = "\\fB"
but didn't work either... what am I doing wrong?
This is my script:
BEGIN {
FS = "\\f." # "\\\\f." didn't work either
}
{
print $2
}
This is the input
\fB-o\fP
Where I want $2 to be -o. But it just won't work.
Upvotes: 2
Views: 442
Reputation: 7667
I think I remember running into this once.
The real problem was that some versions of awk insist on FS being a single character.
The way around it, as I recall, was to manually pull the file into GNU Emacs, edit the multicharacter FS down to one character that wasn't used anywhere else in the file, awk that with the appropriate FS, then manually repair it afterwards.
You MIGHT be able to automate this with a couple of sed scripts, one to do the initial recoding, and one to repair it, with the awk step in the middle.
Upvotes: 0
Reputation: 328556
The field separator FS
is for CSV-like data. In your case, find the options for a filter and then remove the parts that you don't want:
/\\fB/ { ... process option ...}
Upvotes: 0
Reputation: 258138
It looks like you can accomplish this with 4 backslashes:
$ echo "1\z2\z3" | awk 'BEGIN { FS = "\\\\z" } ; {print $3 $1}'
31
When bash parses this, it should unescape the 4 backslashes to 2 literal backslashes; then awk will unescape those 2 backslashes to a single literal backslash.
Upvotes: 2