printing lines based on pattern matching in multiple fields using awk

Question

Suppose I have a html input like

this is a html input line

I want to filter all such input lines from a file which begins with

and ends with

. Now my idea was to search for pattern

in the first field and pattern

in the last field using the below awk command

awk '$1 ~ /\/ ; $NF ~ /\/ {print $0}'

but looks like there is no provision to match two fields at a time or I am making some syntax mistakes. Could you please help me here?

PS: I am working on a Solaris SunOS machine.

Ed Morton · Accepted Answer

There's a lot going wrong in your script on Solaris:

awk '$1 ~ /\/ ; $NF ~ /\/ {print $0}'

The default awk on Solaris (and so the one we have to assume you are using since you didn't state otherwise) is old, broken awk which must never be used. On Solaris use /usr/xpg4/bin/awk. There's also nawk but it's got less POSIX features (eg. no support for character classes).
\<...\> are gawk-specific word boundaries. There is no awk on Solaris that would recognize those. If you were just trying to get literal characters then there's no need to escape them as they are not regexp metacharacters.
If you want to test for condition 1 and condition 2 you put && between them, not ; which is just the statement terminator in lieu of a newline.
The default action given a true condition is {print $0} so you don't need to explicitly write that code.
/ is the awk regexp delimiter so you do need to escape that in mid-regexp.
The default field separator is white space so in your posted sample input $1 and $NF will be
this and line

So if you DID for some reason compare multiple fields you could do:

awk '($1 ~ /^.*/) && ($NF ~ /.*<\/li>$/)'

but this is probably what you really want:

awk '/^.*<\/li>/'

in which case you could just use grep:

grep '^.*'

Answers (2)