Metahuman
Metahuman

Reputation: 202

Regular expression for multiple lines with negation

I have a file which has multiple lines having the following structure:

Text line1: xxxx
Text line2: x
OS: "MacOS"
NotOS: "Linux"
Text line3:
ID: 12345

OR

Text line1: xxxx
Text line2: x
OS: "MacOS|Linux|Red Hat|Windows|Ubuntu|CentOS|Fedora"
NotOS: "HP-UX"
Text line3:
ID: 12345

I am looking to get all the IDs that has "CentOS" in the OS field, but no "Linux" in the NotOS field. I used the following which works for me for some extent, but not completely:

grep -n 'OS\:.*[c|C]ent[o|O][s|S][\S+\n\r\s]+VulnID:' filename |\
grep -v '[L|l][I|i][N|n][U|u][X|x]|Amazon|Amazon Linux'

It tends to return this:

15386:OS:             "Linux|AIX|Solaris|VMware|FreeBSD|IRIX|NetBSD|OpenBSD|BSD|Fedora|Ubuntu|Red.*Hat|CentOS|OpenSuSE|SuSE|MacOS|Oracle Enterprise Linux|HP-UX"
15404:OS: "SuSE|Linux|AIX|BSD|CentOS|Solaris|HP-UX"
15527:NotOS: "(Unknown|CentOS|Red Hat|Ubuntu|Oracle Enterprise Linux|Debian|Fedora|AIX|SuSE|Solaris)"
15537:NotOS: "(Unknown|CentOS|Red Hat|Ubuntu|Oracle Enterprise Linux|Debian|Fedora|AIX|SuSE|Solaris)"
15705:OS:             "Solaris|Linux|CentOS"

where the first numbers are line numbers, but it wont return the text with "ID:"

How do I get this done?

Upvotes: 0

Views: 46

Answers (1)

Ed Morton
Ed Morton

Reputation: 203995

Keep it simple, just use awk:

$ awk -F': ' '{m[$1]=tolower($2)} $1=="ID" && m["OS"]~/centos/ && m["NotOS"]!~/linux/' file
ID: 12345

or if you just want the number:

$ awk -F': ' '{m[$1]=tolower($2)} $1=="ID" && m["OS"]~/centos/ && m["NotOS"]!~/linux/{print $2}' file
12345

The above was run on this file:

$ cat file
Text line1: xxxx
Text line2: x
OS: "MacOS"
NotOS: "Linux"
Text line3:
ID: 12345
Text line1: xxxx
Text line2: x
OS: "MacOS|Linux|Red Hat|Windows|Ubuntu|CentOS|Fedora"
NotOS: "HP-UX"
Text line3:
ID: 12345

Upvotes: 2

Related Questions