Reputation: 93
I am an AWK-ing novice, and this is, by far, my most complex AWK attempt to date. I have 2 files, one with scan data (FILE1.csv) and another with scan dates (FILE2.csv). I need to compare these 2 files (extracting the dates from FILE2) then with those extracted dates, I need to conditionally check for the correct date, based on which dates are present for the particular target. My current script output yields no results. Any help is greatly appreciated!
FILE1.csv
Name,Plugin,Plugin Name,First Discovered,Last Observed,Severity
server1.domain,57608,SMB Signing not required,9/19/2020 20:55,12/3/2022 20:39,Medium
server1.domain,71966,Oracle Java SE Multiple Vulnerabilities (January 2014 CPU),4/22/2021 3:08,12/1/2022 3:14,Critical
server1.domain,94138,Oracle Java SE Multiple Vulnerabilities (October 2016 CPU),4/22/2021 3:08,12/8/2022 3:14,Critical
server2.domain,156032,Apache Log4j Unsupported Version Detection,12/25/2021 3:07,12/8/2022 3:07,Critical
server2.domain,156032,Apache Log4j Unsupported Version Detection,8/31/2022 11:48,11/30/2022 10:16,Critical
server2.domain,156103,Apache Log4j 1.2 JMSAppender Remote Code Execution (CVE-2021-4104),12/25/2021 3:07,12/6/2022 3:07,High
server3.domain,164078,Splunk Enterprise and Universal Forwarder < 9.0 Improper Certificate Validation,10/31/2022 3:13,11/30/2022 10:16,High
server3.domain,166960,Tenable Nessus Agent 10.x < 10.2.1 Multiple Vulnerabilities (TNS-2022-22),11/7/2022 3:14,12/3/2022 3:14,High
server3.domain,168362,VMware Tools 10.x / 11.x / 12.x < 12.1.5 DoS (VMSA-2022-0029),12/5/2022 3:14,12/8/2022 3:14,Low
FILE2.csv
Name,LAST_VULN_AGENT_SCAN,LAST_VULN_NONCRED_SCAN,LAST_VULN_CRED_SCAN
server1.domain,12/8/2022 3:14,12/3/2022 20:39,
server2.domain,,12/8/2022 3:07,
server3.domain,,12/3/2022 3:14,12/8/2022 3:14
DESIRED OUTPUT
Name,Plugin,Plugin Name,First Discovered,Last Observed,Severity
server1.domain,94138,Oracle Java SE Multiple Vulnerabilities (October 2016 CPU),4/22/2021 3:08,12/8/2022 3:14,Critical
server2.domain,156032,Apache Log4j Unsupported Version Detection,12/25/2021 3:07,12/8/2022 3:07,Critical
server3.domain,168362,VMware Tools 10.x / 11.x / 12.x < 12.1.5 DoS (VMSA-2022-0029),12/5/2022 3:14,12/8/2022 3:14,Low
CURRENT SCRIPT
awk -F',' 'NR==FNR{a[$2,$3,$4];next}
if (a[$2] && a[$4]) {
if(a[$2] > a[$4]) {
if ($5 == a[$2])
print $0;
}
else {
if ($5 == a[$4])
print $0;
}
}
else if (a[$2]) {
if ($5 == a[$2])
print $0;
}
else if (a[$4]) {
if ($5 == a[$4])
print $0;
}
else {
if ($5 == a[$3])
print $0;
}' FILE1.csv FILE2.csv
Edit 1: Here is my if/then logic to help understand what I'm doing
if [ ! -z ${LAST_VULN_AGENT_SCAN} ] && [ ! -z ${LAST_VULN_CRED_SCAN} ]; then
# AGENT SCAN DATE AND CRED SCAN DATE ARE NOT NULL
if [ "${AGENT_EPOCH}" -gt "${CRED_EPOCH}" ]; then
# AGENT SCAN DATE IS MORE RECENT THAN CRED SCAN DATE
# USE AGENT SCAN DATE TO FILTER COLUMN 5 (Last Observed)
else
# CRED SCAN DATE IS MORE RECENT THAN AGENT SCAN DATE
# USE CRED SCAN DATE TO FILTER COLUMN 5 (Last Observed)
fi
elif [ ! -z ${LAST_VULN_AGENT_SCAN} ] && [ -z ${LAST_VULN_CRED_SCAN} ]; then
# AGENT SCAN DATE IS NOT NULL AND CRED SCAN DATE IS NULL
# USE AGENT SCAN DATE TO FILTER COLUMN 5 (Last Observed)
elif [ -z ${LAST_VULN_AGENT_SCAN} ] && [ ! -z ${LAST_VULN_CRED_SCAN} ]; then
# CRED SCAN DATE IS NOT NULL AND AGENT SCAN DATE IS NULL
# USE CRED SCAN DATE TO FILTER COLUMN 5 (Last Observed)
else
# USE NONCRED SCAN DATE TO FILTER COLUMN 5 (Last Observed)
fi
Upvotes: 0
Views: 131
Reputation: 34084
A few issues with OP's current code:
a[]
array (a[$2,$3,$4]
) but ...a[]
array; net result is that none of the tests will evaluate as 'true'FILE1.csv
and FILE2.csv
are based on a common Name
(eg, server1.domain
) so there needs to be some sort of comparison of $1
between the two files; more specifically, array indices should probably be based on $1
awk
scriptsAdditional items we need to address:
dt2[]
(re $2
), dt3[]
(re: $3
) and dt4[]
(re: $4
)$2
> $4
we'll create an entry in the greater[]
arrayPulling all of this together into our awk
script:
awk '
BEGIN { FS=OFS="," }
FNR==NR { if (FNR==1)
next
if ($2) dt2[$1]=$2
if ($3) dt3[$1]=$3
if ($4) dt4[$1]=$4
if ($2 && $4) {
split($2,a,"[ /:]")
epoch2=mktime(a[3] " " a[1] " " a[2] " " a[4] " " a[5] " 0")
split($4,a,"[ /:]")
epoch4=mktime(a[3] " " a[1] " " a[2] " " a[4] " " a[5] " 0")
if (epoch2 > epoch4)
greater[$1]
}
next
}
FNR==1 { printme=1 } # set print flag
FNR>1 { printme=0 # clear print flag
if ($1 in dt2 && $1 in dt4) {
if ($1 in greater) {
if ($5 == dt2[$1])
printme=1
}
else if ($5 == dt4[$1]) {
printme=1
}
}
else if ($1 in dt2) {
if ($5 == dt2[$1])
printme=1
}
else if ($1 in dt4) {
if ($5 == dt4[$1])
printme=1
}
else if ($1 in dt3) {
if ($5 == dt3[$1])
printme=1
}
}
printme # if print flag == 1 then print current line to stdout
' FILE2.csv FILE1.csv
This generates:
Name,Plugin,Plugin Name,First Discovered,Last Observed,Severity
server1.domain,94138,Oracle Java SE Multiple Vulnerabilities (October 2016 CPU),4/22/2021 3:08,12/8/2022 3:14,Critical
server2.domain,156032,Apache Log4j Unsupported Version Detection,12/25/2021 3:07,12/8/2022 3:07,Critical
server3.domain,168362,VMware Tools 10.x / 11.x / 12.x < 12.1.5 DoS (VMSA-2022-0029),12/5/2022 3:14,12/8/2022 3:14,Low
Upvotes: 1