Reputation: 37
I'm bit of a newbie to shell scripting and awk. Could anyone suggest a more efficient and elegant solution to what I'm doing below to perform a key lookup between two files ?
Two input files:
File 1 - Contains a single column key field (server-metricname-minute) :
key_column
server026-AckDelayAverage-00:01:00
server026-AckDelayMax-00:01:00
server026-AckSent-00:01:00
server026-DigEnvValidationLatestTime-00:01:00
server026-DigEnvValidationTimeAverage-00:01:00
File 2 - Comma separated containing the key field and number of other fields
key_column,host,date,minute,metricname, metric value
server026-AckDelayAverage-00:01:00,server026,May 24 2016,00:01:00,AckDelayAverage,942
server026-AckDelayMax-00:01:00,server026,May 24 2016,00:01:00,AckDelayMax,5855
server026-AckSent-00:01:00,server026,May 24 2016,00:01:00,AckSent,49038
My logic is :
Loop through file1
If key found in File2
print file1.key , file2.field3 , file2.field6 to file3
else
print file1.key + 'KEY_NOT_FOUND' text to file3
fi
So the file3 output should have a row for every record in file1.
The code below seems to work , but could anyone suggest a more efficient and elegant method of achieving this ?
while read key ;
do
metric_found=`grep $key file2`
if [[ ! -z $metric_found ]]
then
echo ${metric_found} | awk -F "," '{print $1",$3,"$6}'
else
echo ${key},KEY_NOT_FOUND
fi
done < file1
Example output from existing script based on the sample data :
server026-AckDelayAverage-00:01:00,May 24 2016,942
server026-AckDelayMax-00:01:00,May 24 2016,5855
server026-AckSent-00:01:00,May 24 2016,49038
server026-DigEnvValidationLatestTime-23:59:00,KEY_NOT_FOUND
server026-DigEnvValidationTimeAverage-23:59:00,KEY_NOT_FOUND
thanks..
Upvotes: 3
Views: 960
Reputation: 203413
$ cat tst.awk
BEGIN { FS=OFS="," }
NR==FNR { file2[$1] = $3 OFS $6; next }
FNR>1 { print $1, ($1 in file2 ? file2[$1] : "KEY_NOT_FOUND") }
$ awk -f tst.awk file2 file1
server026-AckDelayAverage-00:01:00,May 24 2016,942
server026-AckDelayMax-00:01:00,May 24 2016,5855
server026-AckSent-00:01:00,May 24 2016,49038
server026-DigEnvValidationLatestTime-00:01:00,KEY_NOT_FOUND
server026-DigEnvValidationTimeAverage-00:01:00,KEY_NOT_FOUND
Upvotes: 3
Reputation: 5252
try this:
awk 'BEGIN{FS=OFS=","}NR==FNR{a[$1]=1;b[$1]=$3;c[$1]=$6;}NR>FNR{if (a[$1]) print $1,b[$1],c[$1]; else print $1,"KEY_NOT_FOUND";}' file2 file1 > file3
Upvotes: 3