Reputation: 43
I have 2 csv files( 1.csv and 2.csv). Number of columns in both the files are same.
1.csv
Server,Path,I/P Backlog,O/P Backlog
Server1,Yes,3000,200
Server3,Yes,8,0
Server5,Yes,0,0
2.csv
Server,Path,I/P Backlog,O/P Backlog
Server1,Yes,3300,180
Server3,Yes,7,0
Server5,Yes,0,0
Server6,Yes,10,200
I have to compare 3 different columns in both csv.
IN above example, if server name is same, then find the deviation percent of I/P backlog and o/p backlog and create 2 separate columns for input trend and output trend.
if value is same: Nochange. if value has increased then it should increase % (up) if value has decreased: decrease % (down) if new row has been added: New if any row doesnt exist in second report , nothing needs to be done.
So my final sheet should show as below.
Server Path I/P Backlog Input Trend O/P Backlog Output Trend
Server1 Yes 3300 10%(up) 180 20%(down)
Server3 Yes 7 20%(down) 0 No change
Server5 Yes 0 No change 0 No change
Server6 Yes 10 New 200 New
Upvotes: 0
Views: 165
Reputation: 92884
Extended awk
solution:
awk 'function percent(v){
if (v !~ /^[0-9]+(\.[0-9]+)?$/){ return v }
coef=1; dev="down";
if (v >= 1){ coef=-1; dev="up" }
return sprintf("%d%(%s)",(100-v*100) * coef, dev)
}
NR==1{
$4=$4 FS "Input Trend";
$6=$6 FS "Output Trend";
print
}
FNR==1{ next }{ k=$1 FS $2 }
NR==FNR{ a[k]=$3 FS $4; next }
k in a{
split(a[k], vals);
if ($3==vals[1]) ip_diff="No change";
else if (!vals[1]) ip_diff=$3"(up)";
else ip_diff=$3/vals[1];
if ($4==vals[2]) op_diff="No change";
else if (!vals[2]) op_diff=$4"(up)";
else op_diff=$4/vals[2];
print k, $3, percent(ip_diff), $4, percent(op_diff); next
}
{ print $1,$2,$3,"New",$4,"New" }' f1.csv f2.csv
The output:
Server Path I/P Backlog Input Trend O/P Backlog Output Trend
Server1 Yes 3300 10%(up) 180 10%(down)
Server3 Yes 7 12%(down) 0 No change
Server5 Yes 0 No change 0 No change
Server6 Yes 10 New 200 New
Upvotes: 1