Apex
Apex

Reputation: 1096

How to calculate max and min of multiple columns (row wise) using awk

This might be simple - I have a file as below:

df.csv

col1,col2,col3,col4,col5
A,2,5,7,9
B,6,10,2,3
C,3,4,6,8

I want to perform max(col2,col4) - min(col3,col5) but I get an error using max and min in awk and write the result in a new column. So the desired output should look like:

col1,col2,col3,col4,col5,New_col
A,2,5,7,9,2
B,6,10,2,3,3
C,3,4,6,8,2

I used the code below but is does not work - how can I solve this?

awk -F, '{print $1,$2,$3,$4,$5,$(max($7,$9)-min($8,$10))}'

Thank you.

Upvotes: 1

Views: 1150

Answers (3)

Ed Morton
Ed Morton

Reputation: 203324

$ cat tst.awk
BEGIN { FS=OFS="," }
{ print $0, (NR>1 ? max($2,$4) - min($3,$5) : "New_col") }

function max(a,b) {return (a>b ? a : b)}
function min(a,b) {return (a<b ? a : b)}

$ awk -f tst.awk file
col1,col2,col3,col4,col5,New_col
A,2,5,7,9,2
B,6,10,2,3,3
C,3,4,6,8,2

If your actual "which is larger" calculation is more involved than just using >, e.g. if you were comparing dates in some non-alphabetic format or peoples names where you have to compare the surname before the forename and handle titles, etc., then you'd write the functions as:

function max(a,b) {
    # some algorithm to compare the 2 strings
}
function min(a,b) {return (max(a,b) == a ? b : a)}

Upvotes: 2

anubhava
anubhava

Reputation: 785058

You may use this awk:

awk 'BEGIN{FS=OFS=","} NR==1 {print $0, "New_col"; next} {print $0, ($2 > $4 ? $2 : $4) - ($3 < $5 ? $3 : $5)}' df.csv

col1,col2,col3,col4,col5,New_col
A,2,5,7,9,2
B,6,10,2,3,3
C,3,4,6,8,2

A more readable version:

awk '
BEGIN { FS = OFS = "," }
NR == 1 {
   print $0, "New_col"
   next
}
{
   print $0, ($2 > $4 ? $2 : $4) - ($3 < $5 ? $3 : $5)
}' df.csv

Upvotes: 1

Daweo
Daweo

Reputation: 36390

get an error using max and min in awk and write the result in a new column.

No such function are available in awk but for two values you might harness ternary operator, so in place of

max($7,$9)

try

($7>$9)?$7:$9

and in place of

min($8,$10)

try

($8<$10)?$8:$10

Above exploit ?: which might be explained as check?valueiftrue:valueiffalse, simple example, let file.txt content be

100,100
100,300
300,100
300,300

then

awk 'BEGIN{FS=","}{print ($1>$2)?$1:$2}' file.txt

output

100
300
300
300

Also are you sure about 1st $ in $(max($7,$9)-min($8,$10))? By doing so you instructed awk to get value of n-th column, where n is result of computation inside (...).

Upvotes: 0

Related Questions