user1899817
user1899817

Reputation: 89

Can someone help me getting average of a column using awk with condition on other column

awk -F, '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON

awk: {if ($2 == 0) awk {
awk:                   ^ syntax error
bash: count++: command not found

Upvotes: 3

Views: 353

Answers (3)

userABC123
userABC123

Reputation: 1500

Thought I'd try to do this without awk. Awk is clearly the better choice, but it's still a one-liner.

bc<<<"($(grep ' 0 ' file|tee >(wc -l>i)|cut -d\  -f3|tr '\n' '+')0)/"$(<i)

3

It extracts lines with 0 in the second column with grep. This is passed to tee for wc -l to count the lines and to cut to extract the third column. tr replaces the new lines with "+" which is put over the number of lines (i.e., "12 / 4"). This is then passed to bc.

Upvotes: 0

Zac Thompson
Zac Thompson

Reputation: 12665

Just for fun, let's look at what's wrong with your original version and transform it into something that works, step by step. Here's your initial version (I'll call it version 0):

awk -F, '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON

The -F, sets the field separator to be the comma character, but your later comment seems to indicate that the columns (fields) are separated by spaces. So let's get rid of it; whitespace-separation is what awk expects by default. Version 1:

awk '{if ($2 == 0) awk '{ total += $3; count++ } END { print total/count }' CLN_Tapes_LON; }' /tmp/CLN_Tapes_LON

You seem to be attempting to nest a call to awk inside your awk program? There's almost never any call for that, and this wouldn't be the way to do it anyway. Let's also get rid of the mismatched quotes while we're at it: note in passing that you cannot nest single quotes inside another pair of single quotes that way: you'd have to escape them somehow. But there's no need for them at all here. Version 2:

awk '{if ($2 == 0) { total += $3; count++ } END { print total/count } }' /tmp/CLN_Tapes_LON

This is close but not quite right: the END block is only executed when all lines of input are finished processing: it doesn't make sense to have it inside an if. So let's move it outside the braces. I'm also going to tighten up some whitespace. Version 3:

awk '{if ($2==0) {total+=$3; count++}} END{print total/count}' /tmp/CLN_Tapes_LON

Version 3 actually works, and you could stop here. But awk has a handy way of specifying to run a block of code only against lines that match a condition: 'condition {code}' So yours can more simply be written as:

awk '$2==0 {total+=$3; count++} END{print total/count}' /tmp/CLN_Tapes_LON

... which, of course, is pretty much exactly what John1024 suggested.

Upvotes: 4

John1024
John1024

Reputation: 113814

$ awk '$2 == 0 { total += $3; count++;} END { print total/count; }' CLN_Tapes_LON 
3

This assumes that your input file looks like:

$ cat CLN_Tapes_LON 
CLH040 0 3
CLH041 0 3
CLH042 0 3
CLH043 0 3
CLH010 1 0
CLH011 1 0
CLH012 1 0
CLH013 1 0
CLH130 1 40
CLH131 1 40
CLH132 1 40
CLH133 1 40

Upvotes: 0

Related Questions