Reputation: 469
I have a unix question. I have a file that looks like this:
AAAA 0 1 2 2 0
BBBBB 2 2 2 2 2
CCCCC 1 1 0 1 1
DDDD 0 0 0 0 0
EEEEE 2 2 0 2 2
The file has many thousands of rows like this (and is also tab-delimitated). The first column of the file is a name and the 2nd thru 6th columns are the data. It is the info in the 2nd-6th column that is important. I need to output all of the lines in which the 2nd-6th columns have no more than 1 0(zero). For example, I would want the output to look like this:
BBBBB 2 2 2 2 2
CCCCC 1 1 0 1 1
EEEEE 2 2 0 2 2
I have been trying to do this in as simple a method as possible and have tried the following awk command:
awk 'BEGIN{out!=0;}{if($2!=0)out++;if($3!=0)out++;if($4!=0)out++;if($5!=0)out++;if($6!=0)out++;if (out>=4)print;}'
But, when I try this, it just gives me the original input file. I am not sure what is wrong, or if I am even taking a correct approach. Any help would be appreciated.
Upvotes: 2
Views: 1515
Reputation: 67301
much simpler way to do this is :
awk '{count=0;for(i=2;i<=NF;i++){if($i~/0/)++count;}if(count <=1)print}' file1
tested below:
> cat file1
AAAA 0 1 2 2 0
BBBBB 2 2 2 2 2
CCCCC 1 1 0 1 1
DDDD 0 0 0 0 0
EEEEE 2 2 0 2 2
sEEEE 2 0 0 0 2
> awk '{count=0;for(i=2;i<=NF;i++){if($i~/0/)++count;}if(count <=1)print}' file
BBBBB 2 2 2 2 2
CCCCC 1 1 0 1 1
EEEEE 2 2 0 2 2
>
Upvotes: 0
Reputation: 327
Assuming that the columns conforms to a particular format can be dangerous. Below is an easy solution using the 0,1 properties of Boolean variables:
awk '($2==0) + ($3==0) + ($4==0) + ($5==0) + ($6==0) <2' file.txt
Upvotes: 0
Reputation: 54572
One way using perl
:
perl -ne 'print if(tr/0/0/ <= 1)' file.txt
I'm assuming the names on each line don't contain numbers (specifically 0
) and that they do not exceed one digit in length. Also, if you add the -i
flag, you can make the changes in-file.
Upvotes: 0
Reputation: 247192
awk '
{
nzero=0
for (fld = 2; nzero <= 1 && fld <= 6; fld++) {
if ($fld == 0) nzero++
}
if (nzero <= 1) print
}
' filename
Upvotes: 0
Reputation: 1873
The mistake you are doing is not resetting the out variable for each record, and instead initializing it just once in the BEGIN block. (You are also mistakenly initializing that using a "not equals".)
awk '{out = 0; if($2!=0) out++; if($3!=0) out++; if($4!=0) out++; if($5!=0) out++; if($6!=0) out++; if(out>=4) print}'
Upvotes: 2