Reputation: 179
I would like to remove the column which contains any number of NA. I used this command
awk ' $0 !="NA" {print $0}' file
But it does not work. For example, the file is as following
1 2 3 NA 6 male
4 6 2 1 NA female
NA 2 2 NA 3 male
7 2 2 7 NA male
I want to the output file as
2 3 male
6 2 female
2 2 male
2 2 male
Upvotes: 0
Views: 512
Reputation: 782106
You need to make two passes over the data. The first pass should save all the input in an array, find the column numbers that contain NA
, and save that in another array. Then at the end you print all the saved data, but skip over the columns that are in the second array.
awk '{ lines[NR] = $0; for (i = 1; i <= NF; i++) if ($i == "NA") skip[i] = 1;}
END { for (i = 1; i <= NR; i++) {
nf = split(lines[i], fields);
for (j = 1; j <= nf; j++) if (!(j in skip)) printf("%s ", fields[j]);
printf("\n");
}
}' inputfile > outputfile
Upvotes: 1