Hanfei Sun
Hanfei Sun

Reputation: 47051

How to implement this in awk or shell?

Input File1:

5 5 NA
NA NA 1
2 NA 2

Input File2:

1 1 1
2 NA 2
3 NA NA
NA 4 4
5 5 5
NA NA 6

Output:

3 NA NA
NA 4 4
NA NA 6

The purpose is: in file1 , set any item of each line that is not NA into a set, then in file2, eliminate the line that whose fields are within this set. Does anyone have ideas about this?

Upvotes: 2

Views: 96

Answers (4)

Scrutinizer
Scrutinizer

Reputation: 9926

If the column position of the values matters:

awk '
  NR==FNR{
    for(i=1; i<=NF; i++) if($i!="NA") A[i,$i]=1
    next
  }
  {
    for(i=1; i<=NF; i++) if($i!=NA && A[i,$i]) next
    print 
  }
' file1 file2

Upvotes: 0

Steve
Steve

Reputation: 54392

To add any item not 'NA':

awk -f script.awk file1 file2

Contents of script.awk:

FNR==NR {
    for (i=1;i<=NF;i++) {
        if ($i != "NA") {
            a[$i]++
        }
    }
    next
}

{
    for (j=1;j<=NF;j++) {
        if ($j in a) {
            next
        }
    }
}1

Results:

3 NA NA
NA 4 4
NA NA 6

Alternatively, here's the one-liner:

awk 'FNR==NR { for (i=1;i<=NF;i++) if ($i != "NA") a[$i]++; next } { for (j=1;j<=NF;j++) if ($j in a) next }1' file1 file2

Upvotes: 2

Chris Seymour
Chris Seymour

Reputation: 85775

You could do this with grep:

$ egrep -o '[0-9]+' file1 | fgrep -wvf - file2
3 NA NA
NA 4 4
NA NA 6

Upvotes: 2

Kent
Kent

Reputation: 195029

awk one-liner:

 

awk 'NR==FNR{for(i=1;i<=NF;i++)if($i!="NA"){a[$i];break} next}{for(i=1;i<=NF;i++)if($i in a)next;}1' file1 file2

with your data:

kent$  awk 'NR==FNR{for(i=1;i<=NF;i++)if($i!="NA"){a[$i];break;} next}{for(i=1;i<=NF;i++)if($i in a)next;}1' file1 file2
3 NA NA
NA 4 4
NA NA 6

Upvotes: 0

Related Questions