user2525078
user2525078

Reputation: 53

How to filter a file by n. word in line after pattern?

I've got a large file with diffrent lines.

The lines i am interested in, are looking alike:

lcl|NC_005966.1_gene_59 scaffold441.6   99.74   390 1   0   1   390 34065   34454   0.0  715
lcl|NC_005966.1_gene_59 scaffold2333.4  89.23   390 42  0   1   390 3114    2725    1e-138   488
lcl|NC_005966.1_gene_60 scaffold441.6   100.00  186 0   0   1   186 34528   34713   1e-95    344

Now i want to get the lines after the pattern 'lcl|NC_' but just if the third word(or the nth word in the line) is smaller than 100.

(In this case the first two lines, since they just got a number of 99.74 and 89.23)

Next they should be saved into a new file.

Upvotes: 0

Views: 98

Answers (1)

fedorqui
fedorqui

Reputation: 290095

This can make it:

$ awk '$1 ~ /^lcl\|NC_/ && $3<100' file
lcl|NC_005966.1_gene_59 scaffold441.6   99.74   390 1   0   1   390 34065   34454   0.0  715
lcl|NC_005966.1_gene_59 scaffold2333.4  89.23   390 42  0   1   390 3114    2725    1e-138   488

It checks both things:
- 1st field starting with lcl|NC_: $1 ~ /^lcl\|NC_/ does it. (Thanks Ed Morton for improving the previous $1~"^lcl|NC_")
- 3rd field being <100: $3<100.

To save into a file, you can do:

awk '$1 ~ /^lcl\|NC_/ && $3<100' file > new_file
                                      ^^^^^^^^^^

Upvotes: 3

Related Questions