dddxxx
dddxxx

Reputation: 359

Awk OR conditional not working

Input: A tab-separated input file with 15 columns where column 15 is an integer.

Output: The number of lines that satisfy the conditional.

My code:

$ closest-features --closest --no-overlaps --delim '\t' --dist --ec megatrans_enhancers.sorted.bed ../../data/alu_repeats.sorted.bed | awk -v OFS='\t' '{if ($15 <= 1000 || $15 >= -1000) print $0}' | wc -l
1188

The || conditional in this case is failing to work (the total number of lines in the file are 1188 and I know for certain at least some lines do not satisfy the condition), because if I remove the OR conditional then suddenly it works:

$ closest-features --closest --no-overlaps --delim '\t' --dist --ec megatrans_enhancers.sorted.bed ../../data/alu_repeats.sorted.bed | awk -v OFS='\t' '{if ($15 <= 1000) print $0}' | wc -l
926

Not sure what i'm doing wrong. Any advice?

Example Input to Awk command:

chr1    378268  378486  chr1-798_Enhancer       17.2    +       chr1    375923  376219  AluY|SINE|Alu-HOMER529  0       +       E:375923        0.044   -2050
chr1    1079471 1079689 chr1-929_Enhancer       14.6    -       chr1    1071271 1071563 AluSx1|SINE|Alu-HOMER1669       0       -       E:1071271       0.13    -7909
chr1    1080259 1080477 chr1-830_Enhancer       16.7    -       chr1    1071271 1071563 AluSx1|SINE|Alu-HOMER1669       0       -       E:1071271       0.13    -8697
chr1    6611744 6611962 chr1-241_Enhancer       46.6    +       chr1    6611431 6611723 AluSc|SINE|Alu-HOMER10257       0       +       E:6611431       0.089   -22
chr1    6959639 6959857 chr1-58_Enhancer        100.1   -       chr1    6966612 6966911 AluSx|SINE|Alu-HOMER11041       0       -       E:6966612       0.137   6756
chr1    6960593 6960811 chr1-202_Enhancer       51.6    -       chr1    6966612 6966911 AluSx|SINE|Alu-HOMER11041       0       -       E:6966612       0.137   5802
chr1    7447888 7448106 chr1-2_Enhancer 181.9   -       chr1    7449489 7449799 AluSz|SINE|Alu-HOMER11879       0       +       E:7449489       0.119   1384
chr1    10752461        10752679        chr1-131_Enhancer       65.4    -       chr1    10752754        10753065        AluSq2|SINE|Alu-HOMER19455      0       +       E:10752754      0.106      76
chr1    12485694        12485912        chr1-353_Enhancer       36.7    +       chr1    12487328        12487634        AluSx3|SINE|Alu-HOMER23581      0       +       E:12487328      0.085      1417
chr1    12486469        12486687        chr1-141_Enhancer       63.6    +       chr1    12487328        12487634        AluSx3|SINE|Alu-HOMER23581      0       +       E:12487328      0.085      642

Upvotes: 1

Views: 155

Answers (1)

RavinderSingh13
RavinderSingh13

Reputation: 133518

Try to put && condition because a digit should be greater than -1000 and lesser than 1000.

Your_command | awk '$15<=1000 && $15>=-1000{count++} END{print count}'

Add -F"\t" in above awk in case your Input to it is coming TAB delimited too. Also there is no need to use wc -l after awk. I have written logic for that so give the count of lines which are satisfying the condition by creating a variable named count and printing it at very last of Input_file.

Also for your provided samples output is coming as 3 which I believe is correct one.

Upvotes: 1

Related Questions