Reputation: 5059
As the title says, I want to remove the lines from my file in which the first column contains a lowercase at any point. Say, I have a file like this:
Ar MA0007.3 3051 2.62674e-220 OVER 0 OVER 0.749924 0.0797918 0.6897 0.682167 -0.0615 [13,23] 1
NR3C1 MA0113.3 3051 6.79534e-208 OVER 0 OVER 0.759705 0.0819166 0.699595 0.686309 -0.0665 [13,23] 0.269309
NR3C2 MA0727.1 3051 7.09295e-206 OVER 0 OVER 0.754749 0.0821368 0.694756 0.681845 -0.067 [13,23] 0.0756584
FOXA1 MA0148.3 3051 5.53402e-91 OVER 0 OVER 0.860904 0.0640026 0.827295 0.792912 -0.0303 [-3,7] 1
Foxa2 MA0047.2 3051 3.00085e-87 OVER 0 OVER 0.864018 0.065624 0.83031 0.796327 -0.0223 [1,11] 1
FOXP1 MA0481.2 3051 3.11057e-79 OVER 0 OVER 0.843207 0.0698783 0.809315 0.779508 -0.0375 [16,26] 1
FOXL1 MA0033.2 3051 1.60328e-77 OVER 0 OVER 0.925715 0.0677064 0.892118 0.854536 -0.1102 [-2,8] 1
FOXO6 MA0849.1 3051 8.95861e-73 OVER 0 OVER 0.892953 0.0741376 0.858344 0.824513 -0.0954 [13,23] 1
FOXK1 MA0852.2 3051 2.82502e-72 OVER 0 OVER 0.820987 0.0652885 0.790887 0.76394 -0.0325 [2,12] 1
What I would like it to print is :
NR3C1 MA0113.3 3051 6.79534e-208 OVER 0 OVER 0.759705 0.0819166 0.699595 0.686309 -0.0665 [13,23] 0.269309
NR3C2 MA0727.1 3051 7.09295e-206 OVER 0 OVER 0.754749 0.0821368 0.694756 0.681845 -0.067 [13,23] 0.0756584
FOXA1 MA0148.3 3051 5.53402e-91 OVER 0 OVER 0.860904 0.0640026 0.827295 0.792912 -0.0303 [-3,7] 1
FOXP1 MA0481.2 3051 3.11057e-79 OVER 0 OVER 0.843207 0.0698783 0.809315 0.779508 -0.0375 [16,26] 1
FOXL1 MA0033.2 3051 1.60328e-77 OVER 0 OVER 0.925715 0.0677064 0.892118 0.854536 -0.1102 [-2,8] 1
FOXO6 MA0849.1 3051 8.95861e-73 OVER 0 OVER 0.892953 0.0741376 0.858344 0.824513 -0.0954 [13,23] 1
FOXK1 MA0852.2 3051 2.82502e-72 OVER 0 OVER 0.820987 0.0652885 0.790887 0.76394 -0.0325 [2,12] 1
and what I am using is:
awk '!/[a-z]/' < file.txt
This somehow leaves out the following rows:
NR3C1 MA0113.3 3051 6.79534e-208 OVER 0 OVER 0.759705 0.0819166 0.699595 0.686309 -0.0665 [13,23] 0.269309
NR3C2 MA0727.1 3051 7.09295e-206 OVER 0 OVER 0.754749 0.0821368 0.694756 0.681845 -0.067 [13,23] 0.0756584
Could anyone please help me in fixing this.
TIA
Upvotes: 2
Views: 130
Reputation: 195109
grep '^[^a-z]\+\s' file
grep is fine.
well, POSIX one: grep '^[^a-z]\+[[:space:]]' file
Upvotes: 1
Reputation: 133538
Following awk
may help you.
awk '$1!~/[a-z]/' Input_file
Explanation: Simply checking here if $1
(first field) is NOT equal to /a-z/
means small letter alphabets then mentioning no action here which will do default action, which is printing the current line.
Upvotes: 1
Reputation: 74645
You need to match against only the first column using $1 ~ /regex/
:
awk '!($1 ~ /[a-z]/)' file
or equivalently:
awk '$1 !~ /[a-z]/' file
Upvotes: 1