Vasilis
Vasilis

Reputation: 173

A regular expression for one only character

I have a big file with lines like the following (tab separated):

220995146   A   G   1/1:8:0:0:8:301:-5,-2.40824,0   pass
221020849   G   GGAGAGGCA   1/1:8:0:0:8:229:-5,-2.40824,0   pass

I'm trying to write a coitional state that will allows me to keep only the lines that in the second and the third columns will have only one character. For example, the second line doesn't pass. The regex that I'm using is:

if (($ref =~ m/\w{1}/) && ($allele =~ m/\w{1}/)) {
            print "$mline\n";
    }

But unfortunately doesn't work. Any suggestions? Thank you very much in advance.

Upvotes: 1

Views: 70

Answers (3)

Toto
Toto

Reputation: 91428

Regex is not needed here, you can use the length function:

if (length($ref) == 1 && length($allele) == 1) {
    print $mline,"\n";
}

Upvotes: 2

Sjoerd
Sjoerd

Reputation: 75619

I assume that $allele contains the third column. In your code, $allele =~ m/\w{1}/, you check whether it contains one word character. Instead, you want to match the whole thing. You can do this with the begin ^ and $ end matchers:

$allele =~ m/^\w{1}$/

Or just

$allele =~ /^\w$/

Upvotes: 2

anubhava
anubhava

Reputation: 785266

If you're looking for pure regex solution then use:

$re = m/^[^\t]+\t+\w\t+\w\t+.*$/ ;

RegEx Demo

This will match lines where 2nd and 3rd columns have single word character by using \w after 1 or more tabs at 2nd and 3rd position.

Upvotes: 1

Related Questions