Jordan
Jordan

Reputation: 311

Using tr/// operator to count letters in a string

I would like count number of A's, C's and G's in a sequence or string. I have written the following code.

But when I print the values, only A's get printed. C's and G's are displayed as zero. In the code below I'm evaluating A's first, but if I switch the order by evaluating C's first, I get the values of C, but now A's and G's are printed out as zero.

Can anyone tell me what is wrong with my code? Thanks!

#! /usr/bin/perl

use strict;
use warnings;

open(IN, "200BP_junctions_fasta.faa") or die "Cannot open the file: $!\n";
while(<IN>) 
    next if $_ =~ /\>/;
    my $a = ($_ = tr/A//);
    my $c = ($_ = tr/C//);
    my $g = ($_ = tr/G//);
    print "A:$a, C:$c, G:$g\n";
}

The file looks like the following:

> A_Seq  
ATGCTAGCTAGCTAGCTAGTC  
> B_Seq  
ATGCGATCGATCGATCGATAG  

Upvotes: 3

Views: 4149

Answers (4)

brian d foy
brian d foy

Reputation: 132886

The answer that you needed the binding operator, =~ instead of the assignment operat0r, =, or that you don't need to bind the default variable.

Lately, I've been using printf for these sorts of things:

while( <DATA> ) {
    next if /\>/;
    printf "A:%s C:%s G:%s\n", tr/A//, tr/C//, tr/G//;
    }

I've often wished that tr/// could interpolate so I could write this, which doesn't work:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $_\n";
    printf "A:%s C:%s G:%s\n", map { $line =~ tr/$_// } qw(A C G);
    }

Notice that I'd have the extra annoyance of a colliding $_ if I had used the default variable in the while. I know I could do an eval, but that's not only more of a hassle, but l4m3:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $_\n";
    printf "A:%s C:%s G:%s\n", map { eval "\$line =~ tr/$_//" } qw(A C G);
    }

I shouldn't have to know the implementation details, though, so I could move that to a subroutine until I can figure out how to get rid of the eval, although extra subroutine calls may slow down big data munging:

while( my $line = <DATA> ) {
    next if $line =~ /\>/;
    print "Line is $line\n";
    printf "A:%s C:%s G:%s\n", map { count_bases( $line, $_ ) } qw(A C G);
    }

sub count_bases { eval "\$_[0] =~ tr/$_[1]//" }

There's probably some clever way to XOR strings if you don't like tr///, but I've never pursued it long enough to figure it out (not that it would be better than what you are already doing).

Upvotes: 1

perreal
perreal

Reputation: 98088

open(IN, "input") or die "Cannot open the file: $!\n";
while(<IN>) {
  next if $_ =~ /\>/;
  my $a = @{[m/(A)/g]};
  my $c = @{[m/(C)/g]};
  my $g = @{[m/(D)/g]};
  print "A:$a, C:$c, G:$g\n";
}

Upvotes: 0

Axeman
Axeman

Reputation: 29854

Because '5' doesn't have any 'C's or 'G's in it. You're assigning the value of the translation of $_ to $_. If you bind ($_ =~ tr//) the operation to $_, you'll get the result you want.

But you really don't need to bind to the context variable. Binding is so that you can apply a regex or translate operation to another variable. You'd be better off writing:

my $a = tr/A//;
my $c = tr/C//;
my $g = tr/G//;

But you can do it like this, too:

$_{$_}++ foreach m/[ACG]/g;
say "A:$_{A}, C:$_{C}, G:$_{G}";

Upvotes: 1

nshew13
nshew13

Reputation: 3097

Change your $_ = tr/ to $_ =~ tr/. Also, you're missing an open brace for your while.

Upvotes: 6

Related Questions