Oluwole
Oluwole

Reputation: 71

Counting DNA Nucleotides using perl 6

Good Afternoon, i am trying to count the number of times the letters A C T G occur in DNA sequence using perl6.i have tried other ways i am just trying to get it done in another way. Here are some of the code i came up with

use v6;

my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC";

sub MAIN(Str $input = $default-input) 
{
    say "{bag($input.comb)<A C G T>}";
}



use v6;

my $default-input = "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC";

sub MAIN($input = $default-input) 
{
    "{<A C G T>.map({ +$input.comb(/$_/) })}".say;

Sample Dataset
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC

Upvotes: 6

Views: 288

Answers (2)

Matt Oates
Matt Oates

Reputation: 778

Another way is to use the BioInfo modules I'm working on which have a coercion to Bag already for you :)

use v6;
use BioInfo;

my @sequences = `
>seqid
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC
`;

for @sequences -> $seq {
    say $seq.Bag;
}

In the above code you are importing a special bioinformatics Slang which understands that string literals between `` are FASTA literals. DNA/RNA/Amino-acids are automatically detected and you get a specific class for that. The object has its own .Bag that does what you want. Other than my own modules there is also the BioPerl6 project.

If you want to read from file then the following should work for you:

use v6;
use BioInfo::Parser::FASTA;
use BioInfo::IO::FileParser;

#Spawn an IO thread that parses the file and creates BioInfo::Seq objects on .get
my $seq_file = BioInfo::IO::FileParser.new(file => 'myseqs.fa', parser => BioInfo::Parser::FASTA);

#Print the residue counts per file
while my $seq = $seq_file.get() {
    say $seq.Bag;
}

Upvotes: 3

Brad Gilbert
Brad Gilbert

Reputation: 34120

multi sub MAIN ( \DNA ) {
  my Int %bag = A => 0, C => 0, G => 0, T => 0;

  # doesn't keep the whole thing in memory
  # like .comb.Bag would have
  for DNA.comb {
    %bag{$_}++
  }
  .say for %bag<A C G T> :p;
}

multi sub MAIN ( 'example' ){
  samewith "AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC"
}

multi sub MAIN ( Bool :STDIN($)! ){
  samewith $*IN
}

multi sub MAIN ( Str :filename(:$file)! where .IO.f ){
  samewith $file.IO
}
~$ ./test.p6
Usage:
  ./test.p6 <DNA> 
  ./test.p6 example 
  ./test.p6 --STDIN 
  ./test.p6 --filename|--file=<Str>

~$ ./test.p6 example
A => 20
C => 12
G => 17
T => 21

~$ ./test.p6 --STDIN < test.in
A => 20
C => 12
G => 17
T => 21

~$ ./test.p6 --file=test.in
A => 20
C => 12
G => 17
T => 21

Upvotes: 7

Related Questions