wombatp
wombatp

Reputation: 833

Sorting values (for loop)

I'm working on a simple nucleotide counter in python 2.7 and at one of the methods I wrote I'd like to print the g,c,a,t values sorted by the number of times they show up in the gene sheet. what could be the better way to do that? Thanks in advance!

 def counting(self):
    gene = open("BRCA1.txt", "r")
    g = 0
    a = 0
    c = 0
    t = 0
    gene.readline()
    for line in gene:
        line = line.lower()
        for char in line:
            if char == "g":
                g += 1
            if char == "a":
                a += 1
            if char == "t":
                t += 1
            if char == "c":
                c += 1
    print "number of g\'s: %r" % str(g)
    print "number of c\'s: %r" % str(c)
    print "number of d\'s: %r" % str(a)
    print "number of t\'s: %r" % str(t)

Upvotes: 1

Views: 92

Answers (2)

Peter DeGlopper
Peter DeGlopper

Reputation: 37319

Use the collections.Counter class.

from collections import Counter
def counting(self):
    with open("BRCA1.txt", "r") as gene:
        nucleotide_counts = Counter(char for line in gene for char in line.lower().strip())
    for (nucleotide, count) in nucleotide_counts.most_common():
        print "number of %s's: %d" % (nucleotide, count)

If your lines might contain things besides nucleotides, this should work:

from collections import Counter
def counting(self):
    nucleotides = frozenset(('g', 'a', 't', 'c'))
    with open("BRCA1.txt", "r") as gene:
        nucleotide_counts = Counter(char for line in gene for char in line.lower() if char in nucleotides)
    for (nucleotide, count) in nucleotide_counts.most_common():
        print "number of %s's: %d" % (nucleotide, count)

That version doesn't need strip because newlines and other whitespace would be excluded by checking set membership.

Upvotes: 4

shx2
shx2

Reputation: 64318

Use the Counter class.

from collections import Counter
counter = Counter(char for line in gene for char in line.lower() )
for char, count in counter.most_common():
    print "number of %s\'s: %d" % (char, count)

Upvotes: 6

Related Questions