nit
nit

Reputation: 689

Count letters in text

I have text file as follows...

s1

MPPRRSIVEVKVLDVQKRRVPNKHYVYIIRVTWSSGATEAIYRRYSKFFDLQMQMLDKFP MEGGQKDPKQRIIPFLPGKILFRRSHIRDVAVKRLIPIDEYCKALIQLPPYISQCDEVLQ FFETRPEDLNPPKEEHIGKKKSGNDPTSVDPMVLEQYVVVADYQKQESSEISLSVGQVVD

s2

MAEVRKFTKRLSKPGTAAELRQSVSEAVRGSVVLEKAKLVEPLDYENVITQRKTQIYSDP LRDLLMFPMEDISISVIGRQRRTVQSTVPEDAEKRAQSLFVKECIKTYSTDWHVVNYKYE DFSGDFRMLPCKSLRPEKIPNHVFEIDEDCEKDEDSSSLCSQKGGVIKQGWLHKANVNST

. . .

I wanted to count letter 'P' in each sequences output should be

> s1:10

> s2:20

To acheive this python script as follows

infile=open("file1.txt",'r')

out=open("file2.csv",'w')

for line in infile:

     line = line.strip("\n")

   if line.startswith('>'):

        name=line

   else:

        pattern = line.count('P') 

        print '%s:%s' %(name,pattern)

        out.write('%s:%s\n' %(name,pattern))

it read line and gives result as follows

> s1:2

> s1:3

> s1:5

> s2:10

> s2:10

But i except out put as follows

> s1:10

> s2:20 . . .

Can any body help how to do this...

Thanks in Advance Ni

Upvotes: 1

Views: 309

Answers (2)

SetSlapShot
SetSlapShot

Reputation: 70

total = 0
for line in infile:
    line = line.strip("\n")
    if line.startswith('>'):
        name = line
    else:
        pattern = line.count('P') 
        total += pattern
        print '%s:%s' %(name,pattern)

#this goes outside the for loop
out.write('%s:%s\n' %(name,total))

Upvotes: 1

codemaker
codemaker

Reputation: 1742

Don't parse the file line by line. Just iterate over the entire file character by character counting occurrances of the character you are looking for.

Upvotes: 1

Related Questions