Hang Lin
Hang Lin

Reputation: 51

Rosalind: REVP failing the given case

I wrote a solution to this challenge . It successfully handles the example case given, but not the actual case.

Challenge: A DNA string is a reverse palindrome if it is equal to its reverse complement. For instance, GCATGC is a reverse palindrome because its reverse complement is GCATGC. For example:

5'...GCATGC...3'

3'...CGTACG...5'

Given:

A DNA string of length at most 1 kbp in FASTA format.

Return:

The position and length of every reverse palindrome in the string having length between 4 and 12. You may return these pairs in any order.

Sample Dataset

>Rosalind_24 TCAATGCATGCGGGTCTATATGCAT

Sample Output

4 6

5 4

6 6

7 4

17 4

18 4

20 6

21 4

For the sample, it works. However it failed on the actual sample.

Actual Dataset:

>Rosalind_7901 ATATAGTCGGCTGTCCAGGCAATCGCGAGATGGGGAACGACATCTTGGTACTTTACGGAT GCCAAGACTTAATATCTGGCCCGGATATGACCGCGAGCACCCCCTACTCGTCTGTCGGTT TCGGCCGGCATGACCTGTCCTCTTGATAATAGATATAAGTTGCCAACCGCACTATTTCAA GATCAGATGCCCCAAGGCACAAGGCACAGAAGAATCAGGTACTGAGCAAACAGCGCCCAT TTGTCAGCGCAACTCCGAGCGACAGGCACAAGTGGTAGTAACATCTGTAGTCTACGAGCG CGGGACCGATGTAAAAAGCAACGAGAGACGGGGCCGTCGATAGAAAAGCAATGGAGTCCA TATGGGCACGCTGAGCGTGCCTGTACTAATTTCTATGGGCTACTGGCACTAGGGGCTTAA GCCCTCGGTTACCGCGCTTTATGAATATAGTTTTCGTGCCAGGAGTGTCTTGTTTCGAGG AAGCGTGAGCTACACTTAGCACGTCCGGGCTTATTGGAAATTTGTTCAGTCTGTATGCTC CGCAATATCATGTCGGCGCTCATTCAATGTTGCGTGTAATTTAGACCTCTACTACAGCTG GGGTTGGAGCGGTCGGTAGTAAGACGTATGATTACGGTTTACATCCCGCCGGCGGACACG GAACGTGATTTTCAGCATTGTCCCATCGTAGGGATTGGGGCCCTAGTAGGTGTGGGTAGC ACGTTACATGAAGCTATCCAATGGCGTATATACTCCATCCCATCGGACTAGAAGATTTGA GGGACCCAGTCATAACTGGTGCAAAATTACGTTACAAAAGCCGAGGATACAGTATA

Actual Output:

1 4 2 4 23 6 24 4 48 4 70 4 73 4 79 4 82 4 86 4 93 4 124 6 125 4 126 6 127 4 131 4 155 4 156 4 184 4 222 4 236 4 251 4 337 4 342 4 389 4 394 4 415 4 423 4 440 4 441 4 452 4 453 4 482 4 496 4 509 4 513 4 526 6 527 4 554 4 558 4 565 4 587 4 604 6 605 4 634 4 656 10 657 8 658 6 659 4 674 4 709 6 710 4 714 4 733 4 739 4 744 4 758 8 759 4 759 6 760 4 761 4 780 4 813 4 818 4 822 4 846 4

Code:

from string import maketrans
table=maketrans('ATCG','TAGC')

protein=open('rosalind_revp.txt','r').read()[14::].strip()

for i in range(len(protein)):
    for ii in range(2,7):
        if protein[i:i+ii]==protein[i+2*ii-1:i+ii-1:-1].translate(table):
            print str(i+1),str(2*ii)

(When testing sample, the 4th line is

protein=open('rosalind_revp.txt','r').read()[12::].strip()

I even manually matched a bunch of the position-length pairs, and sad to find that they all worked perfectly. I still don't know why the result wasn't accepted. Could anyone let me know where I was wrong?

Upvotes: 2

Views: 613

Answers (1)

JJjj007
JJjj007

Reputation: 1

This is my github link and it has the solution hope this works

def reverse(l):
    t=""
    for i in range(len(l)):
        if(l[i]=='A'):
            t=t+'T'
        elif(l[i]=='T'):
            t=t+'A'
        elif(l[i]=='C'):
            t=t+'G'
        elif(l[i]=='G'):
            t=t+'C'
    return t
def rev(d):
    return d[len(d)::-1]
k=input()
p=input()
for i in range(len(p)):
    for j in range(4,14):
        if (p[i:i+j]==rev(reverse(p[i:i+j]))and i+j<=len(p)):
            print(i+1, end=" ")
            print(j)  

https://github.com/jssssv007/stackexcahnge

Upvotes: 0

Related Questions