Nelson
Nelson

Reputation: 37

Add "?" to a string when a condition is not met

I was given this dictionary with the 3-letter sequences that represent a single letter code.

genetic_code = {
    'TTT': 'F', 'TTC': 'F', 'TTA': 'L', 'TTG': 'L',
    'TCT': 'S', 'TCC': 'S', 'TCA': 'S', 'TCG': 'S',
    'TAT': 'Y', 'TAC': 'Y', 'TAA': '*', 'TAG': '*',
    'TGT': 'C', 'TGC': 'C', 'TGA': '*', 'TGG': 'W',

    'CTT': 'L', 'CTC': 'L', 'CTA': 'L', 'CTG': 'L',
    'CCT': 'P', 'CCC': 'P', 'CCA': 'P', 'CCG': 'P',
    'CAT': 'H', 'CAC': 'H', 'CAA': 'Q', 'CAG': 'Q',
    'CGT': 'R', 'CGC': 'R', 'CGA': 'R', 'CGG': 'R',

    'ATT': 'I', 'ATC': 'I', 'ATA': 'I', 'ATG': 'M', 
    'ACT': 'T', 'ACC': 'T', 'ACA': 'T', 'ACG': 'T',
    'AAT': 'N', 'AAC': 'N', 'AAA': 'K', 'AAG': 'K',
    'AGT': 'S', 'AGC': 'S', 'AGA': 'R', 'AGG': 'R',

    'GTT': 'V', 'GTC': 'V', 'GTA': 'V', 'GTG': 'V',
    'GCT': 'A', 'GCC': 'A', 'GCA': 'A', 'GCG': 'A',
    'GAT': 'D', 'GAC': 'D', 'GAA': 'E', 'GAG': 'E',
    'GGT': 'G', 'GGC': 'G', 'GGA': 'G', 'GGG': 'G',
}

I am supposed to define a function called translate which will take a set of letters and translate it (every 3 letters) to its single letter code. If the length of the input is not a multiple of 3, I must add "?"

Also, if the input contains letters and codes not included in the dictionary, I must also add "?"

I tried this:

def translate (c,d):
    AA = ""
    for i in range(0, len(c), 3):
        codon = c[i:i + 3]
        if len(codon) == 0 and codon in genetic_code:
            AA += genetic_code[codon]
        else:
            AA += "?"
    return AA

But I get an error when I do this:

assert translate('GATTACATG', genetic_code) == 'DYM'

assert translate('GATXAAATGA', genetic_code) == 'D?M?'

Why?

Thanks,

Upvotes: 2

Views: 39

Answers (1)

orlp
orlp

Reputation: 117771

You were almost there:

def translate(c, d):
    AA = ""
    for i in range(0, len(c), 3):
        codon = c[i:i + 3]
        if len(codon) == 3 and codon in d:  # Note difference.
            AA += d[codon]                  # Note difference.
        else:
            AA += "?"
    return AA

However since genetic_code only contains codons of length 3 the length check is unnecessary:

def translate(c, d):
    AA = ""
    for i in range(0, len(c), 3):
        codon = c[i:i + 3]
        if codon in d:
            AA += d[codon]
        else:
            AA += "?"
    return AA

But this can be simplified using dict.get with a default value of "?":

def translate(c, d):
    AA = ""
    for i in range(0, len(c), 3):
        codon = c[i:i + 3]
        AA += d.get(codon, "?")
    return AA

And this can be simplified further and made more performant using str.join:

def codons(c):
    return (c[i:i + 3] for i in range(0, len(c), 3))
def translate(c, d):
    return "".join(d.get(codon, "?") for codon in codons(c))

Upvotes: 3

Related Questions