Benjamin Brooks
Benjamin Brooks

Reputation: 215

decipher( s ) python

Decipher( S ) will be given a string of English text shifted by some amount. Then, decipher should return, to the best of its ability, the original English string, which will be some rotation (possibly 0) of the input S. This means I have to try each possible decoding and estimate how English they are.

My approach is to use letter frequencies:

def letProb( c ):
    """ if c is an alphabetic character,
    we return its monogram probability (for english),
    otherwise we return 1.0 We ignore capitalization.
    Adapted from
    http://www.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html
    """
    if c == 'e' or c == 'E': return 0.1202
    if c == 't' or c == 'T': return 0.0910
    if c == 'a' or c == 'A': return 0.0812
    if c == 'o' or c == 'O': return 0.0768
    if c == 'i' or c == 'I': return 0.0731
    if c == 'n' or c == 'N': return 0.0695
    if c == 's' or c == 'S': return 0.0628
    if c == 'r' or c == 'R': return 0.0602
    if c == 'h' or c == 'H': return 0.0592
    if c == 'd' or c == 'D': return 0.0432
    if c == 'l' or c == 'L': return 0.0398
    if c == 'u' or c == 'U': return 0.0288
    if c == 'c' or c == 'C': return 0.0271
    if c == 'm' or c == 'M': return 0.0261
    if c == 'f' or c == 'F': return 0.0230
    if c == 'y' or c == 'Y': return 0.0211
    if c == 'w' or c == 'W': return 0.0209
    if c == 'g' or c == 'G': return 0.0203
    if c == 'p' or c == 'P': return 0.0182
    if c == 'b' or c == 'B': return 0.0149
    if c == 'v' or c == 'V': return 0.0111
    if c == 'k' or c == 'K': return 0.0069
    if c == 'x' or c == 'X': return 0.0017
    if c == 'q' or c == 'Q': return 0.0011
    if c == 'j' or c == 'J': return 0.0010
    if c == 'z' or c == 'Z': return 0.0007
    return 1.0

Also I use this formula:

def list_to_str( L ):
    """ L must be a list of characters; then,
    this returns a single string from them
    """
    if len(L) == 0: return ''
    return L[0] + list_to_str( L[1:] )

And this:

def rot(c, n):
    """ rot rotates c, a single character forward by n spots in the 
        alphabet
        input: a single character
        input: a non-negative integer n between 0 and 25
        output: c forward by n spots in the alphabet
    """
    if 'a' <= c <= 'z':
        neword = ord(c) +n
        if neword > ord('z'):
                neword = neword - 26
    elif 'A' <= c <= 'Z':
        neword = ord(c) + n 
            if neword > ord('Z'):
                neword = neword - 26
    else:                       
        neword = ord(c) 
    return chr(neword)

And finally this:

def decipher( S ):
    """
    """
    L = [[rot(c, n) for c in S] for n in range(26)]
    LoL = [[sum(letProb(c)) for letter in encoding] for encoding in L ]
    L = max(LoL)
    return list_to_str(L)

The first two formulas are good, but in the final formula there is something wrong, for sure in the sentence:

LoL = [[sum(letProb(c)) for letter in encoding] for encoding in L ]

TypeError: 'float' object is not iterable

Upvotes: 3

Views: 8304

Answers (2)

Martin Evans
Martin Evans

Reputation: 46759

You might find the following useful. You could use some kind of lookup for the letter probabilities. If a dictionary is used, you can use get to provide your default probability of 1.0 for non A to Z characters.

Secondly, Python's string module has a make_trans function that could be used with translate to perform your rotations.

import string

letter_probs = {
    'e' : 0.1202, 't' : 0.0910, 'a' : 0.0812, 'o' : 0.0768,
    'i' : 0.0731, 'n' : 0.0695, 's' : 0.0628, 'r' : 0.0602,
    'h' : 0.0592, 'd' : 0.0432, 'l' : 0.0398, 'u' : 0.0288,
    'c' : 0.0271, 'm' : 0.0261, 'f' : 0.0230, 'y' : 0.0211,
    'w' : 0.0209, 'g' : 0.0203, 'p' : 0.0182, 'b' : 0.0149,
    'v' : 0.0111, 'k' : 0.0069, 'x' : 0.0017, 'q' : 0.0011,
    'j' : 0.0010, 'z' : 0.0007}

def rotate(text, rotate_by):
    s_from = string.ascii_lowercase + string.ascii_uppercase
    s_to = string.ascii_lowercase[rotate_by:] + string.ascii_lowercase[:rotate_by] + \
           string.ascii_uppercase[rotate_by:] + string.ascii_uppercase[:rotate_by]
    cypher_table = string.maketrans(s_from, s_to)
    return text.translate(cypher_table)

def sum_probs(text):
    global letter_probs
    return sum(letter_probs.get(c.lower(), 1.0) for c in text)

def decipher(text):
    probabilities = []
    for r in range(26):
        rotated = rotate(text, r)
        probabilities.append((sum_probs(rotated), rotated, r))
    return sorted(probabilities)

for probability, decoded_text, rotate_value in decipher("Mjqqt Btwqi"):
    print '{:2.2f}  {:2}  "{}"'.format(probability, rotate_value, decoded_text)

This would display all possibilities in probability order:

1.17  19  "Fcjjm Umpjb"
1.20  16  "Czggj Rjmgy"
1.22   9  "Vszzc Kcfzr"
1.26  20  "Gdkkn Vnqkc"
1.28  13  "Zwddg Ogjdv"
1.29   4  "Qnuux Fxaum"
1.29  15  "Byffi Qilfx"
1.30   8  "Uryyb Jbeyq"
1.31   6  "Spwwz Hzcwo"
1.32   5  "Rovvy Gybvn"
1.32   0  "Mjqqt Btwqi"
1.33  12  "Yvccf Nficu"
1.34   1  "Nkrru Cuxrj"
1.37  23  "Jgnnq Yqtnf"
1.39   7  "Tqxxa Iadxp"
1.40  22  "Ifmmp Xpsme"
1.40   2  "Olssv Dvysk"
1.44  25  "Lipps Asvph"
1.45  17  "Dahhk Sknhz"
1.47  24  "Khoor Zruog"
1.49  11  "Xubbe Mehbt"
1.52   3  "Pmttw Ewztl"
1.56  10  "Wtaad Ldgas"
1.58  21  "Hello World"
1.61  14  "Axeeh Phkew"
1.68  18  "Ebiil Tloia"

As you can see, the answer is near the bottom.

Upvotes: 0

naimetti
naimetti

Reputation: 175

You are trying to sum(letProb(c)) where letProb returns a Float, and sum needs an iterable, a list o a generator, lets say. Try something like this.

LoL = [[sum(letProb(c) for letter in encoding)] for encoding in L ]

I also think that c maybe should be letter, but I don't, if you post complete code, rot function included, maybe I can help you better.

Upvotes: 1

Related Questions