Reputation: 15
I am trying to fix a code that takes a ciphered input file and runs a frequency analysis of the letters and then decrypts the ciphered text. I got it to work for the most part, but the ciphered text is not fully decrypted. Can I get some suggestions on how I would fix it?
ETAOIN = 'ETAOINSHRDLCUMWFGYPBVKJXQZ'
LETTERS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
cipher = open('cipher.txt', 'r').read()
def getLetterCount(message):
alphabet = [chr(a + 65) for a in range(26)]
letter_count = dict((x, 0) for x in alphabet)
for letter in message.upper():
if letter in LETTERS:
letter_count[letter] += 1
return letter_count
def getFreq(freqPair):
return freqPair[0]
def getFrequencyOrder(message):
letterToFreq = getLetterCount(message)
freqToLetter = {}
for letter in LETTERS:
if letterToFreq[letter] not in freqToLetter:
freqToLetter[letterToFreq[letter]] = [letter]
else:
freqToLetter[letterToFreq[letter]].append(letter)
for freq in freqToLetter:
freqToLetter[freq].sort(key=ETAOIN.find)
freqToLetter[freq] = ''.join(freqToLetter[freq])
print(freqToLetter)
freq_pairs = list(freqToLetter.items())
freq_pairs.sort(key=getFreq, reverse=True)
freqOrder = []
for freqPair in freq_pairs:
freqOrder.append(freqPair[1])
return ''.join(freqOrder)
mostFrequentLetters = getFrequencyOrder(cipher)
plaintext = ""
for letter in cipher:
i = mostFrequentLetters.find(letter)
plaintext += ETAOIN[i]
print(plaintext)
Here is the Ciphered text, save it in a text file called cipher.txt to run the code. I am just looking for suggestions on how I could improve this code.
GBTBVAGBFBYVGHQRNZNAARRQFGBERGVERNFZHPUSEBZUVFPUNZORENFSEBZFBPVRGLVNZABGFBYVGNELJUVYFGVERNQNAQJEVGRGUBHTUABOBQLVFJVGUZROHGVSNZNAJBHYQORNYBARYRGUVZYBBXNGGURFGNEFGURENLFGUNGPBZRSEBZGUBFRURNIRAYLJBEYQFJVYYFRCNENGRORGJRRAUVZNAQJUNGURGBHPURFBARZVTUGGUVAXGURNGZBFCURERJNFZNQRGENAFCNERAGJVGUGUVFQRFVTAGBTVIRZNAVAGURURNIRAYLOBQVRFGURCRECRGHNYCERFRAPRBSGURFHOYVZRFRRAVAGURFGERRGFBSPVGVRFUBJTERNGGURLNERVSGURFGNEFFUBHYQNCCRNEBARAVTUGVANGUBHFNAQLRNEFUBJJBHYQZRAORYVRIRNAQNQBERNAQCERFREIRSBEZNALTRARENGVBAFGURERZRZOENAPRBSGURPVGLBSTBQJUVPUUNQORRAFUBJAOHGRIRELAVTUGPBZRBHGGURFRRAIBLFBSORNHGLNAQYVTUGGURHAVIREFRJVGUGURVENQZBAVFUVATFZVYRGURFGNEFNJNXRANPREGNVAERIRERAPRORPNHFRGUBHTUNYJNLFCERFRAGGURLNERVANPPRFFVOYROHGNYYANGHENYBOWRPGFZNXRNXVAQERQVZCERFFVBAJURAGURZVAQVFBCRAGBGURVEVASYHRAPRANGHERARIREJRNEFNZRNANCCRNENAPRARVGUREQBRFGURJVFRFGZNARKGBEGUREFRPERGNAQYBFRUVFPHEVBFVGLOLSVAQVATBHGNYYURECRESRPGVBAANGHERARIREORPNZRNGBLGBNJVFRFCVEVGGURSYBJREFGURNAVZNYFGURZBHAGNVAFERSYRPGRQGURJVFQBZBSUVFORFGUBHENFZHPUNFGURLUNQQRYVTUGRQGURFVZCYVPVGLBSUVFPUVYQUBBQJURAJRFCRNXBSANGHERVAGUVFZNAAREJRUNIRNQVFGVAPGOHGZBFGCBRGVPNYFRAFRVAGURZVAQJRZRNAGURVAGRTEVGLBSVZCERFFVBAZNQROLZNAVSBYQANGHENYBOWRPGFVGVFGUVFJUVPUQVFGVATHVFURFGURFGVPXBSGVZOREBSGURJBBQPHGGRESEBZGURGERRBSGURCBRGGURPUNEZVATYNAQFPNCRJUVPUVFNJGUVFZBEAVATVFVAQHOVGNOYLZNQRHCBSFBZRGJRAGLBEGUVEGLSNEZFZVYYREBJAFGUVFSVRYQYBPXRGUNGNAQZNAAVATGURJBBQYNAQORLBAQOHGABARBSGURZBJAFGURYNAQFPNCRGURERVFNCEBCREGLVAGURUBEVMBAJUVPUABZNAUNFOHGURJUBFRRLRPNAVAGRTENGRNYYGURCNEGFGUNGVFGURCBRGGUVFVFGURORFGCNEGBSGURFRZRAFSNEZFLRGGBGUVFGURVEJNEENAGLQRRQFTVIRABGVGYRGBFCRNXGEHYLSRJNQHYGCREFBAFPNAFRRANGHER
Here is what the output looks like with the original sample: {329: 'AV', 299: 'B', 69: 'C', 3: 'DM', 252: 'E', 297: 'F', 432: 'G', 118: 'H', 36: 'I', 84: 'J', 2: 'K', 81: 'L', 354: 'N', 65: 'O', 107: 'P', 165: 'Q', 568: 'R', 99: 'S', 77: 'T', 281: 'U', 6: 'W', 20: 'X', 154: 'Y', 124: 'Z'} TNYNIOTNSNLITUDEACAOOEEDSTNRETIREASCUMHWRNCHISMHACBERASWRNCSNMIETGIACONTSNLITARGFHILSTIREADAODFRITETHNUYHONBNDGISFITHCEBUTIWACAOFNULDBEALNOELETHICLNNKATTHESTARSTHERAGSTHATMNCEWRNCTHNSEHEAVEOLGFNRLDSFILLSEPARATEBETFEEOHICAODFHATHETNUMHESNOECIYHTTHIOKTHEAT
It should be something like this. A long, uppercase, run-on sentence that fully translates the ciphered text. This is just an example of what it should look like, and not what it should actually translate to : {113: 'A', 104: 'B', 31: 'LC', 0: 'D', 90: 'E', 114: 'UF', 166: 'G', 39: 'H', 13: 'I', 40: 'J', 1: 'MK', 122: 'N', 29: 'O', 41: 'P', 52: 'Q', 224: 'R', 32: 'S', 22: 'T', 120: 'V', 2: 'W', 9: 'X', 50: 'Y', 55: 'Z'} CAPDOUBTFULLITSTOODASTWOSPENTSWIMMERSTHATDOECLINGTOGETHERANDCHOAKETHEIRARTTHEMERCILESSEMACDONWALDWORTHIETOBEAREBELLFORTOTHATTHEMULTIPLYINGVILLANIESOFNATUREDO...
Upvotes: 1
Views: 175
Reputation: 177554
The key is this is a Caesar cipher, so the OP's algorithm is wrong. A Caesar cipher only rotates the alphabet so a frequency analysis of more than the most common letter isn't needed.
The most common letter in the solution text is likely E. The most common letter in the cipher text is R, so if E=R then the Caesar cipher is as follows where the alphabet is rotated to align with R with E:
NOPQRSTUV... -> ABCDEFGHI...
Here's code to find the most common and translate the cipher. Since this is probably homework I'll leave it to the OP to write it without the import or using the built-in str.maketrans
and str.translate
:
import string
LETTERS = string.ascii_uppercase
with open('cipher.txt') as f:
cipher = f.read()
most_common = max(LETTERS, key=cipher.count) # This letter is probably E
# find the rotation as the difference between ordinals of most common and E
# modulo 26 to give a number from 0-26.
rotation = (ord(most_common) - ord('E')) % 26
# Built the translation dictionary
caesar = LETTERS[rotation:] + LETTERS[:rotation]
translation = str.maketrans(caesar,LETTERS)
print(cipher.translate(translation))
Output:
TOGOINTOSOLITUDEAMANNEEDSTORETIREASMUCHFROMHISCHAMBERASFROMSOCIETYIAMNOTSOLITARYWHILSTIREADANDWRITETHOUGHNOBODYISWITHMEBUTIFAMANWOULDBEALONELETHIMLOOKATTHESTARSTHERAYSTHATCOMEFROMTHOSEHEAVENLYWORLDSWILLSEPARATEBETWEENHIMANDWHATHETOUCHESONEMIGHTTHINKTHEATMOSPHEREWASMADETRANSPARENTWITHTHISDESIGNTOGIVEMANINTHEHEAVENLYBODIESTHEPERPETUALPRESENCEOFTHESUBLIMESEENINTHESTREETSOFCITIESHOWGREATTHEYAREIFTHESTARSSHOULDAPPEARONENIGHTINATHOUSANDYEARSHOWWOULDMENBELIEVEANDADOREANDPRESERVEFORMANYGENERATIONSTHEREMEMBRANCEOFTHECITYOFGODWHICHHADBEENSHOWNBUTEVERYNIGHTCOMEOUTTHESEENVOYSOFBEAUTYANDLIGHTTHEUNIVERSEWITHTHEIRADMONISHINGSMILETHESTARSAWAKENACERTAINREVERENCEBECAUSETHOUGHALWAYSPRESENTTHEYAREINACCESSIBLEBUTALLNATURALOBJECTSMAKEAKINDREDIMPRESSIONWHENTHEMINDISOPENTOTHEIRINFLUENCENATURENEVERWEARSAMEANAPPEARANCENEITHERDOESTHEWISESTMANEXTORTHERSECRETANDLOSEHISCURIOSITYBYFINDINGOUTALLHERPERFECTIONNATURENEVERBECAMEATOYTOAWISESPIRITTHEFLOWERSTHEANIMALSTHEMOUNTAINSREFLECTEDTHEWISDOMOFHISBESTHOURASMUCHASTHEYHADDELIGHTEDTHESIMPLICITYOFHISCHILDHOODWHENWESPEAKOFNATUREINTHISMANNERWEHAVEADISTINCTBUTMOSTPOETICALSENSEINTHEMINDWEMEANTHEINTEGRITYOFIMPRESSIONMADEBYMANIFOLDNATURALOBJECTSITISTHISWHICHDISTINGUISHESTHESTICKOFTIMBEROFTHEWOODCUTTERFROMTHETREEOFTHEPOETTHECHARMINGLANDSCAPEWHICHISAWTHISMORNINGISINDUBITABLYMADEUPOFSOMETWENTYORTHIRTYFARMSMILLEROWNSTHISFIELDLOCKETHATANDMANNINGTHEWOODLANDBEYONDBUTNONEOFTHEMOWNSTHELANDSCAPETHEREISAPROPERTYINTHEHORIZONWHICHNOMANHASBUTHEWHOSEEYECANINTEGRATEALLTHEPARTSTHATISTHEPOETTHISISTHEBESTPARTOFTHESEMENSFARMSYETTOTHISTHEIRWARRANTYDEEDSGIVENOTITLETOSPEAKTRULYFEWADULTPERSONSCANSEENATURE
Upvotes: 2