grumpster3
grumpster3

Reputation: 1

Python is truncating my file contents

I have set a task in Python to code a long text file 1-26 for the letters of the alphabet and 26+ for non-alphanumerics see code below:

#open the file,read the contents and print out normally
my_file = open("timemachine.txt")
my_text = my_file.read()
print (my_text)

print ""
print ""

#open the file and read each line, taking out the eol chars
with open("timemachine.txt","r") as myfile:
    clean_text = "".join(line.rstrip() for line in myfile)

#close the file to prevent memory hogging
my_file.close()

#print out the result all in lower case 
clean_text_lower = clean_text.lower()
print clean_text_lower

print ""
print ""

#establish a lowercase alphabet as a list   
my_alphabet_list = []
my_alphabet = """ abcdefghijklmnopqrstuvwxyz.,;:-_?!'"()[]  %/1234567890"""+"\n"+"\xef"+"\xbb"+"\xbf"

#find the index for each lowercase letter or non-alphanumeric
for letter in my_alphabet:
    my_alphabet_list.append(letter)
print my_alphabet_list,
print my_alphabet_list.index

print ""
print ""

#go through the text and find the corresponding letter of the alphabet
for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

When I print this I should get (1) the original text, (2) the text reduced to lower case and no whitespace, (3) the code index used and finally (4) the converted codes. However I can only get the latter part of the original text or if I comment out (4) it will print all the text. Why?

Upvotes: 0

Views: 59

Answers (1)

Matthew
Matthew

Reputation: 658

The bit at the end:

for letter in clean_text_lower:
    posn = my_alphabet_list.index(letter)
print posn,

keeps reassigning posn without actually doing anything with it. Therefore, you will only get the my_alphabet_list.index(letter) for the last letter in clean_text_lower.

To fix there's a couple things you could do. First thing that springs to mind is initialize a list and append values to it i.e:

posns = []
for letter in clean_text_lower:
    posns.append(my_alphabet_list.index(letter))

print posns,

Upvotes: 2

Related Questions