hariudkmr
hariudkmr

Reputation: 337

Need understanding of this python error

I am actually processing a text log file, with the following python code. this runs continuosly even after the mline reaches EOF.

myfile = open("560A_HL_Japan_02_04_2016.txt", 'r')
mod_myfile = open("560A_HL_Japan_02_04_2016_modified.txt", "wb")
mfl = myfile.readlines()
mstring=''
for mline in mfl:
    mli = mline.split()
    for l in range(len(mli)):
        if l >= 2:                      #second object
            mstring += mli[l]+' '
    mstring += '\n'
    mod_myfile.write(mstring)
mod_myfile.close()

If I make a slight modification with the below code. it executes without any issues

myfile = open("560A_HL_Japan_02_04_2016.txt", 'r')
mod_myfile = open("560A_HL_Japan_02_04_2016_modified.txt", "wb")
mfl = myfile.readlines()
for mline in mfl:
    mli = mline.split()
    for l in range(len(mli)):
        if l == 2:                      #second object
            mstring = mli[l]+' '
        elif l > 2:
            mstring += mli[l]+' '
    mstring += '\n'
    mod_myfile.write(mstring)
 mod_myfile.close()

Upvotes: 1

Views: 58

Answers (1)

snakecharmerb
snakecharmerb

Reputation: 55834

In your first example, you initialise mstring as the empty string outside of your loops:

mstring = ''

Then in the loop you keep adding to mstring:

mstring += mli[l]+' '

but mstring is never reinitialised, so it keeps getting bigger and bigger, so the code will take longer and longer to execute.

In your second example, mstring is reset every time l is equal to 2:

if l == 2:                      #second object
    mstring = mli[l]+' '

Because mstring is reset from time to time the second example performs better.

A couple of other observations:

Using += to add strings is not guaranteed to give best performance in all versions of Python. Consider building a list and calling ''.join() when it is complete.

Don't use l as a variable name, it looks like 1 in some fonts.

Upvotes: 2

Related Questions