PYTHON Re: Performance: Starting from a specific line in a text file read a line and split it based on tabs, then access each element.

Question

As per the title the issue is i want to do the following:

Starting from a specific line x up to the end of the file read each line. Nb. i dont want to use readline() as that reads the entire file to memory and when testing it is very slow on the server i deployed it to. (took like 15 minutes, whereas on my very good pc it takes 30sec).
when the single line is read i want to .split(" ") that specific line and load it to a list so i can access each element.

Please see my attempt below (editted as sensitive):

with open(FileName, "w+") as file:
        file.write(FileName + "," + Quantity + "
")
        # Start from beginning of data and read each line and take specific data   
        for x in range(StartCount,Quantity+StartCount)):
            os.chdir(FileLocation + country)
            with open(OutputFileName, 'r') as OutputFile:    
                for x, line in enumerate(OutputFile):
                    OutputFileData = [line.split("  ") for line in OutputFile]

                    #Select data you want for file from output file. Nb OutputFileData[1][:-1] removed extra part of a column
                    try:
                        FileData = OutputFileData[0]+ "," + OutputFileData[1][:-1] + "," + OutputFileData[2]

.... I then go on to append file data to the file i'm creating.

Note my code works fine when i use:

with open(OutputFileName, 'r') as OutputFile:    
                lines=OutputFile.readlines()
                temp = lines[x]
                OutputFileData = temp.split("   ")

But as i said before i believe the slowness of the script when i run it on the server is because it keeps iterating over: lines=OutputFile.readlines() which causes it to slow down.. So when i check the file im trying to create i will see it stop at an amount of lines and then it just hangs..

Please help me figure out a better way.

ReLisK · Accepted Answer

Just coming back to say the issue at the time wasn't actually my code its just that the server is really really just that slow. So i ended up having the code run on individual machines and then drop the data to the server where it needed to be. This improved performance immensely.

PYTHON Re: Performance: Starting from a specific line in a text file read a line and split it based on tabs, then access each element.

Answers (2)

Related Questions