Python : How to improve the speed of opening a file , modifying it based on a condition and copying it to another file?

Question

I have a million of csv files each having 441 rows and 8 columns. I open each file and check if the row 221 has any column greater than 60. If so, I make all the values in that column as "-1" for all the rows.
For example:

Input    
row 220: 65,13,15,27,18,51,20,79  
row 221: 25,23,45,67,12,11,23,69
row 222: 12,12,14,15,16,17,19,22  

Output   
row 220: 65,13,15,-1,18,51,20,-1 
row 221: 25,23,45,-1,12,11,23,-1
row 222: 12,12,14,-1,16,17,19,-1

Once I do the above processing, I copy this contents into another file. I do the above for all the files.

The code:

file_list=[]
mypath1=os.path.join(mypath,dut) // dut refers to the directory name
out_path1=os.path.join(mypath1,folder1)

if not os.path.exists(out_path1):
        os.mkdir(out_path1)

for i in listdir(mypath1):
    if i.startswith("PD") and i.endswith(".csv"):
        file_list.append(i)

for j in file_list:

    #print j
    f = open(os.path.join(mypath1,j),'r')
    f5=csv.reader(f)

    sec=[]
    f5 = list(f5)
    for col in range(0,8):
        if int(f5[220][col]) <= 60:
            sec.append(col)

    for r in range(0,441):
        for value in sec:
           f5[r][value] = -1

    filename = "temp1_" + j
    f2 = open(os.path.join(out_path1,filename),'w+')
    f1=csv.writer(f2)
    f1.writerows(f5)

    f2.close()
    f.close()
    flag=1

The code is working fine, but the time taken for processing around 300 000 csv files is around 1 hour (opening a file, doing the above operation and writing to another file is approximately 0.01 second).

Is there any other way to speed up the above process? I have 20 other directories with same amount of files. In that case, total time taken would be 20 hours.

Python : How to improve the speed of opening a file , modifying it based on a condition and copying it to another file?

Answers (1)

Related Questions