how to predict how long it will take for python to run a script?

Question

So I want to make sure that I run my program when it is optimal, for example, if it will take 5 hours to complete, I should run it overnight!

I do know this program will end, and theoretically I should be able to base length on size. So here is the actual problem:

I need to open 16 pickled files that are pandas DataFrames that add up to a total of 1.5 gigs. Note, I will also need to do this with DataFrames that add up to 20 gigs, so the answer I need is a way of telling how long the following code will take given total amounts of gigs:

import pickle
import os
def pickleSave(data, pickleFile):
    output = open(pickleFile, 'wb')
    pickle.dump(data, output)
    output.close()
    print "file has been saved to %s" % (pickleFile)
def pickleLoad(pickleFile):
    pkl_file = open(pickleFile, 'rb')
    data = pickle.load(pkl_file)
    pkl_file.close()
    return data
directory = '/Users/ryansaxe/Desktop/kaggle_parkinsons/GPS/'
files = os.listdir(directory)
dfs = [pickleLoad(directory + i) for i in files]
new_file = directory + 'new_file_dataframe'
pickleSave(dfs,new_file)

so now I need to write a function that will look like the following:

def time_fun(data_size_in_gigs):
    #some algorithm here
    print "your code will take ___ hours to run"

I have no clue how to approach this, or if it is even possible. Any ideas?

how to predict how long it will take for python to run a script?

Answers (1)

Related Questions