Faster method to append in pandas

Question

I am currently modifying a pandas dataframe in a loop structure which looks something like this:

for item in item_list:
    
    ~~~~ do something to the item ~~~~~

    results_df = results_df.append(item)

This code is fine for small items being appended and whenever the results_df is small. However, the items I am appending are reasonably large, and the loop is quite long, which means this loop takes quite a long time to complete due to the large expense of copying the result_df when it becomes large.

One solution I can see is that I could append items to a list in this dictionary, like:

results_dict = {'result_1': [], 'result_2': [], 'result_3': []}
for item in item_list:
    item_1, item_2, item_3 = item

    ~~~~~ do something ~~~~

    results_dict['result_1'].append(item_1)
    results_dict['result_2'].append(item_2)
    results_dict['result_3'].append(item_3)

From the resulting dictionary the dataframe can then be made. This is ok but does not seem optimal. Can anyone think of a better solution? Nb the items in each item in item_list are reasonably large dataframe on which some comoplex processing takes place, and the length of item_list is of the order of 1000

Faster method to append in pandas

Answers (1)

Related Questions