Is there a way to make changing DataFrame faster in a loop?

Question

    for index, row in df.iterrows():
        print(index)

        name = row['name']
        new_name = get_name(name)
        row['new_name'] = new_name

        df.loc[index] = row

In this piece of code, my testing shows that the last line makes it quite slow, really slow. It basically insert a new column row by row. Maybe I should store all the 'new_name' into a list, and update the df outside of the loop?

jezrael · Accepted Answer

Use Series.apply for processing function for each value of column, it is faster like iterrows:

df['new_name'] = df['name'].apply(get_name)

If want improve performance then is necessary change function if possible, but it depends of function.

Is there a way to make changing DataFrame faster in a loop?

Answers (2)

Related Questions