What is the best way to iterate through a data frame in Python?

Question

I trying to build a data frame based on another one. In order to build the second one, I need to loop over the first data frame and make some changes to the data and insert it in the second one. I am using a namedTuple for my for loop.

This loop is taking a lot of time to process 2m rows of data. Is there any fastest way to do this?

Sam Y · Accepted Answer

Since usually pandas dataframe were built on columns, it seems that it cannot provide a way to iterate through lines. However, This is the way I use for processing each row from the pandas dataframe:

rows = zip(*(table.loc[:, each] for each in table))
for rowNum, record in enumerate(rows):
    # If you want to process record, modify the code to process here:
    # Otherwise can just print each row
    print("Row", rowNum, "records: ", record)

Btw, I still suggest you to look for some pandas methods that can help you process your first dataframe - usually will be quicker and more effective than you write your own. Wish this could help.

What is the best way to iterate through a data frame in Python?

Answers (2)

Related Questions