2Xchampion
2Xchampion

Reputation: 656

How to integrate a progress bar in a pandas iterrows for loop

I acknowledge that using iterrows in pandas is bad practice, but this is what I'm dealing with from previous projects leftovers...

I am using a for loop like so to iterate through a pandas data frame for some data manipulation (on mobile so forgive my poor formatting) -

for index, row in df_temp.iterrows
# do stuff

I've since wanted to wrap a progress bar feature around this loop to track its progress (given the amount of data it consumes). I have found something like tqdm but its use case is rather simplified, is there a neat way to restructure my for loop so that a progress bar feature can be slotted in?

Tried simply taking a counter of the loop and keeping track of that during each iteration, but that seems counter-intuitive..

Upvotes: 1

Views: 1726

Answers (2)

jar
jar

Reputation: 2908

iterrows is an iterator so tqdm does not know its length. You can pass the value of the length to the total parameter as follows -

for _, row in tqdm(df.iterrows(), total=df.shape[0]):

Then you will actually see the progress bar.

Upvotes: 3

Ian Thompson
Ian Thompson

Reputation: 3285

import numpy as np
import pandas as pd
from tqdm import tqdm


df = pd.DataFrame(np.random.random(10_000,))

for index, row in tqdm(df.iterrows()):
    # do stuff
    row

example output

Upvotes: 3

Related Questions