Martin Kunze
Martin Kunze

Reputation: 1055

Python: Pandas dataframe and for loop - seperate row variable outside of loop body

I have some table data (based on some pandas dataframe) in following form:

Index Name Region 1 ... Region n
Index data Name data Region 1 data Region 1 data Region 1 data

Now I want to loop through the datarows and seperate for each row the data of column Name in some string variable and the data of column Region i for all 1≤i≤n in some kind of array or list.

The way I know is as follows:

for index, row in data.iterrows():
        name = row.values[0]
        regions = row.filter(regex = '^Region').values
        
        body of loop

In the body of the for loop I never need the variable row again, only name and regions. So for me the code feels a little bit overloaded.

My question now is:

Is their some way to make all a little bit simpler, maybe some for loop of kind:

for index, name, regions in data():
     body of loop

Upvotes: 0

Views: 309

Answers (1)

Ahmed Elashry
Ahmed Elashry

Reputation: 399

First of all, when using pandas, it is better to avoid for-loops as much as we can. It is faster to use pandas methods and there are plenty for everything you can do with a for loop.

For your case, you can define what you want to do in a function and pass it to the apply() method of pandas data frames. For example:

def body_for_loop(row, region_index): 
    name = row["Name"]
    regions = row.filter(regex = '^Region').values
    # body of loop

Now when you want to use it, you will just call:

df.apply(body_fro_loop, axis=1)

Upvotes: 1

Related Questions