imkusoh
imkusoh

Reputation: 43

for loop to add new values to column

So I am trying to add a new column to my dataframe that contains the side/radius given the shape and area of each row.

My original dataset looks like this:

df:

    shape     color   area  
0   square    yellow  9409.0    
1   circle    yellow  4071.5    
2   triangle  blue    2028.0    
3   square    blue    3025.0

But when I coded it like this:

df['side'] = 0
for x in df['shape']:
    if x == 'square':
        df['side'] = np.rint(np.sqrt(df['area'])).astype(int)
    elif x == 'triangle':
        df['side'] = np.rint(np.sqrt((4 * df['area'])/np.sqrt(3))).astype(int)
    elif x == 'circle':
        df['side'] = np.rint(np.sqrt(df['area']/np.pi)).astype(int)

I got:

    shape     color   area    size
0   square    yellow  9409.0  55
1   circle    yellow  4071.5  36    
2   triangle  blue    2028.0  25    
3   square    blue    3025.0  31    

It looks like the loop is adding the elif x == 'circle' clause to the side column for every row.

Upvotes: 0

Views: 1375

Answers (2)

user7864386
user7864386

Reputation:

Looks like it's a good use case for numpy.select, where you select values depending on which shape it is:

import numpy as np
df['side'] = np.select([df['shape']=='square', 
                        df['shape']=='circle', 
                        df['shape']=='triangle'], 
                       [np.rint(np.sqrt(df['area'])), 
                        np.rint(np.sqrt(df['area']/np.pi)), 
                        np.rint(np.sqrt((4 * df['area'])/np.sqrt(3)))], 
                       np.nan).astype(int)

It could be written more concisely by creating a mapping from shape to multiplier; then use pandas vectorized operations:

mapping = {'square': 1, 'circle': 1 / np.pi, 'triangle': 4 / np.sqrt(3)}
df['side'] = df['shape'].map(mapping).mul(df['area']).pow(1/2).round(0).astype(int)

Output:

      shape   color    area  side
0    square  yellow  9409.0    97
1    circle  yellow  4071.5    36
2  triangle    blue  2028.0    68
3    square    blue  3025.0    55

Upvotes: 1

AppleCiderGuy
AppleCiderGuy

Reputation: 1287

I see you were assigning to the columns. you can iterate over each row and edit it as you iterate over it using iterrows () method on dataFrame.

for i, row in df.iterrows():
    if row['shape'] == 'square':
        df.at[i,'side'] = np.rint(np.sqrt(row['area'])).astype(int)
    elif row['shape'] == 'triangle':
        df.at[i,'side'] = np.rint(np.sqrt((4 * row['area'])/np.sqrt(3))).astype(int)
    elif row['shape'] == 'circle':
        df.at[i,'side'] = np.rint(np.sqrt(row['area']/np.pi)).astype(int)

note the assignment is to cell of a column on row at index i.

also, suggestion by @enke above will work just fine.

Upvotes: 0

Related Questions