user14383280
user14383280

Reputation:

Can't create function, that adds new column in DataFrame

I'm trying to make a function that adds new columns with numbering:

x = [i+1 for i in range(len(df))]

def new_column (df, column):
  df.sort_values(by=column, ascending=False)
  df['new_col'] = x

But, when I call it, I get an eroor:

new_column(df, column_name)

NameError: name 'column' is not defined

What did I do wrong?

Upvotes: 0

Views: 304

Answers (2)

Bendriss Jaâfar
Bendriss Jaâfar

Reputation: 78

This is how I did it :

df = pd.DataFrame({  
    'a' : [2.0,1.0,3.5,2.0,5.0,3.0,1.0,1.0],
    'b' : [1.0,-1.0,3.5,3.0,4.0,2.0,3.0,2.0],     
    'c' : [2.0,2.0,2.0,2.0,-1.0,-1.0,-2.0,-2.0],                  
    })
     a    b    c
0  2.0  1.0  2.0
1  1.0 -1.0  2.0
2  3.5  3.5  2.0
3  2.0  3.0  2.0
4  5.0  4.0 -1.0
5  3.0  2.0 -1.0
6  1.0  3.0 -2.0
7  1.0  2.0 -2.0
def new_column (df, column_name, column_value):
    df[column_name] = column_value
    
    return df

x = [i+1 for i in range(len(df))]

df2 = new_column(df, "help", x)

The problem with your approach is that you use "column" before creating it ! that is why it said : "NameError: name 'column' is not defined"

     a    b    c  help
0  2.0  1.0  2.0     1
1  1.0 -1.0 -2.0     2
2  3.5  3.5  2.0     3
3  2.0  3.0 -2.0     4
4  5.0  4.0 -1.0     5
5  3.0  2.0  1.0     6
6  1.0  3.0 -2.0     7
7  1.0  2.0  2.0     8

Upvotes: 1

jezrael
jezrael

Reputation: 862611

Use:

np.random.seed(2021)

df = pd.DataFrame({'a':np.random.randint(1,10, size=10)})
print (df)


def new_column (df, column):
    #assign back or use inplace=True
    df = df.sort_values(by=column, ascending=False)
    #df.sort_values(by=column, ascending=False, inplace=True)
    #add range
    df['new_col'] = range(1, len(df) + 1)
    #return ouput
    return df
  
print (new_column(df, 'a'))
   a  new_col
5  9        1
3  7        2
6  7        3
7  7        4
8  7        5
9  7        6
1  6        7
4  6        8
0  5        9
2  1       10

Upvotes: 1

Related Questions