Reputation:
I'm trying to make a function that adds new columns with numbering:
x = [i+1 for i in range(len(df))]
def new_column (df, column):
df.sort_values(by=column, ascending=False)
df['new_col'] = x
But, when I call it, I get an eroor:
new_column(df, column_name)
NameError: name 'column' is not defined
What did I do wrong?
Upvotes: 0
Views: 304
Reputation: 78
This is how I did it :
df = pd.DataFrame({
'a' : [2.0,1.0,3.5,2.0,5.0,3.0,1.0,1.0],
'b' : [1.0,-1.0,3.5,3.0,4.0,2.0,3.0,2.0],
'c' : [2.0,2.0,2.0,2.0,-1.0,-1.0,-2.0,-2.0],
})
a b c
0 2.0 1.0 2.0
1 1.0 -1.0 2.0
2 3.5 3.5 2.0
3 2.0 3.0 2.0
4 5.0 4.0 -1.0
5 3.0 2.0 -1.0
6 1.0 3.0 -2.0
7 1.0 2.0 -2.0
def new_column (df, column_name, column_value):
df[column_name] = column_value
return df
x = [i+1 for i in range(len(df))]
df2 = new_column(df, "help", x)
The problem with your approach is that you use "column" before creating it ! that is why it said : "NameError: name 'column' is not defined"
a b c help
0 2.0 1.0 2.0 1
1 1.0 -1.0 -2.0 2
2 3.5 3.5 2.0 3
3 2.0 3.0 -2.0 4
4 5.0 4.0 -1.0 5
5 3.0 2.0 1.0 6
6 1.0 3.0 -2.0 7
7 1.0 2.0 2.0 8
Upvotes: 1
Reputation: 862611
Use:
np.random.seed(2021)
df = pd.DataFrame({'a':np.random.randint(1,10, size=10)})
print (df)
def new_column (df, column):
#assign back or use inplace=True
df = df.sort_values(by=column, ascending=False)
#df.sort_values(by=column, ascending=False, inplace=True)
#add range
df['new_col'] = range(1, len(df) + 1)
#return ouput
return df
print (new_column(df, 'a'))
a new_col
5 9 1
3 7 2
6 7 3
7 7 4
8 7 5
9 7 6
1 6 7
4 6 8
0 5 9
2 1 10
Upvotes: 1