Reputation: 8033
I have dataframe as below
d = {'Name':['Alisa','Bobby','jodha','jack','raghu','Cathrine',
'Alisa','Bobby','kumar','Alisa','Alex','Cathrine'],
'Age':[26,24,23,22,23,24,26,24,22,23,24,24],
'Score':[85,63,55,74,31,77,85,63,42,62,89,77]}
df = pd.DataFrame(d,columns=['Name','Age','Score'])
Name Age Score
0 Alisa 26 85
1 Bobby 24 63
2 jodha 23 55
3 jack 22 74
4 raghu 23 31
5 Cathrine 24 77
6 Alisa 26 85
7 Bobby 24 63
8 kumar 22 42
9 Alisa 23 62
10 Alex 24 89
11 Cathrine 24 77
When i run the code below, it works fine & gets the output as shown. A new column for each for the column.
a=df.columns[1:]
df[a +'rat'] = df[a]/df[a].sum()
Name Age Score Agerat Scorerat
0 Alisa 26 85 0.091228 0.105853
1 Bobby 24 63 0.084211 0.078456
2 jodha 23 55 0.080702 0.068493
3 jack 22 74 0.077193 0.092154
4 raghu 23 31 0.080702 0.038605
5 Cathrine 24 77 0.084211 0.095890
6 Alisa 26 85 0.091228 0.105853
7 Bobby 24 63 0.084211 0.078456
8 kumar 22 42 0.077193 0.052304
9 Alisa 23 62 0.080702 0.077210
10 Alex 24 89 0.084211 0.110834
11 Cathrine 24 77 0.084211 0.095890
However, when i want to create a Min
for each of the columns with the code a below, i get the error "KeyError: "None of [Index(['Agemin', 'Scoremin'], dtype='object')] are in the [columns]"
. I wanted it create that column!
df[a +'min'] = df[a].min()
so, how do we go about creating min()
, max()
, sum()
etc columns for each of the columns without having to specify the names of each of the columns?
Upvotes: 2
Views: 57
Reputation: 19947
If you would prefer to do it using your original code, you can do:
df[a +'min'] = df[a].groupby(by=np.zeros_like(df.index)).transform(min)
Upvotes: 2
Reputation: 59529
You can assign
multiple scalar values from a Series using **
to pass the arguments. The index becomes the column name, with the value broadcast to all rows. For a Series add_suffix
adds to the index, while later I use it to add to column names, which how that behaves for DataFrames.
import pandas as pd
df1 = df.select_dtypes('number')
df = df.assign(**df1.min().add_suffix('min'))
# Name Age Score Agemin Scoremin
#0 Alisa 26 85 22 31
#1 Bobby 24 63 22 31
#2 jodha 23 55 22 31
...
#10 Alex 24 89 22 31
#11 Cathrine 24 77 22 31
Personally, I would concat
the other result:
df = pd.concat([df, (df1/df1.sum()).add_suffix('rat')], axis=1)
# Name Age Score Agemin Scoremin Agerat Scorerat
#0 Alisa 26 85 22 31 0.091228 0.105853
#1 Bobby 24 63 22 31 0.084211 0.078456
#2 jodha 23 55 22 31 0.080702 0.068493
#...
#10 Alex 24 89 22 31 0.084211 0.110834
#11 Cathrine 24 77 22 31 0.084211 0.095890
Upvotes: 2