Reputation: 5299
I have dataframe shown in below: column name 'Types'shows each types dified
I would like to add another column named 'number' defined as below.
df=pd.DataFrame({'Sex':['M','F','F','M'],'Age':[30,31,33,32],'Types':['A','C','B','D']})
Out[8]:
Age Sex Types
0 30 M A
1 31 F C
2 33 F B
3 32 M D
and I have another male table below; each column represents Types!
(It was difficult to create table for me, Are there another easy way to create?)
table_M = pd.DataFrame(np.arange(20).reshape(4,5),index=[30,31,32,33],columns=["A","B","C","D","E"])
table_M.index.name="Age(male)"
A B C D E
Age(male)
30 0 1 2 3 4
31 5 6 7 8 9
32 10 11 12 13 14
33 15 16 17 18 19
and I have female table below;
table_F = pd.DataFrame(np.arange(20,40).reshape(4,5),index=[30,31,32,33],columns=["A","B","C","D","E"])
table_F.index.name="Age(female)"
A B C D E
Age(female)
30 20 21 22 23 24
31 25 26 27 28 29
32 30 31 32 33 34
33 35 36 37 38 39
so I would like to add 'number' column as shown below;
Age Sex Types number
0 30 M A 0
1 31 F C 27
2 33 F B 36
3 32 M D 13
this number column refer to female and male table. for each age , Type, and Sex. It was too complicated for me. Can I ask how to add 'number' column?
Upvotes: 2
Views: 127
Reputation: 294488
It's easier if you two tables are combined such that you can access the 'Sex'
via an apply
.
table = pd.concat([table_F, table_M], axis=1, keys=['F', 'M'])
accessor = lambda row: table.loc[row.Age, (row.Sex, row.Types)]
df['number'] = df.apply(accessor, axis=1)
df
Upvotes: 3
Reputation: 16619
Another way to do this:
In [60]: df['numbers'] = df.apply(lambda x: table_F.loc[[x.Age]][x.Types].iloc[0] if x.Sex == 'F' else table_M.loc[[x.Age]][x.Types].iloc[0], axis = 1)
In [60]: df
Out[60]:
Age Sex Types numbers
0 30 M A 0
1 31 F C 27
2 33 F B 36
3 32 M D 13
Upvotes: 1
Reputation:
I suggest reshaping your male and female tables:
males = (table_M.stack().to_frame('number').assign(Sex='M').reset_index()
.rename(columns={'Age(male)': 'Age', 'level_1': 'Types'}))
females = (table_F.stack().to_frame('number').assign(Sex='F').reset_index()
.rename(columns={'Age(female)': 'Age', 'level_1': 'Types'}))
reshaped = pd.concat([males, females], ignore_index=True)
Then merge:
df.merge(reshaped)
Out:
Age Sex Types number
0 30 M A 0
1 31 F C 27
2 33 F B 36
3 32 M D 13
What this does is it stacks the columns of Male and Female tables, and assigns an indicator column showing Sex ('M' and 'F'). females.head()
looks like this:
females.head()
Out:
Age Types number Sex
0 30 A 20 F
1 30 B 21 F
2 30 C 22 F
3 30 D 23 F
4 30 E 24 F
and males.head()
:
males.head()
Out:
Age Types number Sex
0 30 A 0 M
1 30 B 1 M
2 30 C 2 M
3 30 D 3 M
4 30 E 4 M
With pd.concat these two are combined into a single DataFrame and merge by default works on the common columns so it looks for the matches in 'Age', 'Sex', 'Types' columns and merge two DataFrames based on that.
One other possibility is to use df.lookup:
df.loc[df['Sex']=='M', 'number'] = table_M.lookup(*df.loc[df['Sex']=='M', ['Age', 'Types']].values.T)
df.loc[df['Sex']=='F', 'number'] = table_F.lookup(*df.loc[df['Sex']=='F', ['Age', 'Types']].values.T)
df
Out:
Age Sex Types number
0 30 M A 0.0
1 31 F C 27.0
2 33 F B 36.0
3 32 M D 13.0
This looks up the males in table_M
, and females in table_F
.
Upvotes: 5