Reputation: 692
I have a main_df
dataframe as below.
user_id main_code sub_1 sub_2
0 03920 AA YA ZA
1 34233 BB YB ZA
2 02342 AA YD ZB
3 32324 CC YA ZA
4 52323 AA YA ZD
5 20932 DD YD ZD
6 02034 BB YA ZA
I am trying to achieve below output dataframe. Selected columns(sub_1 & sub_2) of main_df dataframe unique values count and covert to dataframe columns.
main_code YA YB YD ZA ZB ZD
0 AA 2.0 NaN 1.0 1.0 1.0 1.0
1 BB 1.0 1.0 NaN 2.0 NaN NaN
2 CC 1.0 NaN NaN 1.0 NaN NaN
3 DD NaN NaN 1.0 NaN NaN 1.0
So far I tried as below. I get a different output.
result_df = pd.DataFrame()
for col in ['sub_1','sub_2']:
result_df = pd.concat([result_df, pd.DataFrame(main_df[pd.notnull(main_df[col])]['main_code'].value_counts())], axis=1)
result_df.columns = ['sub_1','sub_2']
It would be helpful someone can guide me. Thank you.
Upvotes: 2
Views: 78
Reputation: 1432
Although right answer is already given. You can use this if you want to follow a customized approach.
a = (pd.pivot_table(df, index='main_code',columns=
['sub_1'],aggfunc=np.count_nonzero).reset_index())
cols = a.columns.droplevel(0).to_list()
cols[0]='main_code'
a.columns=cols
print(a)
Upvotes: 0
Reputation: 3936
Here you go:
pd.merge(
df.pivot_table(index="main_code", columns="sub_1", aggfunc="count")["sub_2"],
df.pivot_table(index="main_code", columns="sub_2", aggfunc="count")["sub_1"],
left_index=True,
right_index=True
)
Upvotes: 0
Reputation: 16683
Use .melt
with .pivot_table
df = df.melt(id_vars='main_code', value_vars=['sub_1', 'sub_2']).pivot_table(index='main_code', columns='value', aggfunc='count').reset_index()
Upvotes: 3