Reputation: 89
This dataframe is categorized in A and B.
and I want to count values based on standard value "0", ">0" and nan.
Pls see the Cateogory A. There isn't 'nan' data.
df :
Category A A A B B
CODE U-01 U-02 U-03 U-04 U-05
n1 0 1 0 2 nan
n2 1 1 0 nan nan
n3 3 0 0 0 2
In the Code,
df1 = (df.unstack()
.mask(lambda x: x.gt(0), '>0')
.fillna('nan')
.groupby(level=[0, 2])
.value_counts()
.unstack([0,2], fill_value=0)
.rename(columns={0:'0'}))
The output drop Category A & nan columns. I want to express to every columns in "0, >0, nan".
Category A B
Standard 0 >0 0 >0 nan
n1 2 1 0 1 1
n2 1 2 0 0 2
n3 1 1 1 1 0
My desired output table would look like:
Category A B
Standard 0 >0 nan 0 >0 nan
n1 2 1 0 0 1 1
n2 1 2 0 0 0 2
n3 1 1 0 1 1 0
Upvotes: 1
Views: 48
Reputation: 863166
First idea is use DataFrame.reindex
by all possible combinations created by MultiIndex.from_product
:
df1 = (df.unstack()
.mask(lambda x: x.gt(0), '>0')
.fillna('nan')
.groupby(level=[0, 2])
.value_counts()
.unstack([0,2], fill_value=0)
.rename(columns={0:'0'})
)
mux = pd.MultiIndex.from_product([df.columns.levels[0], ['0','>0','nan']])
df1 = df1.reindex(mux, axis=1, fill_value=0)
print (df1)
Category A B
0 >0 nan 0 >0 nan
n1 2 1 0 0 1 1
n2 1 2 0 0 0 2
n3 2 1 0 1 1 0
Or you can reshape by stack
and unstack
with convert missing values to 0
:
df1 = (df.unstack()
.mask(lambda x: x.gt(0), '>0')
.fillna('nan')
.groupby(level=[0, 2])
.value_counts()
.unstack([0,2], fill_value=0)
.rename(columns={0:'0'})
.stack()
.fillna(0)
.astype(int)
.unstack()
)
print (df1)
Category A B
0 >0 nan 0 >0 nan
n1 2 1 0 0 1 1
n2 1 2 0 0 0 2
n3 2 1 0 1 1 0
Upvotes: 2