Reputation: 667
I saw a function that used to convert the crosstab values into percentage, the code is:
I am really confused what is the meaning of ser/float(ser[-1])
. What is ser[-1] means, and how this code converts data into percentage.
Upvotes: 2
Views: 845
Reputation: 862511
It means divide each column by last value of column (value of All
row), which is converted to float
.
You can check it by:
def percConvert(ser):
print (ser)
print (ser[-1])
return ser / float(ser[-1])
But solution can be simplify with parameter normalize=0
in crosstab
(column with only ones is removed):
df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True, normalize=0)
Sample:
np.random.seed(123)
N = 100
data = pd.DataFrame({'Loan_Status': np.random.choice(['Y','N'], N),
'Credit_History':np.random.choice([0., 1.], N)})
#print (data)
def percConvert(ser):
return ser / float(ser[-1])
df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True, normalize=0)
print (df1)
Loan_Status N Y
Credit_History
0.0 0.489362 0.510638
1.0 0.415094 0.584906
All 0.450000 0.550000
df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True)
.apply(percConvert, axis=1)
print (df1)
Loan_Status N Y All
Credit_History
0.0 0.489362 0.510638 1.0
1.0 0.415094 0.584906 1.0
All 0.450000 0.550000 1.0
Upvotes: 5