Psyduck
Psyduck

Reputation: 667

Percentage converting function in Panda dataframe

I saw a function that used to convert the crosstab values into percentage, the code is:

I am really confused what is the meaning of ser/float(ser[-1]). What is ser[-1] means, and how this code converts data into percentage.

https://www.analyticsvidhya.com/blog/2016/01/12-pandas-techniques-python-data-manipulation/

Upvotes: 2

Views: 845

Answers (1)

jezrael
jezrael

Reputation: 862511

It means divide each column by last value of column (value of All row), which is converted to float.

You can check it by:

def percConvert(ser):
    print (ser)
    print (ser[-1])
    return ser / float(ser[-1])

But solution can be simplify with parameter normalize=0 in crosstab (column with only ones is removed):

df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True, normalize=0)

Sample:

np.random.seed(123)
N = 100
data = pd.DataFrame({'Loan_Status': np.random.choice(['Y','N'], N),
                   'Credit_History':np.random.choice([0., 1.], N)})
#print (data)

def percConvert(ser):
    return ser / float(ser[-1])

df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True, normalize=0)
print (df1)
Loan_Status            N         Y
Credit_History                    
0.0             0.489362  0.510638
1.0             0.415094  0.584906
All             0.450000  0.550000

df1 = pd.crosstab(data['Credit_History'],data['Loan_Status'], margins=True)
        .apply(percConvert, axis=1)
print (df1)
Loan_Status            N         Y  All
Credit_History                         
0.0             0.489362  0.510638  1.0
1.0             0.415094  0.584906  1.0
All             0.450000  0.550000  1.0

Upvotes: 5

Related Questions