How to make this function using apply

Question

I am trying to eval the Herfindahl index using apply. I have done this by transforming the dataframe into a numpy matrix. In fact, the function evalHerfindahlIndex is working well and it evaluates the correct value of the Herfindahl index for each line. However, when I try to make the same function (evalHerfindahlIndexForDF) to use apply I get a very strange error:

ValueError: ("No axis named 1 for object type ", 'occurred at index A')

The entire code is this:

import pandas as pd
import numpy as np
import datetime


def evalHerfindahlIndex(x):
    soma=np.sum(x,axis=1)
    y=np.empty(np.shape(x))
    for line in range(len(soma)):
        y[line,:]=np.power(x[line,:]/soma[line],2.0)
    hhi=np.sum(y,axis=1)    
    return hhi

def evalHerfindahlIndexForDF(x):
    soma=x.sum(axis=1)

def creatingDataFrame():

    dateList=[]
    dateList.append(datetime.date(2002,1,1))
    dateList.append(datetime.date(2002,2,1))
    dateList.append(datetime.date(2002,1,1))
    dateList.append(datetime.date(2002,1,1))
    dateList.append(datetime.date(2002,2,1))
    raw_data = {'Date': dateList,            
                'Company': ['A', 'B', 'B', 'C' , 'C'],                
                'var1': [10, 20, 30, 40 , 50]}

    df = pd.DataFrame(raw_data, columns = ['Date','Company', 'var1'])
    df.loc[1, 'var1'] = np.nan
    return df


if __name__=="__main__":
    df=creatingDataFrame()
    print(df)
    dfPivot=df.pivot(index='Date', columns='Company', values='var1')
    #print(dfPivot)
    dfPivot=dfPivot.fillna(0)
    dfPivot['Date']=dfPivot.index

    listOfCompanies=list(set(df['Company']))
    Pivot=dfPivot.as_matrix(columns=listOfCompanies)
    print(evalHerfindahlIndex(Pivot))
    print(dfPivot)

    print(dfPivot[listOfCompanies].apply(evalHerfindahlIndexForDF))

The dataframe that I am using is dfPivot:

Company        A     B     C        Date
Date                                    
2002-01-01  10.0  30.0  40.0  2002-01-01
2002-02-01   0.0   0.0  50.0  2002-02-01

The correct values of the Herfindahl index evaluated using evalHerfindahlIndex is:

[0.40625 1.     ]

I would like to return this as an extra column of the dataframe dfPivot.

How to make this function using apply

Answers (1)

Related Questions