Longroadahead
Longroadahead

Reputation: 381

percentage change in pandas

I have DataFrame that is like the following

      Date         ACH   BABA   BIDU    CEA    CHA   CTRP    EDU    HNP  
0     2000-06-30  $1.00  $3.00  $1.00  $0.00  $0.00  $0.00  $0.00  $0.00   
1     2000-07-03  $3.00  $2.00  $6.20  $1.50  $0.00  $0.00  $0.00 $-0.48   
2     2000-07-04  $5.00  $6.00  $3.00  $0.00  $0.00  $0.00  $0.00  $0.00 

I'm trying to calculate the percentage change of each using:

df_vals = df[[ticker for ticker in tickers]].pct_change()

However I get the following error

TypeError: unsupported operand type(s) for /: 'str' and 'str'

I'm assuming I get this error because I have column headings and therefore it can't calculate string. Then I tried adding shift (probably wrong too)

df_vals = df[[ticker for ticker in tickers]].shift(1).pct_change()

This returns the same error. Thanks for the help.

Upvotes: 3

Views: 489

Answers (1)

jezrael
jezrael

Reputation: 862431

You need remove $ by replace and cast to floats first:

import pandas as pd

s = '''\
Date        ACH   BABA   BIDU    CEA    CHA   CTRP    EDU    HNP  
2000-06-30  $1.00  $3.00  $1.00  $0.00  $0.00  $0.00  $0.00  $0.00   
2000-07-03  $3.00  $2.00  $6.20  $1.50  $0.00  $0.00  $0.00 $-0.48   
2000-07-04  $5.00  $6.00  $3.00  $0.00  $0.00  $0.00  $0.00  $0.00'''

# Recreate sample dataframe
df = pd.read_csv(pd.compat.StringIO(s),sep='\s+')

# Set index date (to not include) and remove all $
df = df.set_index('Date').replace('\$', '', regex=True).astype(float)

# Apply pct change and reset index
df = df.pct_change().reset_index()

print(df)

Returns:

         Date       ACH      BABA      BIDU       CEA  CHA  CTRP  EDU  \
0  2000-06-30       NaN       NaN       NaN       NaN  NaN   NaN  NaN   
1  2000-07-03  2.000000 -0.333333  5.200000       inf  NaN   NaN  NaN   
2  2000-07-04  0.666667  2.000000 -0.516129 -1.000000  NaN   NaN  NaN   

        HNP  
0       NaN  
1      -inf  
2 -1.000000  

Upvotes: 3

Related Questions