kev
kev

Reputation: 139

Transform data to growth rates in Python

I have two variables and I want to express one of them (monetary_base) in terms of monthly growth. How can I do that?. In the R language you should first transform the data into time series, in Python is this also the case?

#LLamando a las series que buscamos    
inflacion = llamada_api('https://api.estadisticasbcra.com/inflacion_mensual_oficial')
base_monetaria = llamada_api('https://api.estadisticasbcra.com/base')

#Armando DataFrames
df = pd.DataFrame(inflacion)
df_bm = pd.DataFrame(base_monetaria)

#Renombrando columnas
df = df.rename(columns={'d':'Fecha',
                        'v':'IPC'})

df_bm = df_bm.rename(columns={'d':'Fecha',
                              'v':'base_monetaria'})

#Arreglando tipo de datos
df['Fecha']=pd.to_datetime(df['Fecha'])
df_bm['Fecha']=pd.to_datetime(df_bm['Fecha'])

#Verificando que las fechas esten en formato date
df['Fecha'].dtype
df_bm['Fecha'].dtype

#Filtrando 
df_ipc = df[(df['Fecha'] > '2002-12-31')]
df_bm_filter = df_bm[(df_bm['Fecha'] > '2002-12-31')]

#Graficando
plt.figure(figsize=(14,12))
df_ipc.plot(x = 'Fecha', y = 'IPC')
plt.title('IPC-Mensual', fontdict={'fontsize':20})
plt.ylabel('IPC')
plt.xticks(rotation=45)
plt.show()

The data looks like this

         Fecha  base_monetaria
1748 2003-01-02           29302
1749 2003-01-03           29360
1750 2003-01-06           29524
1751 2003-01-07           29867
1752 2003-01-08           29957
        ...             ...
5966 2020-02-18         1941302
5967 2020-02-19         1941904
5968 2020-02-20         1887975
5969 2020-02-21         1855477
5970 2020-02-26         1807042

The idea is to take the data for the last day of the month and calculate the growth rate with the data for the last day of the previous month.

Upvotes: 0

Views: 787

Answers (3)

kev
kev

Reputation: 139

The problem can be solve creating a column with the lag values of base_monetaria

df_bm_filter['is_month_end'] = df_bm_filter['Fecha'].dt.is_month_end
df_last_date = df_bm_filter[df_bm_filter['is_month_end'] == True]
df_last_date['base_monetaria_lag'] = df_last_date['base_monetaria'].shift(1)
df_last_date['bm_growth'] = (df_last_date['base_monetaria'] - df_last_date['base_monetaria_lag']) / df_last_date['base_monetaria_lag']

Upvotes: 0

XXavier
XXavier

Reputation: 1226

Assuming the base_moetaria is a monthly cumulative value then

df = pd.DataFrame({'Fecha': ['2020-01-31', '2020-02-29', '2020-03-31', '2020-05-31', '2020-04-30', '2020-07-31', '2020-06-30', '2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
                   'price': [32132, 54321, 3213121, 432123, 32132, 54321, 32132, 54321, 3213121, 432123, 32132, 54321]})

df['Fecha'] = pd.to_datetime(df['Fecha'])
df.set_index('Fecha', inplace=True)
new_df = df.groupby(pd.Grouper(freq="M")).tail(1).reset_index()
new_df['rate'] = (new_df['price'] -new_df['price'].shift(1))/new_df['price'].shift(1)

The new_df['rate'] will give you the growth rate the way you explained in the comment below

Upvotes: 1

Manuel Carrera
Manuel Carrera

Reputation: 131

You can try something like this

from pandas.tseries.offsets import MonthEnd
import pandas as pd

df = pd.DataFrame({'Fecha': ['2020-01-31', '2020-02-29', '2020-03-31', '2020-05-31', '2020-04-30', '2020-07-31', '2020-06-30', '2020-08-31', '2020-09-30', '2020-10-31', '2020-11-30', '2020-12-31'],
                   'price': ['32132', '54321', '3213121', '432123', '32132', '54321', '32132', '54321', '3213121', '432123', '32132', '54321']})

df['Fecha'] = df['Fecha'].astype('datetime64[ns]')
df['is_month_end'] = df['Fecha'].dt.is_month_end
df = df[df['is_month_end'] == True]

df.sort_values('Fecha',inplace=True)

df.reset_index(drop=True, inplace = True)

def change(x,y):
    try:
        index = df[df['Fecha']==y].index.item()
        last = df.loc[index-1][1]
        return float(x)/float(last)
    except:
        return 0


df['new_column'] = df.apply(lambda row: change(row['price'],row['Fecha']), axis=1)

df.head(12)

Upvotes: 1

Related Questions