jKraut
jKraut

Reputation: 2487

Pandas convert yearly to monthly

I'm working on pulling financial data, in which some is formatted in yearly and other is monthly. My model will need all of it monthly, therefore I need that same yearly value repeated for each month. I've been using this stack post and trying to adapt the code to my data.

Here is my dataframe:

df.head()

   date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-12-31  ECB/RA6  4.0
2 2001-12-31  ECB/RA6  2.0
3 2002-12-31  ECB/RA6  3.0
4 2003-12-31  ECB/RA6  2.0

Here is my desired output first 5 rows:

   date ticker value
0 1999-12-31  ECB/RA6  1.0
1 2000-01-31  ECB/RA6  4.0
2 2000-02-28  ECB/RA6  4.0
3 2000-13-31  ECB/RA6  4.0
4 2000-04-30  ECB/RA6  4.0

And my code:

df['date'] = pd.to_datetime(df['date'], format='%Y-%m')
df = df.pivot(index='date', columns='ticker')
start_date = df.index.min() - pd.DateOffset(day=1)
end_date = df.index.max() + pd.DateOffset(day=31)
dates = pd.date_range(start_date, end_date, freq='M')
dates.name = 'date'
df = df.reindex(dates, method='ffill')

df = df.stack('ticker')
df = df.sortlevel(level=1)
df = df.reset_index()

However, it is not repeating the months as expected

Upvotes: 8

Views: 4502

Answers (1)

piRSquared
piRSquared

Reputation: 294198

You want resample

First, you need to set the index so that resample will work. Then you backfill and reset the index.

df.set_index('date').resample('M').bfill().reset_index()

         date   ticker  value
0  1999-12-31  ECB/RA6    1.0
1  2000-01-31  ECB/RA6    4.0
2  2000-02-29  ECB/RA6    4.0
3  2000-03-31  ECB/RA6    4.0
4  2000-04-30  ECB/RA6    4.0
5  2000-05-31  ECB/RA6    4.0
6  2000-06-30  ECB/RA6    4.0
7  2000-07-31  ECB/RA6    4.0
8  2000-08-31  ECB/RA6    4.0
9  2000-09-30  ECB/RA6    4.0
10 2000-10-31  ECB/RA6    4.0
11 2000-11-30  ECB/RA6    4.0
12 2000-12-31  ECB/RA6    4.0
13 2001-01-31  ECB/RA6    2.0
14 2001-02-28  ECB/RA6    2.0
15 2001-03-31  ECB/RA6    2.0
...

To handle this per ticker

df.set_index('date').groupby('ticker', group_keys=False) \
    .resample('M').bfill().reset_index()

Upvotes: 7

Related Questions