Reputation: 640
I'm trying to find if is possible to use data.asfreq(MonthEnd())
with no date_range
created data.
What I'm trying to achive. I run csv query with the following code:
import numpy as np
import pandas as pd
data = pd.read_csv("https://www.quandl.com/api/v3/datasets/FRED/GDPC1.csv?api_key=", parse_dates=True)
data.columns = ["period", "integ"]
data['period'] = pd.to_datetime(data['period'], infer_datetime_format=True)
Then I want to assign frequency to my 'period' column by doing this:
tdelta = data.period[1] - data.period[0]
data.period.freq = tdelta
And some print comands:
print(data)
print(data.period.freq)
print(data.dtypes)
Returns:
..........
270 1948-07-01 2033.2
271 1948-04-01 2021.9
272 1948-01-01 1989.5
273 1947-10-01 1960.7
274 1947-07-01 1930.3
275 1947-04-01 1932.3
276 1947-01-01 1934.5
[277 rows x 2 columns]
-92 days +00:00:00
period datetime64[ns]
integ float64
dtype: object
I can also parse the original 'DATE' column by making it 'index':
data = pd.read_csv("https://www.quandl.com/api/v3/datasets/FRED/GDPC1.csv?api_key=", parse_dates=True, index_col='DATE')
What I want to do is just to covert the quarterly data in to monthly rows. For example:
270 1948-07-01 2033.2
271 1948-06-01 NaN
272 1948-05-01 NaN
273 1948-04-01 2021.9
274 1948-03-01 NaN
275 1948-02-01 NaN
276 1948-01-01 1989.5
......and so on.......
I'm eventually trying to do this by using ts.asfreq(MonthBegin())
and , ts.asfreq(MonthBegin(), method='pad')
. So far unsuccessfully. I have the following error:
NameError: name 'MonthBegin' is not defined
My question is can I use asfreq
if I don't use date_range
to create the frame? Somehow to 'pass' my date column to the function. If this is not the solution is it there any other easy way to convert quarterly to monthly frequency?
Upvotes: 1
Views: 1017
Reputation: 13923
Use a TimeGrouper
:
import pandas as pd
periods = ['1948-07-01', '1948-04-01', '1948-01-01', '1947-10-01',
'1947-07-01', '1947-04-01', '1947-01-01']
integs = [2033.2, 2021.9, 1989.5, 1960.7, 1930.3, 1932.3, 1934.5]
df = pd.DataFrame({'period': pd.to_datetime(periods), 'integ': integs})
df = df.set_index('period')
df = df.groupby(pd.TimeGrouper('MS')).sum().sort_index(ascending=False)
EDIT: You can also use resample
instead of a TimeGrouper
:
df.resample('MS').sum().sort_index(ascending=False)
Upvotes: 2