Reputation: 911
Trying to convert a column in a dataframe from the months 1-420 (35 years of monthly data from 1985 to 2010) to a datetime object.
Sample dataframe:
import pandas as pd
import numpy as np
dates = pd.Series(range(1,421))
df2 = pd.DataFrame(np.random.randn(420,4),index=dates,columns=list('ABCD'))
Convert index to a datetime object:
df2.index = pd.to_datetime(df2.index,unit='M', origin='1981-01-01')
Gives the error:
ValueError: cannot cast unit M
I don't know why it won't cast the unit M, as when I try 'd' instead of 'M' it works, and goes up daily - why won't it go up monthly? I got the units from here.
using 'm' output looks like this:
A B C D
1981-01-01 00:01:00 0.672397 0.753926 0.865845 0.711594
1981-01-01 00:02:00 0.786754 0.658421 -0.111609 -1.459447
1981-01-01 00:03:00 0.200273 -1.485525 -1.939203 0.921833
1981-01-01 00:04:00 -1.589668 0.109760 -1.349790 -1.951316
1981-01-01 00:05:00 0.133847 -0.359300 -1.246740 -0.835645
1981-01-01 00:06:00 -0.843962 1.222129 -0.121450 -1.223132
1981-01-01 00:07:00 -0.818932 0.731127 0.984731 -1.028384
which goes up in minutes, I want it to go up in Months like this:
A B C D
1981-01-01 00:00:00 0.672397 0.753926 0.865845 0.711594
1981-02-01 00:00:00 0.786754 0.658421 -0.111609 -1.459447
1981-03-01 00:00:00 0.200273 -1.485525 -1.939203 0.921833
Upvotes: 0
Views: 65
Reputation: 8631
You should use date_range:
df2.index = pd.date_range('1981/1/1', periods=len(df2), freq='MS')
Output:
A B C D
1981-01-01 -0.761933 0.726808 0.589712 -1.170934
1981-02-01 0.030521 -0.892427 -1.366809 -1.515724
1981-03-01 -0.282887 1.068047 0.244493 -0.247356
Have a look at offset alias for more information.
EDIT: As OP said, the 425 days are repeating over 200,000 rows. Below code would provide repeated indices.
daterange = pd.date_range('1981/1/1', periods=420, freq='MS')
Then expand it to fit your dataframe by repeating it.
df2.index = list(daterange) * math.floor(len(df2)/len(list(daterange))) + list(daterange)[0:math.floor(len(df2)%len(list(daterange)))]
Upvotes: 2