Reputation: 1334
I have this dataframe in pandas:
df = pd.read_csv('data_stack.csv',index_col='month',parse_dates=True)
If I look at the parameter freq
it's automatically infered as None
DatetimeIndex(['2018-09-01', '2018-08-01', '2018-07-01', '2018-06-01',
'2018-05-01', '2018-04-01', '2018-03-01', '2018-02-01',
'2018-01-01', '2017-12-01',
...
'2018-11-01', '2019-01-01', '2018-12-01', '2018-11-01',
'2019-01-01', '2018-12-01', '2018-11-01', '2019-01-01',
'2018-12-01', '2018-11-01'],
dtype='datetime64[ns]', name='month', length=4325, freq=None)
I want to put it as Monthly started 'MS':
df.index.freq = 'MS'
but I get this error:
ValueError Traceback (most recent call last)
<ipython-input-99-0dc1e7b74d6b> in <module>
----> 1 df.index.freq = 'MS'
~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/extension.py in fset(self, value)
64
65 def fset(self, value):
---> 66 setattr(self._data, name, value)
67
68 fget.__name__ = name
~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py in freq(self, value)
925 if value is not None:
926 value = frequencies.to_offset(value)
--> 927 self._validate_frequency(self, value)
928
929 self._freq = value
~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/arrays/datetimelike.py in _validate_frequency(cls, index, freq, **kwargs)
1001 # message.
1002 raise ValueError(
-> 1003 f"Inferred frequency {inferred} from passed values "
1004 f"does not conform to passed frequency {freq.freqstr}"
1005 )
ValueError: Inferred frequency None from passed values does not conform to passed frequency MS
I have looked for similar cases and I found this one: pandas.DatetimeIndex frequency is None and can't be set
I have tried it but I obtain the same error, could anyone tell me why?
The data is in this repository: https://github.com/jordi-crespo/stack-questions
Upvotes: 0
Views: 4894
Reputation: 30679
There's no frequency as you have duplicate values in your index. So I guess the only thing you can do with such an index in order to set a frequency is to aggregate the data somehow, e.g.
>>> df.resample('MS').mean().index
DatetimeIndex(['2017-01-01', '2017-02-01', '2017-03-01', '2017-04-01',
'2017-05-01', '2017-06-01', '2017-07-01', '2017-08-01',
'2017-09-01', '2017-10-01', '2017-11-01', '2017-12-01',
'2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
'2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
'2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01',
'2019-01-01'],
dtype='datetime64[ns]', name='month', freq='MS')
which gives you an index of the desired frequency. But I'm not sure if this is what you really want.
Upvotes: 1