Reputation: 11895
I have a Pandas DataFrame as below.
df
A B
date_time
2014-07-01 06:03:59.614000 62.1250 NaN
2014-07-01 06:03:59.692000 62.2500 NaN
2014-07-01 06:13:34.524000 62.2500 241.0625
2014-07-01 06:13:34.602000 62.2500 241.5000
2014-07-01 06:15:05.399000 62.2500 241.3750
2014-07-01 06:15:05.399000 62.2500 241.2500
2014-07-01 06:15:42.004000 62.2375 241.2500
2014-07-01 06:15:42.082000 62.2375 241.3750
2014-07-01 06:15:42.082000 62.2375 240.2500
I want to change the frequency of this to regular 1 minute
intervals. But get the error below:
new = df.asfreq('1Min')
>>error: cannot reindex from a duplicate axis
Now, I understand why this is happening. Since my time granularity is high (in milliseconds) but irregular, I get multiple readings per minute, even per second. So I tried to combine these millisecond readings to minutes and get rid of duplicates as below.
# try to convert the index to minutes and drop duplicates
df['index'] = df.index
df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M'))
df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True)
df_by_minute = df.set_index('minute_index')
df_by_minute
A B index
minute_index
2014-07-01 06:03 62.2500 NaN 2014-07-01 06:03:59.692000
2014-07-01 06:13 62.2500 241.50 2014-07-01 06:13:34.602000
2014-07-01 06:15 62.2375 240.25 2014-07-01 06:15:42.082000
# now change the frequency to 1 minute but I just get NaNs (!)
df_by_minute.asfreq('1Min')
A B index
2014-07-01 06:03:00 NaN NaN NaT
2014-07-01 06:04:00 NaN NaN NaT
2014-07-01 06:05:00 NaN NaN NaT
2014-07-01 06:06:00 NaN NaN NaT
2014-07-01 06:07:00 NaN NaN NaT
2014-07-01 06:08:00 NaN NaN NaT
2014-07-01 06:09:00 NaN NaN NaT
2014-07-01 06:10:00 NaN NaN NaT
2014-07-01 06:11:00 NaN NaN NaT
2014-07-01 06:12:00 NaN NaN NaT
2014-07-01 06:13:00 NaN NaN NaT
2014-07-01 06:14:00 NaN NaN NaT
2014-07-01 06:15:00 NaN NaN NaT
As you see it does not work.. Can someone help? What I am trying to achieve is to get a function that returns A or B as of DateTime
and DateTime would be in 1Min increments.
Upvotes: 8
Views: 9698
Reputation: 398
Change Time Frequency in Pandas Dataframe.resample() function of data frame has ability to change the time frequency from Seconds to Minutes, Hours, Days, Years, etc. It perfectly works with DatetimeIndex fields and datetime column.
Upvotes: 1
Reputation: 1485
I think, not asfreq
but resample
fits your needs:
new = df.resample('T', how='mean')
For how
option, you can also use 'last' or 'first'.
Upvotes: 6