Filter DatetimeIndex vector to a certain frequency with pandas

Question

I'm using findatapy to acquire FX Rates from DukasCopy, the package works great. This is the script I'm using:

from findatapy.market import Market, MarketDataRequest, MarketDataGenerator

market = Market(market_data_generator=MarketDataGenerator())

md_request = MarketDataRequest(start_date='01 Feb 2017', finish_date='03 Feb 2017', category='fx', fields=['bid', 'ask'], freq='tick', data_source='dukascopy', tickers=['EURUSD'])

df = market.fetch_market(md_request)

print(df)
print(len(df))
print(df.index)
print(len(df.index))

I'm only interested in the points that have an hourly frequency (00:00:00, 01:00:00, 02:00:00 and so on). This means that after filtering, I should only get 24 points per day.

Now, what I get as an outcome, is this.

df:

df.index:

What I'd like to do now, but I'm completely clueless about how to it, is to filter the index using an hourly frequency and then select the corresponding points.

I think that what I should do is to create an array with Pandas that has the desired index and use that to slice my main array, but how can I do that? Can pandas.date_range help me create this 'desired' array? Or is there a much simpler way of doing this?

Thanks for your time.

cs95 · Accepted Answer

You can just use the pd.TimeGrouper object and extract the first row for each hour group, something like this:

df = df.groupby(pd.TimeGrouper('1H')).head(1)

Filter DatetimeIndex vector to a certain frequency with pandas

Answers (1)

Related Questions