Aquiles Páez
Aquiles Páez

Reputation: 573

Filter DatetimeIndex vector to a certain frequency with pandas

I'm using findatapy to acquire FX Rates from DukasCopy, the package works great. This is the script I'm using:

from findatapy.market import Market, MarketDataRequest, MarketDataGenerator

market = Market(market_data_generator=MarketDataGenerator())

md_request = MarketDataRequest(start_date='01 Feb 2017', finish_date='03 Feb 2017', category='fx', fields=['bid', 'ask'], freq='tick', data_source='dukascopy', tickers=['EURUSD'])

df = market.fetch_market(md_request)

print(df)
print(len(df))
print(df.index)
print(len(df.index))

I'm only interested in the points that have an hourly frequency (00:00:00, 01:00:00, 02:00:00 and so on). This means that after filtering, I should only get 24 points per day.

Now, what I get as an outcome, is this.

df:

enter image description here

df.index:

enter image description here

What I'd like to do now, but I'm completely clueless about how to it, is to filter the index using an hourly frequency and then select the corresponding points.

I think that what I should do is to create an array with Pandas that has the desired index and use that to slice my main array, but how can I do that? Can pandas.date_range help me create this 'desired' array? Or is there a much simpler way of doing this?

Thanks for your time.

Upvotes: 1

Views: 340

Answers (1)

cs95
cs95

Reputation: 402553

You can just use the pd.TimeGrouper object and extract the first row for each hour group, something like this:

df = df.groupby(pd.TimeGrouper('1H')).head(1)

Upvotes: 1

Related Questions