Reputation: 1543
I have a dataframe of eurusd 5-minute data and I have another dataframe of Non-farms payroll data. I want to create a new dataframe which has all of the rows of the eurusd 5-min data for the NFP dates. See below:
#NFP Days (Dataframe #1)
nfp_dates = forex_calendar.loc[forex_calendar['Event'] == '(United States) Nonfarm Payrolls']
#Import csv files into a Pandas dataframes and convert to Pandas datetime and set to index (Dataframe #2)
eurusd_ask = pd.read_csv('EURUSD_Candlestick_5_m_ASK_01.01.2012-05.08.2017.csv')
eurusd_ask.index = pd.to_datetime(eurusd_ask.pop('Gmt time'))
Desired Dataframe (Dataframe #3) I want a new dataframe called "eurusd_ask_nfp" that would have all of the rows of 5-minute data (entire day) from eurusd_ask on the NFP dates (i.e., 2012-01-06, 2012-02-03, 2012-03-09, etc.)
Open High Low Close Volume
Gmt time
2012-01-06 00:00:00 1.23627 1.23627 1.23532 1.23553 860.47
2012-01-06 00:05:00 1.23553 1.23553 1.23509 1.23532 698.33
2012-01-06 00:10:00 1.23533 1.23535 1.23518 1.23533 338.22
2012-01-06 00:15:00 1.23533 1.23535 1.23511 1.23518 480.64
2012-01-06 00:20:00 1.23518 1.23525 1.23493 1.23508 551.73
2012-01-06 00:25:00 1.23509 1.23537 1.23501 1.23512 349.78
2012-01-06 00:30:00 1.23510 1.23520 1.23471 1.23481 659.66
... ... ... ... ... ...
2012-02-03 00:00:00 1.33271 1.33271 1.33210 1.33225 870.59
2012-02-03 00:05:00 1.33225 1.33248 1.33212 1.33227 943.04
2012-02-03 00:10:00 1.33226 1.33226 1.33189 1.33212 659.45
2012-02-03 00:15:00 1.33213 1.33272 1.33212 1.33268 788.32
2012-02-03 00:20:00 1.33265 1.33333 1.33264 1.33315 953.20
2012-02-03 00:25:00 1.33315 1.33322 1.33282 1.33286 678.37
2012-02-03 00:30:00 1.33286 1.33291 1.33260 1.33260 442.39
2012-02-03 00:35:00 1.33261 1.33293 1.33250 1.33268 621.53
2012-02-03 00:40:00 1.33268 1.33292 1.33257 1.33288 511.00
2012-02-03 00:45:00 1.33288 1.33300 1.33240 1.33240 554.76
2012-02-03 00:50:00 1.33240 1.33240 1.33191 1.33225 839.80
... ... ... ... ... ...
2012-03-09 00:00:00 1.25739 1.25748 1.25710 1.25711 439.46
2012-03-09 00:05:00 1.25712 1.25732 1.25709 1.25715 388.29
2012-03-09 00:10:00 1.25715 1.25733 1.25711 1.25726 291.65
2012-03-09 00:15:00 1.25729 1.25731 1.25685 1.25692 626.85
2012-03-09 00:20:00 1.25692 1.25696 1.25662 1.25663 394.68
2012-03-09 00:25:00 1.25662 1.25677 1.25648 1.25652 302.73
2012-03-09 00:30:00 1.25652 1.25665 1.25645 1.25646 457.76
2012-03-09 00:35:00 1.25646 1.25659 1.25629 1.25633 423.00
2012-03-09 00:40:00 1.25633 1.25649 1.25632 1.25641 173.08
2012-03-09 00:45:00 1.25642 1.25671 1.25630 1.25665 263.99
... ... ... ... ... ...
etc., etc.
Upvotes: 0
Views: 333
Reputation: 1543
I was able to figure this out in case someone wants the answer, here you go:
nfp_dates = forex_calendar.loc[forex_calendar['Event'] == '(United States) Nonfarm Payrolls']
nfp_dates = nfp_dates.reset_index()
nfp_dates = nfp_dates['Date'].dt.date
nfp_5min = eurusd_ask[np.isin(eurusd_ask.index.date, nfp_dates)]
This returned what I wanted
Gmt time Open High Low Close Volume
2012-06-01 00:00:00 1.27925 1.27977 1.27925 1.27965 919.94
2012-06-01 00:05:00 1.27966 1.27971 1.27939 1.27958 668.37
2012-06-01 00:10:00 1.27958 1.27984 1.27929 1.27930 732.95
2012-06-01 00:15:00 1.27930 1.27956 1.27913 1.27921 732.04
2012-06-01 00:20:00 1.27922 1.27960 1.27918 1.27942 552.55
2012-06-01 00:25:00 1.27940 1.27940 1.27920 1.27928 384.76
2012-06-01 00:30:00 1.27929 1.27983 1.27929 1.27957 682.91
2012-06-01 00:35:00 1.27957 1.27963 1.27921 1.27927 572.12
2012-06-01 00:40:00 1.27928 1.27938 1.27889 1.27901 794.94
2012-06-01 00:45:00 1.27901 1.27901 1.27878 1.27887 535.03
Upvotes: 0
Reputation: 402603
Taking a stab at this, but you should likely be able to use Index.isin
to query all records where the dates match.
idx = nfp_dates.index
eurusd_ask_nfp = eurusd_ask[eurusd_ask.index.isin(idx)]
Upvotes: 1