Reputation: 1339
How to select multiple rows of a dataframe by list of dates
dates = pd.date_range('20130101', periods=6)
df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
In[1]: df
Out[1]:
A B C D
2013-01-01 0.084393 -2.460860 -0.118468 0.543618
2013-01-02 -0.024358 -1.012406 -0.222457 1.906462
2013-01-03 -0.305999 -0.858261 0.320587 0.302837
2013-01-04 0.527321 0.425767 -0.994142 0.556027
2013-01-05 0.411410 -1.810460 -1.172034 -1.142847
2013-01-06 -0.969854 0.469045 -0.042532 0.699582
myDates = ["2013-01-02", "2013-01-04", "2013-01-06"]
So the output should be
A B C D
2013-01-02 -0.024358 -1.012406 -0.222457 1.906462
2013-01-04 0.527321 0.425767 -0.994142 0.556027
2013-01-06 -0.969854 0.469045 -0.042532 0.699582
Upvotes: 9
Views: 6656
Reputation: 1352
If you have a timeseries
containing hours and minutes in the index (e.g. 2022-03-07 09:03:00+00:00
instead of 2022-03-07
), and you want to filter by dates (without hours, minutes, etc.), you can use the following:
df.loc[np.isin(df.index.date, myDates)]
If you try df.loc[df.index.date.isin(myDates)]
it might not work and python will throw an error saying 'numpy.ndarray' object has no attribute 'isin'
, and this is why we use np.isin
.
This is an old post but I think this can be useful to a lot of people (such as myself).
Upvotes: 0
Reputation: 32095
Convert your entry into a DateTimeIndex:
df.loc[pd.to_datetime(myDates)]
A B C D
2013-01-02 -0.047710 -1.827593 -0.944548 -0.149460
2013-01-04 1.437924 0.126788 0.641870 0.198664
2013-01-06 0.408820 -1.842112 -0.287346 0.071397
Upvotes: 4
Reputation: 214957
You can use index.isin()
method to create a logical index for subsetting:
df[df.index.isin(myDates)]
Upvotes: 9