Reputation: 1070
In the flowing data frame in Pandas, I want to extract columns corresponding dates between '03/01'
and '06/01'
. I don't want to use the index at all, as my input would be a start and end dates. How could I do so ?
A B
0 01/01 56
1 02/01 54
2 03/01 66
3 04/01 77
4 05/01 66
5 06/01 72
6 07/01 132
7 08/01 127
Upvotes: 0
Views: 472
Reputation: 3677
First create a list of the dates you need using daterange. I'm adding the year 2000 since you need to supply a year for this to work, im then cutting it off to get the desired strings. In real life you might want to pay attention to the actual year due to things like leap-days.
date_start = '03/01'
date_end = '06/01'
dates = [x.strftime('%m/%d') for x in pd.date_range('2000/{}'.format(date_start),
'2000/{}'.format(date_end), freq='D')]
dates is now equal to:
['03/01',
'03/02',
'03/03',
'03/04',
.....
'05/29',
'05/30',
'05/31',
'06/01']
Then simply use the isin argument and you are done
df = df.loc[df.A.isin(dates)]
df
If your columns is a datetime column I guess you can skip the strftime part in th list comprehension to get the right result.
Upvotes: 2
Reputation: 351
You are welcome to use boolean masking, i.e.:
df[(df.A >= start_date) && (df.A <= end_date)]
Inside the bracket is a boolean array of True
and False
. Only rows that fulfill your given condition (evaluates to True
) will be returned. This is a great tool to have and it works well with pandas and numpy.
Upvotes: 1