Daniel Arges
Daniel Arges

Reputation: 365

Manipulate/copy dataframe (pandas) and maintain only the last day of each month

I have the following dataframe dt:

           date  USDBRL
0    2000-01-03  1.8011
1    2000-01-04  1.8337
2    2000-01-05  1.8544
3    2000-01-06  1.8461
4    2000-01-07  1.8281
        ...     ...
5212 2020-10-01  5.6441
5213 2020-10-02  5.6464
5214 2020-10-05  5.6299
5215 2020-10-06  5.5205
5216 2020-10-07  5.6018

How can I manipulate this dt, or create a new one, containing only the rows with the last day of each month?

Upvotes: 3

Views: 90

Answers (2)

Anurag Reddy
Anurag Reddy

Reputation: 1215

You can create a list of range of dates using pd.date_range. This is an alternative way

required_datelist = pd.date_range(start='1/1/2018', periods=12, freq='M')
output = dt[dt.date.isin(required_datelist)]

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150755

You can use dt.to_periods('M') to find the month, then duplicated to find where they're duplicated, then boolean indexing:

months = pd.to_datetime(dt['date']).dt.to_period('M')

out = dt.loc[months.duplicated(keep='last')]

Another approach is groupby().idxmax(). This is a bit slower but safer in the case your data is not sorted by date:

out = df.loc[df.groupby(months)['date'].idxmax()]

Upvotes: 2

Related Questions