Pedro Braz
Pedro Braz

Reputation: 2401

Get the last date of each month in a list of dates in Python

I'm using Python 2.7, PyCharm and Anaconda,

I have a list of dates and I'd like to retrieve the last date of each month present in the array.

Are there any functions or libraries that could help me to do this?

I read the dates from a CSV file and stored them as datetime.

I have the following code:

Dates=[]
Dates1=[]
for date in dates:
    temp=xlrd.xldate_as_tuple(int(date),0)
    Dates1.append(datetime.datetime(temp[0],temp[1],temp[2]))

for date in Dates1:
    if not (date<startDate or date>endDate):
        Dates.append(date)

To make it clear, suppose I have:

Dates = [2015-01-20, 2015-01-15, 2015-01-17, 2015-02-21, 2015-02-06] 

(Consider it being in datetime format.)

The list I'd like to retrieve is:

[2015-01-20, 2015-02-21]

So far I've googled around, especially in Stack Overflow, but I could only find answers to how I could get the last date of each month, but not from a user-specified list.

Upvotes: 3

Views: 6986

Answers (3)

Emmanuel Wildiers
Emmanuel Wildiers

Reputation: 1

This works perfectly, using just one line of code:

dates = [datetime.date(2023, 3, 21), datetime.date(2023, 3, 1), datetime.date(2023, 2, 22), datetime.date(2023, 2, 14)]
eom_dates = set(pd.Series(dates,index=dates).groupby(lambda x:x.month).max())

This results in:

{datetime.date(2023, 2, 22), datetime.date(2023, 3, 21)}

Upvotes: 0

leroyJr
leroyJr

Reputation: 1160

Pandas can handle this task really well. Load your csv to a dataframe, then run a group by the month and find the max date using the aggregate function:

import pandas as pd
import numpy as np

df = pd.read_csv('/path/to/file/')          # Load a dataframe with your file
df.index = df['my_date_field']              # set the dataframe index with your date
dfg = df.groupby(pd.TimeGrouper(freq='M'))  # group by month / alternatively use MS for Month Start / referencing the previously created object

# Finally, find the max date in each month
dfg.agg({'my_date_field': np.max})

# To specifically coerce the results of the groupby to a list:
dfg.agg({'my_date_field': np.max})['my_date_field'].tolist()

Upvotes: 5

Michael Laszlo
Michael Laszlo

Reputation: 12239

For year y and month m, calendar.monthrange(y, m)[1] returns the day number of the last day of the month.

The following script takes a list of datetime object called dates and makes a new list, month_last_dates, containing datetime objects corresponding to the last date of each month in which the members of dates fall.

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = len(dates) * [None]
for i, date in enumerate(dates):
  y, m, d = date.year, date.month, date.day
  last = calendar.monthrange(y, m)[1]
  print y, m, last  # Output for testing purposes.
  month_last_dates[i] = datetime.datetime(y, m, last)

Here is an equivalent script written more concisely with the help of a list comprehension:

import datetime
import calendar

tuples = [(2015, 8, 1), (2015, 9, 16), (2015, 10, 4)]
dates = [datetime.datetime(y, m, d) for y, m, d in tuples]

month_last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in dates]

# Output for testing purposes.
for date in month_last_dates:
  print date.year, date.month, date.day

In your case, given the list Dates, you can make a new list like this:

last_dates = [datetime.datetime(date.year, date.month,
      calendar.monthrange(date.year, date.month)[1]) for date in Dates]

Upvotes: 4

Related Questions