Alex Kinman
Alex Kinman

Reputation: 2605

Pandas giving me the wrong max date in a date time column?

I have a dataframe with a date column:

data['Date']

0        1/1/14
1        1/8/14
2       1/15/14
3       1/22/14
4       1/29/14
         ...   
255    11/21/18
256    11/28/18
257     12/5/18
258    12/12/18
259    12/19/18

But, when I try to get the max date out of that column, I get:

test_data.Date.max()

'9/9/15'

Any idea why this would happen?

Upvotes: 1

Views: 2637

Answers (3)

Ethan King
Ethan King

Reputation: 151

Your date may be stored as a string. First convert the column from string to datetime. Then, max() should work.

test = pd.DataFrame(['1/1/2010', '2/1/2011', '3/4/2020'], columns=['Dates'])

      Dates
0  1/1/2010
1  2/1/2011
2  3/4/2020

pd.to_datetime(test['Dates'], format='%m/%d/%Y').max()
Timestamp('2020-03-04 00:00:00')

That timestamp can be cleaned up using .dt.date:

pd.to_datetime(test['Dates'], format='%m/%d/%Y').dt.date.max()
datetime.date(2020, 3, 4)

to_datetime format argument table python docs
pandas to_datetime pandas docs

Upvotes: 0

Isa
Isa

Reputation: 1

The .max() understands it as a date (like you want), if it is a datetime object. Building upon Seshadri's response, try:

type(data['Date'][1])

If it is a datetime object, this returns this:

pandas._libs.tslibs.timestamps.Timestamp

If not, you can make that column a datatime object like so:

data['Date'] = pd.to_datetime(data['Date'],format='%m/%d/%y')

The format argument makes sure you get the right formatting. See the full list of formatting options here in the python docs.

Upvotes: 0

Celius Stingher
Celius Stingher

Reputation: 18367

Clearly the column is of type object. You should try using pd.to_datetime() and then performing the max() aggregator:

data['Date'] = pd.to_datetime(data['Date'],errors='coerce') #You might need to pass format
print(data['Date'].max())

Upvotes: 3

Related Questions