randombraziliandude
randombraziliandude

Reputation: 23

Pandas to_datetime Format does nothing- python

EDIT: Sorry guys, fixed the code and added more examples.

Trying to format dates from a xlsx file but nothing happens. Here's my code:

RAW DATA:

import pandas as pd
import numpy as np
import datetime as dt

df = pd.read_excel("Open_PO.xlsx")

df['Need By Date - Atual'].values

RAW DATA Output:

array(['2020-05-22T00:00:00.000000000', '2020-06-03T00:00:00.000000000',
       '2020-06-03T00:00:00.000000000', ...,
       '2022-10-28T00:00:00.000000000', '2022-10-28T00:00:00.000000000',
       '2022-12-22T00:00:00.000000000'], dtype='datetime64[ns]')

FORMATTING ATTEMPT:

df['Need By Date - Atual'] = pd.to_datetime(df["Need By Date - Atual"],format = '%d-%b-%y').dt.date

df['Need By Date - Atual'].values

OUTPUT:

array([datetime.date(2020, 5, 22), datetime.date(2020, 6, 3),
       datetime.date(2020, 6, 3), ..., datetime.date(2022, 10, 28),
       datetime.date(2022, 10, 28), datetime.date(2022, 12, 22)],
      dtype=object)

As you can see, the format paramater doesnt work.

Already tried adding dayfirst=True, astype function, and pretty much everything people posted in other questions.

Cant use the date in str format because it messes up the sort_values function I use later in the code.

What can I do??

Upvotes: 1

Views: 318

Answers (1)

Whole Brain
Whole Brain

Reputation: 2167

Pandas Series are not numpy arrays

This line already did exactly what you wanted:

df['Need By Date - Atual'] = pd.to_datetime(df["Need By Date - Atual"],format = '%d-%b-%y').dt.date
# -> type pandas.Series

But this line shows you a numpy array from your pandas Series:

df['Need By Date - Atual'].values 
# -> type numpy.ndarray

Since Series and formatted datetimes are specific to pandas, the values will be casted into regular datetimes, known by numpy.

How to have the result as strings

This is probably what you wanted to achieve :

pd.to_datetime(dd[0]).dt.strftime("%d/%m/%Y").values

This time (no pun intedned), the visual format will be applied as strings, no as datetime.

How to format datetimes for excel

You probably want to use the argument date_format from pandas.ExcelWriter:

with ExcelWriter('path_to_file.xlsx', date_format='DD-MM-YY') as writer:
    df.to_excel(writer, sheet_name='some_date_sheet')

Source: pandas' ExcelWriter documentation

Upvotes: 2

Related Questions