dsilva
dsilva

Reputation: 105

Display Python datetime without hours, minutes and seconds in Pandas

I have a date string and want to convert it to the date type:

I have tried to use pd.to_datetime with the format that I want but it is returning the time without the conversion.

df = pd.DataFrame({
    'date': ['2010-12-30 23:57:10+00:00', '2010-12-30 23:52:41+00:00','2010-12-30 23:43:04+00:00','2010-12-30 23:37:30+00:00','2010-12-30 23:31:39+00:00'],
    'text' : ['El odontólogo Barreda, a un paso de quedar en …','Defederico es el nuevo refuerzo de Independien..','Israel: ex presidente Katzav declarado culpabl…'
        , 'FMI estima que la recuperación asimétrica de l…','¿Quién fue el campeón argentino del año? Votá …']
}) 

df["new date"] =pd.to_datetime(df['date'], format="%Y-%m-%d")

That is the output that returns

2010-12-30 23:57:10+00:00

and I need to eliminate

23:57:10+00:00 .

Upvotes: 1

Views: 3293

Answers (2)

Mustafa Aydın
Mustafa Aydın

Reputation: 18315

Well it's a datetime object, so it needs to keep the time information. However, there's a Period datatype that might fit here: it represents a span of time instead of a stamp:

df["new date"] = pd.to_datetime(df["date"]).dt.to_period(freq="D")

which converts to Daily periods to get

>>> df["new date"]

0    2010-12-30
1    2010-12-30
2    2010-12-30
3    2010-12-30
4    2010-12-30
Name: new date, dtype: period[D]

Noting that these are not strings; one can therefore continue to perform .dt based operations.


If you do need datetime type, though, you can .normalize() the timestamps to signal the time component is immaterial and they are all set to midnight:

>>> df["new date"] = pd.to_datetime(df["date"]).dt.normalize()
>>> df["new date"]

0   2010-12-30 00:00:00+00:00
1   2010-12-30 00:00:00+00:00
2   2010-12-30 00:00:00+00:00
3   2010-12-30 00:00:00+00:00
4   2010-12-30 00:00:00+00:00
Name: new date, dtype: datetime64[ns, UTC]

Noting that after normalization, the display does not normally show that all-zero time information if the original datetime stamps do not have timezone information attached, i.e., the part after "+"; in your case, they do have it, so we see the zeros in the output as well. If you want to get rid of that in such cases, you can chain .dt.tz_convert(tz=None) to get rid of the timezone information and therefore the all-zeros in the output. Still, the output is of type datetime.


Lastly, if it is all about display purposes, then we can use .strftime to shape them into a desired format:

>>> df["new date"] = pd.to_datetime(df["date"]).dt.strftime("%Y-%m-%d")
>>> df["new date"]

0    2010-12-30
1    2010-12-30
2    2010-12-30
3    2010-12-30
4    2010-12-30
Name: new date, dtype: object

As you see, the datatype is "object", i.e., string here, which would prevent datetime-based actions, e.g., df["new date"].dt.month would no longer work unlike the first two alternatives.

Upvotes: 2

Corralien
Corralien

Reputation: 120539

To keep a DatetimeIndex and its dt accessor, you can use dt.normalize() to reset the time part then dt.tz_convert to remove the timezone information:

df['new date'] = pd.to_datetime(df["date"]).dt.normalize().dt.tz_convert(None)

Output

>>> df['new date']
0   2010-12-30
1   2010-12-30
2   2010-12-30
3   2010-12-30
4   2010-12-30
Name: new date, dtype: datetime64[ns]

Upvotes: 0

Related Questions