Sergey Kazantsev
Sergey Kazantsev

Reputation: 306

How to parse a date column as datetimes, not objects in Pandas?

I'd like to create DataFrame from a csv with one datetime-typed column.

Follow the article, the code should create needed DateFrame:

df = pd.read_csv('data/data_3.csv', parse_dates=['date'])
df.info()
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype         
---  ------   --------------  -----         
 0   date     3 non-null      datetime64[ns]
 1   product  3 non-null      object        
 2   price    3 non-null      int64         
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 200.0+ bytes

But when I do exacly the same steps, I get object-typed date column:

df = pd.read_csv(path, parse_dates=['published_at']) 
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100000 entries, 0 to 99999
Data columns (total 6 columns):
 #   Column           Non-Null Count   Dtype  
---  ------           --------------   -----  
 0   name             100000 non-null  object 
 1   salary_from      48041 non-null   float64
 2   salary_to        53029 non-null   float64
 3   salary_currency  64733 non-null   object 
 4   area_name        100000 non-null  object 
 5   published_at     100000 non-null  object 
dtypes: float64(2), object(4)
memory usage: 4.6+ MB

I have tried a couple of various ways to parse datetime column and still can't get a DateFrame with datetime dtype. So how to parse a column with datetime type (not object)?

Upvotes: 0

Views: 620

Answers (1)

johnjohn
johnjohn

Reputation: 892

When loading the csv, have you tried:

df = pd.read_csv(path, parse_dates=['published_at'], infer_datetime_format = True)

And/or when converting to datetime:

pd.to_datetime(df.published_at, utc=True)

Upvotes: 2

Related Questions