frank
frank

Reputation: 3598

convert 6 digit int into to yyyymm in pandas

I made a file that had three date columns:

pd.DataFrame({'yyyymm':[199501],'yyyy':[1995],'mm':[1],'Address':['AL1'],'Number':[12]})
    yyyymm  yyyy    mm  Address Number
0   199501  1995    1   AL1     12

and saved it as a file:

df.to_csv('complete.csv')

I read in the file with:

df=pd.read_csv('complete.csv')

and my 3 date columns are converted to int's, and not dates.

I tried to convert them back to dates with:

df['yyyymm']=df['yyyymm'].astype(str).dt.strftime('%Y%m')
df['yyyy']=df['yyyy'].dt.strftime('%Y')
df['mm']=df['mm'].dt.dtrftime('%m')

with the error:

AttributeError: Can only use .dt accessor with datetimelike values

Very odd, as the command I used to make the datetime column was:

df['yyyymm']=df['col2'].dt.strftime('%Y%m')

Am I missing something? HOw can I convert the 6 digit column back to yyyymm datetime, the 4 digit column to yyyy datetime, and the mm digit column back to datetime?

Upvotes: 2

Views: 1104

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476557

The columns yyyymm and yyyy and mm are integers. By using .astype(str), you convert these to strings. But a string has no .dt.

You can use pd.to_datetime(..) [pandas-doc] to convert these to a datetime object:

df['yyyymm'] = pd.to_datetime(df['yyyymm'].astype(str), format='%Y%m')

Indeed, this gives us:

>>> pd.to_datetime(df['yyyymm'].astype(str), format='%Y%m')
0   1995-01-01
Name: yyyymm, dtype: datetime64[ns]

The same can be done for the yyyy and mm columns:

>>> pd.to_datetime(df['yyyy'].astype(str), format='%Y')
0   1995-01-01
Name: yyyy, dtype: datetime64[ns]
>>> pd.to_datetime(df['mm'].astype(str), format='%m')
0   1900-01-01
Name: mm, dtype: datetime64[ns]

Upvotes: 1

Related Questions