doctorer
doctorer

Reputation: 1772

convert pandas dataframe column to np.datetime64

I want to add a np.datetime64 column to a pandas dataframe that has been read from a .csv file containing columns for year, month, day, hour and minute and use it as an index. I have combined the separate columns to make a column of datetime strings.

import numpy as np
import pandas as pd
filename = 'test.csv'
df = pd.read_csv(filename, header=0, usecols = [2,3,4,5,6], names = ['y','m','d','h','min'],dtype = {'y':'str','m':'str','d':'str','h':'str','min':'str'})  #read csv file into df
df['datetimetext'] = (df['y']+'-'+df['m']+'-'+df['d']+' '+df['h']+':'+df['min']+':00')

So the dataframe looks like this:

           y   m   d   h min    datetimetext  
0       1993  09  06  00  30    1993-09-06 00:30:00
1       1993  09  06  01  00    1993-09-06 01:00:00
2       1993  09  06  01  30    1993-09-06 01:30:00
3       1993  09  06  02  00    1993-09-06 02:00:00
4       1993  09  06  02  30    1993-09-06 02:30:00
......

Now I want to add a column with the datetime formatted as np.datetime64

I want to write

df['datetime'] = np.datetime64(df['datetimetext'])

but that creates an error

ValueError: Could not convert object to NumPy datetime

Do I need to iterate through each row of the dataframe, or is there a more elegant solution?

Upvotes: 4

Views: 9917

Answers (1)

piRSquared
piRSquared

Reputation: 294218

easiest way with what you have is

df['datetime'] = pd.to_datetime(df['datetimetext'])

LINK TO DOCS

However, if your columns are named appropriately, you can convert directly from the named columns. I renamed your columns using rename

m = dict(y='year', m='month', d='day', h='h', min='m')
# rename columns and get rid of datetimetext
df = df[['y', 'm', 'd', 'h', 'min']].rename(columns=m)
df

enter image description here

I'll next perform the conversion and assign to the index in one shot

df.index = pd.to_datetime(df)
df

enter image description here

Upvotes: 6

Related Questions