DevLeb2022
DevLeb2022

Reputation: 685

how to convert column dtypes for DateTime and Time in pandas

I have a dataframe that includes columns with time and Date as objects dtypes

I want to convert the columns to Time and DateTime dtypes

When i try to convert it display the below error for the Date:

TypeError: 'Series' object is not callable

code:

import pandas as pd

df=pd.DataFrame( {
    'time': ['22:00:00', 'nan', '00:45:00', 'nan'],
    'date': ['2020-09-30T00:00:00+03:00','0001-01-01T00:00:00+02:22','2020-10-01T00:00:00+03:00','0001-01-01T00:00:00+02:22']
     })            
df["date"]=df["date"].fillna("00/00/0000")
df["date"] = pd.to_datetime(df["date"],errors='coerce',format='%d%m%y')
                     
df["time"]=df["time"].fillna("00:00:00")
df["time"] = pd.to_timedelta(df["time"].str.strip())
        
print(df.info())

How to fix this error??

Upvotes: 1

Views: 2685

Answers (2)

Utsav
Utsav

Reputation: 5918

I'm not 100% sure what you are looking for.

To answer the question:
I want to convert the columns to Time and DateTime dtypes -- There is nothing like Time datatype in pandas.

Closest we can get is Timedelta or datetime object.

One option is below to get the time object

filled the time as per the requirement.

df['date'] = pd.to_datetime(df.date, errors='coerce')
df['time'] = pd.to_datetime(df.time).dt.time.ffill()

Output

    time        date
0   22:00:00    2020-09-30 00:00:00+03:00
1   22:00:00    NaT
2   00:45:00    2020-10-01 00:00:00+03:00
3   00:45:00    NaT

Second Option is Timedelta

This is little bit annoying as it add 0 days at the start.
To extract only the time from it we will have to convert it to string.(Not sure if that is the requirement)

df['time'] = pd.to_timedelta(df.time).ffill().astype(str).str.split().str[-1]

Output

    time        date
0   22:00:00    2020-09-30 00:00:00+03:00
1   22:00:00    NaT
2   00:45:00    2020-10-01 00:00:00+03:00
3   00:45:00    NaT

Upvotes: 0

Matteo Zanoni
Matteo Zanoni

Reputation: 4152

There are a few problems with the code:

  • when you create the dataframe the date column has one less value than the time column
  • the fillna method would just replace Null (or nan) values not empty strings
  • "00/00/0000" is not a valid date for python's date time
  • your date format is slightly off, you are missing the / separating the fields and I believe that your dates are with month before day (see "09/30/2020")

Here a proposal that fixes those errors:

import pandas as pd

df = pd.DataFrame(
    {
        "time": ["10:00:00 PM", None, "12:45:00 AM", None, "1:00:00 AM"],
        "date": ["9/30/2020", "10/1/2020", None, "10/1/2020", None],
    }
)
df["date"] = df["date"].fillna("01/01/1970")
df["date"] = pd.to_datetime(df["date"], errors="coerce", format="%m/%d/%Y")

df["time"] = df["time"].fillna("00:00:00")
df["time"] = pd.to_timedelta(df["time"].str.strip())

print(df.info())

This code runs and the final df has no null values.

Upvotes: 1

Related Questions