Josh Friedlander
Josh Friedlander

Reputation: 11657

Test if Pandas column is datetime type

I'm trying to fillna per column with a suitable variable. My goal is to try find the column type at the highest level of generality: basically, at the moment it is either numeric (int/float), string, or pandas Timestamp. I understand that I can detect numeric or string using numpy.issubdtype and the type hierarchy, but I haven't found a way to detect Timestamp. My solution uses iloc[0] and isinstance, but is there something better? Here is my code, roughly:

for col in df:
    if np.issubdtype(dataframe[col].dtype, np.number):
        df[col] = df[col].fillna(-1)
    elif isinstance(dataframe[col].iloc[0], pd.datetime):
        df[col] = df[col].fillna(pd.to_datetime('1900-01-01'))
    else:
        df[col] = df[col].fillna('NaN')
    return (dataframe.fillna(na_var)

(Note that I can't use df.loc[0, col] because my index doesn't always contain 0.)

Upvotes: 4

Views: 7123

Answers (1)

bexi
bexi

Reputation: 1216

Form me, np.issubdtype(df[col].dtype, np.datetime64) does what you want.

So taking everything together, we have:

def df_fillna(df):
    for col in df:
        if np.issubdtype(df[col].dtype, np.number):
            df[col] = df[col].fillna(-1)
        elif np.issubdtype(df[col].dtype, np.datetime64):
            df[col] = df[col].fillna(pd.to_datetime('1900-01-01'))
        else:
            df[col] = df[col].fillna('NaN')
        return df

An example. Input:

df_test = pd.DataFrame()
df_test['dates'] = [pd.to_datetime("2009-7-23"), pd.to_datetime("2011-7-7"), pd.NaT]
df_test = df_fillna(df_test)

Output:

       dates
0 2009-07-23
1 2011-07-07
2 1900-01-01

Upvotes: 6

Related Questions