MethJordan
MethJordan

Reputation: 37

Pandas - Lower of Dates when not null

I have a dataframe, it has many timestamps, what I'm trying to do is get the lower of two dates only if both columns are not null. For example.

  Internal Review       Imported Date     Lower Date
1 2/9/2018 19:44
2 2/15/2018 1:20        2/13/2018 2:18    2/13/2018 2:18
3 2/7/2018 23:17        2/12/2018 9:34    2/7/2018 23:17
4                       2/12/2018 9:25
5 2/1/2018 20:57        2/12/2018 9:24    2/1/2018 20:57

If I wanted the lower of Internal Review and Imported Date, row one and four would not return any value, but would return the lower dates because they both contain dates. I know the .min(axis=1) will return a date, but they can be null which is the problem.

I tried copying something similar to here:

def business_days(start, end):
    mask = pd.notnull(start) & pd.notnull(end)
    start = start.values.astype('datetime64[D]')[mask]
    end = end.values.astype('datetime64[D]')[mask]
    result = np.empty(len(mask), dtype=float)
    result[mask] = np.busday_count(start, end)
    result[~mask] = np.nan
    return result

and tried

def GetLowestDays(col1, col2, df):
    df = df.copy()
    Start = col1.copy().notnull()
    End = col2.copy().notnull()
    Col3 = [Start, End].min(axis=1)
    return col3

But simply get a "AttributeError: 'list' object has no attribute 'min'"

Upvotes: 0

Views: 1668

Answers (1)

Nicolas M.
Nicolas M.

Reputation: 1478

The following code should do the trick :

df['Lower Date'] = df[( df['Internal Review'].notnull() ) & ( df['Imported Date'].notnull() )][['Internal Review','Imported Date']].min(axis=1)

The new column will be filled by the minimum if both are not null.

Nicolas

Upvotes: 1

Related Questions