Reputation: 37
I have a dataframe, it has many timestamps, what I'm trying to do is get the lower of two dates only if both columns are not null. For example.
Internal Review Imported Date Lower Date
1 2/9/2018 19:44
2 2/15/2018 1:20 2/13/2018 2:18 2/13/2018 2:18
3 2/7/2018 23:17 2/12/2018 9:34 2/7/2018 23:17
4 2/12/2018 9:25
5 2/1/2018 20:57 2/12/2018 9:24 2/1/2018 20:57
If I wanted the lower of Internal Review and Imported Date, row one and four would not return any value, but would return the lower dates because they both contain dates. I know the .min(axis=1) will return a date, but they can be null which is the problem.
I tried copying something similar to here:
def business_days(start, end):
mask = pd.notnull(start) & pd.notnull(end)
start = start.values.astype('datetime64[D]')[mask]
end = end.values.astype('datetime64[D]')[mask]
result = np.empty(len(mask), dtype=float)
result[mask] = np.busday_count(start, end)
result[~mask] = np.nan
return result
and tried
def GetLowestDays(col1, col2, df):
df = df.copy()
Start = col1.copy().notnull()
End = col2.copy().notnull()
Col3 = [Start, End].min(axis=1)
return col3
But simply get a "AttributeError: 'list' object has no attribute 'min'"
Upvotes: 0
Views: 1668
Reputation: 1478
The following code should do the trick :
df['Lower Date'] = df[( df['Internal Review'].notnull() ) & ( df['Imported Date'].notnull() )][['Internal Review','Imported Date']].min(axis=1)
The new column will be filled by the minimum if both are not null.
Nicolas
Upvotes: 1