Reputation: 1730
Here is the DataFrame column and its datatype
df['Hours'].head()
OutPut:
0 00:00:00
1 00:00:00
2 11:38:00
3 08:40:00
Name: Hours, dtype: timedelta64[ns]
I want to conditionally form anaother column from it, such that it will look like.
Hours Test
00:00:00 N/A
00:00:00 N/A
11:38:00 02:38:00
08:40:00 Under Worked
Where ,
if df['Hours'] == '00:00:00':
df[Test] = 'N/A'
elif (df['Hours'].dt.total_seconds()//3600) < 9:
df['Test'] = 'Under Worked'
else:
df['Test'] = (df['Hours'].dt.total_seconds()//3600)-9
But it gives me error
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Also I tried with using np.select
conditions = [
(str(df['Hours']) == '0 days 00:00:00'),
(df['Hours'].dt.total_seconds()//3600) < 9]
choices = ['NA', 'Worked Under']
df['Test'] = np.select(conditions, choices, default=(df['Hours'].dt.total_seconds()//3600)-9)
This is the error I get
ValueError: list of cases must be same length as list of conditions
How can it be solved?
Upvotes: 1
Views: 42
Reputation: 862681
Use:
df1['Hours'] = pd.to_timedelta(df1['Hours'])
conditions = [df1['Hours'] == pd.Timedelta(0), df1['Hours'] < pd.Timedelta(9, unit='H')]
choices = ['N/A', 'Under Worked']
s = df1['Hours'].sub(pd.Timedelta(9, unit='h')).astype(str).str[7:15]
df1['OT'] = np.select(conditions, choices, default=s)
print (df1)
Hours Test OT
0 00:00:00 N/A N/A
1 00:00:00 N/A N/A
2 11:38:00 02:38:00 02:38:00
3 08:40:00 Under Worked Under Worked
Upvotes: 2