Reputation: 1754
I want to creat a column by another one which dtype is datetime. The details as below:
df['finished']
0 2019-01-28 15:53:48
1 2019-01-28 17:11:15
2 2019-01-28 17:12:14
3 2019-01-28 17:12:15
4 2019-01-28 17:12:41
Name: finish, dtype: datetime64[ns]
df['finish'].map(lambda x: 30 if x<='2019-02-01 21:00:00' else 5)
TypeError: Cannot compare type 'Timestamp' with type 'str
Upvotes: 1
Views: 35
Reputation: 863651
If compare in pandas vectorized way - all column with value, is not necessary convert to datetimes, because pandas handle this comparison:
df['new'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
print (df)
finish new
0 2019-01-28 15:53:48 30
1 2019-01-28 17:11:15 30
2 2019-01-28 17:12:14 30
3 2019-01-28 17:12:15 30
4 2019-01-28 17:12:41 30
Your solution failed, because compare scalars, so is necessary compare by datetimes in loop - call lambda function for each value.
Also is not recommended, because slow. But solution is convert string to Timestamp
or datetime
:
df['new'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)
Performance:
#[5000 rows x 1 columns]
df = pd.concat([df] * 1000, ignore_index=True)
In [165]: %timeit df['new1'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
465 µs ± 64.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [166]: %timeit df['new2'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)
22.4 ms ± 228 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Upvotes: 1