Jack
Jack

Reputation: 1754

How to create column based on datetime values?

I want to creat a column by another one which dtype is datetime. The details as below:

 df['finished']

0   2019-01-28 15:53:48
1   2019-01-28 17:11:15
2   2019-01-28 17:12:14
3   2019-01-28 17:12:15
4   2019-01-28 17:12:41
Name: finish, dtype: datetime64[ns]

df['finish'].map(lambda x: 30 if x<='2019-02-01 21:00:00' else 5)

TypeError: Cannot compare type 'Timestamp' with type 'str

Upvotes: 1

Views: 35

Answers (1)

jezrael
jezrael

Reputation: 863651

If compare in pandas vectorized way - all column with value, is not necessary convert to datetimes, because pandas handle this comparison:

df['new'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
print (df)
               finish  new
0 2019-01-28 15:53:48   30
1 2019-01-28 17:11:15   30
2 2019-01-28 17:12:14   30
3 2019-01-28 17:12:15   30
4 2019-01-28 17:12:41   30

Your solution failed, because compare scalars, so is necessary compare by datetimes in loop - call lambda function for each value.

Also is not recommended, because slow. But solution is convert string to Timestamp or datetime:

df['new'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)

Performance:

#[5000 rows x 1 columns]
df = pd.concat([df] * 1000, ignore_index=True)

In [165]: %timeit df['new1'] = np.where(df['finish'] <='2019-02-01 21:00:00', 30, 5)
465 µs ± 64.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [166]: %timeit df['new2'] = df['finish'].map(lambda x: 30 if x<=pd.Timestamp('2019-02-01 21:00:00') else 5)
22.4 ms ± 228 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Upvotes: 1

Related Questions