Reputation: 61
I have a column in a dataframe with two data types, like this:
25 3037205
26 2019-09-04 19:54:57
27 2019-09-09 17:55:45
28 2019-09-16 21:40:36
29 3037206
30 2019-09-06 14:49:41
31 2019-09-11 17:17:11
32 3037207
33 2019-09-11 17:19:04
I'm trying to slice it and build a new data frame like this:
26 3037205 2019-09-04 19:54:57
27 3037205 2019-09-09 17:55:45
28 3037205 2019-09-16 21:40:36
29 3037206 2019-09-06 14:49:41
30 3037206 2019-09-11 17:17:11
31 3037207 2019-09-11 17:19:04
I can't find how to slice between numbers "no datetype".
Some ideas?
Thx!
Upvotes: 5
Views: 243
Reputation: 18367
I'm not sure if this is the most efficient way of solving the issue, but it seems to get the job done. I've added the option to rename the second column (since its name is not specified) after the #:
import pandas as pd
import numpy as np
data = {'dates':[3037205,'2019-09-04 19:54:57','2019-09-09 17:55:45','2019-09-16 21:40:36',3037206,'2019-09-06 14:49:41','2019-09-11 17:17:11',3037207,'2019-09-11 17:19:04']}
df = pd.DataFrame(data)
df['mask'] = np.where(df['dates'].str.isnumeric(),df['dates'],np.nan)
df['mask_2'] = np.where(df['dates'].str.isnumeric(),np.nan,df['dates'])
df['mask'] = df['mask'].fillna(method='ffill')
df = df.dropna(subset=['mask_2']).drop(columns=['mask_2'])#.rename(columns={'mask':'desired_name'})
print(df)
Output:
dates mask
1 2019-09-04 19:54:57 3037205
2 2019-09-09 17:55:45 3037205
3 2019-09-16 21:40:36 3037205
5 2019-09-06 14:49:41 3037206
6 2019-09-11 17:17:11 3037206
8 2019-09-11 17:19:04 3037207
Upvotes: 2
Reputation: 150735
Another approach:
s = pd.to_numeric(df['col1'], errors='coerce')
df.assign(val=s.ffill().astype(int)).loc[s.isnull()]
Output:
col1 val
26 2019-09-04 19:54:57 3037205
27 2019-09-09 17:55:45 3037205
28 2019-09-16 21:40:36 3037205
30 2019-09-06 14:49:41 3037206
31 2019-09-11 17:17:11 3037206
33 2019-09-11 17:19:04 3037207
Upvotes: 4