Reputation: 1150
I am trying to duplicate my pandas' data frame's rows and also adding an additional column for a time sequence in minutes between column FROM
and TO
.
For example, I have this data frame.
ID FROM TO
A 15:30 15:33
B 16:40 16:44
C 15:20 15:22
What I want the output to be is
ID FROM TO time
A 15:30 15:33 15:30
A 15:30 15:33 15:31
A 15:30 15:33 15:32
A 15:30 15:33 15:33
B 16:40 16:41 16:40
B 16:40 16:41 16:41
C 15:20 15:22 15:20
C 15:20 15:22 15:21
C 15:20 15:22 15:22
In R, I could do this: new_df = setDT(df)[, .(ID, FROM, TO, time=seq(FROM,TO,by="mins")), by=1:nrow(df)]
, but I am having trouble finding the Python equivalent of this.
Thank you in advance!
Upvotes: 0
Views: 190
Reputation: 30605
Here's a similar one that of @chrisz using concat
and iterrows
along with date_range
confined to a single step
df = pd.concat([pd.DataFrame({
'ID':row.ID,
'FROM': row.FROM,
'TO': row.TO,
'TIME': pd.Series(pd.date_range(row.FROM, row.TO, freq='60s').time).astype(str).str[:5]
}) for _, row in df.iterrows()])
TIME FROM ID TO
0 15:30 15:30 A 15:33
1 15:31 15:30 A 15:33
2 15:32 15:30 A 15:33
3 15:33 15:30 A 15:33
0 16:40 16:40 B 16:44
1 16:41 16:40 B 16:44
2 16:42 16:40 B 16:44
3 16:43 16:40 B 16:44
4 16:44 16:40 B 16:44
0 15:20 15:20 C 15:22
1 15:21 15:20 C 15:22
2 15:22 15:20 C 15:22
Upvotes: 1
Reputation: 51165
Two steps to solve your problem:
pd.date_range
with apply
and strftime
df['duration'] = df.apply(
lambda row: [
i.strftime('%H:%M')
for i in pd.date_range(
row['FROM'], row['TO'], freq='60s'
)
],
axis=1)
ID FROM TO duration
0 A 15:30 15:33 [15:30, 15:31, 15:32, 15:33]
1 B 16:40 16:44 [16:40, 16:41, 16:42, 16:43, 16:44]
2 C 15:20 15:22 [15:20, 15:21, 15:22]
apply
with stack
df.set_index(['ID', 'FROM', 'TO']) \
.duration.apply(pd.Series) \
.stack().reset_index(level=3, drop=True) \
.reset_index() \
.set_index('ID')
# Result
FROM TO 0
ID
A 15:30 15:33 15:30
A 15:30 15:33 15:31
A 15:30 15:33 15:32
A 15:30 15:33 15:33
B 16:40 16:44 16:40
B 16:40 16:44 16:41
B 16:40 16:44 16:42
B 16:40 16:44 16:43
B 16:40 16:44 16:44
C 15:20 15:22 15:20
C 15:20 15:22 15:21
C 15:20 15:22 15:22
Upvotes: 1