Reputation: 1001
Problem statement:
I have a dataframe like so:
Description Type x y z
0 Branch Actuated 0 0 0
1 Forward Actuated 7.07 7.07 0
2 Backwards Actuated 7.07 -2.93 0
3 Forward Actuated 17.07 -2.93 0
4 Backwards Actuated 10 -10 0
5 Forward Actuated 17.07 -17.07 0
6 EOL Actuated 7.07 -17.07 0
7 Forward Actuated -7.07 -7.07 0
8 Backwards Actuated -7.07 2.93 0
9 Forward Actuated -17.07 2.93 0
10 Backwards Actuated -10 10 0
11 Forward Actuated -17.07 17.07 0
12 EOL Actuated -7.07 17.07 0
13 Forward Actuated -0 10 0
14 Forward Actuated -0 20 0
15 Forward Actuated 10 0 0
When an EOL
row is encountered, the algorithm needs to reverse find the first Branch
row and insert it after the EOL
ro like so:
Description Type x y z
0 Branch Actuated 0 0 0
1 Forward Actuated 7.07 7.07 0
2 Backwards Actuated 7.07 -2.93 0
3 Forward Actuated 17.07 -2.93 0
4 Backwards Actuated 10 -10 0
5 Forward Actuated 17.07 -17.07 0
6 EOL Actuated 7.07 -17.07 0
0 Branch Actuated 0 0 0 <--
7 Forward Actuated -7.07 -7.07 0
8 Backwards Actuated -7.07 2.93 0
9 Forward Actuated -17.07 2.93 0
10 Backwards Actuated -10 10 0
11 Forward Actuated -17.07 17.07 0
12 EOL Actuated -7.07 17.07 0
0 Branch Actuated 0 0 0 <--
13 Forward Actuated -0 10 0
14 Forward Actuated -0 20 0
15 Forward Actuated 10 0 0
Note: reverse search should be based on the original dataframe not the augmented one.
Question: How to effifciently (resource lax and quick) accomplish this?
Upvotes: 2
Views: 629
Reputation: 765
def function1(dd: pd.DataFrame,dd1: pd.DataFrame):
return pd.concat([dd,dd1]) if dd.Description.eq("EOL").any() else dd
dd1=df1.query("Description=='Branch'").head(1)
df1.groupby(df1.Description.shift().eq('EOL').cumsum(), as_index=1, group_keys=0).apply(function1,dd1=dd1)
Description Type x y z
0 Branch Actuated 0 0 0
1 Forward Actuated 7.07 7.07 0
2 Backwards Actuated 7.07 -2.93 0
3 Forward Actuated 17.07 -2.93 0
4 Backwards Actuated 10 -10 0
5 Forward Actuated 17.07 -17.07 0
6 EOL Actuated 7.07 -17.07 0
0 Branch Actuated 0 0 0
7 Forward Actuated -7.07 -7.07 0
8 Backwards Actuated -7.07 2.93 0
9 Forward Actuated -17.07 2.93 0
10 Backwards Actuated -10 10 0
11 Forward Actuated -17.07 17.07 0
12 EOL Actuated -7.07 17.07 0
0 Branch Actuated 0 0 0
13 Forward Actuated -0 10 0
14 Forward Actuated -0 20 0
15 Forward Actuated 10 0 0
Upvotes: 0
Reputation: 863341
Use merge_asof
for get last matched rows, join together by concat
, DataFrame.sort_index
and last DataFrame.reset_index
with drop=True
for prevent duplicated index values:
df1 = df[df['Description'] == 'Branch']
print (df1)
Description Type x y z
0 Branch Actuated 0.0 0.0 0
df2 = df[df['Description'] == 'EOL']
print (df2)
Description Type x y z
6 EOL Actuated 7.07 -17.07 0
12 EOL Actuated -7.07 17.07 0
df3=pd.merge_asof(df2,df1,left_index=True,right_index=True,suffixes=('_',''))[df1.columns]
print (df3)
Description Type x y z
6 Branch Actuated 0.0 0.0 0
12 Branch Actuated 0.0 0.0 0
df = pd.concat([df, df3]).sort_index().reset_index(drop=True)
print (df)
Description Type x y z
0 Branch Actuated 0.00 0.00 0
1 Forward Actuated 7.07 7.07 0
2 Backwards Actuated 7.07 -2.93 0
3 Forward Actuated 17.07 -2.93 0
4 Backwards Actuated 10.00 -10.00 0
5 Forward Actuated 17.07 -17.07 0
6 EOL Actuated 7.07 -17.07 0
7 Branch Actuated 0.00 0.00 0
8 Forward Actuated -7.07 -7.07 0
9 Backwards Actuated -7.07 2.93 0
10 Forward Actuated -17.07 2.93 0
11 Backwards Actuated -10.00 10.00 0
12 Forward Actuated -17.07 17.07 0
13 EOL Actuated -7.07 17.07 0
14 Branch Actuated 0.00 0.00 0
15 Forward Actuated -0.00 10.00 0
16 Forward Actuated -0.00 20.00 0
17 Forward Actuated 10.00 0.00 0
Upvotes: 2