Reputation: 596
I have a dataframe that looks like:
df_test = pd.DataFrame({
0: ['Property1', 1.2, 1.5, 2.6],
1: ["Std", 0.1,0.01,0.02],
2: ["Rep", 3,3,3],
3: ["Property2", 3.1,18.2,10.66],
4: ["Property3", 22,33,44],
5: ["Rep", 3,3,3],
6: ["Property4", 6,12,14.23],
7: ["Property5", 3.1,18.2,10.66],
8: ["Std", 1,0.2,0.66],
})
0 1 2 3 4 5 6 7 8
0 Property1 Std Rep Property2 Property3 Rep Property4 Property5 Std
1 1.2 0.1 3 3.1 22 3 6 3.1 1
2 1.5 0.01 3 18.2 33 3 12 18.2 0.2
3 2.6 0.02 3 10.66 44 3 14.23 10.66 0.66
I want to insert columns if the value in df[1,i] is not ["Std", "Rep"] and df[1,i+1] is not "Std".
If I'm iterating over the columns (starting from the end) it would look like:
while i < j:
if df.loc[1,i] != "Std" and df.loc[1,i] != "Rep":
if df.loc[1,i+1] != "Std":
df.insert(i+1,"Std",np.nan)
df.columns = pd.RangeIndex(df.columns.size)
df.loc[1,i+1]="Std"
i+=1
j = len(df.columns)-1
I'm trying to do with using select but I don't know how to set conditions that depend on two consecutive columns like I did in the while loop.
Expected result:
0 1 2 3 4 5 6 7 8 9 \
0 Property1 Std Rep Property2 Std Property3 Std Rep Property4 Std
1 1.2 0.1 3 3.1 nan 22 nan 3 6 nan
2 1.5 0.01 3 18.2 nan 33 nan 3 12 nan
3 2.6 0.02 3 10.66 nan 44 nan 3 14.23 nan
10 11
0 Property5 Std
1 3.1 1
2 18.2 0.2
3 10.66 0.66
Is there a way to vectorize this loop?
Upvotes: 0
Views: 42
Reputation: 323316
Let us try
s = df.iloc[0,:]
s = s.str.startswith('Property')&s.shift(-1).ne('Std')
out = pd.concat([df,pd.DataFrame([['Std',np.nan,np.nan,np.nan]]*s.sum(),
index = s.index[s],
columns=df.index).T],axis=1).sort_index(level=0,axis=1)
out
0 1 2 3 3 ... 5 6 6 7 8
0 Property1 Std Rep Property2 Std ... Rep Property4 Std Property5 Std
1 1.2 0.1 3 3.1 NaN ... 3 6 NaN 3.1 1
2 1.5 0.01 3 18.2 NaN ... 3 12 NaN 18.2 0.2
3 2.6 0.02 3 10.66 NaN ... 3 14.23 NaN 10.66 0.66
[4 rows x 12 columns]
Upvotes: 1