Reputation: 611
I have a dataframe df_in, which contains column names that start with pi and pm.
df_in = pd.DataFrame([[1,2,3,4,"",6,7,8,9],["",1,32,43,59,65,"",83,97],["",51,62,47,58,64,74,86,99],[73,51,42,67,54,65,"",85,92]], columns=["piabc","pmed","pmrde","pmret","pirtc","pmere","piuyt","pmfgf","pmthg"])
If a row in column name which starts with pi is blank, make the same rows of columns which starts with pm also blank till we have a new column which starts with pi. And repeat the same process for other columns also.
Expected Output:
df_out = pd.DataFrame([[1,2,3,4,"","",7,8,9],["","","","",59,65,"","",""],["","","","",58,64,74,86,99],[73,51,42,67,54,65,"","",""]], columns=["piabc","pmed","pmrde","pmret","pirtc","pmere","piuyt","pmfgf","pmthg"])
How to do it?
Upvotes: 1
Views: 178
Reputation: 862601
You can create groups by compare columns names by str.startswith
with cumulative sum and then compare values by empty spaces in groupby
for mask used for set empty spaces in DataFrame.mask
:
g = df_in.columns.str.startswith('pi').cumsum()
df = df_in.mask(df_in.eq('').groupby(g, axis=1).transform(lambda x: x.iat[0]), '')
#first for me failed in pandas 1.2.3
#df = df_in.mask(df_in.eq('').groupby(g, axis=1).transform('first'), '')
print (df)
piabc pmed pmrde pmret pirtc pmere piuyt pmfgf pmthg
0 1 2 3 4 7 8 9
1 59 65
2 58 64 74 86 99
3 73 51 42 67 54 65
Upvotes: 1