Reputation:
I given the sample of data-frame
Name value A Msg_A Invalid_A Msg_B Invalid_B Msg_C Invalid_C
0 j 3 A Not VAid XYZ True VAlid False Not Valid_c True
1 B 4 B VAid ABC False notVAlidb TRue Not Valid_c True
i want df like this
Name value A Msg_A Invalid_A Msg_B Invalid_B Msg_C Invalid_C Message
0 J 3 A Not VAid XYZ True VAlid False Not Valid_c True Not VAid XYZ,Not Valid_c
1 B 4 B VAid ABC False notVAlidb TRue Not Valid_c True notVAlidb,Not Valid_c
i want to merge the msg_ in message column in if invalid column contains True
shape : print(df.filter(like='Invalid_').values[:,None].shape)
--> (4, 1, 8)
i try with
df['message']=(df.filter(like='Status_')
.where(df.filter(like='Invalid_').values[:,None].reshape(4,8))
.apply(lambda x:','.join(x.dropna()),1))
Upvotes: 1
Views: 60
Reputation: 863291
Use wide_to_long
for reshape, then filter rows if Invalid
is True
and last aggregate join
:
df1 = (pd.wide_to_long(df.reset_index(),
stubnames=['Msg','Invalid'],
sep='_',
suffix='\w+',
i=['index','Name','value','A'],
j='tmp')
.reset_index())
df1 = df1[df1['Invalid']]
df['message'] = df1.groupby('index')['Msg'].agg(', '.join)
print (df)
Name value A Msg_A Invalid_A Msg_B Invalid_B Msg_C \
0 j 3 A Not VAid XYZ True VAlid False Not Valid_c
1 B 4 B VAid ABC False notVAlidb True Not Valid_c
Invalid_C message
0 True Not VAid XYZ, Not Valid_c
1 True notVAlidb, Not Valid_c
Yopu can generate all columns not starting by Msg
or Invalid
to parameter i
:
cols = df.columns[~df.columns.str.startswith(('Msg','Invalid'))].tolist()
df1 = (pd.wide_to_long(df.reset_index(),
stubnames=['Msg','Invalid'],
sep='_',
suffix='\w+',
i=['index'] + cols,
j='tmp')
.reset_index())
df1 = df1[df1['Invalid']]
df['message'] = df1.groupby('index')['Msg'].agg(', '.join)
Your solution:
There is necessary same number of columns between both df1
and df2
, so possible set df2.columns
by df1.columns
:
df1 = df.filter(like='Msg_')
df2 = df.filter(like='Invalid_')
#set columnsnames
df2.columns = df1.columns
df['message'] = df1.where(df2).apply(lambda x:','.join(x.dropna()),1)
print (df)
Name value A Msg_A Invalid_A Msg_B Invalid_B Msg_C \
0 j 3 A Not VAid XYZ True VAlid False Not Valid_c
1 B 4 B VAid ABC False notVAlidb True Not Valid_c
Invalid_C message
0 True Not VAid XYZ,Not Valid_c
1 True notVAlidb,Not Valid_c
Upvotes: 1