Reputation: 497
I am trying to Match multiple column in different sets and update an another column with the all the unmatched column name separated by ,
Update the result column with the unmatched column name
Input:
A B C D E
0 f e b a d
1 c b a c b
2 f f a b c
3 d c c d c
4 f b b b e
5 b a f c d
Expected Output
A B C D E MATCHES
0 f e b a d AD, BC Unmatched
1 c b a c b BC Unmatched
2 f f a b c AD, BC Unmatched
3 d c c d c ALL MATCHED
4 f b b b e AD Unmatched
5 b a f c d AD, BC Unmatched
The below code gives Error when using it inside function Else its working fine if i am using separately without use of any function.
def test(x):
try:
for idx in df.index:
unmatch_list = []
if not df.loc[idx, 'A'] == df.loc[idx, 'D']:
unmatch_list.append('AD')
if not df.loc[idx, 'B'] == df.loc[idx, 'C']:
unmatch_list.append('BC')
# etcetera...
if len(unmatch_list):
unmatch_string = ', '.join(unmatch_list) + ' Unmatched'
else:
unmatch_string = 'ALL MATCHED'
df.loc[idx, 'MATCHES'] = unmatch_string
except ValueError:
It Gives Error when trying to process:
if not df.loc[idx, 'A'] == df.loc[idx, 'D']:
Error: pandas.core.indexing.IndexingError: Too many indexers
Need Suggestions:
Upvotes: 1
Views: 851
Reputation: 862841
How is called function?
For me working if add return df
and pass DataFrame
to function:
def test(x):
try:
for idx in df.index:
unmatch_list = []
if not df.loc[idx, 'A'] == df.loc[idx, 'D']:
unmatch_list.append('AD')
if not df.loc[idx, 'B'] == df.loc[idx, 'C']:
unmatch_list.append('BC')
# etcetera...
if len(unmatch_list):
unmatch_string = ', '.join(unmatch_list) + ' Unmatched'
else:
unmatch_string = 'ALL MATCHED'
df.loc[idx, 'MATCHES'] = unmatch_string
except ValueError:
print ('error')
return df
df = test(df)
print (df)
A B C D E MATCHES
0 f e b a d AD, BC Unmatched
1 c b a c b BC Unmatched
2 f f a b c AD, BC Unmatched
3 d c c d c ALL MATCHED
4 f b b b e AD Unmatched
5 b a f c d AD, BC Unmatched
Solution with apply
is possible, but is necessary change function like:
def test(x):
try:
unmatch_list = []
if not x['A'] == x['D']:
unmatch_list.append('AD')
if not x['B'] == x['C']:
unmatch_list.append('BC')
# etcetera...
if len(unmatch_list):
unmatch_string = ', '.join(unmatch_list) + ' Unmatched'
else:
unmatch_string = 'ALL MATCHED'
except ValueError:
print ('error')
return unmatch_string
df['MATCHES'] = df.apply(test, axis=1)
print (df)
A B C D E MATCHES
0 f e b a d AD, BC Unmatched
1 c b a c b BC Unmatched
2 f f a b c AD, BC Unmatched
3 d c c d c ALL MATCHED
4 f b b b e AD Unmatched
5 b a f c d AD, BC Unmatched
Upvotes: 1