work_python
work_python

Reputation: 119

How to find the odd one out in my pandas columns across two columns?

I have a pandas dataframe as below:

enter image description here

Every group of individual IDs should have the same set of combined values. Eg, value 123456789 should have red, blue, orange and white across all of the combined values.

How do I detect an odd row, such as the bottom one? The ID 425656565 should have orange, orange, orange, green but it has red in there instead.

Have been trying to create lists...reset indices...make lists, by I can't figure out how to find odd values.

The code to create this datarame is as below:

df.groupby(['Group','Individual ID','Account','combined']).count().reset_index()

Additionally, the order in 'combined' should also expect to stay the same.

Thanks

Upvotes: 0

Views: 439

Answers (1)

SomeDude
SomeDude

Reputation: 14238

I would do it like below:

df_counts = pd.DataFrame(df_test[['Individual ID','combined']].value_counts(ascending=True)).reset_index()
df_counts.columns = df_counts.columns.tolist()[:-1] + ['odd']
df = df.merge(df_counts, on=['Individual ID', 'combined'])
df['odd'] = df['odd'].apply(lambda c:c == 1)

Output:

  Group Individual ID   Account               combined        odd
0   458     123456789     45877  red,blue,orange,white      False
1   458     123456789     55998  red,blue,orange,white      False
2   458     123456789     55663  red,blue,orange,white      False
3   458     787878787     44778    blue,blue,blue,blue      False
4   458     787878787     22225    blue,blue,blue,blue      False
5   458     787878787     22236    blue,blue,blue,blue      False
6   458     425656565     47778 orange,orange,orange,green  False
7   458     425656565     59886 orange,orange,orange,green  False
8   458     425656565     11111 orange,orange,red,green      True

Upvotes: 1

Related Questions