IronMaiden
IronMaiden

Reputation: 564

Remove duplicates spread across columns

I have a data frame that looks like this

Temp_1 Temp_2 Temp_3 Temp_4 Temp_5   Air
23      23     23      23     23   Oxygen
24      27     56      48     39   Nitrogen
18      18     18      18     18   Hydrogen
47      53     67      73     25   Neon

I want to remove those rows which have the same duplicate values across all the Temperature columns and the output to look something like this

Temp_1 Temp_2 Temp_3 Temp_4 Temp_5   Air
24      27     56      48     39   Nitrogen
47      53     67      73     25   Neon

Upvotes: 1

Views: 59

Answers (2)

Corralien
Corralien

Reputation: 120391

You can also check if all columns Temp_X are equal to the mean. In fact, keep row if one value (any) is not equal (ne) to its mean (mean).

>>> df[df.filter(like='Temp').apply(lambda x: x.ne(x.mean()).any(), axis=1)]

   Temp_1  Temp_2  Temp_3  Temp_4  Temp_5       Air
1      24      27      56      48      39  Nitrogen
3      47      53      67      73      25      Neon

Upvotes: 0

akuiper
akuiper

Reputation: 214927

Just check if all of the Temp columns equal to one of them. e.g check if all equal to the first Temp column and drop if they do:

temp = df.filter(like='Temp')
df[~temp.eq(temp.iloc[:, 0], 0).all(1)] 

#   Temp_1  Temp_2  Temp_3  Temp_4  Temp_5       Air
#1      24      27      56      48      39  Nitrogen
#3      47      53      67      73      25      Neon

Upvotes: 2

Related Questions