Kristo_R
Kristo_R

Reputation: 322

Compare dataframe values to a specific column within the same row

I want to compare values within all rows of a dataframe to a specific column within the same row. I managed to do it by iterating over all rows, and it works OK for smaller datasets, but starts to cause issues as the number of rows and columns is increasing.

I was wondering, is there a more effective way for accomplishing this with pandas?

Example of my current solution:

data = np.array([['Identifier','N1','N2','N3','N4','mean'],
                ['Row1',1,2,3,4,2.5],
                ['Row2',5,4,3,2,3.5],
                ['Row3',1,5,1,5,3],
                ['Row4',1,2,3,10,4]               
                ])

df = pd.DataFrame(data=data[1:,1:],
                 index=data[1:,0],
                 columns=data[0,1:])
df.head()

result:

        N1  N2  N3  N4  mean
Row1    1   2   3   4   2.5
Row2    5   4   3   2   3.5
Row3    1   5   1   5   3
Row4    1   2   3   10  4

To turn this into a boolean dataframe, I do the following:

# new dataframe with same structure
df_bools = pd.DataFrame().reindex_like(df)
df_bools["mean"] = df["mean"]

# iterate over row values
for index,row in df.iterrows():
    colcnt = 0
    for i in row[0:-1]:
        df_bools.iloc[df.index.get_loc(index),colcnt] = (i>row["mean"])
        colcnt += 1

df_bools.head()

and the desired result:

        N1      N2      N3      N4      mean
Row1    False   False   True    True    2.5
Row2    True    True    False   False   3.5
Row3    False   True    False   True    3
Row4    False   False   False   False   4

Upvotes: 1

Views: 38

Answers (1)

BENY
BENY

Reputation: 323226

IIUC

df.iloc[:,:4]=df.iloc[:,:4].gt(df['mean'],0)
df
Out[1015]: 
         N1     N2     N3     N4 mean
Row1  False  False   True   True  2.5
Row2   True   True  False  False  3.5
Row3  False   True  False   True    3
Row4  False  False  False  False    4

Upvotes: 1

Related Questions