lizard6
lizard6

Reputation: 65

Iterate over df to compare values

I have a large data frame and I need to compare the first entry in column one to the first entry in column two. Then the first entry in column one to the first entry in column three, etc. Basically I want to see if the two values are >0, then do something. Is there a way to do this using a pandas dataframe?

0  22 0 44 5 6
1  12 3 56 0 0
2  0  0 1  0 0
3  1  2 0  0 0

So I want to see if 22 and 0 are both >0. Then if 22 and 44 are both >0. Then if 22 and 5 are both >0. Then if 22 and 6 are both >0. Then if 12 and 3 are both >0. Then if 12 and 56 are both >0. ...and so on. If a pair has both elements >0 then my code will do something else.

I assume there is an easy way to iterate over rows and columns that I am just missing.

Upvotes: 3

Views: 1306

Answers (2)

jezrael
jezrael

Reputation: 862581

Use DataFrame.rolling per axis=1 - columns, comapre by condition and test if at least one True in both by any or if necessary test if both are True use change any to all.

First column is filled NaNs, so (I hope) is necessary DataFrame.shift for last column NaNs, replace them by DataFrame.fillna and cast to boolean:

print (df)
    a  b   c  d  e
0  22  0  44  5  6
1  12  3  56  0  0
2   0  0   1  0  0
3   1  2   0  0  0


def f(x):
    print (x)
    mask = x > 0
    print (mask)
    return mask.any()


df = df.rolling(2, axis=1).apply(f, raw=True).shift(-1, axis=1).fillna(0).astype(bool)
print (df)
       a     b      c      d      e
0   True  True   True   True  False
1   True  True   True  False  False
2  False  True   True  False  False
3   True  True  False  False  False

Upvotes: 1

JimminyCricket
JimminyCricket

Reputation: 381

have a look at this suggestion this might work for you.

import pandas as pd
data = [[22, 0, 44, 5, 6], [12, 3, 56, 0, 0], [0, 0, 1, 0, 0], [1, 2, 0, 0, 0]]
df = pd.DataFrame(data)

# we ignore the first column since we will use it to compare
columns = df.columns[1:]

# new dataframe for convenience purposes
new_df = df.copy(deep=True)

# iterate through each column and compare the 1st and the iterating column and if both above zero make True or False
for i in columns:
    new_df.iloc[:, i] = (df.iloc[:, 0] > 0) & (new_df.iloc[:, i] > 0)

# check results
new_df

Upvotes: 0

Related Questions