Reputation: 45
I don't know how to ask this, so I made up an example. The goal is to compare the value of columns A & B to the values of C,D and E separately. If the value is the same or higher (>=) then return '1' else '0'.
A B C D E
1 101 103 100 101 99
2 102 100 101 99 103
3 100 99 103 100 98
The desired outcome would be this:
A B
C D E C D E
1 1 1 1 1 1
1 1 0 0 1 0
0 1 1 0 0 1
Ideally I would like a function for this, what's the best way to approach this?
Upvotes: 3
Views: 1309
Reputation: 25239
Direct compare A
and B
to C, D, E
, and concat
them together
A = df[['C','D','E']].le(df[['A']].values).astype(int)
B = df[['C','D','E']].le(df[['B']].values).astype(int)
pd.concat([A,B], axis=1, keys=['A','B'])
Out[650]:
A B
C D E C D E
1 1 1 1 1 1 1
2 1 1 0 0 1 0
3 0 1 1 0 0 1
Upvotes: 1
Reputation: 106
You could use pandas.DataFrame.apply
For example, in your case with df
being this dataframe
A B C D E
1 101 103 100 101 99
2 102 100 101 99 103
3 100 99 103 100 98
you could perform the following operation to compare A to C,D and E:
df[['C', 'D', 'E']].apply(lambda x: df['A'] >= x)
This will give you:
C D E
True True True
True True False
False True True
With True = 1
and False = 0
. You could do the same to compare B to C,D and E. I hope this helps.
Upvotes: 1
Reputation: 150735
How about broadcasting:
compares = df[['A','B']].values[:,:,None] >= df[['C','D','E']].values[:,None, :]
compares = compares.reshape(len(df),-1)
pd.DataFrame(compares,
columns=pd.MultiIndex.from_product((list('AB'),list('CDE')))
).astype(int)
Output:
A B
C D E C D E
0 1 1 1 1 1 1
1 1 1 0 0 1 0
2 0 1 1 0 0 1
Upvotes: 1
Reputation: 42886
You can use with le
which checks for every column if its equal or less than A
and then B
:
cols = ['A', 'B']
compare = ['C', 'D', 'E']
for col in cols:
print(col)
df[compare].le(df[col], axis=0).astype(int)
output
A
C D E
0 1 1 1
1 1 1 0
2 0 1 1
B
C D E
0 1 1 1
1 0 1 0
2 0 0 1
It would be more elegant to make a dict with the column names as key:
cols = ['A', 'B']
compare = ['C', 'D', 'E']
dct = {col: df[compare].le(df[col], axis=0).astype(int) for col in cols}
print(dct['A'], '\n')
print(dct['B'])
output
C D E
0 1 1 1
1 1 1 0
2 0 1 1
C D E
0 1 1 1
1 0 1 0
2 0 0 1
Upvotes: 1