Reputation: 85
I am working with a dataframe in pandas which holds numeric data.
E.g:
d = {'col1': [1, 2,3,2], 'col2': [3, 4,1,2],'col3':[1,3,4,1}
df = pd.DataFrame(data=d)
What I want to do is compare the elements in the third column with the other elements in their respective row in terms of something each element in row n < last element of row n return true / false or 1 / 0.
#Desired Output:
resDf = {'col1':[False,True,True,False],'col2':[False,False,True,False]}
What I have done so far is use apply
like this:
resultBoolDf = df.iloc[:,:-1].apply(lambda x: np.where(x < df.col3,1,0),axis = 0)
So this does not seem to work since I assume that the comparison is not iterating correctly. Could somebody give me a tip on how to solve this? Thanks!
Upvotes: 3
Views: 123
Reputation: 862691
Use DataFrame.lt
for compare with last column selected by position:
df1 = df.iloc[:,:-1].lt(df.iloc[:, -1], axis=0)
#if want specify last column by label
#df1 = df.iloc[:,:-1].lt(df.col3, axis=0)
print (df1)
col1 col2
0 False False
1 True False
2 True True
3 False False
Last if need 0,1
convert to integers by DataFrame.astype
:
df1 = df.iloc[:,:-1].lt(df.iloc[:, -1], axis=0).astype(int)
#if want specify last column by label
#df1 = df.iloc[:,:-1].lt(df.col3, axis=0).astype(int)
print (df1)
col1 col2
0 0 0
1 1 0
2 1 1
3 0 0
Your solution with numpy.where
is possible use with DataFrame
constructor:
arr = np.where(df.iloc[:,:-1].lt(df.col3, axis=0),1,0)
df1 = pd.DataFrame(arr, index=df.index, columns = df.columns[:-1])
print (df1)
col1 col2
0 0 0
1 1 0
2 1 1
3 0 0
Upvotes: 2