Reputation: 171
I have a data frame that is a single row of numerical values and I want to know if any of those values is greater than 2 and if so create a new column with the word 'Diff'
Col_,F_1,F_2
1,5,0
My dataframe is diff_df. Here is one thing I tried
c = diff_df >2
if c.any():
diff_df['difference']='Difference'
If I were to print c. it would be
Col_,F_1,F_2
False,True,False
I have tried c.all() and many iterations of other things. Clearly my inexperience is holding me back and google is not helping in this regards. Everything I try is either "The truth value of a Series (or Data Frame) is ambiguous use a.any(), a.all()...." Any help would be appreciated.
Upvotes: 0
Views: 50
Reputation: 323
In addition to David's reponse you may also try this:
if ((df > 2).astype(int)).sum(axis=1).values[0] == 1:
df['difference']='Difference'
Upvotes: 0
Reputation: 26676
Use .loc
accessor and .gt()
to query and at the same time create new column and populate it
df.loc[df.gt(2).any(1), "difference"] = 'Difference'
Col_ F_1 F_2 difference
0 1 5 0 Difference
Upvotes: 1
Reputation: 16673
Since it is only one row, take the .max().max()
of the dataframe. With one .max()
you are going to get the .max()
of each column. The second .max()
takes the max of all the columns.
if diff_df.max().max() > 2: diff_df['difference']='Difference'
output:
Col_ F_1 F_2 difference
0 1 5 0 Difference
Upvotes: 2