SpanishBoy
SpanishBoy

Reputation: 2225

Evaluate multiple complex conditional statements in pandas

I am trying to perform conditional logic with this dataframe.

In[5]: df = pd.DataFrame({'WINNER': [2, 2, 0], 'PREDICTED': [2, 1, 0], 'HOME': [5.25, 2.2, 1.25], 'DRAW': [4.5, 3.2, 5], 'AWAY': [1.53, 3.4, 8]})
In[6]: df
Out[6]: 
   AWAY  DRAW  HOME  PREDICTED  WINNER
0  1.53   4.5  5.25          2       2
1  3.40   3.2  2.20          1       2
2  8.00   5.0  1.25          0       0

Using the following rules, I want to calculate a new profit column.

In[14]: df.loc[(df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 0), "PROFIT"] = df['HOME'] * 10
In[16]: df.loc[(df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 1), "PROFIT"] = df['DRAW'] * 10
In[17]: df.loc[(df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 2), "PROFIT"] = df['AWAY'] * 10

I almost got the correct result:

   AWAY  DRAW  HOME  PREDICTED  WINNER  PROFIT
0  1.53   4.5  5.25          2       2    15.3
1  3.40   3.2  2.20          1       2     NaN
2  8.00   5.0  1.25          0       0    12.5

Is it possible to simplify the code in line #14-17?

How can I get table below such that NaN => -10?

AWAY  DRAW  HOME  PREDICTED  WINNER  PROFIT
0  1.53   4.5  5.25          2       2    15.3
1  3.40   3.2  2.20          1       2     -10
2  8.00   5.0  1.25          0       0    12.5

EDIT: I like proposed by imp9 solution with little modifications

categories = ['HOME', 'DRAW', 'AWAY']
df['PROFIT'] = -10
for count, col in enumerate(categories):
   df.loc[df.query('WINNER == PREDICTED == @count').index, "PROFIT"] += df[col] * 10

Upvotes: 0

Views: 533

Answers (2)

ilyas patanam
ilyas patanam

Reputation: 5324

categories = ['HOME', 'DRAW', 'AWAY']
for count, col in enumerate(categories):
    df.loc[df.query('WINNER == PREDICTED == @count').index, "PROFIT"] = df[col] * 10
df.fillna({'PROFIT': -10}, inplace = True)
  • Use a for loop with enumerate to incrementally fill in the profit column.
  • Use df.query to further simplify writing the conditional logic. The query must be passed as a string and variables should be preceded with @.
  • df.fillna() lets you replace all the NaN values in the dataframe.

EDIT: used a dictionary so only Nan in PROFIT column is filled.

Upvotes: 2

Parfait
Parfait

Reputation: 107652

Consider numpy's np.where() to reflect a nested if/then/else. The false argument would be -10:

df["PROFIT"] = np.where((df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 0), 
                         df['HOME'] * 10,
                 np.where((df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 1), 
                          df['DRAW'] * 10,
                   np.where((df["WINNER"] == df["PREDICTED"]) & (df["PREDICTED"] == 2), 
                             df['AWAY'] * 10, -10)))

Upvotes: 1

Related Questions