Zephyr
Zephyr

Reputation: 1352

If / Else flow control in pandas dataframe

I am new to python and is working on logical statement. My objective is to count the goal score by teams. (i.e. if a team scored a goal, I will assign 1 and the opponent will be assigned -1). Below is the snap shot of the data. Data is below

I wrote logical statement as follow:

if data['team']== data['hometeam_team1']:
   data['run_score'] = 1
else:
    data['run_score'] = -1

but it threw me value error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Can anyone advise? Your help is much appreciated. Thanks Aung

Upvotes: 2

Views: 442

Answers (2)

jpp
jpp

Reputation: 164693

The benefit of using panda is vectorised calculations. In other words, you very rarely need to use explicit for loops or if / else clauses to perform a calculation on each row.

Instead you can perform calculations on pd.Series objects. In this example, one efficient solution is to use numpy.where which acts like a vectorised if / else clause:

import numpy as np

data['run_score'] = np.where(data['team']== data['hometeam_team1'], 1, -1)

Upvotes: 2

YOLO
YOLO

Reputation: 21719

I am not sure if this would work since you haven´t provided any data. But this is the general framework used to solve such problem. You can use apply function here.

data['run_score'] = data.apply(lambda row: 1 if row['team'] == row['hometeam_team1'] else -1, axis=1)

Upvotes: 1

Related Questions