okpython
okpython

Reputation: 29

filtering data in pandas where string is in multiple columns

I have a dataframe that looks like this:

team_1  score_1 team_2  score_2
AUS     2       SCO     1
ENG     1       ARG     0
JPN     0       ENG     2

I can retreive all the data from a single team by using: #list specifiying team of interest

team = ['ENG']

#slice the dataframe to show only the rows where the column 'Team 1' or 'Team 2' value is in the specified string list 'team'

df.loc[df['team_1'].isin(team) | df['team_2'].isin(team)]
team_1  score_1 team_2  score_2
ENG     1       ARG     0
JPN     0       ENG     2

How can I now return only the score for my 'team' such as:

team    score
ENG     1
ENG     2

Maybe creating an index to each team so as to filter out? Maybe encoding the team_1 and team_2 columns to filter out?

Upvotes: 0

Views: 286

Answers (2)

sammywemmy
sammywemmy

Reputation: 28644

Melt the columns, filter for values in team, compute the sum of the scores column, and filter for only teams and score:

 team = ["ENG"]

(
    df
    .melt(cols, value_name="team")
    .query("team in @team")
    .assign(score=lambda x: x.filter(like="score").sum(axis=1))
    .loc[:, ["team", "score"]]
)

    team    score
1   ENG        1
5   ENG        2

Upvotes: 0

Maryam
Maryam

Reputation: 720

new_df_1 = df[df.team_1 =='ENG'][['team_1', 'score_1']]
new_df_1 =new_df_1.rename(columns={"team_1":"team", "score_1":"score"})
#   team  score
#  0  ENG      1

new_df_2 = df[df.team_2 =='ENG'][['team_2', 'score_2']]
new_df_2 = new_df_2.rename(columns={"team_2":"team", "score_2":"score"})
#  team  score
# 1  ENG      2

then concat two dataframe:

pd.concat([new_df_1, new_df_2])

the output is :

 team  score
0  ENG      1
1  ENG      2

Upvotes: 1

Related Questions