ninjaisfast
ninjaisfast

Reputation: 17

How to add pandas column values based on key from a dictionary in Python?

My dataframe looks like this: Name

I want to add the league the club plays in, and the country that league is based in, as new columns, for every row.

I initially tried this using dictionaries with the clubs/countries, and returning the key for a value:

club_country_dict = {'La Liga':['Real Madrid','FC Barcelona'],'France Ligue 1':['Paris Saint-Germain']}

key_list=list(club_country_dict.keys())
val_list=list(club_country_dict.values())

But this ran into issues since each of my keys values is actually a list, rather than a single value.

I then tried some IF THEN logic, by just having standalone variables for each league, and checking if the club value was in each variable:

la_Liga = ['Real Madrid','FC Barcelona']

for row in data:
      if data['Club'] in la_Liga:
            data['League'] = 'La Liga'

Apologies for the messy question. Basically I'm looking to add a two new columns to my dataset, 'League' and 'Country', based on the 'Club' column value. I'm not sure what's the easiest way to do this but I've hit walls trying to different ways. Thanks in advance.

Upvotes: 0

Views: 2489

Answers (4)

Valdi_Bo
Valdi_Bo

Reputation: 31011

To cope with the "list problem" in club_country_dict, convert it to the following Series:

league_club = pd.Series(club_country_dict.values(), index=club_country_dict.keys(),
    name='Club').explode()

The result is:

La Liga                   Real Madrid
La Liga                  FC Barcelona
France Ligue 1    Paris Saint-Germain
Name: Club, dtype: object

You should also have a "connection" between the league name and its country (another Series):

league_country = pd.Series({'La Liga': 'Spain', 'France Ligue 1': 'France'}, name='Country')

Of course, add here other leagues of interest with their countries.

The next step is to join them into club_details DataFrame, with Club as the index:

club_details = league_club.to_frame().join(league_country).reset_index()\
    .rename(columns={'index':'League'}).set_index('Club')

The result is:

                             League Country
Club                                       
Paris Saint-Germain  France Ligue 1  France
Real Madrid                 La Liga   Spain
FC Barcelona                La Liga   Spain

Then, assuming that your first DataFrame is named player, generate the final result:

result = player.join(club_details, on='Club')

The result is:

                Name                 Club          League Country
0  Cristiano Ronaldo          Real Madrid         La Liga   Spain
1           L. Messi         FC Barcelona         La Liga   Spain
2             Neymar  Paris Saint-Germain  France Ligue 1  France

Upvotes: 1

Amir saleem
Amir saleem

Reputation: 1496

Try pandas replace feature for Series.

df = pd.DataFrame({"Name" : ['Cristiano', 'L. Messi', "Neymar"], 'Club' : ["Real Madrid", "FC Barcelona", "Paris Saint-Germain"]})

df:

    Name        Club
0   Cristiano   Real Madrid
1   L. Messi    FC Barcelona
2   Neymar      Paris Saint-Germain

Now add new column:

club_country_dict = {'Real Madrid': 'La Liga',
                     'FC Barcelona' : "La Liga",
                     'Paris Saint-Germain': 'France Ligue 1'}
df['League'] = df.Club.replace(club_country_dict)

df:

    Name        Club                   League
0   Cristiano   Real Madrid            La Liga
1   L. Messi    FC Barcelona           La Liga
2   Neymar      Paris Saint-Germain    France Ligue 1

Upvotes: 1

Hanumanth Reddy Aredla
Hanumanth Reddy Aredla

Reputation: 407

Here is one of the simple way to solve it. Use Pandas apply function on rows https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.apply.html

import pandas as pd

df = pd.DataFrame({"name": ["CR", "Messi", "neymar"], "club": ["RM", "BR", "PSG"]})

country = {"BR": "Spain", "PSG": "France", "RM": "Spain"}

df["country"] = df.apply(lambda row: country[row.club], axis=1)

print(df)

Output:

     name club country
0      CR   RM   Spain
1   Messi   BR   Spain
2  neymar  PSG  France

Upvotes: 1

Kay
Kay

Reputation: 2332

You could convert the dictionary to a data frame and then merge:

df = pd.DataFrame({"Name": ["CR", "Messi", "neymar"], "Club": ["Real Madrid", "FC Barcelona", "Paris Saint-Germain"]})

df.merge(pd.DataFrame(club_country_dict.items(), columns=['League', 'Club']).explode('Club'),
         on = 'Club', how='left')

enter image description here

Upvotes: 3

Related Questions