Solanum tuberosum
Solanum tuberosum

Reputation: 27

How to change value in pandas DataFrame dependent on if clause in a for loop

I would like to analyse a list of orchids (input_df) if it contains orchid species that are on one of six lists. I import these lists from an xlsx file with six sheets as dictionary containing the six lists as DataFrames (orchid_checklists).

import pandas as pd
orchid_checklists = pd.read_excel('\\orchid_checklists.xlsx', sheet_name=None)
input_df = pd.read_excel('\\input.xlsx')
input_df['Orchideen-Checkliste'] = ''

With the following for loop with if condition I am trying to add the name of the Checklist into the field corresponding to the item in input_df['Input Name'] in the column ['Orchideen-Checkliste'] to visualize to what checklist one should refer.

for item in input_df['Input Name']:
    for list_name, sheet in orchid_checklists.items():
        genus = item.split(' ')[0]
        if genus in sheet['referenced'].values:
            input_df['Orchideen-Checkliste'] = list_name
            
            
        else:
            pass

In my test input list there is one species called "Bulbophyllum pachyrachis" that should be found. Unfortunately the name of the list "CL_Bulbophyllum" is put into all rows. I can´t figure out why.

In the next step I want to check if the species name is also in the column "exceptions" in either of my checklists. In that case that would not be the correct checklist. In these cases the full species name (e.g. "Aerangis ellisii", see CL_App_I and CL_III below) is found in the column "referenced" of another list. I haven´t started coding this exception, because I am still stuck on the part before, but any pointers how to approach this are highly welcomed.

This is the input data:

    Input Name          Orchideen-Checkliste
0   Sobralia madisonii  
1   Stelis cocornaensis
2   Stelis gelida
3   Braemia vittata
4   Brassia escobariana
5   Aspasia silvana
6   Bulbophyllum maximum
7   Bulbophyllum pachyrachis
8   Chondroscaphe amabilis
9   Dresslerella hispida
10  Elleanthus sodiroi
11  Maxillaria mathewsii

orchid_checklists:

CL_III
       referenced        exceptions
0        Aerangis  Aerangis ellisii
1       Angraecum               NaN
2     Ascocentrum               NaN
3        Bletilla               NaN
4      Brassavola               NaN
5        Calanthe               NaN
6       Catasetum               NaN
7        Miltonia               NaN
8    Miltoniopsis               NaN
9      Renanthera               NaN
10  Renantherella               NaN
11  Rhynchostylis               NaN
12  Rossioglossum               NaN
13          Vanda               NaN
14      Vandopsis               NaN

CL_App_I
                referenced  exceptions
0            Paphiopedilum         NaN
1            Phragmipedium         NaN
2         Aerangis ellisii         NaN
3       Cattleya jongheana         NaN
4          Cattleya lobata         NaN
5      Dendrobium cruentum         NaN
6  Mexipedium xerophyticum         NaN
7         Peristeria elata         NaN
8   Renanthera imshootiana         NaN

CL_Bulbophyllum
      referenced  exceptions
0     Acrochaene         NaN
1   Bulbophyllum         NaN
2      Chaseella         NaN
3   Codonosiphon         NaN
4        Drymoda         NaN
5      Monomeria         NaN
6    Monosepalum         NaN
7   Pedilochilus         NaN
8   Succoglossum         NaN
9        Sunipia         NaN
10         Trias         NaN

Thank you in advance for your help!

Upvotes: 0

Views: 49

Answers (1)

wwii
wwii

Reputation: 23783

input_df['Orchideen-Checkliste'] = list_name

Assigns a value to every item of that column because you did not specify a row indexer.

Without changing your process too much: enumerate the items in input_df['Input Name'] when iterating and use the enumeration to specify the row for the assignment.

for index,item in enumerate(input_df['Input Name']):
    for list_name, sheet in orchid_checklists.items():
        genus = item.split(' ')[0]
        if genus in sheet['referenced'].values:
            input_df.loc[index,'Orchideen-Checkliste'] = list_name

Upvotes: 2

Related Questions