Jason_Leto
Jason_Leto

Reputation: 65

how to add to an empty list each time a condition is met in a for loop python?

so I have a for loop that loops through countries and each country has either a yes or a no, I want the corresponding animal to be added to a list each time there is a yes triggered. For example, I have a list that goes

Countries = ['Germany','France'..etc etc]

my DF is something like this

animal  Germany  France  
Rabbit    yes       yes
Bear      no        yes
...

I want a list of animals such that there is a yes for the countries selected in the countries list. So in the instance above, I would want

animal_list = [Rabbit, Rabbit, Bear]

and my main code goes something like this, I have my attempt below as well but it doesn't work. Is there a clean way of doing it?

 Countries = ['Germany','France'..etc etc]
 animals_list = []
 for country in Countries:
   animal_list = animal_list.append(df[df[country] == 'yes'],'animal'])

The for loop is required so I am unable to do it off the bat using pandas.

Upvotes: 0

Views: 758

Answers (3)

Jason_Leto
Jason_Leto

Reputation: 65

I found a very simple solution which seems to do the trick for me.

Countries = ['Germany','France'..etc etc]
animals_list = []
for country in Countries:
   animals = list(df[df[country] == 'yes'],'animal'])
   animals_list = animals_list + animals

Upvotes: 0

Danial
Danial

Reputation: 432

Considering you have a Dataframe like this

data = {'animal':['Rabbit', 'Bear'],
    'Germany':['yes', 'no'],
    'France': ['yes', 'no']
   }
df = pd.DataFrame(data)

If the wanted countries are given in a list:

# In Python, Try to use lowercase, underscore seperated names for your variables (PEP8)

countries = ['Germany', 'France']

Then you can select those columns:

# Select the countries that you want
df_subset = df[df.columns.intersection(countries)]

And calculate number of yes for each animal:

animals_index_to_num_yes = df_subset.eq('yes').sum(axis=1)

In this way the list can be created very easily:

animals_list = []

for index, animal in df['animal'].iteritems():
    occurences = animals_index_to_num_yes.get(index)
    animals_list.extend(
        [animal] * occurrences
    )

Notes:

  1. Try to avoid for loops in Pandas as much as possible, in general, built-in methods will have a better performance because of the use of concurrency. See this excellent answer for more.
  2. In your case, as the order of the animals in the output list matters, I'm not sure if the loop can be avoided, therefore I used a for loop.

Upvotes: 2

bharatadk
bharatadk

Reputation: 639

animals_list=[]
country_list=['germany','france']

for i in range(len(df)):
    for country in country_list:
        if df[country].iloc[i]=='yes':
            animals_list.append(df.animal.iloc[i])

print(animal_list)

Output : ['rabbit', 'rabbit', 'bear']

Upvotes: 0

Related Questions