still_learning
still_learning

Reputation: 806

Loop for iterating through two lists is not working

I am trying to use a loop for iterating through two lists. Unfortunately, the second for loop does not work: it only checks the first item within the list, but not with rest. Could you please tell me why?

Thanks

Lists:

low_cars_engines=['Audi', 'Bentley', 'Bugatti', 'Porsche', 'Skoda']
low_planes_engines=['Pratt & Whitney','Rolls-Royce','GE Aviation']

I would like to add two more columns (Cars and Planes) to my original dataset based on if statements:

import re

df['Cars'] = pd.Series(index = df.index, dtype='object')
df['Planes'] = pd.Series(index = df.index, dtype='object')

for index, row in df.iterrows():
    value = row['Engine to check']
    for x in low_cars_engines:
        if x in value:
            print(x)
            df.at[index,'Cars'] = 'Yes' # need to keep df.at[index, '_']
            break
        else: 
            df.at[index,'Cars'] = 'No' # need to keep df.at[index, '_']
            break

for index, row in df.iterrows():
    value = row['Engine to check']
    for x in low_planes_engines:
        if x in value:
            df.at[index,'Planes'] = 'Yes'
            break
        else: 
            df[index,'Planes'] = 'No'
            break

print(df)

The first for loop works fine, but not the second: I am not able to assign a value to an item in the list 'Engine to check' even if it is within the list low_planes_engines (it gives me always No).

Could you please tell me what is wrong and if it would be possible to use only one for loop rather than two? I would prefer to keep the same structure, or keep df.at[index,'_']. Right now the second loop print/check only the first item of the list low_planes_engines (i.e. Pratt & Whitney) and does not go through the rest.

Since the dataset is similar to:

Audi
CFM International
Rolls-Royce
Bentley
Volkswagen
Toyota
Suzuki
Porsche

and it does not include that element, all the rows under Planes are set to No.

Upvotes: 0

Views: 88

Answers (2)

DYZ
DYZ

Reputation: 57033

You should not use loops when you work with Pandas. DataFrames are not designed to be accessed sequentially. You need some NumPy, though:

import numpy as np
df['Cars']   = np.where(df['Engine to check'].isin(low_cars_engines), 'Yes', 'No') 
df['Planes'] = np.where(df['Engine to check'].isin(low_planes_engines), 'Yes', 'No')

Result:

#     Engine to check Cars Planes
# 0               Audi  Yes     No
# 1  CFM International   No     No
# 2        Rolls-Royce   No    Yes
# 3            Bentley  Yes     No
# 4         Volkswagen   No     No
# 5             Toyota   No     No
# 6             Suzuki   No     No
# 7            Porsche  Yes     No

You probably should not use "Yes" and "No," either. Use boolean values True and False instead, as they are easier to work with in the future:

df['Cars']   = df['Engine to check'].isin(low_cars_engines) 
df['Planes'] = df['Engine to check'].isin(low_planes_engines)

Finally, if everything in the DataFrame is strictly a car or a plane, only one column is required. The other will be the complement.

Upvotes: 3

jcaliz
jcaliz

Reputation: 4021

You have an additional space here

row['Engine to check ']

Try changing it to

row['Engine to check']

Upvotes: 1

Related Questions