Reputation: 806
I am trying to use a loop for iterating through two lists. Unfortunately, the second for loop does not work: it only checks the first item within the list, but not with rest. Could you please tell me why?
Thanks
Lists:
low_cars_engines=['Audi', 'Bentley', 'Bugatti', 'Porsche', 'Skoda']
low_planes_engines=['Pratt & Whitney','Rolls-Royce','GE Aviation']
I would like to add two more columns (Cars and Planes) to my original dataset based on if statements:
import re
df['Cars'] = pd.Series(index = df.index, dtype='object')
df['Planes'] = pd.Series(index = df.index, dtype='object')
for index, row in df.iterrows():
value = row['Engine to check']
for x in low_cars_engines:
if x in value:
print(x)
df.at[index,'Cars'] = 'Yes' # need to keep df.at[index, '_']
break
else:
df.at[index,'Cars'] = 'No' # need to keep df.at[index, '_']
break
for index, row in df.iterrows():
value = row['Engine to check']
for x in low_planes_engines:
if x in value:
df.at[index,'Planes'] = 'Yes'
break
else:
df[index,'Planes'] = 'No'
break
print(df)
The first for loop works fine, but not the second: I am not able to assign a value to an item in the list 'Engine to check' even if it is within the list low_planes_engines (it gives me always No).
Could you please tell me what is wrong and if it would be possible to use only one for loop rather than two? I would prefer to keep the same structure, or keep df.at[index,'_']
. Right now the second loop print/check only the first item of the list low_planes_engines (i.e. Pratt & Whitney) and does not go through the rest.
Since the dataset is similar to:
Audi
CFM International
Rolls-Royce
Bentley
Volkswagen
Toyota
Suzuki
Porsche
and it does not include that element, all the rows under Planes
are set to No
.
Upvotes: 0
Views: 88
Reputation: 57033
You should not use loops when you work with Pandas. DataFrames are not designed to be accessed sequentially. You need some NumPy, though:
import numpy as np
df['Cars'] = np.where(df['Engine to check'].isin(low_cars_engines), 'Yes', 'No')
df['Planes'] = np.where(df['Engine to check'].isin(low_planes_engines), 'Yes', 'No')
Result:
# Engine to check Cars Planes
# 0 Audi Yes No
# 1 CFM International No No
# 2 Rolls-Royce No Yes
# 3 Bentley Yes No
# 4 Volkswagen No No
# 5 Toyota No No
# 6 Suzuki No No
# 7 Porsche Yes No
You probably should not use "Yes" and "No," either. Use boolean values True
and False
instead, as they are easier to work with in the future:
df['Cars'] = df['Engine to check'].isin(low_cars_engines)
df['Planes'] = df['Engine to check'].isin(low_planes_engines)
Finally, if everything in the DataFrame is strictly a car or a plane, only one column is required. The other will be the complement.
Upvotes: 3
Reputation: 4021
You have an additional space here
row['Engine to check ']
Try changing it to
row['Engine to check']
Upvotes: 1