Reputation: 357
below is my code
years_list = set()
for i in range(0,indicators_csv.shape[0]) :
if (indicators_csv['CountryCode'].str.contains('USA')) :
years_list.append(indicator_csv.iloc[i].Year)
Here indicator_csv is a csv file having column as 'CountryCode' when run this I got following error
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
on if line. I also tried
if (indicators_csv['CountryCode'] == ('USA'))
but getting same error.
I googled it i found some answer related to numbers or and or but nothing like this I found.
Upvotes: 1
Views: 416
Reputation: 12417
If I understood you correctly and you want to iterate over the df instead of using a vectorised approach, you can use:
years_list = []
for index, row in indicators_csv.iterrows():
if ('USA' in row['CountryCode']):
years_list.append(row['Year'])
Input:
CountryCode Year
0 USA 1980
1 UK 1990
2 FR 1984
3 USA 2000
Output:
[1980L, 2000L]
Upvotes: 2
Reputation: 10359
You should try to avoid iterating over pandas
objects as much as possible - it's mich slower than the native vectorised operations. Your issue is that indicators_csv['CountryCode'].str.contains('USA')
checks if 'USA'
is in 'CountryCode'
for every row, so you end up with a column of True and False entries.
What you want to do instead is filter the dataframe to just those rows that contain 'USA'
and then convert the 'Year'
column from that frame to a list. You can do all of this directly in one operation (split across two lines for readability)
years_list = indicators_csv[indicators_csv['CountryCode'].str.contains('USA')]\
['Year'].tolist()
Upvotes: 1
Reputation: 149
the error is throwing up because you are trying to use a series of boolean value in a IF clause where it expects single boolean.
Upvotes: 0