Reputation: 11
industry_usa = f500["industry"][f500["country"] == "USA"].value_counts().head(2)
This is a dataframe where some of its columns are industry
and country
. So why do we need to locate the 2 columns side by side while creating the indsutry_usa
series. Please explain.
Upvotes: 0
Views: 38
Reputation: 3294
I will break it down for you:
f500["industry"]
: This selects the series (column) with the same name.
f500["country"] == "USA"
: This returns a boolean index containing True
for all the rows which have their country column as USA.
f500["industry"][f500["country"] == "USA"]
: As you might have guessed, this now is just like any other indexing we do in pandas. So, it selects all those "industry"s where the country is "USA".
.value_counts()
: is just to do a count of the unique values. Like we have in Counter
class in python
NOTE: The interesting fact is that you could change the order to - f500[f500["country"] == "USA"]["industry"]
and still get the same result!!
Upvotes: 1