Reputation: 73
I have a dataframe with multiple columns(8-10) and one such column is the year column.i have another column called the arrival column. the year column consists of data from 3 years- 2018,2019 and 2020. i want to find out, for the year 2019, the sum of arrivals. i thought it would be pretty basic but am not getting the right results! could someone show me how to approach this?
I've heard df.loc could be used but am unsure how to approach that.
Current code:
df=pd.read_excel('xyz.xlsx')
while df['Year'== '2019']:
arrived= df['Arrived'].sum()
print(arrived)
Upvotes: 3
Views: 10664
Reputation: 2135
Another approach here, in case you wanted to have the sum for every year, would be to use the groupby
operation:
per_year = df.groupby('Year')['Arrived'].sum()
This would give you a series, and you could then see the value for 2019 specifically with:
per_year['2019']
Upvotes: 2
Reputation: 10960
The first input to loc
command is the filter for the index and then the second is the column.
df.loc[df['Year'] == '2019', 'Arrived'].sum()
Upvotes: 5