Abrar Sayeed
Abrar Sayeed

Reputation: 73

filtering a Pandas dataframe by one column and getting the sum of values in another column

I have a dataframe with multiple columns(8-10) and one such column is the year column.i have another column called the arrival column. the year column consists of data from 3 years- 2018,2019 and 2020. i want to find out, for the year 2019, the sum of arrivals. i thought it would be pretty basic but am not getting the right results! could someone show me how to approach this?

I've heard df.loc could be used but am unsure how to approach that.

Current code:

df=pd.read_excel('xyz.xlsx')
while df['Year'== '2019']:
    arrived= df['Arrived'].sum()
    print(arrived)

Upvotes: 3

Views: 10664

Answers (2)

aiguofer
aiguofer

Reputation: 2135

Another approach here, in case you wanted to have the sum for every year, would be to use the groupby operation:

per_year = df.groupby('Year')['Arrived'].sum()

This would give you a series, and you could then see the value for 2019 specifically with:

per_year['2019']

Upvotes: 2

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

The first input to loc command is the filter for the index and then the second is the column.

df.loc[df['Year'] == '2019', 'Arrived'].sum()

Upvotes: 5

Related Questions