Rémi Rémi
Rémi Rémi

Reputation: 21

Create the equivalent of sumif function in R

I would like to create a new variable in a dataframe in R, essentially using the equivalent of a "sumif" function in excel.

I have a dataframe which looks like this (but is much larger):

Country   Year       Number of IMF programs during the year
  AFG     2000                       1
  ARG     2000                       1
  AFG     2001                       0
  ARG     2001                       1

I would like to create a new variable "number of IMF programs over the period": the sum of the number of programs in the country, over the period.

In the example, we can see from the column "IMF programs during the year" that Afghanistan had 1 program over the period (1 in 2000, 0 in 2001), and Argentina had 2. so the table would look like this:

Country   Year  IMF programs during the year    N of programs over the period
  BRA     2000                 1                               1
  ARG     2000                 1                               2
  BRA     2001                 0                               1
  ARG     2001                 1                               2

I have seen tutorials, but they only show how to create subsets of observations matching certain criteria, and then calculate sums within the subsets. I would be able to do this, but it is not what I need. I would need another variable...

The intuition behind the function would be: sum "number of programs in the year" if ("country" = "the country that corresponds to this row").

Would you be able to provide me with some guidance ?

Many thanks for your help !

Upvotes: 2

Views: 243

Answers (2)

akrun
akrun

Reputation: 887128

We can use aggregate from base R

aggregate(IMFPrograms ~ country, df,  FUN  = sum)

If we need to create a column, use ave

df$NPrograms <- with(df, ave(IMFPrograms, country, FUN = sum))

Or using rowsum

rowsum(df$IMFPrograms, df$country)

Upvotes: 1

Reeza
Reeza

Reputation: 21274

library(tidyverse)

want <- df %>%
        group_by(country) %>%
        mutate(NPrograms = sum(IMFPrograms))

Upvotes: 3

Related Questions