887
887

Reputation: 619

Count observations with certain value in a group?

I am working with the following data frame:

Year  Month      Day   X      Y
2018  January    1     4.5    6
2018  January    4     3.2    8.1
2018  January    11    1.1    2.3
2018  February   7     5.4    2.2
2018  February   15    1.5    4.4
2019  January    3     8.6    2.3
2019  January    22    1.1    2.5
2019  January    23    5.5    7.8
2019  February   5     6.9    1.1
2019  February   10    1.8    1.3

I am looking to create a new column that indicates the number of observations where x is greater than y for a given month.

Year  Month      Day   X      Y       XGreaterThanYCount
2018  January    1     4.5    6             0
2018  January    4     3.2    8.1           0
2018  January    11    1.1    2.3           0
2018  February   7     5.4    2.2           1
2018  February   15    1.5    4.4           1
2019  January    3     8.6    2.3           1
2019  January    22    1.1    2.5           1
2019  January    23    5.5    7.8           1
2019  February   5     6.9    1.1           2
2019  February   10    1.8    1.3           2

I tried to perform a logical test df$XYTest <- df$X > df$Y and then apply that in mutate

df <- df %>%
  group_by(Year, Month) %>%
  mutate(XGreaterThanYCount = count(XYTest = TRUE))

But I can't seem to make it work and I'm not sure this is a good strategy anyway.

Upvotes: 1

Views: 53

Answers (2)

jay.sf
jay.sf

Reputation: 72813

With ave

dat <- transform(dat, XgreaterY=ave(X > Y, Year, Month, FUN=sum))
dat
#    Year    Month Day   X   Y XgreaterY
# 1  2018  January   1 4.5 6.0         0
# 2  2018  January   4 3.2 8.1         0
# 3  2018  January  11 1.1 2.3         0
# 4  2018 February   7 5.4 2.2         1
# 5  2018 February  15 1.5 4.4         1
# 6  2019  January   3 8.6 2.3         1
# 7  2019  January  22 1.1 2.5         1
# 8  2019  January  23 5.5 7.8         1
# 9  2019 February   5 6.9 1.1         2
# 10 2019 February  10 1.8 1.3         2

Data:

dat <- structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2019L, 
2019L, 2019L, 2019L, 2019L), Month = c("January", "January", 
"January", "February", "February", "January", "January", "January", 
"February", "February"), Day = c(1L, 4L, 11L, 7L, 15L, 3L, 22L, 
23L, 5L, 10L), X = c(4.5, 3.2, 1.1, 5.4, 1.5, 8.6, 1.1, 5.5, 
6.9, 1.8), Y = c(6, 8.1, 2.3, 2.2, 4.4, 2.3, 2.5, 7.8, 1.1, 1.3
)), class = "data.frame", row.names = c(NA, -10L))

Upvotes: 0

AnilGoyal
AnilGoyal

Reputation: 26218

df <- df %>%
  group_by(Year, Month) %>%
  mutate(XGreaterThanYCount = sum(X > Y))

Upvotes: 1

Related Questions