Reputation: 619
I am working with the following data frame:
Year Month Day X Y
2018 January 1 4.5 6
2018 January 4 3.2 8.1
2018 January 11 1.1 2.3
2018 February 7 5.4 2.2
2018 February 15 1.5 4.4
2019 January 3 8.6 2.3
2019 January 22 1.1 2.5
2019 January 23 5.5 7.8
2019 February 5 6.9 1.1
2019 February 10 1.8 1.3
I am looking to create a new column that indicates the number of observations where x is greater than y for a given month.
Year Month Day X Y XGreaterThanYCount
2018 January 1 4.5 6 0
2018 January 4 3.2 8.1 0
2018 January 11 1.1 2.3 0
2018 February 7 5.4 2.2 1
2018 February 15 1.5 4.4 1
2019 January 3 8.6 2.3 1
2019 January 22 1.1 2.5 1
2019 January 23 5.5 7.8 1
2019 February 5 6.9 1.1 2
2019 February 10 1.8 1.3 2
I tried to perform a logical test df$XYTest <- df$X > df$Y
and then apply that in mutate
df <- df %>%
group_by(Year, Month) %>%
mutate(XGreaterThanYCount = count(XYTest = TRUE))
But I can't seem to make it work and I'm not sure this is a good strategy anyway.
Upvotes: 1
Views: 53
Reputation: 72813
With ave
dat <- transform(dat, XgreaterY=ave(X > Y, Year, Month, FUN=sum))
dat
# Year Month Day X Y XgreaterY
# 1 2018 January 1 4.5 6.0 0
# 2 2018 January 4 3.2 8.1 0
# 3 2018 January 11 1.1 2.3 0
# 4 2018 February 7 5.4 2.2 1
# 5 2018 February 15 1.5 4.4 1
# 6 2019 January 3 8.6 2.3 1
# 7 2019 January 22 1.1 2.5 1
# 8 2019 January 23 5.5 7.8 1
# 9 2019 February 5 6.9 1.1 2
# 10 2019 February 10 1.8 1.3 2
Data:
dat <- structure(list(Year = c(2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L), Month = c("January", "January",
"January", "February", "February", "January", "January", "January",
"February", "February"), Day = c(1L, 4L, 11L, 7L, 15L, 3L, 22L,
23L, 5L, 10L), X = c(4.5, 3.2, 1.1, 5.4, 1.5, 8.6, 1.1, 5.5,
6.9, 1.8), Y = c(6, 8.1, 2.3, 2.2, 4.4, 2.3, 2.5, 7.8, 1.1, 1.3
)), class = "data.frame", row.names = c(NA, -10L))
Upvotes: 0
Reputation: 26218
df <- df %>%
group_by(Year, Month) %>%
mutate(XGreaterThanYCount = sum(X > Y))
Upvotes: 1