Reputation: 796
Given this dataframe:
require(dplyr)
require(ggplot2)
require(forcats)
class <- c(1, 4,1,3, 2, 2,4, 1, 4, 5, 2, 4, 2,2,2)
prog <- c("Bac2", "Bac2","Bac2","Bac", "Master", "Master","Bac", "Bac", "DEA", "Doctorat",
"DEA", "Bac", "DEA","DEA","Bac")
mydata <- data.frame(height = class, prog)
res=mydata %>% group_by(prog,height) %>%
tally() %>% mutate(prop = n/sum(n))
i need to create a new column "new", per name of "prog": if prop does not have > 0.5, return 99 under column "new"
if prop has > 0.5, return the value under "height" that correspond to the max prop
desired output:
prog height n prop new
<chr> <dbl> <int> <dbl> dbl
1 Bac 1 1 0.2 99
2 Bac 2 1 0.2 99
3 Bac 3 1 0.2 99
4 Bac 4 2 0.4 99
5 Bac2 1 2 0.667 1
6 Bac2 4 1 0.333 1
7 DEA 2 3 0.75 2
8 DEA 4 1 0.25 2
9 Doctorat 5 1 1 5
10 Master 2 2 1 2
Upvotes: 0
Views: 38
Reputation: 52004
group_by
prog and use ifelse
:
library(dplyr)
res %>%
group_by(prog) %>%
mutate(new = ifelse(any(prop > 0.5), height[prop > 0.5], 99))
output
# A tibble: 10 × 5
# Groups: prog [5]
prog height n prop new
<chr> <dbl> <int> <dbl> <dbl>
1 Bac 1 1 0.2 99
2 Bac 2 1 0.2 99
3 Bac 3 1 0.2 99
4 Bac 4 2 0.4 99
5 Bac2 1 2 0.667 1
6 Bac2 4 1 0.333 1
7 DEA 2 3 0.75 2
8 DEA 4 1 0.25 2
9 Doctorat 5 1 1 5
10 Master 2 2 1 2
Upvotes: 2