Tpellirn
Tpellirn

Reputation: 796

how to filter columns in a dataframe in r?

Given this dataframe:

 require(dplyr)
 require(ggplot2)
 require(forcats)
 class <- c(1, 4,1,3, 2, 2,4, 1, 4, 5, 2, 4, 2,2,2) 
 prog <- c("Bac2", "Bac2","Bac2","Bac", "Master", "Master","Bac", "Bac", "DEA", "Doctorat", 
 "DEA", "Bac", "DEA","DEA","Bac")
 mydata <- data.frame(height = class, prog)
 res=mydata %>% group_by(prog,height) %>% 
tally() %>% mutate(prop = n/sum(n))

i need to create a new column "new", per name of "prog": if prop does not have > 0.5, return 99 under column "new"
if prop has > 0.5, return the value under "height" that correspond to the max prop

desired output:

   prog     height     n      prop    new
   <chr>     <dbl> <int>     <dbl>     dbl
  1 Bac           1     1     0.2      99
  2 Bac           2     1     0.2      99
  3 Bac           3     1     0.2      99
  4 Bac           4     2     0.4      99
  5 Bac2          1     2     0.667    1
  6 Bac2          4     1     0.333    1
  7 DEA           2     3     0.75     2
  8 DEA           4     1     0.25     2
  9 Doctorat      5     1     1        5
  10 Master        2     2    1       2

Upvotes: 0

Views: 38

Answers (1)

Ma&#235;l
Ma&#235;l

Reputation: 52004

group_by prog and use ifelse:

library(dplyr)
res %>% 
  group_by(prog) %>% 
  mutate(new = ifelse(any(prop > 0.5), height[prop > 0.5], 99))

output

# A tibble: 10 × 5
# Groups:   prog [5]
   prog     height     n  prop   new
   <chr>     <dbl> <int> <dbl> <dbl>
 1 Bac           1     1 0.2      99
 2 Bac           2     1 0.2      99
 3 Bac           3     1 0.2      99
 4 Bac           4     2 0.4      99
 5 Bac2          1     2 0.667     1
 6 Bac2          4     1 0.333     1
 7 DEA           2     3 0.75      2
 8 DEA           4     1 0.25      2
 9 Doctorat      5     1 1         5
10 Master        2     2 1         2

Upvotes: 2

Related Questions