Alex
Alex

Reputation: 163

How do I group the data frame so that columns contained logical "Yes"/"No" values?

I have the data set that I would like to group in a extended way. In the meantime my code is:

df <- data.frame(id=1,year=c(2012:2018))
df <- rbind(df, data.frame(id=2,year=c(2012:2015)))
df <- rbind(df, data.frame(id=3,year=c(2015:2018)))

df %>% group_by(id) %>% summarise(n=n())

The results are:

# A tibble: 3 x 2
     id     n
  <dbl> <int>
1     1     7
2     2     4
3     3     4

I would like to have code on grouping the data so that my resulting table looked like this one:

     id     n  yr_2012  yr_2013  yr_2014  yr_2015  yr_2016  yr_2017  yr_2018
  <dbl> <int>   <char>   <char>   <char>   <char>   <char>   <char>   <char>
1     1     7      Yes      Yes      Yes      Yes      Yes      Yes      Yes
2     2     4      Yes      Yes      Yes      Yes
3     3     4                                 Yes      Yes      Yes      Yes

Upvotes: 0

Views: 358

Answers (2)

akrun
akrun

Reputation: 887048

We can use pivot_wider from tidyr

library(dplyr)
library(tidyr)
library(stringr)
df %>% 
   add_count(id) %>%
   mutate(year = str_c('yr_', year)) %>% 
   pivot_wider(names_from = year, values_from = year,
      values_fn = function(x) if(length(x) >0) "Yes", values_fill = "")

-output

# A tibble: 3 x 9
     id     n yr_2012 yr_2013 yr_2014 yr_2015 yr_2016 yr_2017 yr_2018
  <dbl> <int> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1     1     7 "Yes"   "Yes"   "Yes"   Yes     "Yes"   "Yes"   "Yes"  
2     2     4 "Yes"   "Yes"   "Yes"   Yes     ""      ""      ""     
3     3     4 ""      ""      ""      Yes     "Yes"   "Yes"   "Yes"  

Upvotes: 2

Vin&#237;cius F&#233;lix
Vin&#237;cius F&#233;lix

Reputation: 8811

Libraries

library(tidyverse)

Code

df %>% 
   #Creating variable n
   add_count(id) %>% 
   # Auxiliary variable
   mutate(aux = "Yes") %>% 
   pivot_wider(
      #Defining the column that will name the new columns
      names_from = year,
      #Defining a prefix
      names_prefix = "yr_",
      # Defining the column that will be used in the cells
      values_from = aux,
      # Defining the replace for NA
      values_fill = "No"
      )

Output

# A tibble: 3 x 9
     id     n yr_2012 yr_2013 yr_2014 yr_2015 yr_2016 yr_2017 yr_2018
  <dbl> <int> <chr>   <chr>   <chr>   <chr>   <chr>   <chr>   <chr>  
1     1     7 Yes     Yes     Yes     Yes     Yes     Yes     Yes    
2     2     4 Yes     Yes     Yes     Yes     No      No      No     
3     3     4 No      No      No      Yes     Yes     Yes     Yes 

Upvotes: 1

Related Questions