Nicolas
Nicolas

Reputation: 45

Split information from two columns, R, tidyverse

i've got some data in two columns:

# A tibble: 16 x 2
   code  niveau
   <chr>  <dbl>
 1 A          1
 2 1          2
 3 2          2
 4 3          2
 5 4          2
 6 5          2
 7 B          1
 8 6          2
 9 7          2

My desired output is:

 A tibble: 16 x 3
   code  niveau cat  
   <chr>  <dbl> <chr>
 1 A          1 A    
 2 1          2 A    
 3 2          2 A    
 4 3          2 A    
 5 4          2 A    
 6 5          2 A    
 7 B          1 B    
 8 6          2 B  

I there a tidy way to convert these data without looping through it?

Here some dummy data:

data<-tibble(code=c('A', 1,2,3,4,5,'B', 6,7,8,9,'C',10,11,12,13), niveau=c(1, 2,2,2,2,2,1,2,2,2,2,1,2,2,2,2))

desired_output<-tibble(code=c('A', 1,2,3,4,5,'B', 6,7,8,9,'C',10,11,12,13), niveau=c(1, 2,2,2,2,2,1,2,2,2,2,1,2,2,2,2), 
                       cat=c(rep('A', 6),rep('B', 5), rep('C', 5)))

Nicolas

Upvotes: 0

Views: 44

Answers (2)

akrun
akrun

Reputation: 887951

We can use str_detect from stringr

library(dplyr)
library(stringr)
library(tidyr)
data %>%
     mutate(cat = replace(code, str_detect(code, '\\d'), NA)) %>% 
     fill(cat)

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 389325

Probably, you can create a new column cat and replace code values with NA where there is a number. We can then use fill to replace missing values with previous non-NA value.

library(dplyr)
data %>% mutate(cat = replace(code, grepl('\\d', code), NA)) %>% tidyr::fill(cat)

# A tibble: 16 x 3
#   code  niveau cat  
#   <chr>  <dbl> <chr>
# 1 A          1 A    
# 2 1          2 A    
# 3 2          2 A    
# 4 3          2 A    
# 5 4          2 A    
# 6 5          2 A    
# 7 B          1 B    
# 8 6          2 B    
# 9 7          2 B    
#10 8          2 B    
#11 9          2 B    
#12 C          1 C    
#13 10         2 C    
#14 11         2 C    
#15 12         2 C    
#16 13         2 C  

Upvotes: 1

Related Questions