Reputation: 71
Here's a sample of my data :
k <- structure(list(Required.field = c("yes", "yes", "yes"),
Choices = c("2, Féminin | 1, Masculin", "1, Oui | 0, Non | 99, Je ne sais pas", "1, Oui | 0, Non")),
row.names = c(5L, 10L, 15L), class = "data.frame")
> k
Required.field Choices
5 yes 2, Fémenin| 1, Masculin
10 yes 1, Oui | 0, Non | 99, Je ne sais pas
15 yes 1, Oui | 0, Non
What i'd like to have is something like this :
> result
Required.field Number Value
5 yes c(2,1) c(Fémenin, Masculin)
10 yes c(1,0,99) c(Oui, Non, Je ne sais pas)
15 yes c(1,0) c(Oui, Non)
here's the code i write which doesn't do the job correctly !
k$test = strsplit(k$choice,c(" | "), fixed = T)
bbl = k %>%
mutate(number = str_extract_all(test, "[0-9]+")) %>% #get only digits
mutate(value = str_extract(test, "[aA-zZ].*")) #get only letters
why is it not working exactly?
Upvotes: 1
Views: 52
Reputation: 887158
We may use
library(dplyr)
library(stringr)
k %>%
mutate(Number = str_extract_all(Choices, "\\d+"),
Value = str_extract_all(Choices, "[^0-9,| ]+") )
-output
Required.field Choices Number Value
5 yes 2, Féminin | 1, Masculin 2, 1 Féminin, Masculin
10 yes 1, Oui | 0, Non | 99, Je ne sais pas 1, 0, 99 Oui, Non, Je, ne, sais, pas
15 yes 1, Oui | 0, Non 1, 0 Oui, Non
Upvotes: 0
Reputation: 52004
Here's a solution with tidyr
and dplyr
functions:
library(tidyr)
library(dplyr)
dat %>%
mutate(id = 1:n()) %>%
separate_rows(Choices, sep = " \\| ") %>%
separate(Choices, into = c("Number", "Value"), sep = ", ", convert = TRUE) %>%
group_by(id) %>%
summarise(Required.field = unique(Required.field),
across(c(Number, Value), list))
output
id Required.field Number Value
1 1 yes 2, 1 Féminin, Masculin
2 2 yes 1, 0, 99 Oui, Non, Je ne sais pas
3 3 yes 1, 0 Oui, Non
Upvotes: 3