Reputation: 347
I am having trouble spliting a simple character column into 3 columns depending on the content of the column. A very easy example:
data <- data.frame(x = c("GUIC01", "GUI02"))
> data
x
1 GUIC01
2 GUI02
I want to create the columns, to produce this:
> desired
x Parc TipusBassa Num
1 GUIC01 GUI C 01
2 GUI02 GUI <NA> 02
Basically if the cell has a c in the middle, it must "create" a column where it says so and split the rest of the content of the cell. So far I tried this approach:
data<-if_else(nchar(data$x) == 5,
separate(data, into = c('Parc','Num'), sep = c(3)),
separate(data, into = c('Parc', 'TipusBassa','Num'), sep = c(3,4)))
What I am missing? Thanks a lot!
Upvotes: 1
Views: 76
Reputation: 389325
You can use tidyr::extract
and pass the regex to extract values in different columns.
tidyr::extract(data, x, c('Parc', 'TipusBassa', 'Num'),
'([A-Z]{3})([A-Z]?)([0-9]{2})', remove = FALSE)
# x Parc TipusBassa Num
#1 GUIC01 GUI C 01
#2 GUI02 GUI 02
Upvotes: 1
Reputation: 522752
We can use the base string functions here:
data$TipusBass <- ifelse(sub("^.*(.).{2}$", "\\1", data$x) == "C", "C", NA)
data$Num <- sub("^.*(..)$", "\\1", data$x)
data
x TipusBass Num
1 GUIC01 C 01
2 GUI02 <NA> 02
Data:
data <- data.frame(x = c("GUIC01", "GUI02"))
Upvotes: 1