Al14
Al14

Reputation: 1814

Separate to split one column to multiple

I want to split the ID column to 3 columns (called "A", "B","C") based on string indices. The first column has to report the item 1 only, the second items 2:3, last column 4:7.

    ID     Data
mw21e10     878         
mw61e10     908          
mw61e10    1073  

I have defined vectors with the string positions desired as below

a <- c(1,2,4)
b <- c(1,3,7)

I am trying to use the defined vectors into the function separate

df<-mydata %>%
  separate(code, c("A", "B", "C"), c(c(1,2,4), c(1,3,7))))


 

Upvotes: 2

Views: 72

Answers (2)

akrun
akrun

Reputation: 887048

We can use substring based on the 'a', 'b' vector and then use unnest_wider to change the list column to three columns

library(dplyr)
library(tidyr)
mydata %>%
   rowwise %>% 
   mutate(new = list(substring(ID, a, b))) %>%
   ungroup %>% 
   unnest_wider(c(new), names_repair = ~c(names(mydata), 'A', 'B', 'C'))

-output

# A tibble: 3 x 5
#  ID       Data A     B     C    
#  <chr>   <int> <chr> <chr> <chr>
#1 mw21e10   878 m     w2    1e10 
#2 mw61e10   908 m     w6    1e10 
#3 mw61e10  1073 m     w6    1e10 

Or if we need to use separate, make sure to only specify the splitting position instead of a range from two vectors. Based on the output from substr, it is at position 1, 3

mydata %>% 
      separate(ID, into = c("A", "B", "C"), c(1, 3 ), remove = FALSE)

-output

#       ID A  B    C Data
#1 mw21e10 m w2 1e10  878
#2 mw61e10 m w6 1e10  908
#3 mw61e10 m w6 1e10 1073

Or using base R with Map

mydata[c('A', 'B', 'C')] <- Map(substring, mydata$ID, 
                         MoreArgs = list(first = a, last = b))

data

mydata <- structure(list(ID = c("mw21e10", "mw61e10", "mw61e10"), Data = c(878L, 
908L, 1073L)), class = "data.frame", row.names = c(NA, -3L))

Upvotes: 1

tmfmnk
tmfmnk

Reputation: 39858

One dplyr and purrr option could be:

df %>%
 bind_cols(map2_dfc(.x = c(1,2,4),
                    .y = c(1,3,7),
                    ~ df %>%
                     transmute(col = substr(ID, .x, .y))) %>%
            set_names(c("A", "B", "C")))

       ID Data A  B    C
1 mw21e10  878 m w2 1e10
2 mw61e10  908 m w6 1e10
3 mw61e10 1073 m w6 1e10

Upvotes: 0

Related Questions