Split into mutiple columns by a certain format

Question

I would like to split a certain format of data from one column into multiple columns. Below are my sample data :

df = data.frame(id=c(1,2),data=c('^apple:1%2%3^orange:3%4%5',
                                   '^apple:4%5%6^orange:7%8%9'))
#    id    data
#    1    ^apple:1%2%3^orange:3%4%5
#    2    ^apple:4%5%6^orange:7%8%9

which will then gives the following output

 id   data_apple_A  data_apple_B  data_apple_C  data_orange_A  data_orange_B  data_orange_C
 1       1               2              3             3             4              5
 2       4               5              6             7             8              9

I am using the following code to try and get the values for A,B and C(they could be decimal values):

cSplit(df, "data", sep="\^", fixed = FALSE,direction= "long")[, c('valA','valB','valC')
        :=(str_extract(data, "\d+\.*\d*")),str_extract(data, "(?<=%)\d+\.*\d*"),
        str_extract(data, "(?<=%)\d+\.*\d*$") ][]

but was getting the following error:

Error in `[.data.table`(cSplit(df, "data", sep = "\^", fixed = FALSE,  : 
  Provide either 'by' or 'keyby' but not both

Do note that apple and orange is just a placeholder. it could be characters and the number of ^(str): could also be different for each row. Also, A B C are fixed. for every ^(str): , there will only be 3 decimal or non decimal numbers in the format of 1%2%3.

Any help would be appreciated.

Split into mutiple columns by a certain format

Answers (1)

Related Questions