Column Split based on certain separator format

Question

I have a format to separate where I will have this data:

df = data.frame(id=c(1,2),name=c('A~B~C','A~B~D'),value=c('1~2~3','1~~2'))
id   name      value
1    A~B~C     1~2~3
2    A~B~D     1~~2

which is expected to have the following output where the column name is the original column name followed by the text in the name column:

id   value_A   value_B   value_C  value_D
1      1         2          3      
2      1                            2

I manage to achieve the splitting for the by using many nested for loops to process on my data row by row. It works on small sample data but once the data gets huge, the time is an issue.

Also,there could be more than 1 value columns, but they all should map into the same name column. Example output:

id   value_A   value_B   value_C  value1_A   value1_B   value1_C
1      1         2          3       1           2          3
2      1         2          3       1           2          3

akrun · Accepted Answer

Here is an option using cSplit/dcast. Split the rows into 'long' format with cSplit and dcast it to 'wide' format

library(splitstackshape)
dcast(cSplit(df, c('name','value'), '~', 'long')[!is.na(value)], id ~ paste0('value_', name))
#   id value_A value_B value_C value_D
#1:  1       1       2       3      NA
#2:  2       1      NA      NA       2

Column Split based on certain separator format

Answers (2)

Related Questions