Reputation:
I have a messy data that consists of a string with 1-3 codes.
library(data.table)
data <- data.table(ID = c(1, 2), text = c("3TC ABC DTG", "3TC DTG ABC"))
Unfortunately the codes are not written in alphabetical order and I would like them to appear so. Both records should translate to
3TC ABC DTG
I tried mocking around with splitting string
data[, c("text1", "text2", "text3") := tstrsplit(text, " ", fixed = TRUE)]
but cannot find a way to sort and combine these three :/
I also thought about reshaping but then my dcast
seems to have troubles:
data_long <- melt(data,
id.vars = c("ID"),
measure.vars = c("text1", "text2", "text3"),
na.rm = TRUE)
result <- dcast(data,
ID ~ variable,
function (x) paste(x, collapse = " "))
Any way around it?
Upvotes: 0
Views: 61
Reputation: 27732
you were very close.. try
data[, text_new := unlist( lapply( strsplit( text, " " ),
function(x) paste0( sort(x), collapse = " "))) ]
ID text text_new
1: 1 3TC ABC DTG 3TC ABC DTG
2: 2 3TC DTG ABC 3TC ABC DTG
Upvotes: 1