Reputation: 77
I want to get rid of duplicates in strings that are separated by commas.
It works for a single column using:
df$column <- sapply(strsplit(df$column, ",", fixed = TRUE), function(x)
paste(unique(x), collapse = ","))
When I try to use it on multiple columns I always get an "argument is a non-character" error.
Upvotes: 2
Views: 732
Reputation: 887148
We need to wrap with as.character
if the column is factor
sapply(strsplit(as.character(df$column), ",", fixed = TRUE),
function(x) paste(unique(x), collapse = ","))
For applying to multiple columns loop through the columns of interest, apply the same function and update the output to the columns of interest
colsOfInterest <- c('column1', 'column2')
df[colsOfInterest] <- lapply(df[colsOfInterest], function(x)
sapply(strsplit(as.character(x), ",", fixed = TRUE),
function(y) paste(unique(y), collapse=",")))
Upvotes: 3