Reputation: 343
I searched high and low on here, as well as tried duplicate and unique functions for what I'm about to ask, but couldn't get anything to work. Let's say I have a data frame named company with a variable state. When I collapse the rows I'm left with this output in one of the state variable observations:
PA;PA;PA;TX;TX
How could I remove the dups inside the cell (and entire vector for that matter), so it looks as follows:
PA;TX
I have no problems removing dup rows, but can't seem to do it for the cells themselves.
Upvotes: 3
Views: 2417
Reputation: 5335
This works for a single string:
x <- "PA;PA;PA;TX;TX"
x2 <- strsplit(x, ";")
x3 <- unlist(x2)
x4 <- unique(x3)
x5 <- paste(x4, collapse = ";")
If you want to do it for the whole vector company$state
, you could roll all that up into one call to sapply
:
sapply(company$state, function(x) paste(unique(unlist(strsplit(x, ";"))), collapse = ";"))
Upvotes: 7