Reputation: 28169
I have a dataset that contains spaces and other punctuation characters. I'm trying to replace the spaces and special characters with "_". This creates spots with multiple "_" strung together, so I'd like to remove these too by using the following function as described here :
removeSpace <- function(x){
class1 <- class(x)
x <- as.character(x)
x <- gsub(" |&|-|/|'|(|)",'_', x) # convert special characters to _
x <- gsub("([_])\\1+","\\1", x) # convert multiple _ to single _
if(class1 == 'character'){
return(x)
}
if(class1 == 'factor'){
return(as.factor(x))
}
}
The issue is instead of removing spaces and replacing with "_" it does every other character with "_" (i.e. "test" -> "t_e_s_t")
What am I doing wrong?
Upvotes: 3
Views: 2784
Reputation: 4614
You don't need to run two separate replacements to accomplish this. Just put a +
quantifier in your match pattern.
Match: [-/&'() ]+
Replace with: _
Also note that I used a character set instead of switching between each option with |
. This is generally a better approach when matching one of multiple individual characters.
Upvotes: 10