Reputation: 3020
I have this string (for example).
str <- "T gwed is atyrt mtt yfdgfg grter effgf y"
I want to remove lone occurring alphabets from this string ('T' at the start and 'y' at the end in this case) and output should be
"gwed is atyrt mtt yfdgfg grter effgf"
I used this
str <- gsub("[A-Za-z] ", "", str)
But it gives this as a result.
[1] "gweiatyrmtyfdgfgrtey"
Here it considers cases like "gwed " also and hence it merges every word of the string.
How do i achieve my aim?
Also, I have this huge text with thousands of strings (not just a single string), so keep this in mind while providing an answer.
Upvotes: 1
Views: 2627
Reputation: 14450
str <- "T gwed is atyrt mtt yfdgfg grter effgf y"
gsub(" ?\\<[[:alpha:]]\\> ?", "", str)
## [1] "gwed is atyrt mtt yfdgfg grter effgf"
You need to use the special character to denote word boundaries, i.e., \\<
and \\>
. The _?
(where _
is a space) denotes that you also want to remove single spaces around the single letters (if present). See ?regex
for more.
Upvotes: 3
Reputation: 121568
Another option wthout using regular expressions:
xx <- unlist(strsplit(str, " "))
paste(xx[nchar(xx)>1],collapse=' ')
[1] "gwed is atyrt mtt yfdgfg grter effgf"
Upvotes: 1