Reputation: 63
I am going through strings of data for instagram usernames, I have been able to use regex to remove almost all unnecessary characters. I can't figure out how to remove the " 's " trailing the words.
I am able to remove every other special character with regex. I either can remove the apostrophe and not the s, or just skip over it entirely.
[1] "@kyrieirving’s" "@jaytatum0"
> follower.list <- gsub("[^[:alnum:][:blank:]@_]", "", follower.list)
[1] "@kyrieirvings" "@jaytatum0"
[1] "@kyrieirving" "@jaytatum0"
Upvotes: 6
Views: 1623
Reputation: 627292
Use
['’]s\b|[^[:alnum:][:blank:]@_]
See the regex demo.
Details
['’]s\b
- '
or ’
and then s
at the end of a word|
- or[^[:alnum:][:blank:]@_]
- any char but an alphanumeric, horizontal whitespace, @
or _
char> x <- c("@kyrieirving’s", "@jaytatum0")
> gsub("['’]s\\b|[^[:alnum:][:blank:]@_]", "",x)
[1] "@kyrieirving" "@jaytatum0"
Upvotes: 5
Reputation: 973
follower.list = c("@kyrieirving’s", "@jaytatum0")
gsub("\\’s$",'',follower.list)
Upvotes: 3