Reputation: 175
I have a spreadsheet of tweets and want to isolate username mentions in Google Sheets. Somehow, regexps that work in R or other languages are not doing the job there.
An example:
RT @Neromoto: @cazainfractor inconsciente agresiva y poco ciudadana conductora
Desired output:
@Neromoto
@cazainfractor
I have tried this: REGEXEXTRACT(B1,(^|[^@\w])@(\w{1,15})\b)
.
Upvotes: 1
Views: 354
Reputation: 627335
First of all, your (^|[^@\w])@(\w{1,15})\b
regex pattern must be put inside a string literal, i.e. double quotes. Then note that every capturing group will be output, you may want to make the first group non-capturing by replacing (
with (?:
. Also, the last \b
is redundant, after the last \w
matched, there will be either the end of string, or the non-word char.
I'd rather suggest
=REGEXEXTRACT(B1,"\B@\w{1,15}")
Or
=REGEXREPLACE(B1,"(\B@\w{1,15})\s*|.","$1 ")
Details:
\B
- a non-word boundary (that is, before @
, there can be either start of string or a non-word char)@
- a @
char\w{1,15}
- 1 to 15 word chars (if you do not care about the length, replace {1,15}
with +
)And the second regex details:
(\B@\w{1,15})\s*
- Group 1 capturing @
at the non-word boundary position, 1 to 15 word chars and then 0+ whitespaces (in the replacement, the $1
backreference inserts the found mentions back into the resulting string)|
- or.
- any 1 char.Upvotes: 2