Reputation: 63
Example data from tweet:
I always meet @gEmbul at #kampus we always open the site https://www.youtube.com/ facebook# :) @007
the date is string, i want match mention with symbol @, hastag with symbol #, any url, and special caracter.
I will match hastag # in front of hastag and behind hastag
this my code
var data = "I always meet @gEmbul at #kampus we always open the site https://www.youtube.com/ facebook# :) @007"
function clean(data) {
data = data.replace(/(?:https?|ftp):\/\/[\n\S]+/g, '')
.replace(/\B\@\w\w+\b/g, '')
.replace(/\B\#\w\w+\b/g, '');
return data;
}
console.log(clean(data))
i will return
i always meet at we always open site
thanks.
Upvotes: 1
Views: 128
Reputation: 627498
I sugges shrinking the pattern a bit (the 2 regexes you have differ in just 1 char and that can be done with a [#@]
character class, and since you remove the matches, you may just combine the regexps with a |
alternation operator):
var data = "I always meet @gEmbul at #kampus we always open the site https://www.youtube.com/ facebook# :) @007"
function clean(data) {
data = data.replace(/(?:https?|ftp):\/\/[\n\S]+|\B[@#]\w+\b|\b\w+[@#]\B|\B[^\w\s]{2,}\B/g, '');
return data;
}
document.body.innerHTML = clean(data);
Details:
(?:https?|ftp):\/\/[\n\S]+
- a regex that matches an URL that may span across newlines|
- or\B[@#]\w+\b
- a @
or #
followed with 1+ word chars (as a whole word)|
- or\b\w+[@#]\B
- 1+ word chars followed with @
or #
(as a whole word) |
- or\B[^\w\s]{2,}\B
- a non-word boundary, 2 or more chars other than word and whitespace, and again a non-word boundary. Remove \B
to match 2 or more non-whitespace/non-word chars in any context.Upvotes: 1