LDK
LDK

Reputation: 165

Regex Valid Twitter Mention

I'm trying to find a regex that matches if a Tweet it's a true mention. To be a mention, the string can't start with "@" and can't contain "RT" (case insensitive) and "@" must start the word.

In the examples I commented the desired output

Some examples:

function search($strings, $regexp) {
    $regexp;
    foreach ($strings as $string) {
        echo "Sentence: \"$string\" <- " .
        (preg_match($regexp, $string) ? "MATCH" : "NO MATCH") . "\n";
    }
}

$strings = array(
"Hi @peter, I like your car ", // <- MATCH
"@peter I don't think so!", //<- NO MATCH: the string it's starting with @ it's a reply
"Helo!! :@ how are you!", // NO MATCH <- it's not a word, we need @(word) 
"Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?", // <- NO MATCH "RT/rt" on the string , it's a RT
"Helo!! [email protected] how are you!", //<- NO MATCH, it doesn't start with @
"@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" // <- NO MATCH starting with @ it's a reply and RT
);
echo "Example 1:\n";
search($strings,  "/(?:[[:space:]]|^)@/i");

Current output:

Example 1:
Sentence: "Hi @peter, I like your car " <- MATCH
Sentence: "@peter I don't think so!" <- MATCH
Sentence: "Helo!! :@ how are you!" <- NO MATCH
Sentence: "Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?" <- MATCH
Sentence: "Helo!! [email protected] how are you!" <- MATCH
Sentence: "@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" <- MATCH

EDIT:

I need it in regex beacause it can be used on MySQL and anothers languages too. Im am not looking for any username. I only want to know if the string it's a mention or not.

Upvotes: 13

Views: 13214

Answers (6)

Akshay Kasar
Akshay Kasar

Reputation: 156

A simple but works correctly even if the scraping tool has appended some special characters sometimes: (?<![\w])@[\S]*\b. This worked for me

Upvotes: 0

csuwldcat
csuwldcat

Reputation: 8249

This regexp might work a bit better: /\B\@([\w\-]+)/gim

Here's a jsFiddle example of it in action: http://jsfiddle.net/2TQsx/96/

Upvotes: 13

Jose Fernandez
Jose Fernandez

Reputation: 74

Twitter has published the regex they use in their twitter-text library. They have other language versions posted as well on GitHub.

Upvotes: 0

jpotts18
jpotts18

Reputation: 5111

I have found that this is the best way to find mentions inside of a string in javascript. I don't know exactly how i would do the RT's but I think this might help with part of the problem.

var str = "@jpotts18 what is up man? Are you hanging out with @kyle_clegg";
var pattern = /@[A-Za-z0-9_-]*/g;
str.match(pattern);
["@jpotts18", "@kyle_clegg"]

Upvotes: 3

Jacob Eggers
Jacob Eggers

Reputation: 9322

Here's a regex that should work:

/^(?!.*\bRT\b)(?:.+\s)?@\w+/i

Explanation:

/^             //start of the string
(?!.*\bRT\b)   //Verify that rt is not in the string.
(?:.*\s)?      //Find optional chars and whitespace the
                  //Note: (?: ) makes the group non-capturing.
@\w+           //Find @ followed by one or more word chars.
/i             //Make it case insensitive.

Upvotes: 9

asgerhallas
asgerhallas

Reputation: 17724

I guess something like this will do it:

^(?!.*?RT\s).+\s@\w+

Roughly translated to:

At the beginning of string, look ahead to see that RT\s is not present, then find one or more of characters followed by a @ and at least one letter, digit or underscore.

Upvotes: 1

Related Questions