J-Rou
J-Rou

Reputation: 2286

How to match a pound (#) symbol in a regex in php (for hashtags)

Very simple, I need to match the # symbol using a regex. I'm working on a hashtag detector.

I've tried searching in google and in stack overflow. One related post is here, but since he wanted to remove the # symbol from the string he didn't use regex.

I've tried the regexes /\b\#\w\w+/, and /\b#\w\w+/ and they don't work and if I remove the #, it detects the word.

Upvotes: 13

Views: 45834

Answers (5)

will
will

Reputation: 5061

For what it is worth I only managed to match a hash(#) character as a string. In awk the parser takes out the comments as first thing. The only syntax that can 'hold' a # is

"#"

So in my case I took-out lines with only comments as:

$1 == "#" { next; }

I also attempted to make the hash a regex:

HASH_PATTERN = "^#"

$1 ~ HASH_PATTERN { next; }

... This also works. So I'm thinking you an put the whole expression in a string like: HASH_PATTERN.

The string equals does work quite well. It isn't a perfect solution, just a starter.

Upvotes: 0

webbiedave
webbiedave

Reputation: 48897

You don't need to escape it (it's probably the \b that's throwing it off):

if (preg_match('/^\w+#(\w+)/', 'abc#def', $matches)) {
    print_r($matches);
}

/* output of $matches:
Array
(
    [0] => abc#def
    [1] => def
)
*/

Upvotes: 7

Lasse Nielsen
Lasse Nielsen

Reputation: 41

With the comment on the earlier answer, you want to avoid matching x#x. In that case, your don't need \b but \B:

\B#(\w\w+)

(if you really need two-or-more word characters after the #).

The \B means NON-word-boundary, and since # is not a word character, this matches exactly if the previous character is not a word character.

Upvotes: 4

Niet the Dark Absol
Niet the Dark Absol

Reputation: 324640

# does not have any special meaning in a regex, unless you use it as the delimiter. So just put it straight in and it should work.

Note that \b detects a word boundary, and in #abc, the word boundary is after the # and before the abc. Therefore, you need to use the \b is superfluous and you just need #\w\w+.

Upvotes: 13

Highway of Life
Highway of Life

Reputation: 24341

You could use the following regex: /\#(\w+)/ to match a hashtag with just the hashtag word, or: /\#\w+/ will match the entire hashtag including the hash.

Upvotes: 0

Related Questions