Jeroen Steen
Jeroen Steen

Reputation: 541

Regex for splitting hashtags, but ignore

I have a string where I want to match hashtags after a certain "context", for example |product.

After |product I want to match the hashtags that are after that.

So this is my full string |product#houtprint#laserprint|materiaal#hout.

And this is my regex pattern untill now \|product(?<product>#[^\|]+). I now get a match on #houtprint#laserprint, but I want to get a match on them seperatly so #houtprint and #laserprint.

This is also my PHP part:

preg_match_all("~\|".$context."(?<".$context.">#[^\|]+)~", $tags_string, $matches);

How can I ensure that I get the products as seperate groups?

Upvotes: 1

Views: 67

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627488

You need to set a \G based boundary so that preg_match_all could match consecutive hashtags (that follow one another after the substring you specify):

(?:\|product|\G(?!\A))(?<product>#[^|#]+)

Not sure you really need the named capture group here.

See the regex demo

Details:

  • (?:\|product|\G(?!\A)) - either |product substring (\|product) or the end of the previous successful match (\G(?!\A)) (these branches may be swapped for better performance)
  • (?<product>#[^|#]+) - a "product" named capturing group that matches
    • # - a hash symbol
    • [^|#]+ - one or more chars other than | and #.

PHP demo:

$re = '/(?:\|product|\G(?!\A))(?<product>#[^|#]+)/';
$str = '|product#houtprint#laserprint|materiaal#hout';
preg_match_all($re, $str, $matches);
print_r($matches["product"]);
// => Array ( [0] => #houtprint [1] => #laserprint )

Upvotes: 2

Related Questions