user1040418
user1040418

Reputation: 3

How to match a "tag" list using regular expressions & PHP

I have a form input field that accepts multiple "tags" from a user, a bit like the one on this site! So, for example a user could enter something like:

php mysql regex

...which would be nice & simple to separate up the multiple tags, as I could explode() on the spaces. I would end up with:

array('php', 'mysql', 'regex')

However things get a little more complicated as the user can separate tags with commas or spaces & use double quotes for multi-word tags.

So a user could also input:

php "mysql" regex, "zend framework", another "a, tag with punc $^&!)(123 *note the comma"

All of which would be valid. This should produce:

array('php', 'mysql', 'regex', 'zend framework', 'another', 'a, tag with punc $^&!)(123 *note the comma')

I don't know how to write a regular expression that would firstly match everything in double quotes, then explode the string on commas or spaces & finally match everything else. I guess I would use preg_match_all() for this?

Could anyone point me in the right direction!? Many thanks.

Upvotes: 0

Views: 343

Answers (1)

chimericdream
chimericdream

Reputation: 432

Try this regex out. I tested it against your string, and it correctly pulled out the individual tags:

("([^"]+)"|\s*([^,"\s]+),?\s*)

This code:

$string = 'php "mysql" regex, "zend framework", another "a, tag with punc $^&!)(123 *note the comma"';
$re = '("([^"]+)"|\s*([^,"\s]+),?\s*)';
$matches = array();
preg_match_all($re, $string, $matches);
var_dump($matches);

Yielded the following result for me:

array(3) {
  [0]=>
  array(6) {
    [0]=>
    string(4) "php "
    [1]=>
    string(7) ""mysql""
    [2]=>
    string(8) " regex, "
    [3]=>
    string(16) ""zend framework""
    [4]=>
    string(9) " another "
    [5]=>
    string(44) ""a, tag with punc $^&!)(123 *note the comma""
  }
  [1]=>
  array(6) {
    [0]=>
    string(0) ""
    [1]=>
    string(5) "mysql"
    [2]=>
    string(0) ""
    [3]=>
    string(14) "zend framework"
    [4]=>
    string(0) ""
    [5]=>
    string(42) "a, tag with punc $^&!)(123 *note the comma"
  }
  [2]=>
  array(6) {
    [0]=>
    string(3) "php"
    [1]=>
    string(0) ""
    [2]=>
    string(5) "regex"
    [3]=>
    string(0) ""
    [4]=>
    string(7) "another"
    [5]=>
    string(0) ""
  }
}

Hope that helps.

Upvotes: 2

Related Questions