Wizard
Wizard

Reputation: 11295

PHP Regex find all capitalize words in string

PHP Regex find all capitalize words in string:

$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";

preg_match_all('/\b([A-Z-][\p{L}\pL]+)\b/', $string, $matches);

var_dump($matches);

Output:

array(2) {
  [0]=>
  array(2) {
    [0]=>
    string(8) "YDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
  [1]=>
  array(2) {
    [0]=>
    string(8) "YDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
}

Question is where disapear symbol 'Ž' ?

HOw to modify regex expresion, that will be not removed UTF-8 symbols ?

Code online: Code

Upvotes: 3

Views: 1062

Answers (1)

hek2mgl
hek2mgl

Reputation: 158280

Basically you need to use the modifier u option when working with unicode strings. However the regex can also get simplified using the :upper: character class because it will match all uppercased unicode characters.

Like this:

$string = "test sample test: 2015. ŽYDRŪNAS PAVARDENIS";

preg_match_all("/[[:upper:]]+/u", $string, $matches);
var_dump($matches);

Output:

array(1) {
  [0]=>
  array(2) {
    [0]=>
    string(10) "ŽYDRŪNAS"
    [1]=>
    string(10) "PAVARDENIS"
  }
}

Demo

Upvotes: 5

Related Questions