Spinnenzunge
Spinnenzunge

Reputation: 55

Regular Expression filtering all translation functions

I am working on a Webinterface that provides the same function like poEdit.

I want to walk trough all .php files in a specified folder and search every line for a translation. For this I would like to use regular expression searching the actual line in the php file and return the translation-text-parameter and the domain-parameter.

My function looks like this:

__('This is my translation', 'domain');

But because for the domain-parameter I defined a default, the function __() can also be called like this:

__('this is my translation');

Now in PHP i tried to use the Function preg_match_all() but i can't gent my regex together.

Here is an example of a possible line in the script and the output array I would like to receive with the preg_match_all() function:

echo __('Hello World'); echo __('Some domain specific translation', 'mydomain');

Array output:

Array
(
    [0] => Array
        (
            [0] => Hello World
        )

    [1] => Array
        (
            [0] => Some domain specific translation.
            [1] => mydomain
        )
)

Can anyone help me out with the Regex and the preg_math_all() flags?

Thank you guys.

Upvotes: 1

Views: 535

Answers (2)

Nameless
Nameless

Reputation: 2366

Something like this should work. Array shift needed, because zero element will always contain full match, there is no flag to exclude it AFAIK.

if(preg_match_all('/__\(\s*\'((?:[^\']|(?<=\\\)\')+)\'(?:\s*,\s*\'((?:[^\']|(?<=\\\)\')+)\')?\s*\)/us', $data, $result)) {
  foreach ($result as &$item) {
    array_shift($item);
  }
  unset($item);
  var_dump($result);
}

It finds correctly calls like these __('lorem \' ipsum', 'my\'domain'). It would fail on __('lorem \\') though.

Upvotes: 1

Tomalak
Tomalak

Reputation: 338228

The regex you would need for this is considerably complex.

__\(\s*(['"])((?:(?!(?<!\\)\1).)+)\1(?:,\s*(['"])((?:(?!(?<!\\)\3).)+)\3)?\s*\)

Matches would be in groups 2 and 4, for example

__('This is my translation', 'domain');

would produce these groups:

  1. '
  2. This is my translation
  3. '
  4. domain

and this

__('This is my \'translation\'', "domain");

would produce these groups:

  1. '
  2. This is my \'translation\'
  3. "
  4. domain

Upvotes: 1

Related Questions