sandelius
sandelius

Reputation: 523

PHP regex optimize

I've got a regular expression that match everything between <anything> and I'm using this:

'@<([\w]+)>@'

today but I believe that there might be a better way to do it?

/ Tobias

Upvotes: 1

Views: 301

Answers (3)

Silver Light
Silver Light

Reputation: 45932

You better use PHP string functions for this task. It will be a lot faster and not too complex.

For example:

$string = "abcd<xyz>ab<c>d";

$curr_offset = 0;
$matches = array();

$opening_tag_pos = strpos($string, '<', $curr_offset);

while($opening_tag_pos !== false)
{
    $curr_offset = $opening_tag_pos;
    $closing_tag_pos = strpos($string, '>', $curr_offset);
    $matches[] = substr($string, $opening_tag_pos+1, ($closing_tag_pos-$opening_tag_pos-1));

    $curr_offset = $closing_tag_pos;
    $opening_tag_pos = strpos($string, '<', $curr_offset);
}

/*
     $matches = Array ( [0] => xyz [1] => c ) 
*/

Of course, if you are trying to parse HTML or XML, use a XHTML parser instead

Upvotes: 0

jensgram
jensgram

Reputation: 31498

If "anything" is "anything except a > char", then you can:

@<([^>]+)>@

Testing will show if this performs better or worse.

Also, are you sure that you need to optimize? Does your original regex do what it should?

Upvotes: 1

Spencer Hakim
Spencer Hakim

Reputation: 1553

\w doesn't match everything like you said, by the way, just [a-zA-Z0-9_]. Assuming you were using "everything" in a loose manner and \w is what you want, you don't need square brackets around the \w. Otherwise it's fine.

Upvotes: 1

Related Questions