Ice
Ice

Reputation: 169

Tricky Question: How to order results from a multiple regexes

I currently use 3 different regular expressions in one preg_match, using the or sign | to separate them. This works perfectly. However the first and second regex have the same type of output. e.g. [0] Source Text [1] Number Amount [2] Name - however the last one since it uses a different arrangement of source text results in: [0] Source Text [1] Name [2] Number Amount.

    preg_match('/^Guo (\d+) Cars @(\w+)|^AV (\d+) Cars @(\w+)|^@(\w+) (\d+) [#]?av/i', $source, $output);

Since Name is able to be numeric I can't do a simple check to see if it is numeric. Is there a way I can either switch the order in the regex or identify which regex it matched too. Speed is of the essence here so I didn't want to use 3 separate preg_match statements (and more to come).

Upvotes: 2

Views: 390

Answers (3)

Martijn Laarman
Martijn Laarman

Reputation: 13536

Three separate regular expressions don't have to be slower. One big statement will mean a lot of backtracing for the regular expression engine. Key in regular expression optimisation is to make the engine fail ASAP. Did you do some benchmarking pulling them appart?

In your case you can make use of the PCRE's named captures (?<name>match something here) and replace with ${name} instead of \1. I'm not 100% certain this works for preg_replace. I know preg_match correctly stores named captures for certain, though.

PCRE needs to be compiled with the PCRE_DUPNAMES option for that to be useful in your case (as in RoBorg's) post. I'm not sure if PHP's compiled PCRE DLL file has that option set.

Upvotes: 3

Gumbo
Gumbo

Reputation: 655499

I don’t know since what version PCRE supports the duplicate subpattern numbers syntax (?| … ). But try this regular expression:

/^(?|Guo (\d+) Cars @(\w+)|AV (\d+) Cars @(\w+)|@(\w+) (\d+) #?av)/i

So:

$source = '@abc 123 av';
preg_match('/^(?|Guo (\\d+) Cars @(\\w+)|AV (\\d+) Cars @(\\w+)|@(\\w+) (\\d+) #?av)/i', $source, $output);
var_dump($output);

Upvotes: 0

Greg
Greg

Reputation: 321766

You could use named capture groups:

preg_match('/^Guo (?P<number_amount>\d+) Cars @(?P<name>\w+)|^AV (?P<number_amount>\d+) Cars @(?P<name>\w+)|^@(?P<name>\w+) (?P<number_amount>\d+) [#]?av/i', $source, $output);

Upvotes: 3

Related Questions