Reputation: 4099
I need a fairly complex regex to accomplish the following:
> replace numbers in a string, i.e. 700, 12.43 by a label (format: {NUMBER:xx})
> ignore: when number is between {braces}, i.e. {7}, {7th}
> ignore: when any character is attached to number, i.e. G3, 7x, 1/2
> except: when
> preceded by $, i.e. $840
> followed by .!?:, i.e. 33! 45.65? 4...
Taken all together:
Buy 4 {5} G3 Mac computers for 80% at $600 or 2 for 1/2 price: 200...
dollar. Twice - 2x - as cheap!
Desired output:
Buy {NUMBER:4} {5} G3 Mac computers for 80% at
$ {NUMBER:600} or {NUMBER:2} for 1/2 price:
{$NUMBER:200} dollar. Twice - 2x - as cheap!
I now have this:
preg_replace("/(?<!{)(?>[0-9]+(?:\.[0-9]+)?)(?!})/", " {NUMBER:$0} ", $string);
which outputs:
Buy {NUMBER:4} {5} G {NUMBER:3} Mac computers for {NUMBER:80} % at
$ {NUMBER:600} or {NUMBER:2} for {NUMBER:1} / {NUMBER:2} price:
{NUMBER:200} ... dollar. Twice - {NUMBER:2} x - as cheap!
In other words: ignoring exceptions aren't working yet, and I don't know how to properly implement it. Who does and can help me out?
Upvotes: 1
Views: 178
Reputation: 336168
This works for your test cases and follows your rules, assuming that braces are correctly matched and unnested:
$result = preg_replace(
'/(?<!\{) # Assert no preceding {
(?<![^\s$]) # Assert no preceding non-whitespace except $
\b # Match start of number
(\d+(?:\.\d+)?+) # Match number (optional decimal part)
\b # Match end of number
(?![^{}]*\}) # Assert that next brace is not a closing brace
(?![^\s.!?,]) # Assert no following non-whitespace except .!?,
/x',
'{NUMBER:\1}', $string);
Upvotes: 2
Reputation: 65274
$string="Buy 4 {5} G3 Mac computers for 80% at \$600 or 2 for 1/2 price: 200... \ndollar. Twice - 2x - as cheap!";
$pattern='/[\s|^|\$]([0-9]+(\.\s+)*)[\s|$|\.|\!|\?|\:|\,]/';
//$count=preg_match_all($pattern, $string, $matches);
//echo "$count\n";
//print_r($matches[1]);
echo preg_replace($pattern,"{NUMBER:\$1}",$string);
Upvotes: 1