Jquestions
Jquestions

Reputation: 1730

Using Regex with product pricing trailing currency symbol

So, still learning, regex is mind numbing stuff. But I have a working regex to preg_match in php any numbers based around product pricing that follow a currency symbol £. This may be helpful as I couldn't find a working example to consider all variants (such as thousand , and decimals etc). Any improvements to the regex totally welcome!

My question is why though does the array contain 3 instances of every number? And what's the meaning of the "2" that follows?

(?<=\£|GBP)((\d{1,6}(,\d{3})*)|(\d+))(\.\d{2})?

Function:

function website($url) {

$xml = new DOMDocument();
if(@$xml->loadHTMLFile($url)) {

        $xpath = new DOMXPath( $xml );
        $textNodes = $xpath->query( '//text()' );

    foreach ( $textNodes as $textNode ) {

        if ( preg_match('/(?<=\£|GBP)((\d{1,6}(,\d{3})*)|(\d+))(\.\d{2})?/', $textNode->nodeValue, $matches, PREG_OFFSET_CAPTURE ) ) {

            $website_prices[] = $matches;
            global $website_prices;
        }
    }
}

print_r is dumping:

    [3] => Array
    (
        [0] => Array
            (
                [0] => 545
                [1] => 2
            )

        [1] => Array
            (
                [0] => 545
                [1] => 2
            )

        [2] => Array
            (
                [0] => 545
                [1] => 2
            )

    )

Upvotes: 1

Views: 574

Answers (1)

m87
m87

Reputation: 4523

Your current regex has lots of unnecessary grouping / formatting, which isn't needed. The following regex would be suitable in your case :

(?<=£|GBP)[\d.,]+

see demo / explanation

PHP

(implementation)

<?php
   $re = '/(?<=£|GBP)[\d.,]+/';
   $str = '£545 £5450 £54.20 £5450 £545,620 £545,620.96
           GBP545 GBP5450 GBP54.20 GBP5450 GBP545,620 GBP545,620.96';
   preg_match_all($re, $str, $matches);
   print_r($matches);
?>

­(output)

Array
(
    [0] => Array
        (
            [0] => 545
            [1] => 5450
            [2] => 54.20[3] => 5450
            [4] => 545,620
            [5] => 545,620.96
            [6] => 545
            [7] => 5450
            [8] => 54.20
            [9] => 5450
            [10] => 545,620
            [11] => 545,620.96
        )
)

Upvotes: 1

Related Questions