luke_mclachlan
luke_mclachlan

Reputation: 1054

Scraping a page to retrieve prices, currency code messing things up

I'm scraping a page using PHP Simple HTML DOM Parser and I want to retrieve the price. It's been going well except for a page that I've encountered, where the html reads:

<p class="was-price">Was: &#163;220.00</p>

I want to scrape the part that reads 220.00 and I am very confused about how to retrieve it. Thus far I have been using preg_replace() with great success to strip out text from a string, yet this is the first time I have come across a currency symbol in numeric format.

Today is the first day I have used preg_replace() and it's confusing to say the least. Can it be used to remove currency symbols in this way? Or should I be looking at another method? Thanks

Upvotes: 0

Views: 50

Answers (1)

Igor Savinkin
Igor Savinkin

Reputation: 6277

Use html_entity_decode() to decode encoded html entities. Then you apply preg_replace().

$str = '<p class="was-price">Was: &#163;220.00</p>';
$str = html_entity_decode($str);
echo $str; 
preg_replace(...);

Upvotes: 1

Related Questions