Pulling element/data from an external website with php and cURL

Question

I'm trying to pull in an element from an external website using PHP and cURL.

The link to the website I'm trying to pull content from is: http://www.stayclassy.org/fundraise?fcid=231864
The element I'm targeting is the number value under the list item
"Raised So Far" in the right column at the top (right now the value is at $10).

Here is the code I'm using to extract the data:

    define("TARGET", "http://www.stayclassy.org/fundraise?fcid=231864");

$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, TARGET);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

if(!($results = curl_exec($curl))) {
    print("{ \"total\": \"$0.00\" }");
    return;
} 

$pattern = '/\ \$(.+?) \<\/li\>\<\/a\>/';
preg_match_all($pattern, $results, $matches);

$total = $matches[1][0];
$total = str_replace(",", "", $total);

printf("{ \"total\": \"$%s\" }", formatMoney($total, true));


function formatMoney($number, $fractional=false) 
{
    if ($fractional) {
        $number = sprintf('%.2f', $number);
    }
    while (true) {
        $replaced = preg_replace('/(-?\d+)(\d\d\d)/', '$1,$2', $number);
        if ($replaced != $number) {
            $number = $replaced;
        } else {
            break;
        }
    }
    return $number;
}

The issue I'm having is that the list item/element I'm targeting doesn't have a unique ID or class. In fact, the dollar amount is located in a separate list item without a class.

I was wondering how to target a specific list item in an unordered list using the code above, particularly when it doesn't have a class. Any ideas?

Gareth Cornish · Accepted Answer

Targeting the specific item requires that you identify a unique string around it. To do this you just expand further and further out until you find a string you can identify that only occurs once. So, the line you want is:

$10

but this is not unique at all. So we expand the string by adding the previous line as well:

Raised so far:
$10

and bingo, this string is unique for your needs. The string is fairly constant except for your amount, so it will be easy to use. So you need a regular expression that finds this string. I'd use something like this:

$pattern = '/Raised so far:<\/li>\s*\$(\d+)<\/li>/';

You don't need to use preg_match_all because you only expect to get one match:

preg_match($pattern, $results, $matches);
$total = $matches[1];

Your other options include loading the page with a DOMDocument, and then using XPath or getElementById to parse the DOM. But that may be a little too much effort for this task.

Also, I'd use file_get_contents to fetch the contents of the remote site. But that's just me.

UPDATE: To handle thousands separators as well, modify your pattern as follows:

$pattern = '/Raised so far:<\/li>\s*\$([\d\.,]+)<\/li>/';

Pulling element/data from an external website with php and cURL

Answers (1)

Related Questions