Dasa
Dasa

Reputation: 297

scraping from in between specific identifiable tags from xml

i have the following to extract the currency name currency code country and rate, the problem is that it grabs all the currencies, i would like to only get certain currencies ie only the usd gbp and cad

<?php

$xchange = new SimpleXMLElement('http://www.bankisrael.gov.il/currency.xml', NULL, TRUE);

echo "
<table>
        <tr>
                <th>n</th>
                <th>cc</th>
                <th>c</th>
                <th>r</th>

        </tr>";

foreach($xchange as $curr) // loop through our books
{

        echo "
        <tr>
                <td>{$curr->NAME}</td>
                <td>{$curr->CURRENCYCODE}</td>
                <td>{$curr->COUNTRY}</td>
                <td>{$curr->RATE}</td>
        </tr>";
}

echo '</table>';
?>

Upvotes: 0

Views: 331

Answers (5)

Touchpad
Touchpad

Reputation: 722

I do this as follows

function xml ($url) {
  $text = file_get_contents($url) or die("ERROR: Unable to read file");

  $p = xml_parser_create();

  xml_parser_set_option($p, XML_OPTION_CASE_FOLDING, 0);
  xml_parse_into_struct($p, $text, $vals, $index);
  xml_parser_free($p);

  for($i=0; $i<count ($index['Title']); $i++) {
    foreach(array('var1', 'var2', 'var3') as $key) {
      $info[$i][$key] = $vals[$index[$key][$i]]['value'];
    }
  }
}

Upvotes: 0

Doug
Doug

Reputation: 6442

Simply replace

foreach($xchange as $curr) // loop through our books
{
        echo "
        <tr>
                <td>{$curr->NAME}</td>
                <td>{$curr->CURRENCYCODE}</td>
                <td>{$curr->COUNTRY}</td>
                <td>{$curr->RATE}</td>
        </tr>";
}

with

foreach($xchange as $curr) // loop through our books
{
    if (in_array($curr->CURRENCYCODE, array('usd', 'gbp', 'cad'))) {
            echo "
            <tr>
                    <td>{$curr->NAME}</td>
                    <td>{$curr->CURRENCYCODE}</td>
                    <td>{$curr->COUNTRY}</td>
                    <td>{$curr->RATE}</td>
            </tr>";
    }
}

Upvotes: 0

Samuel Herzog
Samuel Herzog

Reputation: 3611

Code speaking

$xchange = new SimpleXMLElement('http://www.bankisrael.gov.il/currency.xml', NULL, TRUE);
$filterCurrencies = array( 'USD', 'GBP' );

$filter = implode( array_map( function($filler) { return 'text()="'.$filler.'"'; }, $filterCurrencies), ' or ' );

$xpathQuery = $xpath = '//CURRENCYCODE[%filter%]/parent::*';
$xpathQuery = str_replace('%filter%', , $xpathQuery);

$currencies = $xchange->xpath($xpathQuery);

/** I do know you already have to code to echo it ... the code is tested, feel free to copy&pase **/

Step by Step

Okay, first of all, you're using a SimpleXML object to read in the data from bank israel. I suggest to leverage this object to do most of the work (which is much faster compared to filtering with PHP, although SimpleXML isn't the best thing for performance).

So first of all what do we want to accomplish? Getting the Data of a based on the content of its element. For an Webdesigner this should sound like CSS, but not quite right. For a web developer having his hands on an XML that should sound like XPath, which is the golden choice! Fortunately, SimpleXML enables us to use XPath, so we'll build a query:

XPath basics:
//CURRENCYCODE will select any currencycode element
//CURRENCYCODE/parent::* will select the currencycodes parent (<CURRENCY>), which is where our data is
//CURRENCYCODE[text()="JPY"] will select only <CURRENCY> elements whose text equals JPY exactly.

Here we salt with our requirementslist:

$filterCurrencies = array( 'USD', 'GBP' ); // we want us dollars and british pounds
$filter =  implode( array_map( function($token) { return 'text()="'.$token.'"'; }, $filterCurrencies), ' or ' );
// this will make a string like 'text()="USD" or text()="GBP"' by mapping the filter against the requirements string (currenciecodes get tokens) glueing it with a logical or

Now the only thing left to do is to integrate that with our XPATH template

$xpath = '//CURRENCY[%filter%]/parent::*';
$xpath = str_replace('%filter%', $filter, $xpath);

$currencies = $xchange->xpath($xpath);

Happy looping!

Upvotes: 0

Berry Langerak
Berry Langerak

Reputation: 18859

This seems to work decently enough: it filters out everything that isn't in $filter. If it is in $filter, it will be displayed. I've tested it, and it seems to work like a charm.

<?php

/**
 * This is an array indicating which currencies you want to show.
 */
$filter = array(
    'usd',
    'gbp',
    'cad'
);

$root = simplexml_load_file( 'http://www.bankisrael.gov.il/currency.xml' );

echo "
<table>
    <tr>
        <th>n</th>
        <th>cc</th>
        <th>c</th>
        <th>r</th>
    </tr>\n";

foreach( $root->CURRENCY as $currency ) {
    if( in_array( (string) strtolower( $currency->CURRENCYCODE ), $filter ) ) {
        echo "
    <tr>
        <td>{$currency->NAME}</td>
        <td>{$currency->CURRENCYCODE}</td>
        <td>{$currency->COUNTRY}</td>
        <td>{$currency->RATE}</td>
    </tr>\n";
    }
}

echo "</table>\n";

Upvotes: 1

James C
James C

Reputation: 14159

This piece of code should do the job in a much cleaner way:

<?php

$a = getCurrencyData('http://www.bankisrael.gov.il/currency.xml');
print_r($a);

function getCurrencyData($url) {

    $raw = file_get_contents($url);
    $xml = new SimpleXMLElement($raw);

    $ret = array();

    foreach($xml->CURRENCY as $currency) {
        $currency = (array) $currency;
        $ret[$currency['CURRENCYCODE']] = $currency;
    }

    return $ret;
}

You can now just get at the currencies you want by using the currency code e.g. $a['GBP']

Upvotes: 4

Related Questions