William Entriken
William Entriken

Reputation: 39253

PHP query XML with SQL

I am doing some web scraping and come across several tables of data that I want to query against. Currently I'm up to:

$url = 'http://finance.yahoo.com/q/op?s=QQQQ&m=2012-04';
$html = @DOMDocument::loadHTMLFile($url); 
$xml = simplexml_import_dom($html); 
$results = $xml->xpath('//table[@class="yfnc_datamodoutline1"]');
var_dump($results);

Produces results: http://pastebin.com/6p3L2Kcc

This is well-ordered HTML table data, with TH and TD and everything. I'd like to use it like this:

$sql = 'SELECT Last,Open_Int FROM TABLE1 WHERE Last>25 AND Symbol LIKE "%C%"';
$results = $xmltable->sql($sql);
while($result = $results->fetch_assoc())
  echo $result['Last'] . " -- " . $result['Open_Int'] . "\n";

Without any creativity, I can write classes to parse that HTML table, take the first row, create a table in sqlite, select other rows and turn them into insert statements. But, do you know a better way to do this, or is there some powerful PHP function that I'm not seeing?

Update: Perhaps the scope here is too big. I'd be happy with a link to a library or advice on getting an HTML table in to a (proper) XML table.

Upvotes: 0

Views: 1430

Answers (1)

Ken Downs
Ken Downs

Reputation: 4827

The answer depends on your larger needs. Here are three questions that can flesh those out:

1) How often is the data read vs. written?

2) Do you keep old versions or is only the latest required?

3) Will the data be compared to other data?

In one case let's say the answer to #1 is "many more reads" and the answer to #3 is "yes". In this case it might be well worthwhile to put the XML results into a SQL table for frequent and flexible querying.

However, in another case, let's say the answer to #2 is "no" and the answer to #3 is "no" -- you just keep the latest retrieval and don't compare it to anything. In this case you can just stick into a file and retrieve it as needed for display (#1 becomes kind of irrelevant).

EDIT in response to question in comment: Assuming you want to put it into a database, the display you link to shows a nested set of objects/arrays. You "walk the tree" to peel out the nested objects, strip off their properties and issue individual inserts to the particular tables.

Upvotes: 1

Related Questions