Reputation: 1620
Ok im trying to build an xml feed from this HTML table using PHP Simple HTML DOM Parser.
<table>
<tr><td colspan="5"><strong>Saturday October 15 2011</strong></td></tr>
<tr><td>Team 1</td> <td>vs</td> <td>Team 7</td> <td>3:00 pm</td></tr>
<tr><td>Team 2</td> <td>vs</td> <td>Team 12</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td> <td>vs</td> <td>Team 8</td> <td>3:00 pm</td></tr>
<tr><td>Team 4</td> <td>vs</td> <td>Team 10</td> <td>3:00 pm</td></tr>
<tr><td>Team 5</td> <td>vs</td> <td>Team 11</td> <td>3:00 pm</td></tr>
<tr><td colspan="5"><strong>Monday October 17 2011</strong></td></tr>
<tr><td>Team 6</td> <td>vs</td> <td>Team 9</td> <td>7:45 pm</td></tr>
<tr><td colspan="5"><strong>Saturday October 22 2011</strong></td></tr>
<tr><td>Team 7</td> <td>vs</td> <td>Team 12</td> <td>3:00 pm</td></tr>
<tr><td>Team 1</td> <td>vs</td> <td>Team 2</td> <td>3:00 pm</td></tr>
<tr><td>Team 8</td> <td>vs</td> <td>Team 4</td> <td>3:00 pm</td></tr>
<tr><td>Team 3</td> <td>vs</td> <td>Team 6</td> <td>3:00 pm</td></tr>
<tr><td>Team 9</td> <td>vs</td> <td>Team 5</td> <td>3:00 pm</td></td></tr>
<tr><td>Team 10</td> <td>vs</td> <td>Team 11</td> <td>3:00 pm</td></tr>
</table>
What I am aiming to do is extract the Date and then the following rows up until the next date. so that I can build an XML node as such for each of the dates.
<matchday date="Saturday October 15 2011">
<fixture>
<hometeam>Team 1</hometeam>
<awayteam>Team 7</awayteam>
<kickoff>3:00 pm</kickoff>
</fixture>
<fixture>
<hometeam>Team 2</hometeam>
<awayteam>Team 12</awayteam>
<kickoff>3:00 pm</kickoff>
</fixture>
</matchday>
I have at present each of the dates from the html and built their respective xml nodes
$dateNodes = $html->find('table tr td[colspan="5"] strong');
foreach($dateNodes as $date){
echo '<matchday day="'.trim($date->innertext).'">';
// FIXTURES
// END FIXTURES
echo '</matchday>';
}
How would i go about getting the team names etc for each fixture up until the next matchday date?
Upvotes: 3
Views: 763
Reputation: 317049
Instead if SimpleHtmlDom (which I believe is a craptaculous library), you can use an XSLT transformation and PHP's native XSLT processor:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" method="xml"/>
<xsl:template match="/">
<matchdays>
<xsl:for-each select="table/tr[td[@colspan=5]]">
<matchday>
<xsl:attribute name="date">
<xsl:value-of select="td/strong"/>
</xsl:attribute>
<xsl:for-each select="following-sibling::tr[
not(td[@colspan]) and
preceding-sibling::tr[td[@colspan]][1] = current()
]">
<fixture>
<hometeam><xsl:value-of select="td[1]"/></hometeam>
<awayteam><xsl:value-of select="td[3]"/></awayteam>
<kickoff><xsl:value-of select="td[4]"/></kickoff>
</fixture>
</xsl:for-each>
</matchday>
</xsl:for-each>
</matchdays>
</xsl:template>
</xsl:stylesheet>
Then just use the code given in the example at http://php.net/manual/en/xsltprocessor.transformtoxml.php to transform your HTML to the XML:
$xml = new DOMDocument;
$xml->load('YourSourceFile.xml');
$xsl = new DOMDocument;
$xsl->load('YourStyleSheet.xsl');
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
echo $proc->transformToXML($xml);
In addition to using XSLT, you can also do it with PHP's native DOM extension:
$xml = new DOMDocument;
$xml->loadHtmlFile('YourHtmlFile.xml');
$xp = new DOMXPath($xml);
$new = new DOMDocument('1,0', 'utf-8');
$new->appendChild($new->createElement('matchdays'));
foreach ($xp->query('//table/tr/td[@colspan=5]/strong') as $gameDate) {
$matchDay = $new->createElement('matchday');
$matchDay->setAttribute('date', $gameDate->nodeValue);
foreach ($xp->query(
sprintf(
'//tr[
not(td[@colspan]) and
preceding-sibling::tr[td[@colspan]][1]/td/strong/text() = "%s"
]',
$gameDate->nodeValue
)
) as $gameData) {
$tds = $gameData->getElementsByTagName('td');
$fixture = $matchDay->appendChild($new->createElement('fixture'));
$fixture->appendChild($new->createElement(
'hometeam', $tds->item(0)->nodeValue)
);
$fixture->appendChild($new->createElement(
'awayteam', $tds->item(2)->nodeValue)
);
$fixture->appendChild($new->createElement(
'kickoff', $tds->item(3)->nodeValue)
);
}
$new->documentElement->appendChild($matchDay);
}
$new->formatOutput = true;
echo $new->saveXML();
Upvotes: 2