Reputation: 31
Hello I have a script which gets html data from a website...
//website is built like this:
<table class="table table-hover">
<tr>
<td><b>Cover</b></td>
<td><b>Platz</b></td>
<td><b>Titel</b></td>
<td><b>Videolink</b></td>
</tr>
<tr>
<td><a href="http://www.youtube.com" target="_blank"><img src="youtube.jpg" /></a></td>
<td>1</td>
<td><a href="http://www.youtube.com" target="_blank">name</a></td>
<td><input type="text" onclick="this.select()" id="1" size="45" name="1" value="http://www.youtube.com" /></td>
</tr><tr>
<td><a href="http://www.youtube.com2" target="_blank"><img src="youtube.jpg2" /></a></td>
<td>1</td>
<td><a href="http://www.youtube.com2" target="_blank">name2</a></td>
<td><input type="text" onclick="this.select()" id="2" size="45" name="2" value="http://www.youtube.com2" /></td>
</tr></table>
PHP
<?php
include 'core/functions/dom.php';
include 'core/init.php';
$url = "http://MYWEBSITE";
$html = file_get_html($url);
$theData = array();
foreach($html->find('table tr') as $row) {
$rowData = array();
foreach($row->find('td') as $cell) {
$rowData[] = $cell->innertext;
}
$theData[] = $rowData;
}
$list=($theData[2]);
$name=($list[3]);
echo $name;
?>
The data is now stored in a variable! but when I echo it out it is a link...
<a href="http://www.youtube.com2" target="_blank">name2</a>
(you can see this when you view the source code)
I just need the "name2" as text, that I can put it in my database!
Another problem is that it echos out a text field. There I also just need the text...
<input type="text" onclick="this.select()" id="2" size="45" name="2" value="http://www.youtube.com2" />
There I need the value of the input as text for my database!
Upvotes: 2
Views: 312
Reputation: 2950
You can acheive this by using a built in class called DOMDocument
. After instantiating your object, you can call the method getElementsByTagName('td')
which will extract value data (non-tag data) from the <td>
tag. I've added an if conditon to ignore whitespace as some of the <td>
tags do not have values.
Code:
<?php
$dom = new DOMDocument;
$dom->loadHTML($html);
$result = $dom->getElementsByTagName('a');
foreach ($result as $v) {
echo $v->getAttribute('href') . ' ' . $v->nodeValue;
echo '<br>';
}
Output:
http://www.youtube.com
http://www.youtube.com name
http://www.youtube.com2
http://www.youtube.com2 name2
See: http://php.net/manual/en/domdocument.getelementsbytagname.php
Edit:
I've updated code so it outputs URL's/Anchors & values (if any) of the A tag.
Upvotes: 1