Reputation: 120
I have some question.How can i get text between tags in html??
<ReviewsClientModel xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://schemas.datacontract.org/2004/07/Microsoft.OneStore.Site.Models">
<Items>
<ReviewsClientModel.ReviewClientModel>
<HelpfulNegative>0</HelpfulNegative>
<HelpfulPositive>2</HelpfulPositive>
<IsPublished>true</IsPublished>
<IsTakenDown>false</IsTakenDown>
<Rating xmlns:d4p1="http://schemas.datacontract.org/2004/07/Microsoft.OneStore.Site.Models.ViewModels">
<ReviewId>5ce5dc85-466d-a1cc-efe7-70bdd5183dfb</ReviewId>
<ReviewText>I downloaded this app it had someone in his eyes its kinda black so I don't know who it is.my cousin thinks its not scary but I get creeped out wen I saw him myself. Whoevers not scared then just wow just wow. FOR SAFETY DONT DOWNLOAD</ReviewText>
<SubmittedDateTime>2015-06-25T20:13:05.633</SubmittedDateTime>
<Title>FOR SAFETY DON"T PLAY</Title>
<UserId>985157380267961</UserId>
<UserName>natalie</UserName>
<ViolationsFound>false</ViolationsFound>
</ReviewsClientModel.ReviewClientModel>
For example I would like get "5ce5dc85-466d-a1cc-efe7-70bdd5183dfb". I tried that :
public function getXpath($str)
{
\DB::connection('mongodb')->disableQueryLog();
libxml_use_internal_errors(true);
$str = str_replace("\0", '', $str);
$dom = new \DomDocument();
$dom->loadHTML('<?xml encoding="UTF-8">' . $str);
return new \DomXPath($dom);
}
$xpath = $this->getXpath($str);
$tmpCommId = $xpath->query("//ReviewId");
$comm_id = trim($tmpCommId->item($j)->nodeValue);
I used Curl to download Web Site,and i saved in $str.
Upvotes: 0
Views: 84
Reputation: 623
This answer is assuming you want to use Javascript.
You can parse html using Pure JavaScript HTML Parser.
Check that blog for documentation on the library. Might be a little outdated.
EDIT:
LarsH informed me that you wanted an XML scraper in PHP. Although I should have checked your sample code to actually check what it was, it would really help to remind people what you want it in.
As for the answer, whicle I'm not very familiar with php, DOM should be able to handle this pretty well.
In addition, here is a SO answer from the past that is a pretty good example of using DOM to parse HTML. Should be easy enough to use it with XML instead. Hope that helps.
Upvotes: 2