Boe Jingles
Boe Jingles

Reputation: 595

PHP Parsing String of HTML

thanks for taking a second to look at this. I'm using a PHP script to get the source code of a page from a URL, and then I am attempting to parse it and display a certain part text. The problem appears to be that when I get the source for the link (with:

$data = file_get_contents($link);

) the variable $data stores it as HTML and not as just a string. I'm pretty new to PHP so I'm not 10% sure if that's the case, but I do know that if I try to display $data in any way it displays not as plain text but as HTML with HTML formatting.

Ordinarily this wouldn't be an issue but I am trying to get the value of something inside an HTML tag, like this:

$search = strpos($data, $searchterm);

and because it is either stored as HTML instead of as plain text or it is treated that way, strpos() will only search through the text that would be displayed if I loaded the page.

To be more specific, in my file (YouTube data about my account) it would only look at what would display if it were to be loaded as HTML, which is pure nonsense.

Here is the source that I want it to search through (I have replaced my account name with 'MyAccount' for privacy):

<entry gd:etag="W/"A0MFR347eCp7I2A9WhNQEU4."" xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:gd="http://schemas.google.com/g/2005" xmlns:yt="http://gdata.youtube.com/schemas/2007">
<id>tag:youtube.com,2008:user:A1RDBCYeYWY9dydB9MmPlg</id>
<published>2007-01-23T15:39:30.000Z</published>
<updated>2012-11-17T08:03:36.000Z</updated>
<category scheme="http://schemas.google.com/g/2005#kind" term="http://gdata.youtube.com/schemas/2007#userProfile"/>
<title>MyAccount</title>
<summary/>
<link rel="alternate" type="text/html" href="http://www.youtube.com/channel/UCA1RDBCYeYWY9dydB9MmPlg"/>
<link rel="self" type="application/atom+xml" href="http://gdata.youtube.com/feeds/api/users/A1RDBCYeYWY9dydB9MmPlg?v=2"/>
<author>
<name>MyAccount</name>
<uri>http://gdata.youtube.com/feeds/api/users/MyAccount</uri>
<yt:userId>A1RDBCYeYWY9dydB9MmPlg</yt:userId>
</author>
<yt:channelId>UCA1RDBCYeYWY9dydB9MmPlg</yt:channelId>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.liveevent" href="http://gdata.youtube.com/feeds/api/users/MyAccount/live/events?v=2" countHint="0"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.favorites" href="http://gdata.youtube.com/feeds/api/users/MyAccount/favorites?v=2" countHint="0"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.contacts" href="http://gdata.youtube.com/feeds/api/users/MyAccount/contacts?v=2" countHint="71"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.inbox" href="http://gdata.youtube.com/feeds/api/users/MyAccount/inbox?v=2"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.playlists" href="http://gdata.youtube.com/feeds/api/users/MyAccount/playlists?v=2"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.subscriptions" href="http://gdata.youtube.com/feeds/api/users/MyAccount/subscriptions?v=2" countHint="54"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.uploads" href="http://gdata.youtube.com/feeds/api/users/MyAccount/uploads?v=2" countHint="41"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.newsubscriptionvideos" href="http://gdata.youtube.com/feeds/api/users/MyAccount/newsubscriptionvideos?v=2"/>
<yt:location>US</yt:location>
<yt:maxUploadDuration seconds="43200"/>
<yt:statistics lastWebAccess="2012-07-08T15:58:07.000Z" subscriberCount="126" videoWatchCount="0" viewCount="3385" totalUploadViews="50179"/>
<media:thumbnail url="http://i2.ytimg.com/i/A1RDBCYeYWY9dydB9MmPlg/1.jpg?v=934f35"/>
<yt:userId>A1RDBCYeYWY9dydB9MmPlg</yt:userId>
<yt:username display="MyAccount">MyAccount</yt:username>
</entry>

And here is what it searches through/has access to:

tag:youtube.com,2008:user:A1RDBCYeYWY9dydB9MmPlg2007-01-23T15:39:30.000Z2012-11-17T08:03:36.000Z
MyAccounthttp://gdata.youtube.com/feeds/api/users/MyAccountA1RDBCYeYWY9dydB9MmPlgUCA1RDBCYeYWY9dydB9MmPlgUSA1RDBCYeYWY9dydB9MmPlgMyAccount

Any and all help is greatly appreciated!!

Upvotes: 1

Views: 290

Answers (1)

5hahiL
5hahiL

Reputation: 1076

Try this,

    $data = file_get_contents($link);
    $searchterm = ''; //as necessary

    $data = strtr($data,Array("<"=>"&lt;","&"=>"&amp;"));
    $searchterm = strtr($searchterm,Array("<"=>"&lt;","&"=>"&amp;"));

    $search = strpos($data, $searchterm);

The middle lines makes HTML readable for PHP to process

Upvotes: 0

Related Questions