Reputation: 595
thanks for taking a second to look at this. I'm using a PHP script to get the source code of a page from a URL, and then I am attempting to parse it and display a certain part text. The problem appears to be that when I get the source for the link (with:
$data = file_get_contents($link);
) the variable $data stores it as HTML and not as just a string. I'm pretty new to PHP so I'm not 10% sure if that's the case, but I do know that if I try to display $data in any way it displays not as plain text but as HTML with HTML formatting.
Ordinarily this wouldn't be an issue but I am trying to get the value of something inside an HTML tag, like this:
$search = strpos($data, $searchterm);
and because it is either stored as HTML instead of as plain text or it is treated that way, strpos() will only search through the text that would be displayed if I loaded the page.
To be more specific, in my file (YouTube data about my account) it would only look at what would display if it were to be loaded as HTML, which is pure nonsense.
Here is the source that I want it to search through (I have replaced my account name with 'MyAccount' for privacy):
<entry gd:etag="W/"A0MFR347eCp7I2A9WhNQEU4."" xmlns="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" xmlns:gd="http://schemas.google.com/g/2005" xmlns:yt="http://gdata.youtube.com/schemas/2007">
<id>tag:youtube.com,2008:user:A1RDBCYeYWY9dydB9MmPlg</id>
<published>2007-01-23T15:39:30.000Z</published>
<updated>2012-11-17T08:03:36.000Z</updated>
<category scheme="http://schemas.google.com/g/2005#kind" term="http://gdata.youtube.com/schemas/2007#userProfile"/>
<title>MyAccount</title>
<summary/>
<link rel="alternate" type="text/html" href="http://www.youtube.com/channel/UCA1RDBCYeYWY9dydB9MmPlg"/>
<link rel="self" type="application/atom+xml" href="http://gdata.youtube.com/feeds/api/users/A1RDBCYeYWY9dydB9MmPlg?v=2"/>
<author>
<name>MyAccount</name>
<uri>http://gdata.youtube.com/feeds/api/users/MyAccount</uri>
<yt:userId>A1RDBCYeYWY9dydB9MmPlg</yt:userId>
</author>
<yt:channelId>UCA1RDBCYeYWY9dydB9MmPlg</yt:channelId>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.liveevent" href="http://gdata.youtube.com/feeds/api/users/MyAccount/live/events?v=2" countHint="0"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.favorites" href="http://gdata.youtube.com/feeds/api/users/MyAccount/favorites?v=2" countHint="0"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.contacts" href="http://gdata.youtube.com/feeds/api/users/MyAccount/contacts?v=2" countHint="71"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.inbox" href="http://gdata.youtube.com/feeds/api/users/MyAccount/inbox?v=2"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.playlists" href="http://gdata.youtube.com/feeds/api/users/MyAccount/playlists?v=2"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.subscriptions" href="http://gdata.youtube.com/feeds/api/users/MyAccount/subscriptions?v=2" countHint="54"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.uploads" href="http://gdata.youtube.com/feeds/api/users/MyAccount/uploads?v=2" countHint="41"/>
<gd:feedLink rel="http://gdata.youtube.com/schemas/2007#user.newsubscriptionvideos" href="http://gdata.youtube.com/feeds/api/users/MyAccount/newsubscriptionvideos?v=2"/>
<yt:location>US</yt:location>
<yt:maxUploadDuration seconds="43200"/>
<yt:statistics lastWebAccess="2012-07-08T15:58:07.000Z" subscriberCount="126" videoWatchCount="0" viewCount="3385" totalUploadViews="50179"/>
<media:thumbnail url="http://i2.ytimg.com/i/A1RDBCYeYWY9dydB9MmPlg/1.jpg?v=934f35"/>
<yt:userId>A1RDBCYeYWY9dydB9MmPlg</yt:userId>
<yt:username display="MyAccount">MyAccount</yt:username>
</entry>
And here is what it searches through/has access to:
tag:youtube.com,2008:user:A1RDBCYeYWY9dydB9MmPlg2007-01-23T15:39:30.000Z2012-11-17T08:03:36.000Z
MyAccounthttp://gdata.youtube.com/feeds/api/users/MyAccountA1RDBCYeYWY9dydB9MmPlgUCA1RDBCYeYWY9dydB9MmPlgUSA1RDBCYeYWY9dydB9MmPlgMyAccount
Any and all help is greatly appreciated!!
Upvotes: 1
Views: 290
Reputation: 1076
Try this,
$data = file_get_contents($link);
$searchterm = ''; //as necessary
$data = strtr($data,Array("<"=>"<","&"=>"&"));
$searchterm = strtr($searchterm,Array("<"=>"<","&"=>"&"));
$search = strpos($data, $searchterm);
The middle lines makes HTML readable for PHP to process
Upvotes: 0