acctman
acctman

Reputation: 4349

Getting data from an external webpage

What's the best way to get content from an external website via php?

Using php how do I go to webpage (ex: http://store.domain.com/1/) and scan the HTML coding for data that is found in between (which is the letter C and E). what php method do I use?

<span id="ctl00_ContentPlaceHolder1_phstats1_pname">C</span>
<span id="ctl00_ContentPlaceHolder1_phstats2_pname">E</span>

then save "C" (the found string) to $pname

$_session['pname1'] = $pname1;
$_session['pname2'] = $pname2;

Upvotes: 2

Views: 12176

Answers (4)

Ghost-Man
Ghost-Man

Reputation: 2187

It can be done by CURL. But you can just include the Simple HTML DOM Parser in your project. Its very easy to use and will serve your purpose.

The documentation is here. http://simplehtmldom.sourceforge.net/

Upvotes: 0

Alasdair
Alasdair

Reputation: 14103

The most efficient method is:

$content = file_get_contents('http://www.domain.com/whatever.html');

$pos = str_pos($content,'id="c');
$on=0;
while($pos!==false)
 {
 $content = substr($content,$pos+4);
 $pos = str_pos($content,'"');
 $list[$on] = substr($content,0,$pos);
 $on++;
 $pos = str_pos($content,'id="c');
 }

Then all yours values will be in the $list array, the count of which is $on.

You could also do it in one line with one of the preg functions, but I like the old-school method, it's a nanosecond faster.

Upvotes: 3

nbk
nbk

Reputation: 1992

You need to use web page scraping technique. It can be done simply by using HTML DOM Library or with technologies like Node.js and jQuery. You can find some useful tutorials regarding this here and here.

You may also see this thread regarding implementing scraping using PHP

Upvotes: 4

Nacht
Nacht

Reputation: 3494

i think you can actually use file_get_contents("http://store.domain.com/1/"); to do an http request.

as far as parsing it, depending on how big your project is and how much effort you're willing to go, you can find an html DOM parser like here http://simplehtmldom.sourceforge.net/ or simply search for id="ctl00_ContentPlaceHolder1_phstats1_pname" and take it apart piece by piece (not the recommended way of doing things).

Upvotes: 0

Related Questions