Hunain Usman
Hunain Usman

Reputation: 2228

increase response time for scraping data using simpleHTMLDOM

So basically i was using 42matter's google market API to retrieve application information for my website, but after i found the free version only allowed 500 request/day and was for non commercial use only i had to develop my own API. Like any developer i did some R&D and found about scraping and simpleHTML DOM parser, i used it and was successful in getting my requirement done.

But now, i got a major problem, 42matter's api was super fast, i called the api, and got info quickly on my DOM(in 2 sec), but my api is slow, it is processing the same request in 8 or 10 seconds which on the page is visually slow and not attractive.

I tried to remove the overhead and find only the part which i need, still it was consuming alot of time

the code is following:

include('../common/simple_html_dom.php');

$appPackageName = $_REQUEST['appPackageName'];

header('Content-Type: application/json');

$html = file_get_html('https://play.google.com/store/apps/details?id='.$appPackageName.'');

foreach($html->find('div.id-app-orig-desc') as $e){

    $description = $e->innertext;
}

$appInfo['description'] = $description;

echo json_encode($appInfo);

Please if anyone knows tell me as quickly as possible

Upvotes: 1

Views: 136

Answers (1)

Quicker
Quicker

Reputation: 1246

A generic HTML-Parser must process the full html-code. I find 2 secs response time not fast. If you are only looking for tiny extracts of information out of a given html-doc just use the good old str_pos and substr. This requires you to find some unique markers in the read html and then implementing a process loop in your php. In reality sometimes static offsets or 2 to 3 level marker recursion do good jobs.

Upvotes: 1

Related Questions