kenticny
kenticny

Reputation: 525

Http request document head

I need to get the document title.

so I try to send request, and paser the response html to get title.

example (via nodejs module request):

request.get("http://www.google.com", function(err, res, body) {
  var title = body.match(/<title>(.*?)</title>/g)[1];
})

but when the document is particularly large. the request is slowly.

Is there a way to get document title quickly? Please suggest. Thanks.

Upvotes: 0

Views: 162

Answers (1)

laggingreflex
laggingreflex

Reputation: 34677

Request can give you stream of decompressed data as it is received: http://github.com/request/request#examples (2nd example)

You could keep appending the data received in a buffer, and checking whether it has your desired content yet ("</title>"). As soon as you get it, you could get your title and ignore the rest of the buffer in the stream.

var request = require('request');
var buffer = '';
var flag = 0;
request({
        method: 'GET',
        uri: 'http://www.google.com',
        gzip: true
    }).on('data', function(data) {
        if (buffer.indexOf('</title>') == -1)
            buffer += data;
        else done();
    });
function done() {
    if (flag) return;
    flag++;
    var title = buffer.match(/<title>(.*?)<\/title>/)[1];
    console.log(title);
}

Upvotes: 1

Related Questions