Reputation: 4282
This is my code:
function myFunction() {
var test = DocumentApp.openById('someid');
test.clear();
var html = UrlFetchApp.fetch('https://www.crunchbase.com/organization/google').getContentText();
test.appendParagraph(html);
}
Request failed for https://www.crunchbase.com/organization/google returned code 416. Truncated server
How to fix this? When I set the website as www.google.com
it works, but fails when I set to https://www.crunchbase.com/organization/google
.
Upvotes: 0
Views: 1919
Reputation: 3337
that's because crunchbase.com do not allow robot to crawl their site.
To avoid the error in your script you need to add muteHttpExceptions
parameter to your urlfetch request:
var params = {muteHttpExceptions:true};
var response = UrlFetchApp.fetch('https://www.crunchbase.com/organization/google',params);
var html = response.getContentText();
test.appendParagraph(html);
then you'll be able to see the response :
Pardon Our Interruption
![]()
Pardon Our Interruption...
As you were browsing http://www.crunchbase.com something about your browser made us think you were a bot. There are a few reasons this might happen:
- You're a power user moving through this website with super-human speed.
- You've disabled JavaScript in your web browser.
- A third-party browser plugin, such as Ghostery or NoScript, is preventing JavaScript from running. Additional information is available in this http://ds.tl/help-third-party-plugins' target='_blank'>support article.
To request an unblock, please fill out the form below and we will review it as soon as possible.
<form id="zwxrztubr" method="POST" action="rytxecbxwsecazdrftrytxe.html"
style="display:none">Ignore: Ignore: Ignore: First Name Last Name E-mail City
Request Unblock You reached this page when attempting to access http://www.crunchbase.com/organization/google from 107.178.192.142 on 2016-08-31 07:38:18 GMT.
Trace: E2A843FA-6F4D-11E6-B2D7-9FC6DA1DE14E via c17ee8fd-4346-4832-a021-e5f8124f2861
Upvotes: 2