Reputation: 185
My group recently migrated from classic to New Google Sites. There were many things I was doing with classic sites and/or apps script's Sites Service I can't do anymore since new sites and apps script aren't integrated.
I would like the ability to scrape content from our internal sites pages using UrlFetchApp; however, running the code below (as a user with access to the sites page I'd like to scrape), returns the Google sign-in page, not the page's content.
Is it possible to scrape the group's internal Google Site using UrlFetchApp?
function myFunction() {
var txt = UrlFetchApp.fetch("https://sites.google.com/a/domain.com/home").getContentText();
Logger.log(txt);
}
Which returns....
Logging output too large. Truncating output.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta content="width=300, initial-scale=1" name="viewport">
<meta name="google-site-verification" content="LrdTUW9psUAMbh4Ia074-BPEVmcpBxF6Gwf0MSgQXZs">
<title>Sign in - Google Accounts</title>
<style>
@font-face {
font-family: 'Open Sans';
font-style: normal;
font-weight: 300;
src: url(//fonts.gstatic.com/s/opensans/v15/mem5YaGs126MiZpBA-UN_r8OUuhs.ttf) format('truetype');
}
Upvotes: 0
Views: 488
Reputation: 7597
I appreciate your question is quite old now, but just to confirm, no you can't scrape Google Sites using Apps Script. It's infuriating that Google still hasn't implemented an API for Sites.
In your example code, you're not passing in any authorisation so the UrlFetchApp
call is effectively 'anonymous', which wouldn't necessarily work in any case. However, even if you pass in a valid authorisation header, it won't work. For example:
let url = "https://sites.google.com/YOUR_SITE";
let opts = {
method: "get",
headers: { Authorization: `Bearer ${ScriptApp.getOAuthToken()}`}
};
let resp = UrlFetchApp.fetch(url, opts);
console.log(resp.getContentText());
It's disappointing.
Upvotes: 0