itchyny
itchyny

Reputation: 824

Can we download a webpage completely with chrome.downloads.download? (Google Chrome Extension)

I want to save a wabpage completely from my Google Chrome extension. I added "downloads", "<all_urls>" permissions and confirmed that the following code save the Google page to google.html.

  chrome.downloads.download(
            { url: "http://www.google.com",
              filename: "google.html" },
            function (x) { console.log(x); })

However, this code only saves the html file. Stylesheets, scripts and images are not be saved. I want to save the webpage completely, as if I save the page with the dialog, selecting Format: Webpage, Complete.

I looked into the document but I couldn't find a way.

So my question is: how can I download a webpage completely from an extension using the api(s) of Google Chrome?

Upvotes: 7

Views: 10633

Answers (2)

Rob W
Rob W

Reputation: 348992

The downloads API downloads a single resource only. If you want to save a complete web page, then you can first open the web page, then export it as MHTML using chrome.pageCapture.saveAsMHTML, create a blob:-URL for the exported Blob using URL.createObjectURL and finally save this URL using the chrome.downloads.download API.

The pageCapture API requires a valid tabId. For instance:

// Create new tab, wait until it is loaded and save the page
chrome.tabs.create({
    url: 'http://example.com'
}, function(tab) {
    chrome.tabs.onUpdated.addListener(function func(tabId, changeInfo) {
        if (tabId == tab.id && changeInfo.status == 'complete') {
            chrome.tabs.onUpdated.removeListener(func);
            savePage(tabId);
        }
    });
});

function savePage(tabId) {
    chrome.pageCapture.saveAsMHTML({
        tabId: tabId
    }, function(blob) {
        var url = URL.createObjectURL(blob);
        // Optional: chrome.tabs.remove(tabId); // to close the tab
        chrome.downloads.download({
            url: url,
            filename: 'whatever.mhtml'
        });
    });
}

To try out, put the previous code in background.js,
add the permissions to manifest.json (as shown below) and reload the extension. Then example.com will be opened, and the web page will be saved as a self-contained MHTML file.

{
    "name": "Save full web page",
    "version": "1",
    "manifest_version": 2,
    "background": {
        "scripts": ["background.js"]
    },
    "permissions": [
        "pageCapture",
        "downloads"
    ]
}

Upvotes: 10

Claudiu Creanga
Claudiu Creanga

Reputation: 8366

No, it does not download for you all files: images, js, css etc. You should use tools like HTTRACK.

Upvotes: -1

Related Questions