jay
jay

Reputation: 41

Issue downloading a complete website for offline use with HTTrack

I downloaded sonst.cc with HTTrack, but when viewing it offline there’s no content. Every single tab is empty. Why is that?

Is there any other app with which I could download the whole thing?

I’m losing my mind over here.

Thanks.

Edit:

When I open the index file, downloaded with HTTrack, in Safari the front page loads just fine with the background image, the menus... everything is perfect! Except when I click on any of the menus the tabs open up empty! No content at all!!! That said it downloaded the whole site... html, css, js, images, ...and when I look at the code everything seems fine. It's all there!

Screenshot

index.html

<html>
  <head>
    <title>SONST</title>
    <meta http-equiv="content-type"     content="text/html;charset=UTF-8" />
    <meta name="title"          content="SONST" />
    <meta name="doc-type"           content="Web Page" />
    <meta name="Content-Language"       content="en" />
    <meta name="author"         content="Brill Webdesign, Eindhoven" />
    <meta name="web_author"         content="Brill Webdesign, Eindhoven" />
    <meta name="production"         content="Brill Webdesign - http://www.brill-webdesign.nl" />
    <meta name="copyright"          content="2015, Brill Webdesign" />
    <meta name="keywords"           content="" />
    <meta name="description"        content="" />
    <meta name="classification"     content="Business and Economy" />
    <meta name="Rating"         content="General" />
    <meta name="revisit-after"      content="5 Days" />
    <meta name="doc-class"          content="Living Document" />
    <meta name="robots"         content="all" />
    <meta http-equiv="imagetoolbar"     content="no" />
    <link rel="Shortcut Icon"       href="favicon.html" type="image/x-icon" />
    <link rel="icon"            href="favicon.html" type="image/x-icon" />
    <link rel="stylesheet"          href="css/styles.css" type="text/css" charset="utf-8" />
    <link rel="stylesheet"          href="css/slideshow.css" type="text/css" media="screen" />

    <script type="text/javascript"      src="scripts/mootools-core-1.3.1-full-compat-yc.js"></script>
    <script type="text/javascript"      src="scripts/mootools-more-1.3.1.1.js"></script>
    <script type="text/javascript"      src="scripts/interface.js"></script>
    <script type="text/javascript"      src="scripts/slideshow.js"></script>
    <script type="text/javascript"      src="scripts/fitimage.js"></script>

    <script type="text/javascript">
        window.addEvent('domready', function()
        {
            new FitImage('files/impressionen/SONST-Wald.jpg');
        });
    </script>

</head>
<body>

    <div id="show"></div>

    <div id="menu">
                    <a href="page5cf1.html?page=aktuelles&amp;l=">Aktuelles</a> /
        <a href="pagee4c7.html?page=angebot&amp;l=">Angebot</a> /
        <a href="page6e95.html?page=projekte&amp;l=">Realisierte Projekte</a> /
        <a href="page0c6a.html?page=referenzen&amp;l=">Referenzen</a> /
        <a href="pagee1df.html?page=kontakt&amp;l=">Kontakt</a> /
        <a href="paged192.html?page=impressum&amp;l=">Impressum</a>
                </div>

    <div id="wrapper">

        <div id="block_01" class="block">
            <div class="remove">
                <a href="#" onclick="slidepic();slide04();slide03();slide02();slide01();resetDelay();">&times;</a>
            </div>
            <div id="block_01_inner"></div>
        </div>

        <div id="block_02" class="block">
            <div class="remove">
                <a href="#" onclick="slidepic();slide04();slide03();slide02();resetDelay();">&times;</a>
            </div>
            <div id="block_02_inner"></div>
        </div>

        <div id="block_03" class="block">
            <div class="remove">
                <a href="#" onclick="slidepic();slide04();slide03();resetDelay();">&times;</a>
            </div>
            <div id="block_03_inner"></div>
        </div>

        <div id="block_04" class="block">
            <div class="remove">
                <a href="#" onclick="slidepic();slide04();resetDelay();">&times;</a>
            </div>
            <div id="block_04_inner"></div>
        </div>

        <div id="block_pic" class="block" rel="off">
            <div class="remove" style="height: 0;">
                <a href="#" onclick="slidepic();resetDelay();" id="close_pic">&times;</a>
            </div>
            <div id="block_pic_slideshow" rel="0" onclick="javascript:next_pic(); return false;"></div>
        </div>

    </div>

    <div class="introLogo">
        <img src="images/logo.png" alt="sonst" width="920" height="291" border="0" />
    </div>
    <div class="lang">
        <a href="index124c.html?l=e">E</a> / <a href="index1d70.html?l=d">D</a>
    </div>
</body>

Upvotes: 4

Views: 13414

Answers (3)

erc mgddm
erc mgddm

Reputation: 1

For gui version. Set all necessary download links in web addresses (URL):

https://ok.mysite.com/src/js.js
https://ok.mysite.com/src/css.css
https://ok.mysite.com/src/
https://ok.mysite.com/folder/iwrHelp
https://ok.mysite.com/folder/mlnHelp
https://ok.mysite.com/folder/mlnRatings
https://ok.mysite.com/folder/iwrVariants
https://ok.mysite.com/folder/millionaire
https://ok.mysite.com/folder/
https://ok.mysite.com/favicon.ico
https://ok.mysite.com/
https://ok.mysite.com/src/erc
https://ok.mysite.com/src/jquery.js
https://ok.mysite.com/src/ico.ico
https://ok.mysite.com/src/bg.jpg
https://ok.mysite.com/src/fonts/font.eot
https://ok.mysite.com/src/fonts/font.otf
https://ok.mysite.com/src/fonts/
https://ok.mysite.com/src/fonts/font.ttf
https://ok.mysite.com/src/fonts/font.woff
https://ok.mysite.com/src/folder/gif.gif
https://ok.mysite.com/src/folder/jpg.jpg

If this is your site, the paths for links can be easily obtained with OS commands (cmd) dir /ogen /a /p /s /b *.* or similar for bash. And replace everything in notepad, for example: c:\site to https://ok.mysite.com

Check the size of all uploaded files, if the size is 0, you need to either upload the file manually or restart the upload and check the size of all files to 0 again, see the httrack error log.

Upvotes: 0

Diego Sagrera
Diego Sagrera

Reputation: 263

Some servers requests headers from the browser. To mimic this exact behaviour follow these steps:

  1. Press F12 on the browser and look for "Network" or "Net" tab
  2. Open the webpage you want to download
  3. Expand the first item on the list, that should be a GET request
  4. Check where it says "Headers". If you're using Firebug in Firefox, you may also click "view source"
  5. Copy all of the headers atarting on the line that reads "Host:" by painting them with the mouse and pressing CONTROL+C
  6. Go to HTTrack and click the "Set options" button of your current download (under the urls).
  7. Go to the "Browser ID" tab leave "Browser identity" empty, HTML footer "(none)" and on the "Additional HTTP headers" paste what you've copied on step #5
  8. You're all set.

Upvotes: 3

Huey
Huey

Reputation: 5220

I did a wget -p -k http://sonst.cc and got index.html with all its associated css and js files.

The background image didn't get pulled, but apart from that, the page looks okay. sonst.cc

I checked out the tabs, and indeed they weren't working. Closer inspection reveals they're loading content from an external php file upon clicking:

dev tools

Since the PHP file is processed server side, naturally wget or httrack can't get its hands on the code, so can't load the relevant content. When it tries to pull it from the server, I get an Access Control cross-origin error

error

If you really want a working version of the page, given the relatively few number of tabs, you could manually copy the responses from the php script and edit the js in index.html to load the tabs from your local copy of the responses instead.

Upvotes: 0

Related Questions