klaudas
klaudas

Reputation: 457

SSZipArchive fails to unzip Google Books epubs

My code executes this sequence:

  1. Epub download request
  2. Unzipping using SSZipArchive

For example e-book from this link unzips successfully:

https://www.z-epub.com/electronic-book/3633

E-book from Google Books however doesn't:

https://books.google.it/books/download/Life_and_Times_of_Frederick_Douglass.epub?id=fFTcLFXId-wC&hl=&output=epub&source=gbs_api

I get this error:

Error Domain=SSZipArchiveErrorDomain Code=-1 "failed to open zip file" UserInfo={NSLocalizedDescription=failed to open zip file}

Here's an interesting thing, same Google Books e-book is unzipped successfully if I add it to app's Bundle or import it through fileImporter (SwiftUI).

I'm making assumption that this issue might be related with file permissions. I might be wrong though. Can anyone assist?

Upvotes: 1

Views: 116

Answers (1)

pmqs
pmqs

Reputation: 3705

Let's see what the files look like whet they are downloaded

$ $ wget -nv https://www.z-epub.com/electronic-book/3633
2022-10-17 14:35:23 URL:https://www.z-epub.com/electronic-book/3633 [753686/753686] -> "3633" [1]

What have we downloaded -- that appears to be an epub file, and it is also a valid zip file

$ file 3633 
3633: EPUB document

$ unzip -l 3633
Archive:  3633
  Length      Date    Time    Name
---------  ---------- -----   ----
       20  2022-09-07 00:36   mimetype
        0  2022-09-07 00:36   META-INF/
      244  2022-09-07 00:36   META-INF/container.xml
     2535  2022-09-07 00:36   content.opf
   280948  2022-09-07 00:36   index-1_1.jpg
    36675  2022-09-07 00:36   index-3_1.jpg
    30045  2022-09-07 00:36   index-420_1.jpg
    96011  2022-09-07 00:36   index-424_1.jpg
     9528  2022-09-07 00:36   index-424_2.jpg
     9563  2022-09-07 00:36   index-424_3.jpg
    12090  2022-09-07 00:36   index-424_4.jpg
    15663  2022-09-07 00:36   index-426_1.jpg
   138226  2022-09-07 00:36   index_split_000.html
   134177  2022-09-07 00:36   index_split_001.html
   136279  2022-09-07 00:36   index_split_002.html
   135831  2022-09-07 00:36   index_split_003.html
   135266  2022-09-07 00:36   index_split_004.html
   132119  2022-09-07 00:36   index_split_005.html
   265375  2022-09-07 00:36   index_split_006.html
     3634  2022-09-07 00:36   index_split_007.html
       58  2022-09-07 00:36   page_styles.css
      730  2022-09-07 00:36   stylesheet.css
     7769  2022-09-07 00:36   toc.ncx
---------                     -------
  1582786                     23 files

Now for the document from Google Books.

$ wget -nv 'https://books.google.it/books/download/Life_and_Times_of_Frederick_Douglass.epub?id=fFTcLFXId-wC&hl=&output=epub&source=gbs_api'
2022-10-17 14:36:06 URL:https://books.google.it/books/download/Life_and_Times_of_Frederick_Douglass.epub?id=fFTcLFXId-wC&hl=it&source=gbs_api&capid=AFLRE71pa6m6U4kyhfqVhaMlSaPP4IhFhRDadl4EH8uPhMSCrD2HznTfJdCsQbhqdFjma8lxUITKJ7EAPw2Dvpo2J5lpmvDi6w&continue=https://books.google.it/books/download/Life_and_Times_of_Frederick_Douglass.epub%3Fid%3DfFTcLFXId-wC%26hl%3Dit%26output%3Depub%26source%3Dgbs_api [13867] -> "Life_and_Times_of_Frederick_Douglass.epub?id=fFTcLFXId-wC&hl=&output=epub&source=gbs_api" [1]

what have we downloaded?

$ file Life_and_Times_of_Frederick_Douglass.epub\?id\=fFTcLFXId-wC\&hl\=\&output\=epub\&source\=gbs_api 
Life_and_Times_of_Frederick_Douglass.epub?id=fFTcLFXId-wC&hl=&output=epub&source=gbs_api: HTML document, ISO-8859 text, with very long lines (11433)

That isn't an epub file. When I went to the URL in my browser, I found a page in Italian that needed user input. Below image has the English translation.

enter image description here

When I dowloaded the epb file manually from that site it looked fine.

Are you doing anything in your code to bypass the form?

Upvotes: 1

Related Questions