Reputation: 1643
I want to be able to download the entire contents of a website and use the data in my app. I've used NSURLConnection to download files in the past, but I don't believe it is capable of downloading all files from an entire website. I'm aware of the app Site Sucker, but don't think there is a way to integrate it's functionality into my app. I looked into AFNetworking & ASIHttpRequest, but didn't see anything useful to me. Any ideas / thoughts? Thanks.
Upvotes: 0
Views: 680
Reputation: 16946
I doubt there is anything out of the box that you can use, but existing libraries that you mentioned (AFNetworking & ASIHttpRequest) will get you pretty far.
The way this works is, you load the main website. Then you go through the source and find any resources that that page uses to display its contents and link to other pages. You then need to recursively download the contents of those resources, as well as its resources.
As you can imagine, there are few caveats to this approach:
You will only be able to download files that are mentioned in the source codes. Hidden files or files that aren't used by any page will not be downloaded as the app doesn't know of their existence.
Be aware of relative and absolute paths: ./image.jpg, /image.jpg, http://website.com/image.jpg, www.website.com/image.jpg, etc. could all link to the same image.
Keep in mind that page1.html could link to page2.html and vice versa. If you don't put any checks in place, this could lead to an infinite loop.
Check for pages that link to external websites--you probably don't want to download those as many websites have links to the outside and here you downloading the entire Internet to an iPhone with 8GB of storage.
Any dynamic pages (the ones that use a server side scripting language, such as PHP) will become static because they lose their server backend to provide them with dynamic data.
Those are the ones I could come up with, but I'm sure that there's more.
Upvotes: 1