Uzay Macar
Uzay Macar

Reputation: 264

How to curl for extracting valid .zip file from redirecting link

I am trying to automate a data downloading process. For this purpose, my goal is to extract (using bash commands) the .zip from a redirection link that could be seen on display here: https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303

I have seen that people suggest the -L tag with curl for redirections, but it doesn't seem to work for my case. The specific command I have tried is: curl -L -o output.zip https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip

The command file output.zip shows that the extracted .zip file is actually a HTML document text. On the other hand, clicking the redirection link (used inside curl command) downloads the extracted folder automatically via a browser.

Any ideas, tips, or suggestions on what I should try (or whether this is possible or not) will be highly appreciated!

Upvotes: 1

Views: 2774

Answers (2)

danrodlor
danrodlor

Reputation: 1459

If you execute curl with the --verbose option you can see that it is a cookie related problem. The cookie engine needs to be enabled. You can download the desired file as follows:

curl -b cookies.txt -L https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip -o test.zip

It doesn't matter if the file provided with the -b option doesn't exist. We just need to activate the cookie engine.

Refer to Send cookies with curl and Save cookies between two curl requests for futher information.

Upvotes: 1

Sonny
Sonny

Reputation: 3183

You can download that file with wget on Linux

$ wget https://journals.sagepub.com/doi/suppl/10.1177/0022002706289303/suppl_file/Sambanis_Aug_06.zip
$ unzip Sambanis_Aug_06.zip
Archive:  Sambanis_Aug_06.zip
inflating: Sambanis (Aug 06).dta
inflating: Sambanis Appendix (Aug 06).pdf

Upvotes: 1

Related Questions