graille
graille

Reputation: 1231

Download Github public assets from CLI and Dockerfile

I'm trying to download the networks models of a Neural Network project used to detect NSFW images. The packed models are available in the assets section of a release at this URL: https://github.com/notAI-tech/NudeNet/releases

I would like to create a DOCKERFILE which download these assets when built. I started using the ADD command, but the files didn't get downloaded entirely (only few kB over the 120 MB of some files). So, I tried in my Linux CLI using wget and curl... But nothing worked as expected. For example the command :

curl -OJL https://github.com/notAI-tech/NudeNet/releases/download/v0/classifier_model.onnx

Starts the download but only download an HTML file instead of the actual ONNX file... It seems Github is doing some kind of redirection and I don't know how I can handle it with curl/wget and finally with the ADD command of a DOCKERFILE ?

Upvotes: 0

Views: 881

Answers (1)

Daweo
Daweo

Reputation: 36550

I did visit

https://github.com/notAI-tech/NudeNet/releases/download/v0/classifier_model.onnx

in my browser and I did get login page, so apparently it is not publicly available. That would explain why you did get small HTML file (file with login form).

Github is doing some kind of redirection and I don't know how I can handle it with(...)wget

You need to provide authentication data, I do not know how it is exactly done in this case, but I suspect they might use one of popular methods: basic authentication (see wget options --http-user=user and --http-password=pass) or cookies based solution (see wget options --load-cookies file and --save-cookies file and --keep-session-cookies).

Mentonied options are described in wget man page, which you might access by click link or doing man wget in terminal.

Upvotes: 1

Related Questions