Marius Johan
Marius Johan

Reputation: 392

Can't download a file on 1.34GB on my Google App Engine server

I'm trying to download a pretrained model on Google App Engine, but when I try to, it just keep restarting the download process.
How do I prevent that?

A 2020-05-11T19:32:55Z [2020-05-11 19:32:55 +0000] [40] [INFO] Worker exiting (pid: 40)
A 2020-05-11T19:32:55Z 
Downloading:  66%|██████▌   | 885M/1.34G [00:27<00:14, 32.2MB/s]
A 2020-05-11T19:32:55Z [2020-05-11 19:32:55 +0000] [50] [INFO] Booting worker with pid: 50
A 2020-05-11T19:32:57Z Downloading files...
A 2020-05-11T19:32:57Z Downloaded bert tokenizer
A 2020-05-11T19:33:25Z 
Downloading:  69%|██████▊   | 921M/1.34G [00:27<00:11, 37.2MB/s][2020-05-11 19:33:25 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:50)
A 2020-05-11T19:33:25Z [2020-05-11 19:33:25 +0000] [50] [INFO] Worker exiting (pid: 50)
A 2020-05-11T19:33:25Z 
Downloading:  69%|██████▊   | 921M/1.34G [00:27<00:12, 33.8MB/s]
A 2020-05-11T19:33:26Z [2020-05-11 19:33:26 +0000] [60] [INFO] Booting worker with pid: 60
A 2020-05-11T19:33:27Z Downloading files...
A 2020-05-11T19:33:27Z Downloaded bert tokenizer
A 2020-05-11T19:33:56Z 
Downloading:  51%|█████     | 678M/1.34G [00:27<00:18, 36.2MB/s][2020-05-11 19:33:56 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:60) 
A 2020-05-11T19:33:56Z [2020-05-11 19:33:56 +0000] [60] [INFO] Worker exiting (pid: 60)
A 2020-05-11T19:33:56Z 
Downloading:  51%|█████     | 681M/1.34G [00:27<00:26, 25.0MB/s]
A 2020-05-11T19:33:56Z [2020-05-11 19:33:56 +0000] [70] [INFO] Booting worker with pid: 70
A 2020-05-11T19:33:57Z Downloading files...
A 2020-05-11T19:33:57Z Downloaded bert tokenizer

Look how it repeats the process even though it hasn't finished yet.
I think it has something to do with the worker exiting, but i've quite high specs

app.yaml

runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
    python_version: 3

# System specifications
manual_scaling:
    instances: 1
resources:
    cpu: 8
    memory_gb: 16
    disk_size_gb: 15

# Health checks 
liveness_check:
    path: "/liveness_check"
    initial_delay_sec: 600

Would it help if I first were to upload it to a Google Storage Bucket (in the same region) and then download it from there?

Upvotes: 1

Views: 114

Answers (1)

siamsot
siamsot

Reputation: 1575

The gunicorn workers have 30 seconds timeout. You can increase that by changing your app.yaml file to something like:

entrypoint: gunicorn -t (value in seconds) -b :$PORT main:app

However, please note that App Engine is a serverless solution. That means that everytime an instance is spinned, those files will have to be downloaded on each instance and when the instance dies, then it's state dies with it.

Upvotes: 2

Related Questions