Jimbo
Jimbo

Reputation: 3284

failing to download file using wget.download on AWS lambda with Errno 97

I'm running the following Python commands in AWS Lambda:

import wget
src = 'https://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/pubmed21n1085.xml.gz'
dest = '/tmp/pubmed21n1085.xml.gz'
wget.download(src,out=dest)

and getting the following error:

[ERROR] URLError: <urlopen error [Errno 97] Address family not supported by protocol>
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 495, in lambda_handler
    download_and_add_file('pubmed21n1085.xml.gz')
  File "/var/task/lambda_function.py", line 453, in download_and_add_file
    wget.download(get_ftp_url(file_name,is_update),out=out_file_path,bar=bar_progress)
  File "/opt/python/wget.py", line 526, in download
    (tmpfile, headers) = ulib.urlretrieve(binurl, tmpfile, callback)
  File "/var/lang/lib/python3.8/urllib/request.py", line 247, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/var/lang/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/var/lang/lib/python3.8/urllib/request.py", line 525, in open
    response = self._open(req, data)
  File "/var/lang/lib/python3.8/urllib/request.py", line 542, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/var/lang/lib/python3.8/urllib/request.py", line 502, in _call_chain
    result = func(*args)
  File "/var/lang/lib/python3.8/urllib/request.py", line 1393, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/var/lang/lib/python3.8/urllib/request.py", line 1353, in do_open
    raise URLError(err)END RequestId: e07da2fc-9c70-4217-a8bc-8963067ec1c6

Failure is taking close to 2 minutes. Is this a Lambda permissions error? Some other code error?

This code works locally .... The file is only ~20 MB.

EDIT: Seems to be related to using a VPC. A Lambda function without a VPC works fine. The VPC is in use for connecting to Aurora Serverless. Outbound rules are all open.

Upvotes: 2

Views: 1213

Answers (1)

Jimbo
Jimbo

Reputation: 3284

Apparently, accessing the outside internet is not something you can just do when on a private VPC. I thought the private VPC just limited access in or out, but that if your outbound rules allowed going anywhere you could access public resources. Not the case.

See: https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/

A decent walkthrough for this is here: https://stackoverflow.com/a/39082826/764365

Note, the instructions seem to require setting up a NAT gateway (along with other things like an elastic IP). It is not clear with the elastic IP but the NAT gateway seems to have a fairly steep per hour charge regardless of usage, which for someone trying to use the occasional Lamba function doesn't work well. I think I am going to look for a service elsewhere that can download the files I want to a S3 bucket.

Upvotes: 2

Related Questions