Python Novice
Python Novice

Reputation: 2200

S3 Connection timeout when using boto3

I am using boto3 to operate with S3. If my application is unable to reach S3 due to a network issue, the connection will hang until eventually it times out. I would like to set a lower connection timeout. I came across this PR for botocore that allows setting a timeout:

$ sudo iptables -A OUTPUT -p tcp --dport 443 -j DROP

from botocore.client import Config
import boto3

config = Config(connect_timeout=5, read_timeout=5)

s3 = boto3.client('s3', config=config)

s3.head_bucket(Bucket='my-s3-bucket') 

This throws a ConnectTimeout, but it still takes too long to error out:

ConnectTimeout: HTTPSConnectionPool(host='my-s3-bucket.s3.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPSConnection object at 0x2ad5dd0>, 'Connection to my-s3-bucket.s3.amazonaws.com timed out. (connect timeout=5)'))

Tweaking both the connect and read timeouts doesn't impact how quickly the connection responds.

Upvotes: 50

Views: 80500

Answers (2)

EM Bee
EM Bee

Reputation: 1259

Did you ever get this resolved? My suspicion is that you need the credentials for your boto connection.

Here is how I connect to boto3:

import boto3
from botocore.exceptions import ClientError
import re
from io import BytesIO
import gzip
import datetime
import dateutil.parser as dparser
from datetime import datetime
import tarfile
import requests
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

## Needed glue stuff
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)

## 
## currently this will run for everything that is in the staging directory         of omniture

# set needed parms
myProfileName = 'MyDataLake'
dhiBucket = 'data-lake'
#create boto3 session
try:    
    session = boto3.Session(aws_access_key_id='aaaaaaaaaaaa', aws_secret_access_key='abcdefghijklmnopqrstuvwxyz', region_name='us-east-1', aws_session_token=None, region_name=None, botocore_session=None)
    s3 = session.resource('s3') # establish connection to s3
except Exception as conne:
    print ("Unable to connect:  " + str(conne))
    errtxt = requests.post("https://errorcapturesite", data=    {'message':'Unable to connect to : ' + myProfileName,     'notify':True,'color':'red'})
    print(errtxt.text) 
    exit()

Upvotes: 2

llude
llude

Reputation: 1273

You are probably getting bitten by boto3's default behaviour of retrying connections multiple times and exponentially backing off in between. I had good results with the following:

from botocore.client import Config
import boto3

config = Config(connect_timeout=5, retries={'max_attempts': 0})
s3 = boto3.client('s3', config=config)

Upvotes: 85

Related Questions