axelios
axelios

Reputation: 203

Limiting amazon s3 downloads service

I run a service where the users can publicly upload and download files to our site, using Amazon S3. Last month we had a problem where a user uploaded a file that was downloaded like crazy, resulting in 170 TB of bandwidth and a huge bill.

Talking to Amazon and searching on StackOverflow the way to ensure this doesn't happen again is to download the S3 logs parse them, and take actions from there.

We could build such script, but I guess there must be some open source or third party service providing a script or service for this?

Upvotes: 1

Views: 2044

Answers (4)

Zigglzworth
Zigglzworth

Reputation: 6813

What about:

  1. Create a CloudFront Distribution for downloads

  2. Setup a CloudWatch alarm that is triggered when the distribution's BytesDownloaded metric exceeds your chosen monthly limit

  3. Add a notification (sent to an SNS topic you create) that is triggered when the alarm is fired

  4. Add a Lambda function that is triggered by SNS when a notification is sent to that topic (the SNS topic should also have your email subscribed of course so you receive an email with the alarm)

  5. In the Lambda function write code that uses the AWS SDK to update the cloudfront distribution and sets the enabled value to false

(You could also create a notification that is fired when the state of the alarm changes back to OK and trigger a lambda function that re-enables the distribution)

Upvotes: 2

Ali
Ali

Reputation: 1105

We had similar kind of requirement long ago.

We used CloudTrail logs to figure out all the activities being performed on our AWS Account.

hope the script for downloading and filter Cloudtrail logs helps you out. ( The following script is only for figuring out launched instance-ids, owner, eventname. please modify according to your need)

import boto3
import gzip
import os
import json

client = boto3.client('s3')
bucketname = "mybucketname"
list_bucket_objects = client.list_objects(Bucket=bucketname )
download_path = '/home/ec2-user/cloudtrail/'

# DOWNLOADING: Downloading Log files from S3
for object in list_bucket_objects['Contents']:
   print object['Key']
   object_name = object['Key'].split('/')
   if len(object_name)==8:
      print "Downloading --->%s" % object_name[7]
      client.download_file(bucketname, object['Key'], download_path+object_name[7])

# UNZIPPING: Unzipping the files in one folder    
file_path = '/home/ec2-user/cloudtrail/'
new_file_path = '/home/ec2-user/cloudtrail/logs/'

#Create Log Directory
if not os.path.exists(new_file_path):
    os.mkdir(new_file_path)

files = os.listdir(file_path)
for file in files:
   boolean = os.path.isfile(file_path+file)
   if boolean == True:
      f = gzip.GzipFile(file_path+file, 'rb')
      s = f.read()
      f.close()
      split_file = file.split('.')
      log_path =  new_file_path+split_file[0]
      print log_path
      out = open(log_path, 'wb')
      out.write(s)
      out.close()

# PARSING AND FILTERING: parsing output into json format, filtering output and writing it in result.txt file
      fin = open(log_path).read()
      content = json.loads(fin)
      for i in range(0, len(content['Records'])):
         event = content['Records'][i]['eventName']
         if 'userName' in content['Records'][i]['userIdentity']:
            user = content['Records'][i]['userIdentity']['userName']
         if 'responseElements' in content['Records'][i]:
            res_ele = content['Records'][i]['responseElements']
            if res_ele:
               if 'instancesSet' in content['Records'][i]['responseElements']:
                  if 'items' in content['Records'][i]['responseElements']['instancesSet']:
                     instance_id = content['Records'][i]['responseElements']['instancesSet']['items'][0]['instanceId']
         if (event == "RunInstances" and instance_id != ""):
            open('result.txt', 'ab').write(event+": :"+user+": :"+instance_id+"\n")

#result.txt is stored in your current working directory.

Upvotes: 0

E.J. Brennan
E.J. Brennan

Reputation: 46859

My solution to this, and problem like this, is to have billing alerts on my account. I know roughly how much I should spend each month, and setup alerts accordingly - roughly I have divided that amount by 4 (weeks), and set a series of billing alerts at 1/4, 1/2, 3/4 and 1X my estimated spend.

This is not a technical solution to stop the downloads, but at least someone will get notified and they can take action before it gets out of control.

Upvotes: 1

Piyush Patil
Piyush Patil

Reputation: 14523

Your best approach is distribute your S3 content using AWS Cloudfront and implement AWS Web Application Firewall (WAF) and implement IP blocking.

So if a IP hits your Cloud Front Distribution more than for say 5 times the AWS WAF will block that IP.

Here is the detailed guide.

https://blogs.aws.amazon.com/security/post/Tx1ZTM4DT0HRH0K/How-to-Configure-Rate-Based-Blacklisting-with-AWS-WAF-and-AWS-Lambda

Upvotes: 0

Related Questions