user11503765
user11503765

Reputation:

Lambda Function to write to csv and upload to S3

I have a Python Script that gets the details of the unused security groups. I want that to write into a CSV file and upload to S3 Bucket.

When I test it in local machine it writes to CSV in the local machine. But when I execute that as a lambda function, it needs a place to save the CSV. So I am using s3.

import boto3
import csv

ses = boto3.client('ses')

def lambda_handler(event, context):
    with open('https://unused******- 
    1.amazonaws.com/Unused.csv', 'w') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow([
            'Account Name',
            'Region',
            'Id'
        ])
        ec2 = boto3.resource('ec2')
        sgs = list(ec2.security_groups.all())
        insts = list(ec2.instances.all())

        all_sgs = set([sg.group_id for sg in sgs])
        all_inst_sgs = set([sg['GroupId'] for inst in insts for sg in
        inst.security_groups])

        unused_sgs = all_sgs - all_inst_sgs


        for elem in unused_sgs:
            writer.writerow([
                Account_Name,
                region,
                elem
                ])

I want to write the result of "elem" into csv file and upload to S3 Bucket. Kindly advice.

Upvotes: 1

Views: 8425

Answers (4)

jawad846
jawad846

Reputation: 772

Generate an inventory from multiple accounts, and push it to the s3 bucket using the Lambda function.

Create SSM parameter store to put the IAM assume roles of the Accounts;

Name: 'rolearnlist'

Type: 'StringList'

Values: 'arn::::::,arn:::::'

Create a Lambda function as below;

import boto3
import json
import datetime
import csv

lambda_client = boto3.client('lambda')
ssm_client = boto3.client('ssm')
s3_client = boto3.resource("s3")
sts_client = boto3.client('sts')

def lambda_handler(event, context):

time = datetime.datetime.now().strftime ('%Y-%m-%d-%H-%M-%S')
bucket = s3_client.Bucket('expo2020-core-master-me-south-1-agent-bucket')
file_name = ('backup_job_weekly_report_' + time + '.csv')
s3_path = 'Inventory/Weekly/' + file_name

fieldnames = ['Account Id','Backup Job Id', 'Backup State', 'Resource Arn', 'Resource Type', 'Start By','Creation Date']

rolearnlist = []
rolearnlist_from_ssm = ssm_client.get_parameter(Name='rolearnlist')
rolearnlist_from_ssm_list = rolearnlist_from_ssm['Parameter']['Value'].split(",")
rolearnlist = rolearnlist_from_ssm_list

with open('/tmp/file_name', 'w', newline='') as csvFile:
    w = csv.writer(csvFile, dialect='excel')
    w.writerow(fieldnames)

    for rolearn in rolearnlist:
        awsaccount = sts_client.assume_role(
            RoleArn=rolearn,    
            RoleSessionName='awsaccount_session'
        )

        ACCESS_KEY = awsaccount['Credentials']['AccessKeyId']
        SECRET_KEY = awsaccount['Credentials']['SecretAccessKey']
        SESSION_TOKEN = awsaccount['Credentials']['SessionToken']

        backup = boto3.client('backup', aws_access_key_id=ACCESS_KEY, aws_secret_access_key=SECRET_KEY, aws_session_token=SESSION_TOKEN)
        response = backup.list_backup_jobs()

        for i in response['BackupJobs']:
            AccountId = i.get('AccountId')
            BackupJobId = i.get('BackupJobId')
            BackupState = i.get('State')
            ResourceArn = i.get('ResourceArn')
            ResourceType = i.get('ResourceType')
            StartBy = i.get('StartBy')
            CreationDate = i.get('CreationDate')

            raw =   [
                    AccountId,
                    BackupJobId,
                    BackupState,
                    ResourceArn,
                    ResourceType,
                    StartBy,
                    CreationDate,
                    ]

            w.writerow(raw)
            raw = []
csvFile.close()

bucket.upload_file('/tmp/file_name', s3_path)

Upvotes: 0

Hoan Dang
Hoan Dang

Reputation: 374

Follow jarmod advice if your csv file is small, otherwise you could use lambda to spin up a temporary ec2 instance (you can go for xlarge size for better performance) with user_data in it. The user_data will do all the csv processes on a strong and healthy ec2, though remember to terminate the instance (the termination command can be also included in the user_data) once the process is done.

Upvotes: 0

Lamanus
Lamanus

Reputation: 13581

By using StringIO(), you don't need to save the csv to local and just upload the IO to S3. Try my code and let me know if something wrong because I can't test the code but it was worked for other cases.

import boto3
import csv
import io

s3 = boto3.client('s3')
ses = boto3.client('ses')

def lambda_handler(event, context):
    csvio = io.StringIO()
    writer = csv.writer(csvio)
    writer.writerow([
        'Account Name',
        'Region',
        'Id'
    ])

    ec2 = boto3.resource('ec2')
    sgs = list(ec2.security_groups.all())
    insts = list(ec2.instances.all())

    all_sgs = set([sg.group_id for sg in sgs])
    all_inst_sgs = set([sg['GroupId'] for inst in insts for sg in
    inst.security_groups])

    unused_sgs = all_sgs - all_inst_sgs

    for elem in unused_sgs:
        writer.writerow([
            Account_Name,
            region,
            elem
            ])

    s3.put_object(Body=csvio.getvalue(), ContentType='application/vnd.ms-excel', Bucket='bucket', Key='name_of.csv') 
    csvio.close()

Upvotes: 4

jarmod
jarmod

Reputation: 78840

If the CSV file will be small, write it to the /tmp folder, then upload that file to S3. If it's large (say, larger than ~200MB) then you should probably stream it to S3.

Read the boto3 documents for the relevant S3 client methods.

Upvotes: 0

Related Questions