Pucho
Pucho

Reputation: 380

Read csv file from /tmp folder in Lambda, filter it and upload it to s3

I have a file that I need to filter based on user login information, then upload the filtered file to s3.

Here is my code:

csv_file = csv.reader(open('/tmp/users.csv', "r"))
    for row in csv_file:
        if result > row[6]: #'result' is the date I'm measuring against column 6 of the csv

            with open('/tmp/filtered.csv', 'w') as g:
                wf = csv.writer(g)
                wf.writerow(['User', 'First', 'Last', 'Email', 'Local', 'Membership', 'Login'])
                wf.writerows(row)
                print (row)
            bucket.upload_file('/tmp/filtered.csv', key)

While the 'print (row)' line gives me this output:

enter image description here

The actual csv file uploaded to s3 looks like this:

enter image description here

The csv output that I'm getting in the file amounts to a single user. I would like to get all users found by the filtering in a properly formatted list. Any help would be appreciated.

EDIT: When I change the line from ‘wf.writerows(row)’ to ‘wf.writerow(row)’ the file is properly formatted, but it is still just one user(last one) out of the entire dataset.

Upvotes: 0

Views: 1460

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 269480

I would say your issue is related to the fact that you are opening the output file for every row:

for row in csv_file:
    with open('/tmp/filtered.csv', 'w') as g:
        wf = csv.writer(g)
        ...

This means that the contents of the output file is being overwritten for every row.

Instead, open the output file and create the csv writer before looping through each row of the input file:

with open('/tmp/filtered.csv', 'w') as output_file:
    wf = csv.writer(output_file)
    wf.writerow(['User', 'First', 'Last', 'Email', 'Local', 'Membership', 'Login'])

    csv_file = csv.reader(open('/tmp/users.csv', "r"))
    for row in csv_file:
        if result > row[6]:
            wf.writerow(row)

bucket.upload_file('/tmp/filtered.csv', key)

This way, only one output file will be created.

Upvotes: 2

Related Questions