How to read and validate header in csv uploaded in s3 in aws lambda python function

Question

I need to upload CSV data to MySQL. To achieve this, I have utilized the AWS Lambda service. I am reading a CSV file from an S3 bucket in the Lambda function, which uses Python2.7. I want to validate the header of the CSV with values which I inserted in Environment Variables in the Lambda console. However, on printing the data, I am getting ' ' at the end of the value of the last column of each row.

I am able to read the data from CSV, and the values are also getting inserted to the MySQL DB.

def validateCSV(event,context):
    EXPECTED_HEADERS=os.environ['RM_EXPECTED_HEADERS']
    s3 = boto3.client("s3")
    file_obj = event["Records"][0]
    bucketname = str(file_obj['s3']['bucket']['name'])
    filename = str(file_obj['s3']['object']['key'])
    fileObj = s3.get_object(Bucket=bucketname, Key=filename)
    rows = fileObj["Body"].read().split('
')
    print(rows)//(['Name,Age,PinCode
', 'Apple,15,411001
',''])
    fList=[]
    for line in rows:
       fList.append(line.split(','))
    print("fList Headers matched: ",fList[0]==EXPECTED_HEADERS)//this is giving me FALSE

I have added value in environment variable --key=RM_EXPECTED_HEADERS and value as ['Name','Age','PinCode']. But when I am printing fList[0], I am getting (['Name', 'Age', 'PinCode ']).

How do I remove ' ' from fList[0]?

Patrick Artner · Accepted Answer

You can strip the manually from your rows:

  rows = [x.strip() for x in fileObj["Body"].read().split('
'))

Alternativly include them into your split(...):

  rows = [x.strip() for x in fileObj["Body"].read().split('
'))

I never had problems with remaining - normally python takes care of eiter (Linux) or (Windows) - problems might occure if you create textfiles using windows and splitting them under unix - not sure.

How to read and validate header in csv uploaded in s3 in aws lambda python function

Answers (1)

Related Questions