AWS_Developer
AWS_Developer

Reputation: 856

How to read and validate header in csv uploaded in s3 in aws lambda python function

I need to upload CSV data to MySQL. To achieve this, I have utilized the AWS Lambda service. I am reading a CSV file from an S3 bucket in the Lambda function, which uses Python2.7. I want to validate the header of the CSV with values which I inserted in Environment Variables in the Lambda console. However, on printing the data, I am getting '\r' at the end of the value of the last column of each row.

I am able to read the data from CSV, and the values are also getting inserted to the MySQL DB.

def validateCSV(event,context):
    EXPECTED_HEADERS=os.environ['RM_EXPECTED_HEADERS']
    s3 = boto3.client("s3")
    file_obj = event["Records"][0]
    bucketname = str(file_obj['s3']['bucket']['name'])
    filename = str(file_obj['s3']['object']['key'])
    fileObj = s3.get_object(Bucket=bucketname, Key=filename)
    rows = fileObj["Body"].read().split('\n')
    print(rows)//(['Name,Age,PinCode\r', 'Apple,15,411001\r',''])
    fList=[]
    for line in rows:
       fList.append(line.split(','))
    print("fList Headers matched: ",fList[0]==EXPECTED_HEADERS)//this is giving me FALSE

I have added value in environment variable --key=RM_EXPECTED_HEADERS and value as ['Name','Age','PinCode']. But when I am printing fList[0], I am getting (['Name', 'Age', 'PinCode\r']).

How do I remove '\r' from fList[0]?

Upvotes: 1

Views: 2030

Answers (1)

Patrick Artner
Patrick Artner

Reputation: 51643

You can strip the \r manually from your rows:

  rows = [x.strip() for x in fileObj["Body"].read().split('\n'))

Alternativly include them into your split(...):

  rows = [x.strip() for x in fileObj["Body"].read().split('\r\n'))

I never had problems with remaining \r - normally python takes care of eiter \n (Linux) or \r\n (Windows) - problems might occure if you create textfiles using windows and splitting them under unix - not sure.

Upvotes: 1

Related Questions