Reputation: 856
I need to upload CSV data to MySQL. To achieve this, I have utilized the AWS Lambda service. I am reading a CSV file from an S3 bucket in the Lambda function, which uses Python2.7. I want to validate the header of the CSV with values which I inserted in Environment Variables in the Lambda console. However, on printing the data, I am getting '\r' at the end of the value of the last column of each row.
I am able to read the data from CSV, and the values are also getting inserted to the MySQL DB.
def validateCSV(event,context):
EXPECTED_HEADERS=os.environ['RM_EXPECTED_HEADERS']
s3 = boto3.client("s3")
file_obj = event["Records"][0]
bucketname = str(file_obj['s3']['bucket']['name'])
filename = str(file_obj['s3']['object']['key'])
fileObj = s3.get_object(Bucket=bucketname, Key=filename)
rows = fileObj["Body"].read().split('\n')
print(rows)//(['Name,Age,PinCode\r', 'Apple,15,411001\r',''])
fList=[]
for line in rows:
fList.append(line.split(','))
print("fList Headers matched: ",fList[0]==EXPECTED_HEADERS)//this is giving me FALSE
I have added value in environment variable --key=RM_EXPECTED_HEADERS and value as ['Name','Age','PinCode']. But when I am printing fList[0], I am getting (['Name', 'Age', 'PinCode\r']).
How do I remove '\r' from fList[0]?
Upvotes: 1
Views: 2030
Reputation: 51643
You can strip the \r
manually from your rows:
rows = [x.strip() for x in fileObj["Body"].read().split('\n'))
Alternativly include them into your split(...)
:
rows = [x.strip() for x in fileObj["Body"].read().split('\r\n'))
I never had problems with remaining \r
- normally python takes care of eiter \n
(Linux) or \r\n
(Windows) - problems might occure if you create textfiles using windows and splitting them under unix - not sure.
Upvotes: 1