user9300944
user9300944

Reputation:

Can't read from XML file in S3 with Python

I have an XML file sitting in S3 and I need to open it from a lambda function and write strings to a DynamoDB table. I am using etree to parse the file. However, I don't think any content is actually getting read from the file. Below is my code, the error, and some sample xml.

Code:

import boto3
import lxml
from lxml import etree

def lambda_handler(event, context):
    output = 'Lambda ran successfully!'
    return output

def WriteItemToTable():
    s3 = boto3.resource('s3')

    obj = s3.Object('bucket', 'object')
    body = obj.get()['Body'].read()

    image_id = etree.fromstring(body.content).find('.//IMAGE_ID').text
    print(image_id)


WriteItemToTable()

Error:

'str' object has no attribute 'content'

XML:

 <HOST_LIST>
    <HOST>
      <IP network_id="X">IP</IP>
      <TRACKING_METHOD>EC2</TRACKING_METHOD>
      <DNS><![CDATA[i-xxxxxxxxxx]]></DNS>
      <EC2_INSTANCE_ID><![CDATA[i-xxxxxxxxx]]></EC2_INSTANCE_ID>
      <EC2_INFO>
        <PUBLIC_DNS_NAME><![CDATA[xxxxxxxxxxxx]]></PUBLIC_DNS_NAME>
        <IMAGE_ID><![CDATA[ami-xxxxxxx]]></IMAGE_ID>

I am trying to pull the AMI ID inside of the <IMAGE_ID> tag.

Upvotes: 0

Views: 5881

Answers (1)

rczajka
rczajka

Reputation: 1840

Content is read, what you get is just an attribute error. body is already a string and it has no content attribute. Instead of fromstring(body.content) just do fromstring(body).

Upvotes: 1

Related Questions