dracons2
dracons2

Reputation: 23

parsing xml content in python

I want to write information from xml to dict in Python. Below is xml file:

<data>
  <files>
    <links>
      <item>
        <file_name>file1</file_name>
        <id>100</id>
      </item>
      <item>
        <file_name>file2</file_name>
        <id>200</id>
      </item>
      <item>
        <file_name>file3</file_name>
        <id>300</id>
      </item>
    </links>
  </files>
</data>

To Python dict like a:

xml_content = { 'file1' = 100, 'file2' = 200, 'file3' = 300 }

Thanks for your help

Upvotes: 1

Views: 208

Answers (2)

ashishmohite
ashishmohite

Reputation: 1120

Beautiful Soup should help you

Link - https://www.crummy.com/software/BeautifulSoup/

Something like this should work

from bs4 import BeautifulSoup

soup = BeautifulSoup("""
<data>
  <files>
    <links>
      <item>
        <file_name>file1</file_name>
        <id>100</id>
      </item>
      <item>
        <file_name>file2</file_name>
        <id>200</id>
      </item>
      <item>
        <file_name>file3</file_name>
        <id>300</id>
      </item>
    </links>
  </files>
</data>
""")


xml_content = { item.find('file_name').string: item.find('id').string for item in soup.find_all('item') }

Output:

{'file2': '200', 'file3': '300', 'file1': '100'}

Upvotes: 0

Simon Kirsten
Simon Kirsten

Reputation: 2577

Using xmltodict this simple code can be used to extract your dictionary:

install xmltodict with pip install xmltodict

import xmltodict

doc = xmltodict.parse("""
<data>
  <files>
    <links>
      <item>
        <file_name>file1</file_name>
        <id>100</id>
      </item>
      <item>
        <file_name>file2</file_name>
        <id>200</id>
      </item>
      <item>
        <file_name>file3</file_name>
        <id>300</id>
      </item>
    </links>
  </files>
</data>
""")

d = {}

for item in doc["data"]["files"]["links"]["item"]:
    d[item["file_name"]] = int(item["id"])

print(d)

d will be:

{u'file3': 300, u'file2': 200, u'file1': 100}

Alternatively you can load the xml from a file like this:

with open('path/to/file.xml') as fd:
    doc = xmltodict.parse(fd.read())

Upvotes: 1

Related Questions