Reputation: 574
I have used BeautifulSoup parser to parse an xml document. Here is the code below. I want to put all the elements into a single dictionary.
import requests
from bs4 import BeautifulSoup
f = open('/home/soundarya/Desktop/mv-v18-1526.nxml','r')
d = BeautifulSoup(f.read())
s = d.find('journal-meta')
j = s.findAll('journal-id')
print s.find('journal-title').renderContents()
print s.find('issn').renderContents()
print s.find('publisher-name').renderContents()
for x in j:
print x.renderContents()
I got the output for this as elements :
/usr/local/lib/python2.7/dist-packages/bs4/__init__.py:166: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
To get rid of this warning, change this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
markup_type=markup_type))
**Molecular Vision
1090-0535
Molecular Vision
Mol Vis
Mol. Vis
MV**
import requests
from bs4 import BeautifulSoup
f = open('/home/soundarya/Desktop/mv-v18-1526.nxml','r')
d = BeautifulSoup(f.read())
a = {}
a['journal-meta'] = d.find('journal-meta')
a['journal-id'] = a.find('journal-id')
a['journal-title'] = a.find('journal-title').renderContents()
a['issn'] = a.find('issn').renderContents()
a['publisher-name'] = a.find('publisher-name').renderContents()
for x in a:
print x.renderContents()
I am getting this error:
AttributeError: 'dict' object has no attribute 'find'
Help me to put the elements in a dictionary.
Upvotes: 2
Views: 266
Reputation: 59604
a['journal-id'] = a.find('journal-id')
I think you wanted to use variable d
:
a['journal-id'] = d.find('journal-id')
Generally, try to use more descriptive variable names.
Upvotes: 4