I am trying to parse an xml file that is located in the same folder as my python script but when I run the script it does not print in the terminal as it's supposed to. I am using ElementTree here is my code:
import xml.etree.ElementTree
f = xml.etree.ElementTree.parse('atom.xml').getroot()
for atype in f.findall('link'):
this is what I want to get from the xml the href
<?xml version='1.0' ?>
<feed xmlns="">
<title type="text">Gwern</title>
<link href="" rel="self" />
<generator uri="" version="HEAD">gitit</generator>
<id> utm_source=RSS&utm_medium=feed&utm_campaign=1</id>
<title type="text">Modified "Mail", Modified "", Modified "", Modified "", Modified "Wikipedia", "", Modified "hakyll.hs", Modified "newsletter/2017/", Modified "", Modified ""</title>
<link href="" rel="alternate" />
<summary type="text">record all minor pending edits</summary>
Upvotes: 0
Views: 908
Reputation: 15513
Question: ... what I want to get from the xml the href
Your XML
has a Namespace: <feed xmlns="">'
therefore you have to use a Namespace Parameter with findall
Second, the XML
has Two <link ...>
Tags, One Inside a <entry>
findall(self, path, namespaces=None)
Finds all elements matching the ElementPath expression. Same as getroot().findall(path).
The optional namespaces argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
root = tree.getroot()
namespaces = {
# Get the First <link ...> Outside <entry>
link = root.findall('./xmlns:link', namespaces)[0]
print('link:{} {}'.format(link, link.get('href')))
# Find all <link ...> Inside <entry>
for link in root.findall('./xmlns:entry/xmlns:link', namespaces):
link:<Element {}link at 0xf6a6d8ac>
Tested with Python: 3.4.2
Upvotes: 3