Reputation: 35
I am very new to python and started coding with it a few weeks ago. Since now I was able to resolv any issues with researching and reading. But this issue gives me now headaches since several hours and I can't seam to find the correct solution.
I created a sample xml (test_file.xml) on my hard drive in the folder where also my read_xml.py file is located.
Content of read_xml.py
(before)
import re
with open('test_file.xml') as xml_source:
data = xml_source.read()
xml_result = re.compile(r'<title>(.+?)</title>')
mo = xml_result.search(data)
print(mo.group(1))
gives me back TinkerTry
which it should.
But if I go further like this
Content of read_xml.py
(now):
import re
with open('test_file.xml') as xml_source:
data = xml_source.read()
xml_result = re.compile(r'<title>(.+?)</title>\n<link href="(.+?)"/>', re.MULTILINE)
mo = xml_result.search(data)
print(mo.group(1))
it does not seam to find / match anything anymore...
Upvotes: 0
Views: 52
Reputation: 43169
In short: don't. If you're in the state of learning Python
(or any other language, for that matter), trying to analyze XML
nodes with regular expression is usually considered an anti-pattern. Instead, use a parser (that's what they were made for).
from lxml import etree
tree = etree.parse('test.xml')
root = tree.getroot()
for title in root.xpath("//item/title"):
print(title.text)
And yields
It's Bugtober, with Adobe Flash crashes, numerous CVE vulnerability patches for Wi-Fi and routers, and an Intel SPI vulnerability patch for most Xeon D Supermicro SuperServers
Supermicro Xeon D SuperServer BIOS 1.2c / IPMI 3.58 released
Windows 10 Fall Creators Update introduces GPU monitoring features built right into Task Manager
VMUG Advantage EVALExperience includes latest VMware vRealize Log Insight 4.5 syslog server appliance for easy vSphere, vSAN, IoT, and networking gear log file analysis
Road-warrior productivity boosted by ASUS ZenScreen MB16AC secondary travel display that connects to Mac or PC with just one USB-C or USB 3.0 cable
lxml
via pip install lxml
first.link
tag was opened but never closed).
Upvotes: 1