Reputation: 301
I have String variables of the following format:
<?xml version="1.0" encoding="UTF-8"?>
<packages>
<package name="firstPackage" version="LATEST" params="xxx" force="false"/>
<package name="secondPackage" version="LATEST" params="xxx" force="false"/>
</packages>
My goal is to split the String + add to a list such that I end up with the following:
packages = ["<package name="firstPackage" version="LATEST" params="xxx" force="false"/>", "<package name="secondPackage" version="LATEST" params="xxx" force="false"/>"]
I am fairly new to Python but I believe the order of operations is:
To do this, I think I need a nested loop. The first loop to go through the initial multi-line Strings, but then another inside to break apart each line and store it as a separate list item. Below is some semi-pseudo code I have written:
variableList = [] # list containing Multi-line strings
packageList = [] # list to contain separated items
for item in variableList:
stringVariable = item
splitPackage = (stringVariable.split("\n"))
for package in splitPackage:
packageList.append(splitPackage)
Any help/advise would be appreciated.
Upvotes: 1
Views: 584
Reputation: 582
A suggestion, don't use splitlines. Use a library that parses xml.
Python comes with a library to parse xml, to achieve your goal try this:
import xml.etree.ElementTree as ET
# your xml wasn't valid
source = """<?xml version="1.0" encoding="utf-8"?>
<packages>
<package name="firstPackage" version="LATEST" params="xxx" force="false"/>
<package name="secondPackage" version="LATEST" params="xxx" force="false"/>
</packages>"""
root = ET.fromstring(source)
for child in root:
print(child.attrib)
This will output:
{'name': 'firstPackage', 'version': 'LATEST', 'params': 'xxx', 'force': 'false'}
{'name': 'secondPackage', 'version': 'LATEST', 'params': 'xxx', 'force': 'false'}
Plus point with this approach is you can convert structured data into any format you'd like, for example into the list you want:
packages = []
for child in root:
package_attributes = child.attrib.values()
desired_format = '<package name="{}" version="{}" params="{}" force="{}"/>'
packages.append(desired_format.format(*package_attributes))
This will give you your desired list.
By the way you could use a different library to parse xml, since the documentation for this one says:
The xml.etree.ElementTree module is not secure against maliciously constructed data
Upvotes: 1
Reputation: 185
package_list = []
spl = multlinestring.splitlines()
for line in spl:
if line.find('name') != -1:
# trim leading/trailing whitespaces
line = line.strip()
package_list.append(line)
Upvotes: 1