Reputation: 57
I would like to retrieve specific tag attribute.
file
tag contains child tag filename
and basing on this field I would like to decide if the modification
should be taken.
In other words: if filename
value contains .tar
I would like to print modification time.
In example below I'd expect that 2020-07-15T06:41:12.000Z
would be printed.
I was trying to do this for 2 hours but I did not succed, so I'll be really thankful for any tips bringing me closer to the solution.
Here is the code, but nothing is printed nor added to dates
list:
import xml.etree.ElementTree as ET
tree = ET.parse(r"C:\path\to\file\logs.xml")
root = tree.getroot()
dates = []
for filetag in root.findall('.//{*}file'):
for filename in filetag.findall('../{*}filename'):
if ".tar" in filename.attrib['value']:
print(filename)
dates.append(filename)
Here is XML document:
<?xml version="1.0" encoding="UTF-8"?>
<session xmlns="http://winscp.net/schema/session/1.0" name="[email protected]" start="2020-07-22T10:01:12.939Z">
<ls>
<destination value="/folder/processing" />
<files>
<file>
<filename value="." />
<type value="d" />
<modification value="2020-07-22T08:57:28.000Z" />
<permissions value="rwxrwsrwx" />
<owner value="1000130000" />
<group value="0" />
</file>
<file>
<filename value=".." />
<type value="d" />
<modification value="2020-07-22T08:51:15.000Z" />
<permissions value="rwxrwxrwx" />
<owner value="1000130000" />
<group value="0" />
</file>
<file>
<filename value="package_tsp200715092001_20200715074120.tar" />
<type value="-" />
<size value="4014536192" />
<modification value="2020-07-15T06:41:12.000Z" />
<permissions value="rw-rw-rw-" />
<owner value="1005" />
<group value="1005" />
</file>
<file>
<filename value="package_tsp200715092001_20200715074120" />
<type value="d" />
<modification value="2020-07-15T06:41:59.000Z" />
<permissions value="rwxr-Sr--" />
<owner value="1000130000" />
<group value="0" />
</file>
</files>
<result success="true" />
</ls>
</session>
Upvotes: 1
Views: 566
Reputation: 23815
Below is a one liner:
import xml.etree.ElementTree as ET
xml = '''
<session xmlns="http://winscp.net/schema/session/1.0" name="[email protected]" start="2020-07-22T10:01:12.939Z">
<ls>
<destination value="/folder/processing" />
<files>
<file>
<filename value="." />
<type value="d" />
<modification value="2020-07-22T08:57:28.000Z" />
<permissions value="rwxrwsrwx" />
<owner value="1000130000" />
<group value="0" />
</file>
<file>
<filename value=".." />
<type value="d" />
<modification value="2020-07-22T08:51:15.000Z" />
<permissions value="rwxrwxrwx" />
<owner value="1000130000" />
<group value="0" />
</file>
<file>
<filename value="package_tsp200715092001_20200715074120.tar" />
<type value="-" />
<size value="4014536192" />
<modification value="2020-07-15T06:41:12.000Z" />
<permissions value="rw-rw-rw-" />
<owner value="1005" />
<group value="1005" />
</file>
<file>
<filename value="package_tsp200715092001_20200715074120" />
<type value="d" />
<modification value="2020-07-15T06:41:59.000Z" />
<permissions value="rwxr-Sr--" />
<owner value="1000130000" />
<group value="0" />
</file>
</files>
<result success="true" />
</ls>
</session>
'''
NS = {'scp': 'http://winscp.net/schema/session/1.0'}
root = ET.fromstring(xml)
tar_files_dates = [f.find('./scp:modification',NS).attrib['value'] for f in root.findall('.//scp:file',NS) if '.tar' in f.find('./scp:filename',NS).attrib['value']]
print(tar_files_dates)
output
['2020-07-15T06:41:12.000Z']
Upvotes: 1
Reputation: 42342
for filename in filetag.findall('../{*}filename'):
because of the ..
this looks for a filename
in the parent of the file
element (that is, as a sibling of file
). It should be a single .
Furthermore, namespace wildcards were added in Python 3.8. You don't indicate which Python version you're using, so this may also be an issue.
Anyway you're probably better off "properly" using namespaces instead of looking for shortcuts, it's a bit more verbose but hardly difficult:
NS = {'scp': 'http://winscp.net/schema/session/1.0'}
for filetag in root.findall('.//scp:file', NS):
for filename in filetag.findall('./scp:filename', NS):
if ".tar" in filename.get('value', ''):
print(filename)
dates.append(filename)
Upvotes: 2