Reputation: 613
I'm trying to do some simple modifications to an xml file (my text messages), but I'm having some difficulty understanding exactly what's going on and why.
My xml file file effectively is formatted like this:
<sms count="123456" backupset="123456" backupdate="12345"....>
<sms protocol="0" address="51511" date="1531363846440" type="1" subject="null" body="Welcome to Family Mobile! Your number is: ....>
<sms protocol="0" address="58038" date="1531407417581" type="1" subject="null" body="Family Mobile Important Message:...>
...
Thus when I create a tree:
import xml.etree.ElementTree as ET
import os
os.chdir('C:/Users/Sams PC/Desktop/')
tree = ET.parse("text_messages.xml")
root = tree.getroot()
My root tag and attributes would be:
>>>root.tag
'smses'
>>> root.attrib
{'count': '6079', 'backup_set': '1233456', 'backup_date': '12345'}>>>
And my children nodes would thus be sms I.E.:
for child in root:
... print(child.tag, child.attrib)
...
sms {'protocol': '0', 'address': '51511', 'date': '1531363846440', 'type': '1', 'subject': 'null', 'body': 'Welcome to Family Mobile! Your number is: ...}
sms {'protocol': '0', 'address': '58038', 'date': '1531407417581', 'type': '1', 'subject': 'null', 'body': 'Family Mobile Important Message: ...}
So, with the above being said, what I wanted to do was choose texts from specific numbers. So this is my approach.
for sms in root.findall('sms'):
address=sms.get('address')
if address != 51511:
root.remove(sms)
tree.write('output.xml')
So the idea is basically, search for and get every value in address in the sms line, then filter those address by saying if the value does not equal 12345, then remove the entire sms line (in other words, only keep texts number 12345).
However, instead my output file has every single sms line removed (even the ones with an address value of 12345, i.e. I get a blank file in return). Interestingly, if I change the removal to be address == 12345, instead my output file will include every single address AND its body (so it removes the date, protocol, type, and subject).
I.E.
if address == 51511:
root.remove(sms)
#output is:
<sms address="51511" body="Welcome to Family Mobile! Your number is:..>
At this point, I don't know why I'm getting the output I'm getting, and feel I must have misunderstood how this Element Tree works. Any help would greatly be appreciated! Thank you!
EDIT: Just wanted to add one final thing, I believe the issue here is that its saying there is no address=='that value' I.E. If I do this:
for sms in root.findall('sms'):
address=sms.get('address')
body=sms.get('body')
if address==51511:
print(address,body)
#output is nothing. However if I do address!=51511, I get every address with its associated body as an output. Basically implying that value of address does not exist in my xml file.
So the earlier command is actually working, I'm getting a blank file because none of my address values equal the value 51511 (I still don't know why the output of ==51511 is giving me only the address and body. Theoretically, since nothing equals that value, it should give me the exact same output as my input (which includes the date, type, and subject).
Upvotes: 1
Views: 79
Reputation: 23206
You may notice in your question that when you print child.attrib
you get:
{..., 'address': '51511', ...}
So the address
attribute value is the string "51511"
, not the number 51511
.
This explains your results.
Upvotes: 1