Hunter McMillen
Hunter McMillen

Reputation: 61520

error parsing XML file using ElementTree.parse

I am using Python's elementtree library to parse an .XML file that I exported from MySQL query browser. When I export the result set to a .XML it includes this really weird character that shows up as the letters "BS" highlighted in a green rounded rectangle in my editor. (see screen shot) Anyway I iterate through the file and try to manually replace the character, but it must not be matching because after I do this:

for lines in file:
    lines.replace("<Weird Char>", "").strip();

I get an error from the parse method. However if I replace the character manually in wordpad/notepad etc... the parse call works correctly. I am looking for a way to parse out the character without having to do it manually.

any help would be great: I included two screen shots, one of how the character appears in my editor, and another how it appears in Chrome.

Thanks

screen shot from my editor screen shot from chrome

EDIT: You will probably have to zoom in on the images, sorry.

Upvotes: 1

Views: 531

Answers (1)

Steve Prentice
Steve Prentice

Reputation: 23514

The backspace character is not a valid XML character and needs to be escaped (&#08;). I'm surprised MySQL is not doing that here, but I'm not familiar with MySQL. You can also check your data and clean it up with an update statement to get rid of that character if it is not valid data for the table.

As far as parsing it out in python, this should work:

lines.replace("\b", "&#08;")

Upvotes: 1

Related Questions