Parsing out data in an XML file using Python

Question

I have an xml file where I need to strip out xml tags where if possible I can use a wild card because the data within the tags will be different information. See xml below:

 
        
            xxxxx
            AbDT-1398  ***this data will be different for each grouping****

Basically I need to search the xml file for the grouping and have a wild card character within the tags and remove the entire grouping. Throughout my xml the tag is listed but the data is what changes.

Aufwind · Accepted Answer

If I got you right, you want to remove certain tags (and eventually their contents) from your xml file. Try using lxml for processing the lxml file. Have a look at these functions from lxml.etree.

strip_elements():

Delete all elements with the provided tag names from a tree or subtree. This will remove the elements and their entire subtree, including all their attributes, text content and descendants.

strip_tags():

This will remove the elements and their attributes, but not their text/tail content or descendants. Instead, it will merge the text content and children of the element into its parent.

Is this what you are looking for? If yes, there is nice answer on SO you should have a look at.

Parsing out data in an XML file using Python

Answers (1)

Related Questions