CW_20161
CW_20161

Reputation: 53

Remove all XML Attributes with a Given Name

I am editing a series of XML files, and I need to remove all attributes with the name "foo". This attribute appears in more than one type of element. An example snippet from the XML might be:

<bodymatter id="######">
  <level1 id="######">
    <pagenum page="#####" id="######" foo="######" />
    <h1 id="#####" foo="#####">Header</h1>
    <imggroup id="#######">
               .
               .
              etc.

The best solution I have uses Regex:

Regex regex = new Regex("foo=\"" + ".*?" + "\"", RegexOptions.Singleline);
content = regex.Replace(content, "");

I know built-in XML parsers could help, but ideally I want to make simple XML replacements/removals without having to deal with the baggage of an entire XML parser. Is Regex the best solution in this case?

Edit:

After some research in the XmlDocument class, here is one possible solution I came up with (to remove more than one attribute type stored in the array "ids"):

private void removeAttributesbyName(string[] ids)
{
    XmlDocument doc = new XmlDocument();
    doc.Load(path);
    XmlNodeList xnlNodes = doc.GetElementsByTagName("*");
    foreach (XmlElement el in xnlNodes)
    {
        for (int i = 0; i <= ids.Length - 1; i++)
        {
            if (el.HasAttribute(ids[i]))
            {
                el.RemoveAttribute(ids[i]);
            }
            if (el.HasChildNodes)
            {
                foreach (XmlNode child in el.ChildNodes)
                {
                    if (child is XmlElement && (child as XmlElement).HasAttribute(ids[i]))
                    {
                        (child as XmlElement).RemoveAttribute(ids[i]);
                    }
                }
            }
        }
    }
}

I don't know if this is as efficient as it possibly could be, but I've tested it and it seems to work fine.

Upvotes: 5

Views: 8280

Answers (3)

Scott Parker
Scott Parker

Reputation: 11

I use the following to remove namespaces. This might also work in removing attributes from other nodes as well.

       FileStream fs = new FileStream(filePath, FileMode.Open);

       StreamReader sr = new StreamReader(fs);

        DataSet ds = new DataSet();
        ds.ReadXml(sr);
        ds.Namespace = "";

        string outXML = ds.GetXml();
        ds.Dispose();
        sr.Dispose();
        fs.Dispose();

Upvotes: 0

fcuesta
fcuesta

Reputation: 4520

Do not use regex for XML manipulation. You can use Linq to XML:

XDocument xdoc = XDocument.Parse(xml);
foreach (var node in xdoc.Descendants().Where(e => e.Attribute("foo")!=null))
{
    node.Attribute("foo").Remove();
}

string result = xdoc.ToString();

Upvotes: 9

alc
alc

Reputation: 1557

Is Regex the best solution in this case?

No.

You'll want to use something that works on XML at the object level (as an XmlElement, for example) and not at the string level.

Upvotes: 2

Related Questions