Reputation: 53
I am editing a series of XML files, and I need to remove all attributes with the name "foo". This attribute appears in more than one type of element. An example snippet from the XML might be:
<bodymatter id="######">
<level1 id="######">
<pagenum page="#####" id="######" foo="######" />
<h1 id="#####" foo="#####">Header</h1>
<imggroup id="#######">
.
.
etc.
The best solution I have uses Regex:
Regex regex = new Regex("foo=\"" + ".*?" + "\"", RegexOptions.Singleline);
content = regex.Replace(content, "");
I know built-in XML parsers could help, but ideally I want to make simple XML replacements/removals without having to deal with the baggage of an entire XML parser. Is Regex the best solution in this case?
Edit:
After some research in the XmlDocument class, here is one possible solution I came up with (to remove more than one attribute type stored in the array "ids"):
private void removeAttributesbyName(string[] ids)
{
XmlDocument doc = new XmlDocument();
doc.Load(path);
XmlNodeList xnlNodes = doc.GetElementsByTagName("*");
foreach (XmlElement el in xnlNodes)
{
for (int i = 0; i <= ids.Length - 1; i++)
{
if (el.HasAttribute(ids[i]))
{
el.RemoveAttribute(ids[i]);
}
if (el.HasChildNodes)
{
foreach (XmlNode child in el.ChildNodes)
{
if (child is XmlElement && (child as XmlElement).HasAttribute(ids[i]))
{
(child as XmlElement).RemoveAttribute(ids[i]);
}
}
}
}
}
}
I don't know if this is as efficient as it possibly could be, but I've tested it and it seems to work fine.
Upvotes: 5
Views: 8280
Reputation: 11
I use the following to remove namespaces. This might also work in removing attributes from other nodes as well.
FileStream fs = new FileStream(filePath, FileMode.Open);
StreamReader sr = new StreamReader(fs);
DataSet ds = new DataSet();
ds.ReadXml(sr);
ds.Namespace = "";
string outXML = ds.GetXml();
ds.Dispose();
sr.Dispose();
fs.Dispose();
Upvotes: 0
Reputation: 4520
Do not use regex for XML manipulation. You can use Linq to XML:
XDocument xdoc = XDocument.Parse(xml);
foreach (var node in xdoc.Descendants().Where(e => e.Attribute("foo")!=null))
{
node.Attribute("foo").Remove();
}
string result = xdoc.ToString();
Upvotes: 9
Reputation: 1557
Is Regex the best solution in this case?
No.
You'll want to use something that works on XML at the object level (as an XmlElement
, for example) and not at the string
level.
Upvotes: 2