Bumba
Bumba

Reputation: 343

How to get part of a string from inside a node using c#

I have an xml file like

<?xml version="1.0"?>
<catalog>
<book id="bk101">
<author>Gambardella, Matthew</author>
<title>XML Developer's Guide [49-o]</title>
<genre>Computer</genre>
<price>44.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at [41-p] creating applications with XML.</description>
</book>
<book id="bk102">
<author>Ralls, Kim</author>
<title>Midnight Rain</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-12-16</publish_date>
<description>A former architect [100-x] battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
</book>
<book id="bk103">
<author>Corets, Eva</author>
<title>Maeve Ascendant</title>
<genre>Fantasy</genre>
<price>5.95</price>
<publish_date>2000-11-17</publish_date>
<description>After the collapse of a nanotechnology society in England, the [01-i] young survivors lay the foundation for a new society.</description>
</book>
</catalog>

How can use linq2xml to extract the values "[(\d+)-([a-z])]" from each of the nodes <description> and store it in a variable or maybe use it like add those extracted values to a new attribute of the respective nodes like <description val="41-p"> etc. ?

Upvotes: 1

Views: 80

Answers (2)

Emre Kabaoglu
Emre Kabaoglu

Reputation: 13146

You could use Descendants

Regex regex = new Regex(@"(\d+)-([a-z])");
var xdoc = XDocument.Parse(xml);
var descriptions = xdoc.Descendants("description")
    .Where(x => regex.Match(x.Value).Success)
    .Select(x => regex.Match(x.Value).Value).ToList();

Output:
41-p
100-x
01-i

If you want to set extracted values as attribute;

Regex regex = new Regex(@"(\d+)-([a-z])");
var xdoc = XDocument.Parse(xml);
var descriptions = xdoc.Descendants("description")
                  .Where(x => regex.Match(x.Value).Success);
foreach (var description in descriptions)
{
    var regexResult = regex.Match(description.Value).Value;
    var attribute = new XAttribute("id", regexResult);
    description.Add(attribute);
}
xdoc.Save("sample.xml");

Upvotes: 3

Hans Kilian
Hans Kilian

Reputation: 25449

I'm not familiar with linq2xml, so I'd use XmlDocument and XPath expressions to find the nodes I'm interested in. Something like this:

XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlString);

var books = doc.SelectNodes("//catalog/book");
foreach (XmlNode book in books)
{
     var description = book.SelectSingleNode("description");
     Regex regex = new Regex(@"(\[.*\])");
     var match = regex.Match(description.InnerText);
     if (match.Success)
     {
          var val = match.Groups[0].Value;
          var attribute = doc.CreateAttribute("val");
          attribute.Value = val;
          description.Attributes.SetNamedItem(attribute);
     }
}

// Convert XmlDocument back to string
var stringWriter = new StringWriter();
var xmlTextWriter = XmlWriter.Create(stringWriter);
doc.WriteTo(xmlTextWriter);
xmlTextWriter.Flush();
xmlString = stringWriter.GetStringBuilder().ToString();

Upvotes: 0

Related Questions