TJM
TJM

Reputation: 89

How to sort XML files by a node attribute in C#

Not asking for anyone to code this solution for me - just looking for guidance on the best approach. I'm working on an .aspx file in VS2015 using C# code behind.

I've found countless threads explaining how to sort nodes within an XML file. But, I have not found any threads on how to sort multiple XML files with the same structure, according to a common child node attribute.

My situation: I have a directory of hundreds of XML files named, simply, 0001.xml through 6400.xml. Each XML file has the same structure. I want to sort the files (not the nodes) according to the attribute of a child node.

Each XML file has an "item" parent node and has child nodes "year", "language", and "author", among others. For example:

<item id="0001">
   <year>2011</year>
   <language id="English" />
   <author sortby="Smith">John F. Smith</author>
   <content></content>
</item>

If, instead of listing the files in order 0001 thru 6400, I instead want to list them in alphabetical order according to the item/author node's @sortby attribute, how would I do that?

One idea that I had was to create a temporary XML file that gathers the information needed from each XML file. Then, I can sort the temporary XML file and then loop through the nodes to display the files in the proper order. Something like this...

XDocument tempXML = new XDocument();
// add parent node of <items>

string[] items = Directory.GetFiles(directory)
foreach (string item in items)
{
   // add child node of <item> with attributes "filename", "year", "language", and "author"
}

// then sort the XML nodes according to attributes

Does this make sense? Is there a smarter way to do this?

Upvotes: 3

Views: 5986

Answers (5)

Alberto Monteiro
Alberto Monteiro

Reputation: 6219

Sorting

We can show xml files sorted using a bit of LINQ to Xml, with this following code:

var xmlsWithFileName = Directory.GetFiles(directory)
                                .Select(fileName => new { fileName, xml = XDocument.Parse(File.ReadAllText(fileName)) })
                                .OrderBy(tuple => tuple.xml.Element("item").Element("author").Attribute("sortby").Value);

Each element of xmlsWithFileName will have

  • xml property, that contains de XML in XDocument
  • fileName property, that contains the path of the XML file

Assuming that in your target directory you have this xml files:

0001.xml

<item id="0001">
   <year>2011</year>
   <language id="English" />
   <author sortby="Smith">John F.Smith</author>
   <content></content>
</item>

0002.xml

<item id="0002">
   <year>2012</year>
   <language id="Portuguese" />
   <author sortby="Monteiro">Alberto Monteiro</author>
   <content></content>
</item>

You can use this code to test

public static void ShowXmlOrderedBySortByAttribute(string directory)
{
    var xmlsWithFileName = Directory.GetFiles(directory)
                                    .Select(fileName => new { fileName, xml = XDocument.Parse(File.ReadAllText(fileName)) })
                                    .OrderBy(tuple => tuple.xml.Element("item").Element("author").Attribute("sortby").Value);

    foreach (var xml in xmlsWithFileName)
    {
        Console.WriteLine($"Filename: {xml.fileName}{Environment.NewLine}Xml content:{Environment.NewLine}");
        Console.WriteLine(xml.xml.ToString());
        Console.WriteLine("================");
    }
}

And the output of this code is:

Filename: c:\temp\teste\0002.xml
Xml content:

<item id="0002">
  <year>2012</year>
  <language id="Portuguese" />
  <author sortby="Monteiro">Alberto Monteiro</author>
  <content></content>
</item>
================
Filename: c:\temp\teste\0001.xml
Xml content:

<item id="0001">
  <year>2011</year>
  <language id="English" />
  <author sortby="Smith">John F.Smith</author>
  <content></content>
</item>
================

As you can see, the XML 0002.xml appear in first position, then the 0001.xml

Upvotes: 4

Tim
Tim

Reputation: 887

Edit: And now that I think about it, you probably want the file contents and not the file name, if that's the case, you could instead replace the "items" array in this example with a collection of strings containing the file contents and use GetAuthor to go through that string and return the author name.

I think the best solution would be to add these file names to some sort of collection that can be sorted. This will take your file names and add them to a Lookup:

var lookup = items.ToLookup(a => GetAuthor(a)).OrderBy(a => a.Key);

This is going to rely on a method that uses the file name to get the author name:

private string GetAuthor(string filename)
{
    string author = String.Empty;

    // get author name logic

    return author;
}

And finally, to interate through your list:

foreach (IGrouping<string, string> author in lookup)
{
    foreach (string file in author)
    {
        Console.WriteLine(String.Format("{0}: {1}", author.Key, file ));
    }
}

If you decide you want to sort the list based on more than one criteria, you'll have to take a different approach and create a custom object, add those to a list and use a custom IComparer, but this example will allow you to avoid all that if you only care about the author name.

Upvotes: 2

Reza Aghaei
Reza Aghaei

Reputation: 125187

You can load items using XElement and sort them this way:

var items = System.IO.Directory.GetFiles(@"path", "*.xml")
                     .Select(file => System.Xml.Linq.XElement.Load(file));
                     .OrderBy(x => x.Element("author").Attribute("sortby").Value)
                     .ToList();

Also if you need file names, you can select an object containing FileName and Item:

var items = System.IO.Directory.GetFiles(@"path", "*.xml")
                     .Select(file => new
                     {
                         FileName = file, 
                         Item = System.Xml.Linq.XElement.Load(file)
                     })
                     .OrderBy(x => x.Item.Element("author").Attribute("sortby").Value)
                     .Select(x=>x.FileName) /*or .Select(x=>x.Item)*/
                     .ToList();

Upvotes: 1

Lewis Hai
Lewis Hai

Reputation: 1214

Have two ways to sort data of XML file by InnerText of it's nodes

  1. Use Linq You can load all Item to list and orderby by Element of childnode. You can make a function with one para is name of childnode to do that.
  2. You can use XSLT to transform

Refer Sorting of XML file by XMLElement's InnerText for more detail

Hope it help!

Upvotes: 1

Bubba
Bubba

Reputation: 295

If I understand what you are saying correctly, this is how I would go about it:

SortedDictionary<string, string> dict = new SortedDictionary<string, string>();
var files = Directory.GetFiles(@"[path to files]", "*.xml");

foreach (var item in files)
{
    XDocument doc = XDocument.Load(item);
    var sortvalue = (from lv1 in doc.Descendants("somesortvalue")
                     select lv1.Value).First();

    dict.Add(sortvalue, item);
}

Then you can do a foreach on the dict.keys and the filenames will be sorted by the dictionary functionality.

Upvotes: 1

Related Questions