Reputation: 1094
I have an XML source document with multiple "report" nodes under the Root node. I need to read each "report" node into its own DataTable. It looks like I'll either need to transform my source XML data using an xsl stylesheet to get it in the format that'll work nicely or iterate through my xml elements like so:
namespace XmlParse2
{
class Program
{
static IEnumerable<string> expectedFields = new List<string>() { "Field1", "Field2", "Field3", "Field4" };
static void Main(string[] args)
{
string xml = @"<Root>
<Report1>
<Row>
<Field1>data1-1</Field1>
<Field2>data1-2</Field2>
<Field4>data1-4</Field4>
</Row>
<Row>
<Field1>data2-1</Field1>
<Field2>data2-2</Field2>
</Row>
</Report1>
<Report2>
<Row>
<Field1>data1-1</Field1>
<Field4>data1-4</Field4>
</Row>
<Row>
<Field1>data2-1</Field1>
<Field3>data2-3</Field3>
</Row>
</Report2>
</Root>";
DataTable report1 = new DataTable("Report1");
report1.Columns.Add("Field1");
report1.Columns.Add("Field2");
report1.Columns.Add("Field3");
report1.Columns.Add("Field4");
DataTable report2 = new DataTable("Report2");
report2.Columns.Add("Field1");
report2.Columns.Add("Field2");
report2.Columns.Add("Field3");
report2.Columns.Add("Field4");
var doc = XDocument.Parse(xml);
var report1Data = doc.Root.Elements("Report1").Elements("Row").Select(record => MapRecord(record));
var report2Data = doc.Root.Elements("Report2").Elements("Row").Select(record => MapRecord(record));
report1 = addRows(report1, report1Data);
report2 = addRows(report2, report2Data);
Console.ReadLine();
}
public static Dictionary<string, string> MapRecord(XElement element)
{
var output = new Dictionary<string, string>();
foreach (var field in expectedFields)
{
bool hasField = element.Elements(field).Any();
if (hasField)
{
output.Add(field, element.Elements(field).First().Value);
}
}
return output;
}
public static DataTable addRows(DataTable table, IEnumerable<Dictionary<string, string>> data)
{
foreach (Dictionary<string, string> dict in data)
{
DataRow row = table.NewRow();
foreach(var item in dict)
{
row[item.Key] = item.Value;
}
table.Rows.Add(row);
}
return table;
}
}
}
The problem with my source data not working seems to be that both Report1 and Report2 have child nodes that are named "Row" and my attempts to do stuff using DataSet.ReadXml is not successful because my code just groups all nodes named Row into one DataTable instead of separate DataTables. :/
What am I missing?
Upvotes: 1
Views: 4507
Reputation: 236328
XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
.Select(report => {
DataTable table = new DataTable(report.Name.LocalName);
var fields = report
.Descendants("Row")
.SelectMany(row => row.Elements()
.Select(e => e.Name.LocalName))
.Distinct();
foreach(string field in fields)
table.Columns.Add(field);
foreach(var row in report.Descendants("Row"))
{
DataRow dr = table.NewRow();
foreach(var field in row.Elements())
dr[field.Name.LocalName] = (string)field;
table.Rows.Add(dr);
}
return table;
});
This query will return IEnumerable<DataTable>
. Each datatable will contain only those columns, which have values in xml. Column names retrieved from xml and could be different for each table. For your sample structure will look this way:
DataTable: Report1
Columns: Field1, Field2, Field4
DataTable: Report2
Columns: Field1, Field3, Field4
All rows data will be added to each table.
You can extract some code to methods. It will make code easier to understand:
XDocument xdoc = XDocument.Load(path_to_xml);
var tables = xdoc.Root.Elements()
.Select(report => CreateTableFrom(report));
And methods:
private static DataTable CreateTableFrom(XElement report)
{
DataTable table = new DataTable(report.Name.LocalName);
table.Columns.AddRange(GetColumnsOf(report));
foreach (var row in report.Descendants("Row"))
{
DataRow dr = table.NewRow();
foreach (var field in row.Elements())
dr[field.Name.LocalName] = (string)field;
table.Rows.Add(dr);
}
return table;
}
private static DataColumn[] GetColumnsOf(XElement report)
{
return report.Descendants("Row")
.SelectMany(row => row.Elements().Select(e => e.Name.LocalName))
.Distinct()
.Select(field => new DataColumn(field))
.ToArray();
}
Upvotes: 1