Reputation: 1194
I have been searching all over the web, i'm assuming some one has to have needed this before me and done it better, for an xml to csv converter. I have a very standard xml below:
<ArrayOfDealer xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<Dealer>
<Cmf>76066699</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS - WPB</DealershipName>
</Dealer>
<Dealer>
<Cmf>76071027</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</DealershipName>
</Dealer>
<Dealer>
<Cmf>76014750</Cmf>
<DealerNumber/>
<DealershipName>Jet Ski of Miami</DealershipName>
</Dealer>
<Dealer>
<Cmf>76066987</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS - Davie</DealershipName>
</Dealer>
</ArrayOfDealer>
And I would like to parse this to something like
cmf dealernumber dealershipname
76066699 BROWARD MOTORSPORTS - WPB
76014750 Jet Ski of Miami
76066987 BROWARD MOTORSPORTS - Davie
XML is coming in from an api which I have stored as a string.
Thoughts?
Edit: To clarify, I know the structure will be like the above, with no further nesting of nodes, but the actual tag names can very.
Upvotes: 3
Views: 12249
Reputation: 5258
This is rough but it will create a CSV file or a tab delimited text file as long as structure stays with
Root -> Main Element (for each row) -> Child Elements (any number)
I included 2 different test sets of XML so you can see the result generates correctly.
Working sample:
namespace XmlToCsv
{
class Program
{
const int TabSpaces = 8;
static void GenerateCsvFromXml(string xmlString, string resultFileName, bool isTabDelimited)
{
XDocument xDoc = XDocument.Parse(xmlString);
var tabsNeededList = new List<int>(); // only used for TabDelimited file
string delimiter = isTabDelimited
? "\t"
: ",";
// Get title row
var titlesList = xDoc.Root
.Elements()
.First()
.Elements()
.Select(s => s.Name.LocalName)
.ToList();
// Get the values
var masterValuesList = xDoc.Root
.Elements()
.Select(e => e
.Elements()
.Select(c => c.Value)
.ToList())
.ToList();
// Add titles as first row in master values list
masterValuesList.Insert(0, titlesList);
// For tab delimited, we need to figure out the number of tabs
// needed to keep the file uniform, for each column
if (isTabDelimited)
{
for (var i = 0; i < titlesList.Count; i++)
{
int maxLength =
masterValuesList
.Select(vl => vl[i].Length)
.Max();
// assume tab is 4 characters
int rem;
int tabsNeeded = Math.DivRem(maxLength, TabSpaces, out rem);
tabsNeededList.Add(tabsNeeded);
}
}
// Write the file
using (var fs = new FileStream(resultFileName, FileMode.Create))
using (var sw = new StreamWriter(fs))
{
foreach (var values in masterValuesList)
{
string line = string.Empty;
foreach (var value in values)
{
line += value;
if (titlesList.IndexOf(value) < titlesList.Count - 1)
{
if (isTabDelimited)
{
int rem;
int tabsUsed = Math.DivRem(value.Length, TabSpaces, out rem);
int tabsLeft = tabsNeededList[values.IndexOf(value)] - tabsUsed + 1; // one tab is always needed!
for (var i = 0; i < tabsLeft; i++)
{
line += delimiter;
}
}
else // comma delimited
{
line += delimiter;
}
}
}
sw.WriteLine(line);
}
}
}
static void Main(string[] args)
{
String xmlString = @"<ArrayOfDealer xmlns:i=""http://www.w3.org/2001/XMLSchema-instance"">
<Dealer>
<Cmf>76066699</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS - WPB</DealershipName>
</Dealer>
<Dealer>
<Cmf>76071027</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</DealershipName>
</Dealer>
<Dealer>
<Cmf>76014750</Cmf>
<DealerNumber/>
<DealershipName>Jet Ski of Miami</DealershipName>
</Dealer>
<Dealer>
<Cmf>76066987</Cmf>
<DealerNumber/>
<DealershipName>BROWARD MOTORSPORTS - Davie</DealershipName>
</Dealer>
</ArrayOfDealer>";
String xmlString2 = @"<ArrayOfUnicorn xmlns:i=""http://www.w3.org/2001/XMLSchema-instance"">
<Unicorn>
<UnicornColor>Red</UnicornColor>
<Cmf>76066699</Cmf>
<UnicornNumber/>
<UnicornshipName>BROWARD MOTORSPORTS - WPB</UnicornshipName>
</Unicorn>
<Unicorn>
<UnicornColor>Red</UnicornColor>
<Cmf>76071027</Cmf>
<UnicornNumber/>
<UnicornshipName>BROWARD MOTORSPORTS OF FT LAUDERDALE LLC</UnicornshipName>
</Unicorn>
<Unicorn>
<UnicornColor>Red</UnicornColor>
<Cmf>76014750</Cmf>
<UnicornNumber/>
<UnicornshipName>Jet Ski of Miami</UnicornshipName>
</Unicorn>
<Unicorn>
<UnicornColor>Red</UnicornColor>
<Cmf>76066987</Cmf>
<UnicornNumber/>
<UnicornshipName>BROWARD MOTORSPORTS - Davie</UnicornshipName>
</Unicorn>
</ArrayOfUnicorn>";
// Comma delimited
GenerateCsvFromXml(xmlString, "Dealer.csv", false);
GenerateCsvFromXml(xmlString2, "Unicorn.csv", false);
// Tab delimited
GenerateCsvFromXml(xmlString, "Dealer.txt", true);
GenerateCsvFromXml(xmlString2, "Unicorn.txt", true);
}
}
}
Results:
Dealer.csv:
Cmf,DealerNumber,DealershipName
76066699,,BROWARD MOTORSPORTS - WPB,
76071027,,BROWARD MOTORSPORTS OF FT LAUDERDALE LLC,
76014750,,Jet Ski of Miami,
76066987,,BROWARD MOTORSPORTS - Davie,
Unicorn.csv:
UnicornColor,Cmf,UnicornNumber,UnicornshipName
Red,76066699,,BROWARD MOTORSPORTS - WPB,
Red,76071027,,BROWARD MOTORSPORTS OF FT LAUDERDALE LLC,
Red,76014750,,Jet Ski of Miami,
Red,76066987,,BROWARD MOTORSPORTS - Davie,
Unicorn.txt (tab delimited):
UnicornColor Cmf UnicornNumber UnicornshipName
Red 76066699 BROWARD MOTORSPORTS - WPB
Red 76071027 BROWARD MOTORSPORTS OF FT LAUDERDALE LLC
Red 76014750 Jet Ski of Miami
Red 76066987 BROWARD MOTORSPORTS - Davie
Upvotes: 3
Reputation: 117055
Would something like this get you what you need:
Func<string, string> csvFormat =
t => String.Format("\"{0}\"", t.Replace("\"", "\"\""));
var xml = XDocument.Parse(/* xml text here */);
Func<XDocument, IEnumerable<string>> getFields =
xd =>
xd
.Descendants("Dealer")
.SelectMany(d => d.Elements())
.Select(e => e.Name.ToString())
.Distinct();
var headers =
String.Join(",",
getFields(xml)
.Select(f => csvFormat(f)));
var query =
from dealer in xml.Descendants("Dealer")
select string.Join(",",
getFields(xml)
.Select(f => dealer.Elements(f).Any()
? dealer.Element(f).Value
: "")
.Select(x => csvFormat(x)));
var csv =
String.Join(Environment.NewLine,
new [] { headers }.Concat(query));
This still assumes that the <ArrayOfDealer><Dealer>
structure says the same, but the fields below might change.
Upvotes: 5
Reputation: 2551
The code below transforms a simple xml
content (as provided by the question) to a sequence of string
(header + rows) in a csv
format in a generic way, it is just necessary to implement a more reliable function to escape values (identity transformation in the example below).
string csvSeparator = ",";
Func<string, string> escapeValue = val => val;
string xml = "xml content";
XDocument doc = XDocument.Parse(xml);
var headers = doc.Root
.Elements()
.First()
.Elements()
.Select(el => el.Name.LocalName);
var headerRow = string.Join(csvSeparator, headers);
var rows = from el in doc.Root.Elements()
let values = from prop in el.Elements()
select escapeValue(prop.Value)
let row = string.Join(csvSeparator, values)
select row;
IEnumerable<string> csvLines = new[] { headerRow }.Concat(rows);
If you need the entire content of the csv file, you can do:
string csvContent = string.Join(Environment.NewLine, csvLines);
Upvotes: 0
Reputation: 3432
Deserialize your xml to an object using XmlSerializer. Then use the approach in this question to serialize your object to a csv file.
Upvotes: 0